Re: Porting network driver to 2.4.0

2001-01-10 Thread Manfred Spraul
Andi Kleen wrote: On Wed, Jan 10, 2001 at 03:40:50PM -0500, Jonathan Earle wrote: Where do I go from here? Is there info somewhere to help with this? Is this a bigger job than it looks on the surface? Try http://www.firstfloor.org/~andi/softnet I would ask someone from znxz. I

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-10 Thread Manfred Spraul
Frank de Lange wrote: Hi'all, Ever since I put two ethernet-cards (cheap Winbond W89C940 based PCI NE2K clones) in my BP-6 system, I've been experiencing intermittent network hangs. Which driver do you use? The driver in 2.4.0 contains several bugfixes. If that driver still hangs then

Re: Compatibility issue with 2.2.19pre7

2001-01-11 Thread Manfred Spraul
Trond Myklebust wrote: As for the issue of casting 'fh-data' as a 'struct knfsd' then that is a perfectly valid operation. No it isn't. fh-data is an array of characters, thus without any alignment restrictions. 'struct knfsd' begins with a pointer, thus it must be 4 or 8 byte aligned.

Apology for duplicates (was Re: Compatibility issue with 2.2.19pre7 (fwd))

2001-01-11 Thread Manfred Spraul
of them was unreachable for 50 minutes. It seems the sendmail resend the message to all receivers, although the first 5 were successful. One retry every 10 minutes -- 6 duplicates Sorry, Manfred Spraul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" i

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-12 Thread Manfred Spraul
Let's decode it: IO APIC #2.. NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 12 0FF 0F 0 1 0 1 0 1 1 91 13 0FF 0F 0 1 1 1 0 1 1 99 IRR for interrupt 19 is set, that means the IO APIC has sent the interrupt to a cpu but not yet received the corresponding EOI. That bit is read

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-12 Thread Manfred Spraul
[EMAIL PROTECTED] said: IRR for interrupt 19 is set, that means the IO APIC has sent the interrupt to a cpu but not yet received the corresponding EOI. OK, but couldn't we reset it by sending an extra EOI when the drivers decide that they've missed interrupts? How? You send an

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware

2001-01-12 Thread Manfred Spraul
Alan Cox wrote: Frank, could you try what happens with the NMI oopser disabled? The second major difference I'm immediately aware of is the number of the reschedule/tlb flush/etc interrupt: 2.2 uses the lowest priority, 2.4 the highest priority. Im trying to remember what they

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-12 Thread Manfred Spraul
Frank de Lange wrote: On Fri, Jan 12, 2001 at 06:16:36PM +0100, Manfred Spraul wrote: I would first concentrate on the differences between 2.2 and 2.4: Frank, could you try what happens with the NMI oopser disabled? Here's the results with nmi_watchdog=0 After network hang

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware

2001-01-12 Thread Manfred Spraul
Ingo Molnar wrote: we *already* reorder vector numbers and spread them out as much as possible. We do this in 2.2 as well. We did this almost from day 1 of IO-APIC support. If any manually allocated IRQ vector creates a '3 vectors in the same 16-vector region' situation then thats a bug in

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-12 Thread Manfred Spraul
Linus wrote: Does this seem to happen mainly with drivers that use "disable_irq()" and "enable_irq()"? I know the ne drivers do (through the 8390 module), and some others do too (3c59x). I removed the disable_irq lines from 8390.c, and that fixed the problem: no hang within 2 minutes - the

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware related?

2001-01-12 Thread Manfred Spraul
Frank de Lange wrote: On Fri, Jan 12, 2001 at 08:04:24PM +0100, Manfred Spraul wrote: I removed the disable_irq lines from 8390.c, and that fixed the problem: no hang within 2 minutes - the test is still running. Frank, could you double check it? I'm currently running my own patched

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul
Linus Torvalds wrote: I'd like to know _which_ of the two makes a difference (or does it only trigger with both of them enabled)? And even then I'm not sure that it is "the" solution - both changes to io-apic handling had some reason for them. Ingo, what was the focus-cpu thing? Frank,

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul
Ingo Molnar wrote: okay - i just wanted to hear a definitive word from you that this fixes your problem, because this is what we'll have to do as a final solution. (barring any other solution.) Ingo, is that possible? The current fix is "disable_irq_nosync() and enable_irq() cause

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul
Frank de Lange wrote: On Fri, Jan 12, 2001 at 09:54:31PM +0100, Manfred Spraul wrote: I have found one combination that doesn't hang with the unpatched 8390.c, but network throughput is down to 1/2. I hope that's due to the debugging changes. Hm, could it be that the fact that network

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware

2001-01-12 Thread Manfred Spraul
Alan Cox wrote: Could you disable both bandaids? I disabled them, no problems so far. Now back to the disable_irq_nosync(). Ok so it looks like the disable_irq code is buggy. Unfortunately its not just used for these drivers they are just the heaviest users. Given that we can see the

Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardware

2001-01-12 Thread Manfred Spraul
Frank de Lange wrote: It could be that people using those cards are not the ones who tend to go for the (somewhat tricky) BP6 board... I doubt that it's BP6 specific: I have the problem with a Gigabyte BXD board and I doubt that Ingo used an BP6. Perhaps 82093AA specific (the IO APIC chip

Call for testers: ne2k-pci and io apic (was: Re: QUESTION: Network hangs with BP6...)

2001-01-13 Thread Manfred Spraul
Russell King wrote: Doesn't the NCR53C9x SCSI drivers use disable_irq() a lot? Do they have any problems? It seems that a certain timing is necessary: one flood ping or a single ncp usually doesn't trigger any problems, but 2 concurrent flood pings hang the network after 5-10 seconds. It's

Re: Call for testers: ne2k-pci and io apic (was: Re: QUESTION: Network hangs with BP6...)

2001-01-13 Thread Manfred Spraul
It seems that noone uses a Ne2000 compatible pci NIC with a newer motherboard (every K7 board, Intel 8xx boards, via apollo pro 133), but I've set up a tiny web site that describes my problem: colorfullife.com/~manfred/io_apic -- Manfred - To unsubscribe from this list: send the

Re: Question regarding driver developement

2001-01-14 Thread Manfred Spraul
The only way I have found so far is to write have two FIFO buffers in the driver (in and out) and use a daemon running in user space to manage the disk access. Have you thought about using mmap and raw-io? * the kernel driver allocates a fifo (probably a ring?) buffer. The driver implement

Re: Oops in rtl8139, and more

2001-01-15 Thread Manfred Spraul
The problem is clear: rtl8139_resume() unconditionally restarts the hardware, even if the network was not yet started. The hardware immediately notices something, and sends an interrupt. The oops happens during rtl8139_open(): the function calls request_irq(), but assumes that the interrupts are

[2.4.1-pre8] MPP related OPPS

2001-01-19 Thread Manfred Spraul
[Paul Mackerras and linux-ppp added to the cc list] It seems that the MPPP reconstruction queue got corrupted: ppp_mp_reconstruct() called kfree_skb(), and within kfree_skb() the call to skb-destructor() crashed: skb-destructor was 0x01010101. I reported this a few months ago without

Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)

2001-01-20 Thread Manfred Spraul
TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I haven't checked recently). That's a waste of space for an entire page. However, having every driver implement it's own slab cache seems a complete waste of time when we already have the code to do so in

Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)

2001-01-21 Thread Manfred Spraul
Russell King wrote: Johannes Erdfelt writes: They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. Can we get away from the "16 byte aligned" and make it "n byte aligned"? I believe that

Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)

2001-01-21 Thread Manfred Spraul
Russell King wrote: Manfred Spraul writes: Not yet, but that would be a 2 line patch (currently it's hardcoded to BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN flag). I don't think there's a problem then. However, if slab can be told "I want 1024

Re: [PATCH] - filesystem corruption on soft RAID5 in 2.4.0+

2001-01-21 Thread Manfred Spraul
I've attached Holger's testcase (ext2, SMP, raid5) boot with "mem=64M" and run the attached script. The script creates and deletes 9 directories with 10.000 in each dir. Neil, could you run it? I don't have an raid 5 array - SMP+ext2 without raid5 is ok. Holger, what's your ext2 block size, and

Re: [PATCH] Re: Q: natsemi.c spinlocks

2001-01-22 Thread Manfred Spraul
Donald Becker wrote: However, natsemi.c's spinlock needs to be retained, and extended into start_tx(), because this driver has a race which has cropped up in a few others: ... if (np-cur_tx - np-dirty_tx = TX_QUEUE_LEN - 1) { /* WINDOW HERE */

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-23 Thread Manfred Spraul
I read through the tcpdump, and it seems that Linux completely ignores packets with out-of-window sequence numbers: * the solaris computers (dynamic...) sends further data although the Linux box (static) says 'win 0'. See lines 2067, 2069, 2076, ... 2066 16:31:43.108759 eth0 static.8664

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-23 Thread Manfred Spraul
I checked RFC793, and AFAICS Solaris is the culprit: it sends out invalid packets, Linux ignores them and thus Linux doesn't receive acks. Which Solaris version do you use? * The last valid ack from the Solaris computer is for byte 1583721, win 8760 (line 2078) * No packet after line 2078 from

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-24 Thread Manfred Spraul
Yes, Linux is __very__ not right doing this. RFC requires to accept ACK, URG and RST on any segment adjacent to window, even if window is zero. Interesting: I checked the RFC 793 and came to the conclusion that Linux is correct. ("special allowance should be made to accept valid ACKs" not

Re: Linux Post codes during runtime, possibly OT

2001-01-26 Thread Manfred Spraul
+ * + * Changed the slow-down I/O port from 0x80 to 0x19. 0x19 is a + * DMA controller scratch register. [EMAIL PROTECTED] */ What about making that a config option? default: delay with 'outb 0x80', other options could be udelay(n); (n=1,2,3) outb 0x19 0x80 is a

Re: [ANNOUNCE] Kernel Janitor's TODO list

2001-01-28 Thread Manfred Spraul
Anything which uses sleep_on() has a 90% chance of being broken. Fix them all, because we want to remove sleep_on() and friends in 2.5. Then you can add 'calling schedule() with disabled local interrupts()' to your list. -- Manfred - To unsubscribe from this list: send the line

flush_scheduled_tasks() question

2001-01-28 Thread Manfred Spraul
Is is intentional that tummy_task is not initialized? Ok, it won't crash because the current __run_task_queue() implementation doesn't call tq-routine if it's NULL, but IMHO it's ugly. Additionally I don't like the loop in flush_scheduled_tasks(), what about replacing it with a locked semaphore

Re: [ANNOUNCE] Kernel Janitor's TODO list

2001-01-28 Thread Manfred Spraul
Arnaldo Carvalho de Melo wrote: Em Sun, Jan 28, 2001 at 05:14:37PM +0100, Manfred Spraul escreveu: Anything which uses sleep_on() has a 90% chance of being broken. Fix them all, because we want to remove sleep_on() and friends in 2.5. Then you can add 'calling schedule

Re: [ANNOUNCE] Kernel Janitor's TODO list

2001-01-28 Thread Manfred Spraul
David Woodhouse wrote: TIOCMIWAIT does restore_flags() before interruptible_sleep_on(). It's broken too. Yes, and I found a second bug: it doesn't sti() immediately after interruptible_sleep_on(), thus cli() doesn't reacquire the global irq lock -- the atomic copy won't be atomic on SMP.

Re: flush_scheduled_tasks() question

2001-01-29 Thread Manfred Spraul
David Woodhouse wrote: -static struct tq_struct dummy_task; +static struct tq_struct dummy_task /* = all zero */; That comment is superflous - that's just C. The non-obvious part is +static struct tq_struct dummy_task; /* remains zero, run_task_queue() supports tqs.routine==NULL*/ BUT: The

Re: [patch] 2.4.0, 2.4.0-ac12: APIC lock-ups

2001-01-29 Thread Manfred Spraul
"Maciej W. Rozycki" wrote: I'll implement an 82489DX update in a few days, but for now I'd like everyone interested to test the following patch as much as possible. It applies to 2.4.0, 2.4.0-ac12 and 2.4.1-pre11 cleanly. I'm not totally convinced that this fixes all problems: No lockup,

Re: [ANNOUNCE] Kernel Janitor's TODO list

2001-01-31 Thread Manfred Spraul
Alan Cox wrote: And one more point for the Janitor's list: Get rid of superflous irqsave()/irqrestore()'s - in 90% of the cases either spin_lock_irq() or spin_lock() is sufficient. That's both faster and better readable. Expect me to drop any submissions that do this. I'd rather take

[PATCH] new version of singlecopy pipe

2001-05-11 Thread Manfred Spraul
@@ -2,6 +2,9 @@ * linux/fs/pipe.c * * Copyright (C) 1991, 1992, 1999 Linus Torvalds + * + * Major pipe_read() and pipe_write() cleanup: Single copy, + * fewer schedules. Copyright (C) 2001 Manfred Spraul */ #include linux/mm.h @@ -10,6 +13,8 @@ #include linux/slab.h #include linux

Re: [PATCH] new version of singlecopy pipe

2001-05-12 Thread Manfred Spraul
J . A . Magallon wrote: On 05.11 Manfred Spraul wrote: Please test it. The kernel space part should be ok, but I know that the patch can cause deadlocks with buggy user space apps. I tried your patch on 2.4.4-ac8, and something strange happens. Untarring linux-2.4.4 takes

APCI oops with 2.4.4-ac8

2001-05-12 Thread Manfred Spraul
linux-2.4.4-ac8 old bios, no complete acpi support. from dmesg: ACPI: System description tables not found Unable to handle kernel NULL pointer dereference at virtual address 00d4 EIP: acpi_get_timer+19 Call trace: bm_initialize bm_osl_init acpi_gbl_FADT is NULL. If you

[PATCH] winbond-840 update

2001-05-12 Thread Manfred Spraul
/drivers/net/winbond-840.c Sat May 12 11:59:43 2001 @@ -32,10 +32,13 @@ synchronize tx_q_bytes software reset in tx_timeout Copyright (C) 2000 Manfred Spraul + * further cleanups + Copyright (c) 2001 Manfred Spraul

Re: [PATCH] winbond-840 update

2001-05-12 Thread Manfred Spraul
Jeff Garzik wrote: Manfred Spraul wrote: @@ -437,9 +439,9 @@ if (option 0) { if (option 0x200) np-full_duplex = 1; - np-default_port = option 15; - if (np-default_port) - np

Re: [PATCH] new version of singlecopy pipe

2001-05-17 Thread Manfred Spraul
David S. Miller wrote: J . A . Magallon writes: What platform? Any more info ? No, I thought it might be some cache flushing issue on a non-x86 machine. I found the problem: I sent out the old patch :-( Attached is the correct version of patch-copy_user_user. --

[PATCH] winbond-840 update

2001-05-19 Thread Manfred Spraul
/drivers/net/winbond-840.c Fri Apr 20 20:54:23 2001 +++ build-2.4/drivers/net/winbond-840.c Sat May 19 14:14:22 2001 @@ -32,10 +32,16 @@ synchronize tx_q_bytes software reset in tx_timeout Copyright (C) 2000 Manfred Spraul + * further

[PATCH] winbond update

2001-05-25 Thread Manfred Spraul
-840.c Fri May 25 23:23:07 2001 @@ -32,12 +32,22 @@ synchronize tx_q_bytes software reset in tx_timeout Copyright (C) 2000 Manfred Spraul + * further cleanups + power management. + support for big endian

Re: [lkml]Re: interrupt problem with MPS 1.4 / not with MPS 1.1 ?

2001-05-31 Thread Manfred Spraul
I know that with MPS 1.4, the USB controller finds itself at an unshared interrupt 19. I can't reboot at the moment to check. lspci -vxxx -s 00:07.0 the APIC sits in the southbridge. the low 2 bits of offset 0x58 must be set [route USB IRQ to APIC], and lspci -vx -s 00:07.2 offset 0x3C

Re: [lkml]Re: [lkml]Re: interrupt problem with MPS 1.4 / not with MPS 1.1 ?

2001-05-31 Thread Manfred Spraul
[EMAIL PROTECTED] wrote: 00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00 [UHCI]) Subsystem: Unknown device 0925:1234 Flags: bus master, medium devsel, latency 32, IRQ 5 I/O ports at a000 [size=32] Capabilities: [80] Power

Re: interrupt problem with MPS 1.4 / not with MPS 1.1 ?

2001-06-01 Thread Manfred Spraul
[EMAIL PROTECTED] wrote: :setpci -s 00:07.2 INTERRUPT_LINE=15 :lspci -vx -s 00:07.2 00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00 [UHCI]) Subsystem: Unknown device 0925:1234 Flags: bus master, medium devsel, latency 32, IRQ 19 I/O

[PATCH] natsemi update

2001-06-02 Thread Manfred Spraul
: (Manfred Spraul) + * pci dma + * SMP locking update + * full reset added into tx_timeout + * correct multicast hash generation + [copied from a natsemi driver version +from Myrio Corporation, Greg Smith

multicast hash incorrect on big endian archs

2001-06-04 Thread Manfred Spraul
I noticed that the multicast hash calculations assumed little endian byte ordering in the winbond-840 driver, and it seems that several other drivers are also affected: 8139too, epic100, fealnx, pci-skeleton, sis900, starfile, sundance, via-rhine, yellowfin perhaps

RE: usb-uhci forgets to destroy kmem entries

2000-09-15 Thread Manfred Spraul
+#ifdef DEBUG_SLAB + if (retval 0 ) { + if(kmem_cache_destroy(uhci_desc_kmem)) Why only #ifdef DEBUG_SLAB? AFAICS the driver should always destroy it's slab cache. Please cc, I'm not subscribed to linux-kernel. -- Manfred - To unsubscribe from this list: send the

Kernel 2.4.2 - kernel BUG at apic.c:220!

2001-02-24 Thread Manfred Spraul
kernel BUG at apic.c:220! From apic.c: /* * Double-check wether this APIC is really registered. */ if (!test_bit(GET_APIC_ID(apic_read(APIC_ID)), phys_cpu_present_map)) BUG(); Really odd. That's usually a sign of a bad MP table. Could you

Re: kernel lock contention and scalability

2001-02-25 Thread Manfred Spraul
Jonathan Lahr wrote: To discover possible locking limitations to scalability, I have collected locking statistics on a 2-way, 4-way, and 8-way performing as networked database servers. I patched the [48]-way kernels with Kravetz's multiqueue patch in the hope that mitigating runqueue_lock

[PATCH][CFT] per-process namespaces for Linux

2001-02-25 Thread Manfred Spraul
* large cleanup of boot process (ramdisk handling, etc.) Have you thought about supporting .tar.gz into ramfs? Creating custom boot images would be simpler. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL

oops followed by kernel BUGs

2001-02-25 Thread Manfred Spraul
When I woke today I found I'd gotten the following oops, kernel: EIP:0010:[bdput+5/96] Code; _EIP: 0: f0 ff 4b 08 lock decl 0x8(%ebx) Call Trace: [clear_inode+194/220] [dispose_list+59/84] kernel: eax: 0002 ebx: 0002 ecx: ca58a648 edx: c15ddfa4

Re: PROBLEM: Network hanging - Tulip driver with Netgear (Lite-On)

2001-02-26 Thread Manfred Spraul
I think I found the bug: Someone (Jeff?) removed the line tp-advertising[phy_idx++] = reg4; from tulip/tulip_core.c pnic_check_duplex uses that variable :-( There are 2 workarounds: * change pnic_check_duplex: s/tp-advertising[0]/tp-mii_advertise/g * remove the new mii_advertise

Re: PROBLEM: Network hanging - Tulip driver with Netgear (Lite-On)

2001-02-26 Thread Manfred Spraul
Jeff Garzik wrote: Pat, Manfred, in pnic_check_duplex, make this change: -negotiated = mii_reg5 tp-advertising[0]; +negotiated = mii_reg5 tulip_mdio_read(dev, tp-phys[0], 4); The changed fixed the problem. Manfred Spraul wrote: I think I found the bug: Someone

[PATCH] minor bug in ipc/sem.c

2001-02-27 Thread Manfred Spraul
try_atomic_semop() corrupts the process id associated with a semaphore if a semaphore operation with semval==0 (i.e. wait until the semaphore value becomes zero) blocks. I've attached a patch against 2.4.2-ac4, it also applies to 2.4.2 -- Manfred --- 2.4/ipc/sem.c Mon Feb 26

Re: PROBLEM: Kernel bug in inode.c:885 when floppy disk removed

2001-02-28 Thread Manfred Spraul
Alexander Viro wrote: - Doctor, it hurts when I do it! - Don't do it, then. Interesting bugfix: have you checked which BUG was triggered? It's a bug in ext2_free_inode(): if a io error occurs, then clear_inode() is not called, but super_operation.delete_inode() must call clear_inode()

Re: paging behavior in Linux

2001-02-28 Thread Manfred Spraul
When I run my program on a readhat linux machine, I dont get results as expected, work thread seems to be stuck when prefetch thread is waiting on a page fault That's a known problem: The paging io for a process is controlled with a per-process semaphore. The semaphore is held while

Q: explicit alignment control for the slab allocator

2001-03-01 Thread Manfred Spraul
Alan added a CONFIG options for FORCED_DEBUG slab debugging, but there is one minor problem with FORCED_DEBUG: FORCED_DEBUG disables HW_CACHEALIGN, and several drivers assume that HW_CACHEALIGN implies a certain alignment (iirc usb/uhci.c assumes 16-byte alignment) I've attached a patch that

Re: Q: explicit alignment control for the slab allocator

2001-03-01 Thread Manfred Spraul
Mark Hemment wrote: The original idea behind offset was for objects with a "hot" area greater than a single L1 cache line. By using offset correctly (and to my knowledge it has never been used anywhere in the Linux kernel), a SLAB cache creator (caller of kmem_cache_create()) could ask

Re: Q: explicit alignment control for the slab allocator

2001-03-01 Thread Manfred Spraul
Mark Hemment wrote: On Thu, 1 Mar 2001, Manfred Spraul wrote: Mark Hemment wrote: The original idea behind offset was for objects with a "hot" area greater than a single L1 cache line. By using offset correctly (and to my knowledge it has never been use

Re: Q: explicit alignment control for the slab allocator

2001-03-02 Thread Manfred Spraul
Zitiere Mark Hemment [EMAIL PROTECTED]: In which cases an offset alignment is really a win? You've got me. :) I don't know. In the Bonwick paper, such a facility was described, so I thought "hey, sounds like that might be useful". Could be a win on archs with small L1 cache

Re: Q: explicit alignment control for the slab allocator

2001-03-02 Thread Manfred Spraul
Mark Hemment wrote: Hmm, no that note, seen the L1 line size defined for a Pentium ? 128 bytes!! (CONFIG_X86_L1_CACHE_SHIFT of 7). That is probably going to waste a lot of space for small objects. No, it doesn't: HWCACHE_ALIGN means "do not cross a cache line boundary".

Re: Q: explicit alignment control for the slab allocator

2001-03-01 Thread Manfred Spraul
"David S. Miller" wrote: Manfred, why are you changing the cache alignment to SMP_CACHE_BYTES? If you read the original SLAB papers and other documents, the code intends to color the L1 cache not the L2 or subsidiary caches. I'll undo that change. I only found this comment in the source

Re: [Re: paging behavior in Linux]

2001-03-02 Thread Manfred Spraul
Neelam Saboo wrote: hi, After I installed a newer version of Kernel (2.4.2) and enable DMA option in hardware configuration, the behavior changes. I can see performance improvements when another thread is used. Also, i can see timing overlaps between two threads. i.e. when one thread is

Re: PROBLEM: Network hanging - Tulip driver with Netgear (Lite-On)

2001-03-02 Thread Manfred Spraul
Jeff Garzik wrote: Manfred Spraul wrote: Could you double check the code in tulip_core.c, around line 1450? IMHO it's bogus. 1) if the network card contains multiple mii's, then the the advertised value of all mii's is changed to the advertised value of the first mii. I'm really

Re: [prepatches] removal of console_lock

2001-03-04 Thread Manfred Spraul
- Major revamp of printk(). The approach taken in printk() is to try to acquire the (new) console_sem. If we succeed, the output is placed into the log buffer and is printed to the consoles. If we fail to acquire the semaphore we just buffer the output in the log buffer and the

Re: kmalloc() alignment

2001-03-04 Thread Manfred Spraul
Does kmalloc() make any guarantees of the alignment of allocated blocks? Will the returned block always be 4-, 8- or 16-byte aligned, for example? 4-byte alignment is guaranteed on 32-bit cpus, 8-byte alignment on 64-bit cpus. -- Manfred - To unsubscribe from this list: send the

Re: SLAB vs. pci_alloc_xxx in usb-uhci patch

2001-03-05 Thread Manfred Spraul
And mm/slab.c changes semantics when CONFIG_SLAB_DEBUG is set: it ignores SLAB_HWCACHE_ALIGN. That seems more like the root cause of the problem to me! HWCACHE_ALIGN does not guarantee a certain byte alignment. And additionally it's not even guaranteed that kmalloc() uses that HWCACHE_ALIGN.

Re: SLAB vs. pci_alloc_xxx in usb-uhci patch

2001-03-06 Thread Manfred Spraul
David Brownell wrote: There are two problems I see. (1) CONFIG_SLAB_DEBUG breaks the documented requirement that the slab cache return adequately aligned data ... adequately aligned for the _cpu_, not for some controllers. It's neither documented that HW_CACHEALIGN aligns to 16 byte

Re: Mapping a piece of one process' addrspace to another?

2001-03-07 Thread Manfred Spraul
pipe_read() and pipe_write() cleanup, kiobuf based + * single copyCopyright (C) 2001 Manfred Spraul */ #include linux/mm.h @@ -10,6 +13,8 @@ #include linux/malloc.h #include linux/module.h #include linux/init.h +#include linux/iobuf.h +#include linux/highmem.h #include

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
Jamie wrote: Linus Torvalds wrote: The long-term solution for this is to create the new VM space for the new process early, and add it to the list of mm_struct's that the swapper knows about, and then just get rid of the pages[MAX_ARG_PAGES] array completely and instead just populate

Re: Kernel 2.4.2 command execution hangs and then succeded after 2 minutes....!? STRACE-DUMP

2001-03-07 Thread Manfred Spraul
- Original Message - From: "Andrea Barisani" [EMAIL PROTECTED] To: "Manfred Spraul" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, March 07, 2001 3:03 PM Subject: Re: Kernel 2.4.2 command execution hangs and then succeded after 2 minutes!? STRACE-DUMP

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
From: "Jamie Lokier" [EMAIL PROTECTED] Manfred Spraul wrote: I'm not sure that this is the right way: It means that every exec() must call dup_mmap(), and usually only to copy a few hundert bytes. But I don't see a sane alternative. I won't propose to create a temporary file i

BUG? race between kswapd and ptrace (access_process_vm )

2001-03-07 Thread Manfred Spraul
Is kswapd now running without lock_kernel()? Then there is a race between swapout and ptrace: access_process_vm() accesses the page table entries, only protected with the mmap_sem semaphore and lock_kernel(). Isn't spin_lock(mm-page_table_lock); missing in access_one_page() [in

Re: Q: explicit alignment control for the slab allocator

2001-03-07 Thread Manfred Spraul
From: "Jes Sorensen" [EMAIL PROTECTED] "Manfred" == Manfred Spraul [EMAIL PROTECTED] writes: Manfred Mark Hemment wrote: As no one uses the feature it could well be broken, but is that a reason to change its meaning? Manfred Some hardware drivers use HW_CACHEALIG

flush_page_to_ram() question in kernel/ptrace.c

2001-03-08 Thread Manfred Spraul
From linux/kernel/ptrace.c, access_one_page(): flush_cache_page(vma, addr); if (write) { maddr = kmap(page); memcpy(maddr + (addr ~PAGE_MASK), buf, len); flush_page_to_ram(page); flush_icache_page(vma, page);

Re: BUG? race between kswapd and ptrace (access_process_vm )

2001-03-08 Thread Manfred Spraul
Rik van Riel wrote: On Wed, 7 Mar 2001, Manfred Spraul wrote: Is kswapd now running without lock_kernel()? Indeed ... Then there is a race between swapout and ptrace: access_process_vm() accesses the page table entries, only protected with the mmap_sem semaphore and lock_kernel

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread Manfred Spraul
Manfred Spraul */ #include linux/mm.h @@ -10,6 +13,8 @@ #include linux/slab.h #include linux/module.h #include linux/init.h +#include linux/iobuf.h +#include linux/highmem.h #include asm/uaccess.h #include asm/ioctls.h @@ -36,97 +41,149 @@ down(PIPE_SEM(*inode)); } +struct

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread Manfred Spraul
From: [EMAIL PROTECTED] Hello! * davem's patch breaks apps that assume that write(,PIPE_BUF) after poll(POLLOUT) never blocks, even for blocking pipes. Pardon, but PIPE_BUF = PAGE_SIZE yet, so that fears have no reasons. The difference is the = davem's patch + if (count =

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread Manfred Spraul
From: [EMAIL PROTECTED] PS BTW "all unix" is unlikely to include freebsd. 8) freebsd, openbsd, netbsd, tru64, openvms - all unix versions I found free telnet guest accounts for. Running for cover, Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread Manfred Spraul
From: [EMAIL PROTECTED] freebsd Very funny, the idea is borrowed from there. As you could understand your patch kills it. PAGE_SIZE is one of the most frequently used transfer unit. freebsd-4.0 doesn't use direct transfers for PAGE_SIZE'd pipe write()s: it uses MINDIRECT=8192. (and

Re: system hang with __alloc_page: 1-order allocation failed

2001-03-13 Thread Manfred Spraul
Maybe it would be good to lower the default threads-max to about 10% or less of physical memory ? And MIN_THREADS_FOR_ROOT should be reintroduced: the define is still there, but the actual code is missing. I've attached an older patch that: * reintroduces MIN_THREADS_FOR_ROOT (or remove

Re: system hang with __alloc_page: 1-order allocation failed

2001-03-13 Thread Manfred Spraul
From: "Chris Evans" [EMAIL PROTECTED] I thought (on Intel) there was a 4092 hard limit? That's the 2.2 limit, it's gone. The new limit is total memory and pid space. The pid's are intentionally limited to 15 bits, the remaining bits are reserved. In the worst case one running process can

Re: [OOPS] 8139too

2001-03-14 Thread Manfred Spraul
Hello LKML! i686 2.4.2 UP+kdb+lm_sensors+pcmcia after APM laptop suspend to disk 8139too is build-in, not pcmcia I often get hangups after suspend-to-disk if I'm connected to a hub/switch. This is the first oops I've actually seen and copied it by hand: I remember a similar bug report.

Re: Performance is weird (fwd)

2001-03-15 Thread Manfred Spraul
One difference between idle and a running user space app is that the kernel-user space return path checks for pending softirqs, but the ide thread doesn't. Perhaps cpu_idle() should also check for pending softirq's before hlt'ing? idle thread is running. * hw interrupt * * hw interrupt handler

Re: Performance is weird (fwd)

2001-03-15 Thread Manfred Spraul
I've attached a patch. I tried to trigger the problem with my 10 MBit ne2k-pci connection, but without success. Could you try it? I've tested it with -ac17, and it applies to 2.4.2 cleanly. -- Manfred --- 2.4/arch/i386/kernel/process.c Thu Feb 22 22:28:52 2001 +++

Re: changing mm-mmap_sem (was: Re: system call for process information?)

2001-03-18 Thread Manfred Spraul
The problem is that mmap_sem seems to be protecting the list of VMAs, so taking _only_ the page_table_lock could let a VMA change under us while a page fault is underway ... No, that can't happen. VMA changes only happen if both the mmap_sem and the page table lock is acquired. (check

Re: [CHECKER] blocking w/ spinlock or interrupt's disabled

2001-03-18 Thread Manfred Spraul
enclosed are 163 potential bugs in 2.4.1 where blocking functions are called with either interrupts disabled or a spin lock held. The checker works by: Here's the file manifest. Apologies. drivers/atm/idt77105.c [...] drivers/char/cyclades.c Unortunately schedule() with disabled

Re: Question about memory usage in 2.4 vs 2.2

2001-03-21 Thread Manfred Spraul
inode_cache 189974 243512 480 30439 30439 1 : 124 62 dentry_cache 201179 341940 128 11398 11398 1 : 252 126 1) number of used objects 2) number of allocated objects 3) size of each object 4) number of slabs that are at least partially in use 5) number of slabs that are allocated for the cache

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Manfred Spraul
sct wrote: We've already got measurements showing how insane this is. Raw IO requests, plus internal pagebuf contiguous requests from XFS, have to get broken down into page-sized chunks by the current ll_rw_block() API, only to get reassembled by the make_request code. It's *enormous*

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-10 Thread Manfred Spraul
In user space, how do you know when its safe to reuse the buffer that was handed to sendmsg() with the MSG_NOCOPY flag? Or does sendmsg() with that flag block until the buffer isn't needed by the kernel any more? If it does block, doesn't that defeat the use of non-blocking I/O?

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-10 Thread Manfred Spraul
Ingo Molnar wrote: On Wed, 10 Jan 2001, Manfred Spraul wrote: That means sendmsg() changes the page tables? I measures smp_call_function on my Dual Pentium 350, and it took around 1950 cpu ticks. well, this is a performance problem if you are using threads. For normal processes

Re: filp_open() in 2.2.19 causes memory corruption

2001-04-23 Thread Manfred Spraul
Are you sure the trace is decoded correctly? CPU:0 EIP:0010:[sys_mremap+31/884] EFLAGS: 00010206 Code: ac ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 75 d9 ac ae is lodsb scasb Could you run #objdump --disassemble-all --reloc linux/mm/mremap.o | less and check that the

Re: [PATCH] Longstanding elf fix (2.4.3 fix)

2001-04-23 Thread Manfred Spraul
Well looking a little more closely than I did last night it looks like access_process_vm (called from ptrace) can cause what amounts to a page fault at pretty arbitrary times. It's also used for several /proc/pid files. I remember that I got crashes with concurrent exec+cat /proc/pid/cmdline

Re: Severe trashing in 2.4.4

2001-04-29 Thread Manfred Spraul
On Sun, Apr 29, 2001 at 01:58:52PM -0400, Alexander Viro wrote: Hmm... I'd say that you also have a leak in kmalloc()'ed stuff - something in 1K--2K range. From your logs it looks like the thing never shrinks and grows prettu fast... You could enable STATS in mm/slab.c, then the number of

Re: AC'97 (VT82C686A)

2001-04-30 Thread Manfred Spraul
Observe that the PCI DWORD (long) register at DWORD offset 15 consists of 4 byte-wide registers (from the PCI specification), Max_lat, Min_Gnt, Interrupt pin, and interrupt line. Nothing has to fit into 4 bits, you have 8 bits. I haven't looked at the Linux code, but if it provides only 4

Re: Followup to previous post: Atlon/VIA Instabilities

2001-05-01 Thread Manfred Spraul
So it seems that CONFIG_X86_USE_3DNOW is simply used to enable access to the routines in mmx.c (the athlon-optimized routines on CONFIG_K7 kernels), so then it appears that somehow this is corrupting memory / not behaving as it should (very technical, right?) :)... Do you use any

  1   2   3   4   5   6   7   8   9   10   >