Channel bonding kernel crash, workaround

2001-03-22 Thread Pfenniger Daniel

Hi, 

Up to the latests kernels (-> 2.4.2) channel bonding crashes the kernel
(aille...) when turning it off (e.g. at reboot).  

Here is a way to avoid this, which might help gourous to track the bug. 
Suppose ifconfig says: 
 
bond0 Link encap:Ethernet  HWaddr 00:40:05:A1:C4:13  
  inet addr:192.168.2.64  Bcast:192.168.2.255  Mask:255.255.255.0
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:823297 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0 

eth0  Link encap:Ethernet  HWaddr 00:40:05:A1:C4:13  
  inet addr:192.168.2.64  Bcast:192.168.2.255  Mask:255.255.255.0
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:487424 errors:0 dropped:0 overruns:0 frame:0
  TX packets:411649 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:100 
  Interrupt:19 Base address:0xd000 

eth1  Link encap:Ethernet  HWaddr 00:40:05:A1:C4:13  
  inet addr:192.168.2.64  Bcast:192.168.2.255  Mask:255.255.255.0
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:487526 errors:0 dropped:0 overruns:0 frame:0
  TX packets:411648 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:100 
  Interrupt:18 Base address:0xb800 

then the exact sequence:

ifconfig eth0 down 
ifconfig eth1 down 
ifconfig bond0 down 
ifconfig eth0 down 
ifconfig eth1 down 

turns off bond0 without crash.  This was tested on several computers with 
all kernel 2.4.2, SMP Pentium II and tulip 21140 and/or 21142/3 NICs.

Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



athlon+2.2+pdc20267=hang?

2001-03-22 Thread Derrick J Brashear

Earlier today I swapped an Athlon (tbird) 850 and an Epox 8KTA3 in for the 
dual Celeron I had, moving all the cards into the new system. One of these 
was a Promise PDC20267 with 4 40gb disks attached. The machine would not 
boot; I assumed it was the i686-smp kernel and installed a Redhat 
7.0-provided i386 kernel. Several hours and a dozen or so boots later, it 
looks like when the bios on the PDC20267 is installed, the system hangs 
while booting at the point where it would probe C/H/S from the devices 
attached to the PDC20267 (they've already been identified by that point)

Short and sweet: The goal is to run 2.2.18+ide.2.2.18.02122001 with the 
PDC20267 installed and all the disks attached; I assume this is possible 
and I'm missing something obvious, since the controller and disks worked in 
the dual Celeron, and I'm hoping someone has an idea what I'm missing.

Long and boring:
-Tried 2.2.18+ide.2.2.18.1209 built for i386, i586, i586tsc, and i686 
uniprocessor, all resulted in a system hang at the time when hd{e,f,g,h} 
would be probed and C/H/S info printed

-Tried 2.2.18+ide.2.2.18.02122001 built for i586, same result.
-Removed PDC20267, above kernel worked
-Attached PDC20267 and removed disks, above kernel worked
-Tried each of the 2 buses in turn, failed as above
-Tried 2.2.17 without ide patch, specifying ide2=0xac00,0xb002 at boot 
time, failed as above

Based on this and noting the PDC20267 didn't install its bios when no 
devices were attached to it, I assume that is the contention.

Device identifies itself thus:
  Bus  0, device   8, function  0:
Unknown mass storage controller: Promise Technology Unknown device (rev 
2).
  Vendor id=105a. Device id=4d30.
  Medium devsel.  IRQ 10.  Master Capable.  Latency=32.
  I/O at 0xac00 [0xac01].
  I/O at 0xb000 [0xb001].
  I/O at 0xb400 [0xb401].
  I/O at 0xb800 [0xb801].
  I/O at 0xbc00 [0xbc01].

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Geert Uytterhoeven

On Thu, 22 Mar 2001, Alan Cox wrote:
> 2.4.2-ac21
> o atyfb mode updates for powermac (Olaf Hering)

60 Hz modes should be marked 60 Hz.
Add separator comment.

--- linux-2.4.2-ac21/drivers/video/macmodes.c.orig  Fri Mar 23 08:17:54 2001
+++ linux-2.4.2-ac21/drivers/video/macmodes.c   Fri Mar 23 08:37:27 2001
@@ -96,11 +96,11 @@
FB_SYNC_HOR_HIGH_ACT|FB_SYNC_VERT_HIGH_ACT, FB_VMODE_NONINTERLACED
 }, {
/* 1152x768, 60 Hz, Titanium PowerBook */
-   "mac21", 75, 1152, 768, 15386, 158, 26, 29, 3, 136, 6,
+   "mac21", 60, 1152, 768, 15386, 158, 26, 29, 3, 136, 6,
FB_SYNC_HOR_HIGH_ACT|FB_SYNC_VERT_HIGH_ACT, FB_VMODE_NONINTERLACED
 }, {
/* 1600x1024, 60 Hz, Non-Interlaced (112.27 MHz dotclock) */
-   "mac22", 75, 1600, 1024, 8908, 88, 104, 1, 10, 16, 1,
+   "mac22", 60, 1600, 1024, 8908, 88, 104, 1, 10, 16, 1,
FB_SYNC_HOR_HIGH_ACT|FB_SYNC_VERT_HIGH_ACT, FB_VMODE_NONINTERLACED
 }
 
@@ -162,6 +162,7 @@
 { VMODE_1024_768_75V, _modedb[9] },
 { VMODE_1024_768_70, _modedb[8] },
 { VMODE_1024_768_60, _modedb[7] },
+/* 1152x768 */
 { VMODE_1152_768_60, _modedb[14] },
 /* 1152x870 */
 { VMODE_1152_870_75, _modedb[11] },

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2 fails to merge mmap areas, 700% slowdown.

2001-03-22 Thread Mike Galbraith

On 22 Mar 2001, Kevin Buhr wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
> >
> > 2.4.2.ac20.virgin   2.4.3-pre6
> > real11m0.708s   11m58.617s
> > user15m8.720s   7m29.970s
> > sys 1m31.410s   0m41.590s
> >
> > It looks like ac20 is doing some double accounting.

[snip]

> Mike, would you like to try out the following (untested) patch against
> vanilla ac20 to see if it does the trick?

Yes, that fixed it.

-Mike


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Rik van Riel

On 22 Mar 2001, Michael Peddemors wrote:

> Here, Here.. killing qmail on a server who's sole task is running mail
> doesn't seem to make much sense either..

I won't defend the current OOM killing code.

Instead, I'm asking everybody who's unhappy with the
current code to come up with something better.

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



use the kernel to change an irq?

2001-03-22 Thread Jacob Luna Lundberg


Oh Great Gurus:

I have an agp video card that seems quite picky about interrupts, and a
bios that is insisting on sharing the video card's interrupt with whatever
is in the first pci slot.  So my question is, is there any way for the
kernel to more or less say ``screw you'' to the bios and pick the irq for
the video card itself?  I have a spare irq I'd love for it to use...

Oh, almost forgot:  Yes, I'd just vacate the pci slot below the video
card, but sadly all my pci slots are in use.  :(

Ok, I'll admit the card is an nVidia card and I'm trying to use the (evil)
binary drivers.  But note I'm *not* asking for help with that directly.
I'm merely asking if there's a way to avoid sharing the interrupt...

Thanks Muchly,
-Jacob

-- 

The authoritarian attitude has to be fought wherever
you find it, lest it smother you and other hackers.

 - Eric S. Raymond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: /linuxrc query

2001-03-22 Thread Amit D Chaudhary

Hi,

Also as a note, what we are doing is keeping our rootfs on flash as a tar.gz and 
reading it and mounting it on a ramfs in the /linuxrc before doing a pivot_root. 
To summarize, pivot_root has been a life saver as the earlier real_root_dev 
might not have been useful in this case.
Not using the ramfs limits for now, will do soon.

Thanks
Amit

Werner Almesberger wrote:

> Amit D Chaudhary wrote:
> 
>> what does redirecting stdin\stdout\stderr to dev/console achieve? I thought 
>> since the root is now the "new" root, dev/console will be used automatically?
> 
> 
> No, you would continue using the file descriptors which are already
> open, i.e. on /dev/console on the old root.
> 
> 
>> Also, why chroot, why not call init directly?
> 
> 
> To make sure the root of the current process is indeed changed.
> pivot_root currently forces a chroot on all processes (except the
> ones that have explicitly moved out of /) in order to move all the
> kernel threads too, but this is not a nice solution. Once a better
> solution is implemented for the kernel threads, we might drop the
> forced chroot, and then the explicit chroot here becomes important.
> 
> 
>> Since the above never returns, what follows in not freed.
> 
> 
> You can run them later, e.g. /etc/rc.d/rc.local
> Or, if you needs the space immediately,  make "what-follows" a
> script than first frees them, and then exec's init.
> 
> - Werner

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: /linuxrc query

2001-03-22 Thread Amit D Chaudhary

Hi,

Thanks for the response. PSB,

Werner Almesberger wrote:

> Amit D Chaudhary wrote:
> 
> No, you would continue using the file descriptors which are already
> open, i.e. on /dev/console on the old root.
So, makes sense. And the child process that follow will use now the new fd's.

>> Also, why chroot, why not call init directly?
> 
> 
> To make sure the root of the current process is indeed changed.
> pivot_root currently forces a chroot on all processes (except the
> ones that have explicitly moved out of /) in order to move all the
> kernel threads too, but this is not a nice solution. Once a better
> solution is implemented for the kernel threads, we might drop the
> forced chroot, and then the explicit chroot here becomes important.
So, it is not a requirement currently but it is useful to have the script not 
dependent on the current pivot_root implementation.


> You can run them later, e.g. /etc/rc.d/rc.local
> Or, if you needs the space immediately,  make "what-follows" a
> script than first frees them, and then exec's init.
Sure will put in a script that does it. I had left it in /linuxrc as I thought 
that's what initrd.txt suggested one to do. But other information in the 
initrd.txt mentions otherwise, hence the query here.

I am assuming umount and thereby blockdev after pivot_script and before "chroot 
. init ..." don't make sense as files(dev/console among others) are\might still 
be in use.

Best Regards
Amit


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Dave Zarzycki

On Thu, 22 Mar 2001, Alan Cox wrote:

> This is commonly done using the speedstep feature on intel cpus. Speedstep
> can generate events so the OS knows about it but Intel are not telling
> people about how this works.
<...snip...>
> We certainly could recalibrate the clock if we could get events out of
> ACPI, APM or some other source.

Specific events for Speedstep on/off would be nice, but in practice, can
we re-calibrate when ever there is a change in the power status (on
battery, charging, etc.)?

davez

-- 
Dave Zarzycki
http://thor.sbay.org/~dave/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PATCH against 2.4.2: TTY hangup on PPP channel corrupts kernel memory

2001-03-22 Thread Kevin Buhr

Paul Mackerras <[EMAIL PROTECTED]> writes:
> 
> I didn't realize you were talking about linux 2.4.0 and pppd 2.3.11.

That was my stupid oversight.  I carefully tested and retested the
patch under 2.4.0-test5, then ported it to 2.4.2 and sent it off with
only a cursory check using the new pppd [smack forehead here].

> > In particular, the comment above "ppp_asynctty_close" is misleading.
> > It's true that the TTY layer won't call any further line discipline
> > entries while the "close" is executing; however, there may be
> > processes already sleeping in line discipline functions called before
> > the hangup.  For example, "ppp_asynctty_close" could be called while
> > we sleep in the "get_user" in "ppp_channel_ioctl" (called from
> > "ppp_asynctty_ioctl").  Therefore, calling "PPPIOCATTACH" on an
> > unattached PPP-disciplined TTY could, in unlikely circumstances
> > (argument swapped out), lead to a crash.
> 
> Yuck.  I don't see that we can protect against this without having
> some sort of lock in the tty structure, though.  We can't protect the
> existence of the channel structure with a lock inside that structure.
> Ideally the necessary protection would be provided at the tty level.

Well, the closer I look at line discipline locking the less I think I
understand it.  I can't even see what prevents an "ldisc.close"
function from being called when an "ldisc.open" is sleeping on a
memory allocation.

Can someone help me understand?

When changing line disciplines, "sys_ioctl" gets the big kernel lock
for us, and the "tty_set_ldisc" function doesn't get any additional
locks.  It just calls the line discipline "open" function.

Suppose, at this point, the modem hangs up.  From a hardware
interrupt, "tty_hangup" is called which schedule_tasks the tq_hangup
routine, "do_tty_hangup".

Now, suppose the line discipline "open" function doesn't do any
special locking and has a harmless-looking "kmalloc" that isn't
GPF_ATOMIC.  It falls asleep and gives up the big kernel lock!!

Now, the eventd kernel thread wakes up and runs "do_tty_hangup".
"do_tty_hangup" has no trouble getting the big kernel lock and running
the "flush_buffer", "write_wakeup", and "close" line discipline
function on the half-initialized line discipline all with no further
locking.  In a naive implementation, "close" would start freeing the
same kernel structures that "open" hasn't had a chance to allocate!
And, now, "open" is free to wake up and try allocating structures for
a line discipline that has already been shutdown from the TTY.

Does this mean that all line discipline implementations must use a
spinlock around critical code in "open", "close", and every other line
discipline function?  It looks like they must, and it looks like most
don't right now.

Maybe I'm just overlooking something obvious.

> >Can we
> > eliminate "ppp_channel_ioctl" from "ppp_async.c" entirely, as in the
> > patch below?  We're requiring people to upgrade to "pppd" 2.4.0
> > anyway, and it has no need for these calls.  This would give me a warm,
> > fuzzy feeling.
> 
> Sure, that would be fine.  I'll make up a patch and send it to Linus.

Thank you.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



/linuxrc query

2001-03-22 Thread Amit D Chaudhary

Hi,

I have a initrd working, a /linuxrc on it that runs and executes. My question 
for the commands after pivot_root which works like a charm, thanks to initrd.txt,

what does redirecting stdin\stdout\stderr to dev/console achieve? I thought 
since the root is now the "new" root, dev/console will be used automatically? 
Also, why chroot, why not call init directly?
#exec chroot . sbin/init 3 dev/console 2>&1

Since the above never returns, what follows in not freed. Does this mean I have 
around 4-6 mb of ram being used up unnecessarily? Any solution?

#umount /initrd
#blockdev --flushbufs /dev/ram0# /dev/rd/0 if using devfs


Thanks and Regards
Amit

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [linux-lvm] EXT2-fs panic (device lvm(58,0)):

2001-03-22 Thread Andreas Dilger

Al Viro writes:
> On Fri, 23 Mar 2001, Stephen C. Tweedie wrote:
> > On Wed, Mar 07, 2001 at 01:35:05PM -0700, Andreas Dilger wrote:
> > > The only remote possibility is in ext2_free_blocks() if block+count
> > > overflows a 32-bit unsigned value.  Only 2 places call ext2_free_blocks()
> > > with a count != 1, and ext2_free_data() looks to be OK.  The other
> > > possibility is that i_prealloc_count is bogus - that is it!  Nowhere
> > > is i_prealloc_count initialized to zero AFAICS.
> > >
> > Did you ever push this to Alan and/or Linus?  This looks pretty
> > important!
> 
> It isn't. Check fs/inode.c::clean_inode(). Specifically,
> memset(>u, 0, sizeof(inode->u));
> The thing is called both by get_empty_inode() and by get_new_inode() (the
> former - just before returning, the latter - just before calling
> ->read_inode()).

If this is the case, then all of the other zero initializations can be
removed as well.  I figured that if most of the fields needed to be
zeroed, then ones _not_ being zeroed would lead to this problem.

FYI Stephen, the original poster followed up that the problem was with
an IBM SCSI RAID card...

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/   -- Dogbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Can't get serial.c to work with Xircom Cardbus Ethernet+Modem

2001-03-22 Thread Tom Sightler

Hi all,

I saw a discussion on this list about this problem earlier, but could not
find that it had actually been resolved.

With the removal of serial_cb from the 2.4.3pre kernels I can no longer use
the modem of my Xircom adapter.  According to the posts in the other thread
serial.c should now provide this functionality, however it still does not,
at least for me.

The thread seemed to come to the conclusion that this was caused because the
serial driver only looks for PCI devices of class SERIAL and not MODEM.  I
tried the patch shown there for the 5.05 serial driver but it still doesn't
find the serial interface on my Xircom 10/100 Ethernet+56K Modem combo card.

I'm pretty sure the issue is not caused by the problem above, because as far
as I can tell the modem on the adapter does present itself as a PCI SERIAL
class device as shown by the following lspci output:

[root@iso-2146-l1 ttsig]# /sbin/lspci
02:00.0 Ethernet controller: Xircom Cardbus Ethernet 10/100 (rev 03)
02:00.1 Serial controller: Xircom Cardbus Ethernet + 56k Modem (rev 03)

[root@iso-2146-l1 ttsig]# /sbin/lspci -n
02:00.0 Class 0200: 115d:0003 (rev 03)
02:00.1 Class 0700: 115d:0103 (rev 03)

[root@iso-2146-l1 ttsig]# /sbin/lspci -v
02:00.0 Ethernet controller: Xircom Cardbus Ethernet 10/100 (rev 03)
Subsystem: Xircom Cardbus Ethernet 10/100
Flags: bus master, medium devsel, latency 64, IRQ 11
I/O ports at 1800 [size=128]
Memory at 1480 (32-bit, non-prefetchable) [size=2K]
Memory at 14800800 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at 1440 [size=16K]
Capabilities: [dc] Power Management version 1

02:00.1 Serial controller: Xircom Cardbus Ethernet + 56k Modem (rev 03)
(prog-if
 02 [16550])
Subsystem: Xircom CBEM56G-100 Ethernet + 56k Modem
Flags: medium devsel, IRQ 11
I/O ports at 1880 [size=8]
Memory at 14801000 (32-bit, non-prefetchable) [size=2K]
Memory at 14801800 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at 14404000 [size=16K]
Capabilities: [dc] Power Management version 1

I'm pretty sure that Class 0700 is the proper class for a PCI serial device.
The serial_cb driver from 2.4.2 always recognized this device properly and
set it up as /dev/ttyS1 using IO 0x1880 and IRQ 11.  It showed under
setserial as a follows:

/dev/ttyS1, UART: 16550A, Port: 0x1880, IRQ: 11

Now with serial.c it doesn't even get reported, I get the following when I
load serial.c:

Serial driver version 5.05.SA (2000-09-14) with MANY_PORTS MULTIPORT
SHARE_IRQ SERIAL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A

I know the version doesn't show as 5.05A, but I applied the patch by hand
and left off that part.  I'm pretty sure the patch is irrelavent since the
device does show up as a true PCI SERIAL Class device.

Any ideas?  I may look at it more tomorrow.  For now I'm back to using
serial_cb which still works fine (even though that apparently suprises many
people).

Later,
Tom


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PATCH against 2.4.2: TTY hangup on PPP channel corrupts kernel memory

2001-03-22 Thread Paul Mackerras

Kevin Buhr writes:

> I didn't realize my specific hang was a peculiarity of the older
> attachment style.  The channel created by pushing the PPP line

I didn't realize you were talking about linux 2.4.0 and pppd 2.3.11.

> discipline onto a TTY was connected to a unit with a PPPIOCATTACH
> ioctl on the TTY---this didn't really "attach" the channel; it still
> had a refcnt of only one.  Through the old compatibility interface, it
> was possible to call ppp_asynctty_read -> ppp_channel_read -> ppp_read
> on the channel's "struct ppp_file" and wait on the channel's "rwait".
> If the modem hung up, "do_tty_hangup" would call "ppp_asynctty_close"
> (with a reader still in "ppp_asynctty_read") and the "struct channel"
> would be freed in "ppp_unregister_channel".

That's one of the main reasons why I removed the compatibility
stuff. :)

> I think your analysis of how things presently are with 2.4.2 and a
> modern "pppd" is correct...
> 
> Since the new "pppd" uses an explicit PPPIOCATTCHAN / PPPIOCCONNECT
> sequence, the refcnt gets bumped to 2 and stays there while the
> channel is attached.  So, this specific hang isn't a problem anymore
> for "ppp_async.c".  It's still a problem with "ppp_synctty.c", though
> (when used with "pppd" 2.3.11, say).  Is the compatibility stuff in
> there slated for removal, too?

Yep, and we should take out the stuff in ppp_generic.c that was called
by the compatibility stuff in the channels, too.

> In particular, the comment above "ppp_asynctty_close" is misleading.
> It's true that the TTY layer won't call any further line discipline
> entries while the "close" is executing; however, there may be
> processes already sleeping in line discipline functions called before
> the hangup.  For example, "ppp_asynctty_close" could be called while
> we sleep in the "get_user" in "ppp_channel_ioctl" (called from
> "ppp_asynctty_ioctl").  Therefore, calling "PPPIOCATTACH" on an
> unattached PPP-disciplined TTY could, in unlikely circumstances
> (argument swapped out), lead to a crash.

Yuck.  I don't see that we can protect against this without having
some sort of lock in the tty structure, though.  We can't protect the
existence of the channel structure with a lock inside that structure.
Ideally the necessary protection would be provided at the tty level.

> I assume PPPIOCATTACH (on the TTY) is deprecated in favor of
> PPPIOCATTCHAN / PPPIOCCONNECT (on the "/dev/ppp" handle).  Can we
> eliminate "ppp_channel_ioctl" from "ppp_async.c" entirely, as in the
> patch below?  We're requiring people to upgrade to "pppd" 2.4.0
> anyway, and it has no need for these calls.  This would give me a warm,
> fuzzy feeling.

Sure, that would be fine.  I'll make up a patch and send it to Linus.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Please review patchlet for ov511 (2.4.2-ac19)

2001-03-22 Thread Pete Zaitcev

Here is the deal:

we have a guy here with a webcam and the following scenario:
1. ov511 disconnects, everything dies/releases/closes fine,
2. webcam soft starts polling open/sleep/open/sleep/...
3. ov511_probe works and reaches ov511_configure,
   calls video_register_device().
4. Webcam software opens and oopses on the semafore
   that was not initialized yet.

I think video_register_device needs to be done last, when
everything else is ready to accept appliction requests.

Someone please review. The error handling style
of ov511 spins my head. I may be missing a code path somewhere.

Thanks in advance,
-- Pete

--- linux-2.4.2-ac19/drivers/usb/ov511.cThu Jan  4 13:15:32 2001
+++ linux-2.4.2-ac19-p3/drivers/usb/ov511.c Thu Mar 22 19:55:59 2001
@@ -3141,11 +3141,6 @@
 
init_waitqueue_head(>wq);
 
-   if (video_register_device(>vdev, VFL_TYPE_GRABBER) < 0) {
-   err("video_register_device failed");
-   return -EBUSY;
-   }
-
if (ov511_write_regvals(dev, aRegvalsInit)) goto error;
if (ov511_write_regvals(dev, aRegvalsNorm511)) goto error;
 
@@ -3214,7 +3209,6 @@
return 0;

 error:
-   video_unregister_device(>vdev);
usb_driver_release_interface(_driver,
>actconfig->interface[ov511->iface]);
 
@@ -3323,6 +3317,11 @@
ov511->buf_state = BUF_NOT_ALLOCATED;
} else {
err("Failed to configure camera");
+   goto error;
+   }
+
+   if (video_register_device(>vdev, VFL_TYPE_GRABBER) < 0) {
+   err("video_register_device failed");
goto error;
}
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.2-ac21

2001-03-22 Thread Tony Hoffmann

I also had my 3c905 behave this way with ac21.  ac20 is ok.  System uses an ABit kt7a 
board.

Andrew Morton wrote:

> Lawrence Walton wrote:
> >
> > Hello all
> > 2.4.2-ac21 seems to have a couple problems.
> > ...
> >
> > Mar 22 15:15:55 the-penguin kernel: NETDEV WATCHDOG: eth0: transmit timed out
> > ...
> > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] (prog-if 
>00 [Normal decode])
>
> People have recently been changing VIA PCI bridge settings
> to try to fix the file corruption thing.  There has been one
> report that this change causes a 3c905C to go silly.
>
> This looks like the same problem to me.
>
> Arjan?
>
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread sfr

> Boot with the 'notsc' option is one approach. We certainly could recalibrate
> the clock if we could get events out of ACPI, APM or some other source. Maybe
> someone at IBM knows something on the thinkpad front here. If there is for
> example an additional apm event or irq we can enable for the thinkpads to see
> the speed change we can make it work

On the ThinkPad 600E (at least), we get a Power Status Change APM event.

Cheers,
Stephen Rothwell

P.S. We actually get two of these events each time we remove or insert the
power cord ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



/proc/stat disk_io entries

2001-03-22 Thread Tony . Young

All,

Firstly, my relevant system stats:
kernel  linux 2.4.3-pre6
hda IDE Drive
hdb CD drive
hdc IDE Drive
hdd IDE Drive
sda SCSI Drive

The problem I'm seeing is that IO stats (disk_io) aren't being shown in
/proc/stats for the 2 harddrives on the second ide controller (hdc and hdd).

I checked the kernel code and found the function kstat_read_proc in
fs/proc/proc_misc.c which loops through from 0 up to DK_MAX_MAJOR and prints
out the stats to /proc/stat for each drive. However, DK_MAX_MAJOR is set to
16 in include/linux/kernel_stat.h, which means that the drives on my second
ide controller, with a major number of 22, aren't included in the loop.

I modified the value of DK_MAX_MAJOR to 23 and rebuilt and /proc/stats now
shows the 2 missing harddrives. I'm uncomfortable sending in a patch for
this as I'm not familiar enough with the code to understand the full
ramifications of changing this value. Considering also that the value 23
stills doesn't include any tertiary or quaternary ide controllers (33 and
34) makes me wonder what the correct value should really be.

I'm also curious, after considering the above, about whether or not a hash
table(s) would be better suited to the current implementation of
2-dimensional arrays for disk stats (dk_drive, dk_drive_rio, dk_drive_wio,
etc).

I've brought this to the list because I'm not sure of the correct solution
and I couldn't work out if there was a specific maintainer of this code.

It also seems strange to me that the identifiers for the values for disk_io
in /proc/stat are (major_number,disk_number) tuples rather than
(major,minor). The current implementation with my change now shows my first
ide drive to be identified as (8,0), while my second and third ide drives
(hdc and hdd) are identified as (22,2) and (22,3) respectively rather than
(22,0) and (22,1) - I presume because they are the in the 3rd and 4th ide
positions. Using disk_number instead of minor number also makes it more
difficult for any user programs reading /proc/stat to trace the entry back
to a physical device. Any program must make assumptions that major numbers 8
and 22 refer to /dev/hd* entries, and that disk number 0 translates to 'a',
1 to 'b', 2 to 'c', etc and can then work out that 22,2 means /dev/hdc.
These assumption, of course, break with the use of devfs when not using
devfsd to provide the necessary links.

I welcome any comments, but please CC me directly as I'm not subscribed to
the list.

Tony...
--
Tony Young
Senior Software Engineer
Integrated Research Limited
Level 10, 168 Walker St
North Sydney, NSW 2060, Australia
Ph:  +61 2 9966 1066
Fax: +61 2 9966 1042
Mob: 0414 649942

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: user space web server accelerator support

2001-03-22 Thread Fabio Riccardi

Dave, Zach,

thanks for your help, I've implemented a file descriptor passing mechanism
very similar to that of Zach's and it worked.

The problem now is performance, fd passing is utterly slow!

On my system (a 1GHz Pentium III + 2G RAM) I can do 1300 SpecWeb99 with a
khttp-like socket passing mechanism, while I only get something like 500 using
file descriptor passing. Indeed with fd passing I decrease Apache's
performance instead of increasing it!

I've checked my code several times and I don't believe that I have introduced
any specific bottleneck of my own (the code actually is quite trivial).

I've profiled the kernel and some interesting differences show:

With direct socket passing, 1300 SpecWeb load:

  9759 total  0.0071
   902 handle_IRQ_event   7.5167
   256 skb_clone  0.6957
   256 do_tcp_sendpages   0.0954
   239 tcp_v4_rcv 0.1572
   238 schedule   0.1766
   226 __kfree_skb0.9741
   207 skb_release_data   1.7845
   204 tcp_transmit_skb   0.1541
   199 d_lookup   0.6910
   190 path_walk  0.0973
   181 ip_output  0.6754
   168 fget   2.2105
   165 do_softirq 1.1786
   158 do_generic_file_read   0.1287

With file descriptor passing, 500 SpecWeb load:

  8621 total  0.0063
  7037 schedule   5.2203
   462 handle_IRQ_event   3.8500
   188 __wake_up  0.9216
   114 unix_stream_data_wait  0.4191
81 __switch_to0.3750
58 schedule_timeout   0.3718
25 d_lookup   0.0868
20 skb_clone  0.0543
19 path_walk  0.0097
17 tcp_transmit_skb   0.0128
17 do_tcp_sendpages   0.0063
17 do_softirq 0.1214
15 system_call0.2679
15 sys_rt_sigtimedwait0.0207

Zach, have you ever noticed such a performance bottleneck in your phhttpd?

SpecWeb has about 30% of its load as dynamic requests, so the amount of
forwarding is definitively significative in my case. Sime time ago I measured
khttp's impact in socket passing and I found that it was negligible
(forwarding everything to Apache instead of having it directly listening on
the socket had an impact of a few percent).

My impression from a first look to the profiling data is that the kernel is
doing a very poor job of scheduling and is ping-ponging between processes...
like it is not doing any buffering whatsoever and it is doing a contect switch
for every passed file descriptor.

Any thoughts?

 - Fabio


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



sis900 ethernet card support

2001-03-22 Thread Rama Krishna Mandava




Hi all,

Iam using SiS900  chipset .It contains sis900 Fast ethernetcard . I installed
RedHat Linux 6.2 .But I could not configure net card.

I want to know Does kernel 2.2.15 supports sis900 net card or not. ?.

Iam trying to insert sis900 as module .But Iam getting message as "
"Device or Resource busy ".





 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Only 10 MB/sec with via 82c686b - FIXED

2001-03-22 Thread SodaPop

Further note - kernels built with K6-2 support seem to be just fine.  But
all athlon/K7 kernels die horribly, with greatly varying death messages.
Most commonly I get bogus pointer/dereference errors and eventually init
gets killed, other times it just locks up, sometimes I get things like
'cannot exec syslogd: Out of memory'.  It looks like the memory registers
are horked up somehow.

I could try to copy some of this out by hand if anyone thought it
worthwhile.  Either way, I think IWill has some work to do yet on their
system bios.

-dennis T




On Wed, 21 Mar 2001 [EMAIL PROTECTED] wrote:

> On 20 Mar, SodaPop wrote:
>
> > I have an IWill KK-266R motherboard with an athlon-c 1200
> > processor in it, and for the life of me I can't get more than
> > 10 MB/sec through the on-board ide controller.  Yes, all the
> > appropriate support is turned on in the kernel to enable dma
> > and specific chipset support, and yes, I think I have all
> > relevant patches and a reasonable kernel.
>
>  Yes, actually I'm seeing the same on a KT133 board from Elitegroup.
>  Although here I get a bit more: 15 MB/s
>
> > I noted a number of other interesting things;  one, that -X33,
> > -X34, and -X64 through -X69 all have the same 10 MB/sec transfer
> > rate, and two, that the 10 MB/sec transfer rate can be linearly
> > increased to 12 MB/sec by raising the system bus from 100 mhz to
> > 120 mhz (all components are safely rated at 133, no overclocking
> > involved.)
>
>  Duh, before making such a claim you should consider the fact that
>  this is overclocking your PCI/AGP bus and I have yet to see any
>  graphic cards/IDE controllers/other devices which are rated for
>  37MHz PCI bus speed.
>
> --
>
> Servus,
>Daniel
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



thunderbird 1.2G + kk266 + 2.4.x oops and crash

2001-03-22 Thread Disconnect

Long and short is I have a new mobo/cpu/ram (see below) that runs great
under Win98 and passes memtest86 (3 extended runs as pc133/cas2, 3
standard runs as pc100/cas3) but oops's almost immediately under Linux
2.4.x (2.4.2 and 2.4.1 at the least.)

With a few rare exceptions (usually kupdated) all of the oops's are in
kswapd (I can manually decode the call stack/etc if someone lets me know
which info they need and confirms for me real quick how to get all the
info out.  I do have the system.map on another machine, so just a
pointer/url/etc is cool.)

I have tried a couple of kernel builds, with no change.  HD access doesn't
seem to affect it (at least, e2fsck on a 10 gig partition doesn't bomb)
but doing actual work does.  (Work like, say, booting w/o init=/bin/sh ;)
..)

Hardware list:
1.2G AMD Thunderbird
Iwill kk266 (not ide-raid) mobo, via apollo kt133a - specs url below)
2 256M pc133/cas2 amd-approved dimms
new amd-approved power supply (bios and windows list voltages/cooling as
reasonable)
bunch of pci cards that don't seem to affect things either way (only one I
haven't pulled is voodoo3, since there is no onboard video)

Mobo is jumpered to 100mhz FSB (which is correct for the chip) and
multiplier/voltage/etc is set to 'auto'.

Things tried:
memtest86, passed
win98, runs fine
set speed down (1150 and 1100), no change
set ide to paranoid (noautotune, no dma, no blockmode, etc), no change
bang head on wall, no change


Full mobo specs:
http://www.iwillusa.com/products/spec.asp?ModelName=KK266=

Any help much appreciated.

---
-BEGIN GEEK CODE BLOCK-
Version: 3.1 [www.ebb.org/ungeek]
GIT/CC/CM/AT d--(-)@ s+:-- a-->? C$ ULBS*$ P+>+++ L>+ 
E--- W+++ N+@ o+>$ K? w--->+ O- M V-- PS+() PE Y+@ PGP++() t 5--- 
X-- R tv+@ b>$ DI D++(+++) G++ e* h(-)* r++ y++
--END GEEK CODE BLOCK--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2ac22

2001-03-22 Thread Aaron Tiensivu

> o Fix ppp memory corruption (Kevin Buhr)
> | Bizzarely enough a direct re-invention of a 1.2 ppp bug

Could this explain my MPPP skb corruption I've reported since 2.3.x?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Only 10 MB/sec with via 82c686b - FIXED

2001-03-22 Thread SodaPop


Regarding the overclocking of the PCI bus, I was not aware of this.  The
documentation led me to believe the pci clock was fixed, however further
experimentation indicates that's clearly not the case.  Thanks.

Regarding the fix:  I installed an Ensonique AudioPci sound card, and
experienced horrible distortion, crackling, and high pitched chirps any
time I tried to use the device.  I noticed that various interrupts were
causing chunks of the real audio to sometimes slip through; on a whim I
tried ping flooding a nearby machine and the sound quality improved
greatly.

Putting two and two together, it occurred to me that the motherboard was
having irq/interrupt routing problems.  The disks could not get reasonable
throughput because the interrupts were getting choked or held up, and the
sound card couldn't properly function either.

Wonder of wonders, I flashed the bios to the latest and greatest version.
Current data transfer rates are 35.7 MB/sec on both udma drives, exactly
as expected and darn close to the continuous read limits of the disks.
The audio also started working, flawlessly.

There are other issues however - the athlon now runs significantly hotter
at idle for one, but the most serious is that the K7 kernel optimizations
cause horrendous kernel panics and crashes.  I'm running now on a kernel
compiled for 386, which seems to be stable.  I'll attempt to build other
kernels to see if I can figure out whats going on.

Net result:  IWill KK266 motherboards have bios problems, it may be a good
idea to upgrade the bios.

-dennis T


On Wed, 21 Mar 2001 [EMAIL PROTECTED] wrote:

> On 20 Mar, SodaPop wrote:
>
> > I have an IWill KK-266R motherboard with an athlon-c 1200
> > processor in it, and for the life of me I can't get more than
> > 10 MB/sec through the on-board ide controller.  Yes, all the
> > appropriate support is turned on in the kernel to enable dma
> > and specific chipset support, and yes, I think I have all
> > relevant patches and a reasonable kernel.
>
>  Yes, actually I'm seeing the same on a KT133 board from Elitegroup.
>  Although here I get a bit more: 15 MB/s
>
> > I noted a number of other interesting things;  one, that -X33,
> > -X34, and -X64 through -X69 all have the same 10 MB/sec transfer
> > rate, and two, that the 10 MB/sec transfer rate can be linearly
> > increased to 12 MB/sec by raising the system bus from 100 mhz to
> > 120 mhz (all components are safely rated at 133, no overclocking
> > involved.)
>
>  Duh, before making such a claim you should consider the fact that
>  this is overclocking your PCI/AGP bus and I have yet to see any
>  graphic cards/IDE controllers/other devices which are rated for
>  37MHz PCI bus speed.
>
> --
>
> Servus,
>Daniel
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Some strange patch to drivers/input/keybdev.c

2001-03-22 Thread Pete Zaitcev

Some guy sent me the attached patch. He says it allows
him to use 2 additional keys on the 106 key USB keyboard.
I never saw a 106 key keyboard before, USB or not.
Does anyone understand what is going on? Vojtech?

-- Pete

--- drivers/input/keybdev.c.orig  Sat Sep  2 19:01:55 2000
+++ drivers/input/keybdev.c   Sat Sep  2 20:21:07 2000
@@ -49,11 +49,11 @@
 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
 80, 81, 82, 83, 43, 85, 86, 87, 88,115,119,120,121,375,123, 90,
284,285,309,298,312, 91,327,328,329,331,333,335,336,337,338,339,
-   367,294,293,286,350, 92,334,512,116,377,109,111,373,347,348,349,
+   367,294,293,286,350, 92,334,512,116,377,109,111,115,347,348,349,
360, 93, 94, 95, 98,376,100,101,357,316,354,304,289,102,351,355,
103,104,105,275,281,272,306,106,274,107,288,364,358,363,362,361,
291,108,381,290,287,292,279,305,280, 99,112,257,258,113,270,114,
-   118,117,125,374,379,259,260,261,262,263,264,265,266,267,268,269,
+   118,117,125,374,125,259,260,261,262,263,264,265,266,267,268,269,
271,273,276,277,278,282,283,295,296,297,299,300,301,302,303,307,
308,310,313,314,315,317,318,319,320,321,322,323,324,325,326,330,
332,340,341,342,343,344,345,346,356,359,365,368,369,370,371,372 };
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel_thread vs. zombie

2001-03-22 Thread Andrew Morton

Martin Frey wrote:
> 
> >>  - When started during boot (low PID (9)) It becomes a zombie
> >>  - When started from a process that quits after sending the ioctl,
> >>it is correctly "garbage collected".
> >>  - When started from a process that stays around, it becomes
> >>a zombie too
> 
> >Take a look at kernel/kmod.c:call_usermodehelper().  Copy it.
> >
> >This will make your thread a child of keventd.  This takes
> >care of things like chrootedness, uids, cwds, signal masks,
> >reaping children, open files, and all the other crud which
> >you can accidentally inherit from your caller.
> >
> So depending on the state of the caller daemonize() will not really
> put us into the background as we want.

Well, kernel_thread() will put you in the background, in the
sense that it creates an async thread.  But you inherit
heaps of stuff from the parent.  daemonize() cleans up
some of those things, but it can't clean up everything.

Kernel threads *need* to run in a well-understood and
sensible environment.  We went through a lot of fun late
last year when there was a sudden proliferation of kernel
threads and quite a few things were subtly broken.

Things like kernel threads blocking signals because that's
what their user-space parent happened to do.  Things like
user-space applications receiving a surprise SIGCHLD from
the kernel as a consequence of some system call which they
happened to have executed some while beforehand.

One approach would be to tromp through your task state setting
everything back where you want it.  That's quite complex.  Plus
there's the issue of who reaps the thread when it exits.

So I think it's reasonable to use keventd as `kinit', if you like.
Something which knows how to launch and reap kernel daemons, and
which provides a known environment to them.

A kernel API function (`kernel_daemon'?) which does all this
boilerplate is needed, I think.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Keith Owens

On Fri, 23 Mar 2001 01:50:49 +, 
"Andrew Morton" <[EMAIL PROTECTED]> wrote:
>Keith Owens wrote:
>> 
>> Am I the only person who is annoyed that nmi watchdog is now off by
>> default and the only way to activate it is by a boot parameter?  You
>> cannot even patch the kernel to build a version that has nmi watchdog
>> on because the startup code runs out of the __setup routine, no boot
>> parameter, no watchdog.
>
>It was causing SMP boxes to crash mysteriously after
>several hours or days.  Quite a lot of them.  Nobody
>was able to explain why, so it was turned off.

I know why it was turned off by default.  The annoying this is that now
the *only* way to activate the watchdog is via a boot command.  It is
not possible to compile a standard debugging kernel with this option
turned on, you have to rely on every user setting the boot options for
every kernel.  If it is going to be off by default there should be a
way to patch the kernel to make it on by default.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [linux-lvm] EXT2-fs panic (device lvm(58,0)):

2001-03-22 Thread Alexander Viro

On Fri, 23 Mar 2001, Stephen C. Tweedie wrote:

> Hi,
>
> On Wed, Mar 07, 2001 at 01:35:05PM -0700, Andreas Dilger wrote:
>
> > The only remote possibility is in ext2_free_blocks() if block+count
> > overflows a 32-bit unsigned value.  Only 2 places call ext2_free_blocks()
> > with a count != 1, and ext2_free_data() looks to be OK.  The other
> > possibility is that i_prealloc_count is bogus - that is it!  Nowhere
> > is i_prealloc_count initialized to zero AFAICS.
> >
> Did you ever push this to Alan and/or Linus?  This looks pretty
> important!

It isn't. Check fs/inode.c::clean_inode(). Specifically,
memset(>u, 0, sizeof(inode->u));
The thing is called both by get_empty_inode() and by get_new_inode() (the
former - just before returning, the latter - just before calling
->read_inode()).
Cheers,
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Andrew Morton

Keith Owens wrote:
> 
> Am I the only person who is annoyed that nmi watchdog is now off by
> default and the only way to activate it is by a boot parameter?  You
> cannot even patch the kernel to build a version that has nmi watchdog
> on because the startup code runs out of the __setup routine, no boot
> parameter, no watchdog.

It was causing SMP boxes to crash mysteriously after
several hours or days.  Quite a lot of them.  Nobody
was able to explain why, so it was turned off.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: kernel_thread vs. zombie

2001-03-22 Thread Martin Frey

>>  - When started during boot (low PID (9)) It becomes a zombie
>>  - When started from a process that quits after sending the ioctl,
>>it is correctly "garbage collected".
>>  - When started from a process that stays around, it becomes 
>>a zombie too

>Take a look at kernel/kmod.c:call_usermodehelper().  Copy it.
>
>This will make your thread a child of keventd.  This takes
>care of things like chrootedness, uids, cwds, signal masks,
>reaping children, open files, and all the other crud which
>you can accidentally inherit from your caller.
>
So depending on the state of the caller daemonize() will not really
put us into the background as we want. With being created from
keventd we inherit a state as we'd like to have in a kernel thread.
Did I get it right?
I will change my example and test that.

Thanks,

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Michael Peddemors

Here, Here.. killing qmail on a server who's sole task is running mail doesn't seem to 
make much sense either..

> > Clearly, Linux cannot be reliable if any process can be killed

> > at any moment. I am not happy at all with my recent experiences.
> 
> Really the whole oom_kill process seems bass-ackwards to me.  I can't in my mind
> logically justify annihilating large-VM processes that have been running for 
> days or weeks instead of just returning ENOMEM to a process that just started 
> up.
> 
> We run Oracle on a development box here, and it's always the first to get the
> axe (non-root process using 70-80 MB VM).  Whenever someone's testing decides to 
> run away with memory, I usually spend the rest of the day getting intimate with
> the backup files, since SIGKILLing random Oracle processes, as you might have
> guessed, has a tendency to rape the entire database.

-- 
"Catch the Magic of Linux..."

Michael Peddemors - Senior Consultant
LinuxAdministration - Internet Services
NetworkServices - Programming - Security
WizardInternet Services http://www.wizard.ca
Linux Support Specialist - http://www.linuxmagic.com

(604)589-0037 Beautiful British Columbia, Canada

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Rik van Riel

On Sat, 23 Mar 2002, Martin Dalecki wrote:

> This is due to the broken calculation formula in oom_kill().

Feel free to write better-working code.

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PROBLEM: 2.2.18 oops leaves umount hung in disk sleep

2001-03-22 Thread Camm Maguire

Greetings!  Here are the contiguous lines from kern.log:

Mar 21 01:14:47 intech9 kernel: eth0: bogus packet: status=0x80 nxpg=0x57 size=1270
Mar 21 01:14:49 intech9 kernel: Unable to handle kernel NULL pointer dereference at 
virtual address 
Mar 21 01:14:49 intech9 kernel: current->tss.cr3 = 02872000, %%cr3 = 02872000
Mar 21 01:14:49 intech9 kernel: *pde = 
Mar 21 01:14:49 intech9 kernel: Oops: 
Mar 21 01:14:49 intech9 kernel: CPU:0
Mar 22 12:30:08 intech9 kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.

Why would this have not been included, would you happen to know?  In
any case, I understand that its pretty much impossible to debug now,
right?  dmesg wrapped around by the time I got to it (I seem to be
having a lot of ethernet bogus packet messages, as shown above.  I've
chalked this up to the heavy traffic during the amanda backup, but
maybe something is wrong here too/instead?)

Thanks again!

Trond Myklebust <[EMAIL PROTECTED]> writes:

> > " " == Camm Maguire <[EMAIL PROTECTED]> writes:
> 
>  > I'd be happy to generate one if I could.  I've got the system
>  > map.  The defaults reported by ksymoops are all correct.  Don't
>  > know why it didn't give me more info.  Normally, the info is
>  > reported by klogd anyway, but not here.  I've sent you all I
>  > currently have.  If you can suggest how I can get more, would
>  > be glad to do so.
> 
> 
> Unless you happen to have a dump from 'dmesg', there's probably not
> much you can do to recover the rest of the Oops...
> 
> We need at least the line 'EIP:' if we're to find out where the fault
> occurred. Are you certain that it can't be found in the syslog?
> 
>  > I thought I was running v3.  Can't seem to find anything now
>  > which indicates the protocol version in use, but was under the
>  > impression that v4 was only an option in 2.4.x, no?
> 
> 
> Mar 21 01:14:49 intech9 automount[305]: using kernel protocol version 3 on reawaken
> 
> Sorry, the above message fooled me.
> 
> 
> Cheers,
>   Trond
> 
> 

-- 
Camm Maguire[EMAIL PROTECTED]
==
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Tim Wright

If it's a 500MHz Thinkpad, then I'm guessing it's something like a 600X.
That doesn't have Speedstep. The speed changes are done by some circuitry
in the laptop. I can try to find out more if this would help.
The newer machines are using Speedstep.

Tim

On Thu, Mar 22, 2001 at 11:37:43PM +, Alan Cox wrote:
> > thanks, i just tested the "notsc" option (.config has CONFIG_X86_TSC
> > enabled=y, but CONFIG_M586TSC is not enabled.. if that's ok), but this time
> ...
> > boot and stay on battery power exclusively.  did anyone else expect this
> > behaviour?  
> 
> Errmm no.. 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
IBM Linux Technology Center, Beaverton, Oregon
Interested in Linux scalability ? Look at http://lse.sourceforge.net/
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [linux-lvm] EXT2-fs panic (device lvm(58,0)):

2001-03-22 Thread Stephen C. Tweedie

Hi,

On Wed, Mar 07, 2001 at 01:35:05PM -0700, Andreas Dilger wrote:

> The only remote possibility is in ext2_free_blocks() if block+count
> overflows a 32-bit unsigned value.  Only 2 places call ext2_free_blocks()
> with a count != 1, and ext2_free_data() looks to be OK.  The other
> possibility is that i_prealloc_count is bogus - that is it!  Nowhere
> is i_prealloc_count initialized to zero AFAICS.
> 
Did you ever push this to Alan and/or Linus?  This looks pretty
important!

Cheers,
 Stephen

> ==
> diff -ru linux/fs/ext2/ialloc.c.orig linux/fs/ext2/ialloc.c
> --- linux/fs/ext2/ialloc.c.orig   Fri Dec  8 18:35:54 2000
> +++ linux/fs/ext2/ialloc.cWed Mar  7 12:22:11 2001
> @@ -432,6 +444,8 @@
>   inode->u.ext2_i.i_file_acl = 0;
>   inode->u.ext2_i.i_dir_acl = 0;
>   inode->u.ext2_i.i_dtime = 0;
> + inode->u.ext2_i.i_prealloc_count = 0;
>   inode->u.ext2_i.i_block_group = i;
>   if (inode->u.ext2_i.i_flags & EXT2_SYNC_FL)
>   inode->i_flags |= S_SYNC;
> diff -ru linux/fs/ext2/inode.c.orig linux/fs/ext2/inode.c
> --- linux/fs/ext2/inode.c.origTue Jan 16 01:29:29 2001
> +++ linux/fs/ext2/inode.c Wed Mar  7 12:05:47 2001
> @@ -1048,6 +1038,8 @@
>   (((__u64)le32_to_cpu(raw_inode->i_size_high)) << 32);
>   }
>   inode->i_generation = le32_to_cpu(raw_inode->i_generation);
> + inode->u.ext2_i.i_prealloc_count = 0;
>   inode->u.ext2_i.i_block_group = block_group;
>  
>   /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Version 6.1.8 of the aic7xxx driver availalbe

2001-03-22 Thread Justin T. Gibbs

As always, the latest version of this driver is availalbe here:

http://people.FreeBSD.org/~gibbs/linux/

Complete CHANGELOG is now available at the above URL.

I try to filter though LK as often as I can, but for
best response, please email issues regarding this driver to
me directly.

Changes since 6.1.6:

Change 168 on 2001/03/22 by gibbs@overdrive

Bump version to 6.1.8.

Change 167 on 2001/03/22 by gibbs@overdrive

aic7xxx_linux.c:
aic7xxx_linux.h:
Add support for switching from full to basic
command queuing.  Flags in the ahc_linux_device
structure indicate what kind of queuing to performed.

In the past, we issued an ordered tag every 250
transactions.  We now issue an ordered tag every
250 transactions issued without the device queue
going empty.

aic7xxx_proc.c:
Use an unsigned long for total number of commands
sent to a device.  %q and %lld don't seem to work
under Linux or I'd have used a uint64_t.

Change 166 on 2001/03/22 by gibbs@overdrive

aic7770.c:
aic7xxx_pci.c:
Don't map our interrupt until after we are fully setup to
handle interrupts.  Our interrupt line may be shared so
an interrupt could occur at any time.

aic7xxx.h:
aic7xxx.c:
Add support for switching from fully blown tagged queing
to just using simple queue tags should the device reject
an ordered tag.

Remove per-target "current" disconnect and tag queuing
enable flags.  These should be per-device and are not
referenced internally be the driver, so we let the OSM
track this state if it needs to.

Use SCSI-3 message terminology.

Change 165 on 2001/03/19 by gibbs@overdrive

aic7770.c:
ahc_reset() leaves the card in a paused state.
Re-arrange the code so we reset the chip earlier
so we can avoid a manual pause during setup.

Setup the controller without enabling card interrupts.

aic7xxx.c:
Fix a bug in ahc_lookup_phase_entry().  We never traversed
past the first entry.  This routine is only used in
diagnostics so this had only a limited effect.

Start out life with card interrupts disabled.  The bus
code will enable the interrupts once setup is complete
and our handler is in place.

Initialize our softc unit to -1 so that code such as
ahc_linux_next_unit() can traverse the list looking for
coliding unit numbers without tripping over entries that
have not yet had their unit number set.

Enhance ahc_dump_card_state().  OSMs should be able to
rely on this to dump any controller specific data of
interest.  Most of the additional registers printed
used to be printed in the FreeBSD timeout handler.

Add a function pointer in our softc for a bus specific
interrupt handler.  This removes some dependencies on
the PCI code so that bus attachments can be compiled
as modules separate from the core.

aic7xxx.reg:
Use the naming for bit 5 of DFSTATUS in the data book,
FIFOQWDEMP.

aic7xxx.seq:
In our idle loop, use an or instruction to set PRELOADEN
rather than rewriting the contents of DMAPARAMS to
DFCNTRL.  The later may re-enable the DMA engine if
the idle loop is called to complete the preload of at
least one segment when a target disconnects on an S/G
segment boundary but before we have completed fetching
the next segment.  This correts a hang, usually in
message out phase, when this situation occurs.  This
bug has been here for a long time, so the situation
is rare, but not impossible to reproduce.

Wait for at least 8 bytes in the FIFO before testing to
see if the DMA fetch of an SCB has stalled.  The old
code used FIFOEMP, which goes false on a single byte.
Since we drain the FIFO 8 bytes at a time, using FIFOQWDEMP
is safer.

If a device happens to be exceptionally slow in asserting
HDONE, our workaround for a stalled SCB dma can be triggered.
Make this situation non-fatal by terminating our FIFO
emptying should we 

Re: Linux 2.4.2-ac21

2001-03-22 Thread Keith Owens

On Fri, 23 Mar 2001 00:02:54 +0100, 
Frank de Lange <[EMAIL PROTECTED]> wrote:
>Linux 2.4.2-ac21 does not like my box, or the other way around:
>
>loading the agpgart module (MGA G400 AGP) -> system hangs
>loading the SCSI module (53c875) -> system hangs
>
>In both cases, the magic SysRq sequence does not work, but it is still possible
>to ping the box from the outside. Connecting to it (ssh) does not work,
>however. I backed out both the SCSI driver patches as well as the agpgart
>patches, but this did not fix the symptoms. Looks more like a module-loading
>related issue, but I have not found it yet.
>
>All this on an SMP (Abit BP6) box by the way...

Activate the nmi watchdog with nmi_watchdog=1 in the boot parameters[*].
That will trip after 5 seconds and point to where it is hanging.  If
the nmi watchdog alone does not give enough data, add the kdb patch
(with nmi watchdog on) and start debugging.
http://oss.sgi.com/projects/kdb/download/ix86/, the -ac20 patch should
fit -ac21 as well.

Am I the only person who is annoyed that nmi watchdog is now off by
default and the only way to activate it is by a boot parameter?  You
cannot even patch the kernel to build a version that has nmi watchdog
on because the startup code runs out of the __setup routine, no boot
parameter, no watchdog.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Sound issues with m805lr motheboard

2001-03-22 Thread Brent D. Norris

> It might be interesting to strace the realserver startup
> both under 2.2 and 2.4 -

Here you go.  Also sorry Alan for such a goofy email earlier.  Does anyone
have any ideas on the sound driver issue?

Brent Norris



execve("Bin/rmserver", ["Bin/rmserver", "rmserver.cfg"], [/* 23 vars */]) = 0
uname({sys="Linux", node="linux-wolf", ...}) = 0
brk(0)  = 0x823318c
open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or directory)
open("/lib/libNoVersion.so.1", O_RDONLY) = 4
fstat64(4, {st_mode=S_IFREG|0755, st_size=15967, ...}) = 0
close(4)= 0
open("/lib/libNoVersion.so.1", O_RDONLY) = 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\10"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=15967, ...}) = 0
old_mmap(NULL, 7272, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40018000
mprotect(0x40019000, 3176, PROT_NONE)   = 0
old_mmap(0x40019000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0) = 
0x40019000
close(4)= 0
open("/etc/ld.so.cache", O_RDONLY)  = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=23137, ...}) = 0
old_mmap(NULL, 23137, PROT_READ, MAP_PRIVATE, 4, 0) = 0x4001a000
close(4)= 0
open("/lib/libdl.so.2", O_RDONLY)   = 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\34\0\000"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=60598, ...}) = 0
old_mmap(NULL, 12244, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4002
mprotect(0x40022000, 4052, PROT_NONE)   = 0
old_mmap(0x40022000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x1000) = 
0x40022000
close(4)= 0
open("/usr/lib/libstdc++.so.2.8", O_RDONLY) = 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\364\204"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=375773, ...}) = 0
old_mmap(NULL, 263568, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40023000
mprotect(0x40054000, 62864, PROT_NONE)  = 0
old_mmap(0x40054000, 57344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x3) = 
0x40054000
old_mmap(0x40062000, 5520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x40062000
close(4)= 0
open("/lib/libm.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20J\0\000"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=530027, ...}) = 0
old_mmap(NULL, 128792, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40064000
mprotect(0x40083000, 1816, PROT_NONE)   = 0
old_mmap(0x40083000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x1e000) = 
0x40083000
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\300\1"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=5155229, ...}) = 0
old_mmap(NULL, 1214792, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40084000
mprotect(0x401a4000, 35144, PROT_NONE)  = 0
old_mmap(0x401a4000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x11f000) 
= 0x401a4000
old_mmap(0x401a9000, 14664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x401a9000
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\300\1"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=5155229, ...}) = 0
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\300\1"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=5155229, ...}) = 0
close(4)= 0
open("/lib/libm.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20J\0\000"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=530027, ...}) = 0
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\300\1"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=5155229, ...}) = 0
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\300\1"..., 1024) = 1024
fstat64(4, {st_mode=S_IFREG|0755, st_size=5155229, ...}) = 0
close(4)= 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x401ad000
munmap(0x4001a000, 23137)   = 0
getpid()= 807
brk(0)  = 0x823318c
brk(0x82331ac)  = 0x82331ac
brk(0x8234000)  = 0x8234000
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, 

[PATCH] Fix races in 2.4.2-ac22 SysV shared memory

2001-03-22 Thread Stephen C. Tweedie

Hi,

The patch below is for two races in sysV shared memory.

The first (minor) one is in shmem_free_swp:

swap_free (entry);
*ptr = (swp_entry_t){0};
freed++;
if (!(page = lookup_swap_cache(entry)))
continue;
delete_from_swap_cache(page);
page_cache_release(page);

has a window between the first swap_free() and the
lookup_swap_cache().  If the swap_free() frees the last ref to the
swap entry and another cpu allocates and caches the same entry before
the lookup, we'll end up destroying another task's swap cache.

The second is nastier.  shmem_nopage() uses the inode semaphore to
serialise access to shmem_getpage_locked() for paging in shared memory
segments.  Lookups in the page cache and in the shmem swap vector are
done to locate the entry.  _getpage_ can move entries from swap to
page cache under protection of the shmem's info->lock spinlock.

shmem_writepage() is locked via the page lock, and moves shmem pages
from the page cache to the swap cache under protection of the same
info->lock spinlock.

However, shmem_nopage() does not hold this spinlock while doing its
lookups in the page cache and swap vectors, so it can race with
writepage, with once cpu in the middle of moving the page out of the
page cache in writepage and the other cpu then failing to find the
entry either in the page cache or in the shm swap entry vector.

Feedback welcome.

Cheers, 
 Stephen


--- mm/shmem.c.~1~  Fri Mar 23 00:26:49 2001
+++ mm/shmem.c  Fri Mar 23 00:42:21 2001
@@ -121,13 +121,13 @@
if (!ptr->val)
continue;
entry = *ptr;
-   swap_free (entry);
*ptr = (swp_entry_t){0};
freed++;
-   if (!(page = lookup_swap_cache(entry)))
-   continue;
-   delete_from_swap_cache(page);
-   page_cache_release(page);
+   if ((page = lookup_swap_cache(entry)) != NULL) {
+   delete_from_swap_cache(page);
+   page_cache_release(page);   
+   }
+   swap_free (entry);
}
return freed;
 }
@@ -218,15 +218,24 @@
 }
 
 /*
- * Move the page from the page cache to the swap cache
+ * Move the page from the page cache to the swap cache.
+ *
+ * The page lock prevents multiple occurences of shmem_writepage at
+ * once.  We still need to guard against racing with
+ * shmem_getpage_locked().  
  */
 static int shmem_writepage(struct page * page)
 {
int error;
struct shmem_inode_info *info;
swp_entry_t *entry, swap;
+   struct inode *inode;
 
-   info = >mapping->host->u.shmem_i;
+   if (!PageLocked(page))
+   BUG();
+   
+   inode = page->mapping->host;
+   info = >u.shmem_i;
swap = __get_swap_page(2);
if (!swap.val) {
set_page_dirty(page);
@@ -234,11 +243,11 @@
return -ENOMEM;
}
 
-   spin_lock(>lock);
-   shmem_recalc_inode(page->mapping->host);
entry = shmem_swp_entry(info, page->index);
if (IS_ERR(entry))  /* this had been allocted on page allocation */
BUG();
+   spin_lock(>lock);
+   shmem_recalc_inode(page->mapping->host);
error = -EAGAIN;
if (entry->val) {
__swap_free(swap, 2);
@@ -268,6 +277,10 @@
  * If we allocate a new one we do not mark it dirty. That's up to the
  * vm. If we swap it in we mark it dirty since we also free the swap
  * entry since a page cannot live in both the swap and page cache
+ *
+ * Called with the inode locked, so it cannot race with itself, but we
+ * still need to guard against racing with shm_writepage(), which might
+ * be trying to move the page to the swap cache as we run.
  */
 static struct page * shmem_getpage_locked(struct inode * inode, unsigned long idx)
 {
@@ -276,31 +289,57 @@
struct page * page;
swp_entry_t *entry;
 
-   page = find_lock_page(mapping, idx);;
+   info = >u.shmem_i;
+
+repeat:
+   page = find_lock_page(mapping, idx);
if (page)
return page;
 
-   info = >u.shmem_i;
entry = shmem_swp_entry (info, idx);
if (IS_ERR(entry))
return (void *)entry;
+
+   spin_lock (>lock);
+   
+   /* The shmem_swp_entry() call may have blocked, and
+* shmem_writepage may have been moving a page between the page
+* cache and swap cache.  We need to recheck the page cache
+* under the protection of the info->lock spinlock. */
+
+   page = find_lock_page(mapping, idx);
+   if (page) {
+   spin_unlock (>lock);
+   return page;
+   }
+   
if (entry->val) {
unsigned long flags;
 
/* Look it up and read it in.. */
page = 

Re: 2.4.2-ac21

2001-03-22 Thread Andrew Morton

Lawrence Walton wrote:
> 
> Hello all
> 2.4.2-ac21 seems to have a couple problems.
> ...
> 
> Mar 22 15:15:55 the-penguin kernel: NETDEV WATCHDOG: eth0: transmit timed out
> ...
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] (prog-if 00 
>[Normal decode])

People have recently been changing VIA PCI bridge settings
to try to fix the file corruption thing.  There has been one
report that this change causes a 3c905C to go silly.

This looks like the same problem to me.

Arjan?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.2-ac21

2001-03-22 Thread Alan Cox

> Hello all
> 2.4.2-ac21 seems to have a couple problems. First the fs was acting very
> strangely, while compiling; the compiler complained about being unable
> to find files and directory's that existed. I was able to cd to those
> directory's and see the files with ls, (I was recompiling ac20 at the
> time.). Second was every half a minute or so, I would get this message.


Ok the further VIA bitfiddling with the pci config is causing the problems
it seems. I'll back that out soon

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Jean Tourrilhes  <[EMAIL PROTECTED]> wrote:
>
>   I agree that the IrDA stack is full of irq/locking bugs (there
>is a patch of mine waiting in Dag's mailbox), but this one is not a
>bug, it's a false positive.
>   The restore_flags(flags); will restore the state of the
>interrupt register before the cli happened, so will automatically
>reenable interrupts.

Look closer. The error report is a big bogus, because it points out as
an error the "return" that is _correct_, not the "return" that is buggy.

Their checkers verify that all exists out of a function have the same
characteristics, and they found a case where one exit exists with
interrupts still disabled, while another one exists after having done a
"restore_flags()". 

So it looks like a real bug, it's just that the error is the _earlier_
return value, not the one pointed at.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] gcc-3.0 warnings

2001-03-22 Thread J . A . Magallon


On 03.23 Alan Cox wrote:
> > page_cache_release(page);
> > -out:
> 
> out:;
> 

Yes, a null sentence can shut up the compiler. But what is the purpose of
a jump to the end instead of a return ? Some optimization ?

> does that trick
> 
> > -   default:
> > +   default:;
>

Same, I have not tested if gcc-3 will complain about a switch that not
covers all values (ie, no default:). But the logic thing would be to kill
the default: completely. Mmmm, and older compilers will eat it with no
default: ?

> 
> The aic7xxx change looks right too. Someone with the hardware handy needs to
> check that one though.
>

It work on my 7880.

> As to the asm - I'll apply it to -ac if you can verify the asm after changes
> goes happily through the older gcc/binutils (should do) and send me a nice
> clean diff of just those changes
> 

Is there a non-written standard for coding that asm's ?
For example:
"  adcl 12(%1), %0\n"
"1:adcl 16(%1), %0\n"
"  lea 4(%1), %1\n"

or

"adcl 12(%1), %0\n\t"
"1:  adcl 16(%1), %0\n\t"
"lea 4(%1), %1\n\t"

-- 
J.A. Magallon  #  Let the source
mailto:[EMAIL PROTECTED]  #  be with you, Luke... 

Linux werewolf 2.4.2-ac21 #5 SMP Thu Mar 22 23:47:26 CET 2001 i686

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Martin Dalecki

Rik van Riel wrote:
> 
> On Sat, 23 Mar 2002, Martin Dalecki wrote:
> 
> > Uptime of a process is a much better mesaure for a killing
> > candidate then it's size.
> 
> You'll have fun with your root shell, then  ;)

You mean the remote one? 

> The current OOM code takes things like uptime, used cpu, size
> and a bunch of other things into account.
> 
> If it turns out that the code is not attaching a proper weight
> to some of these factors, you should be sending patches, not
> flames.

Did I say anything insulting? I have just stated what I think
is more important... BTW> it's not quite obvious that
You have to look into oom_kill to find it in the kernel
source where to look at. (Yes I did just find /usr/src/linux -name
"oom*"
becouse I happen to remember but!

OK i will just place - in front of the description lines where I think
that you where mislead:



 * Good in this context means that:
 * 1) we lose the minimum amount of work done
-* 2) we recover a large amount of memory
 * 3) we don't kill anything innocent of eating tons of memory
-* 4) we want to kill the minimum amount of processes (one)
 * 5) we try to kill the process the user expects us to kill, this
 *algorithm has been meticulously tuned to meet the priniciple
 *of least surprise ... (be careful when you change it)

The following is a wrong assumtion. You usually nice processes to
the background just to guarantee for example smoot interaction just
in case you won't login in in some time to the machine.

For example let's have an dedicated http server, which does a lot of
embedded perl.
It's quite clever to renice it back, just in case this
remote machine get's overloaded, becouse otherwise your chances
to get a login in case the machine starts to trash,
would be much worser. But this doesn't mean that the
process isn't more important - becouse you do it to make the
machine crowl through high load peaks and still let you in in
case you have something urgent to do on it.

/*
 * Niced processes are most likely less important, so double
 * their badness points.
 */
if (p->nice > 0)
points *= 2;

BTW> Why the hell you don't just use a polynomial approximation for
int_sqrt - the range of values is very closed an you are
working in a finite ring anyway - you could very easly find
a simple approximation which wouldn't need any looping.

This should be reversted:

points /= int_sqrt(cpu_time);
points /= int_sqrt(int_sqrt(run_time));
points = p->mm->total_vm;

/*
 * CPU time is in seconds and run time is in minutes. There is
no
 * particular reason for this other than that it turned out to
work
 * very well in practice. This is not safe against jiffie wraps
 * but we don't care _that_ much...
 */
cpu_time = (p->times.tms_utime + p->times.tms_stime) >>
(SHIFT_HZ + 3);
run_time = (jiffies - p->start_time) >> (SHIFT_HZ + 10);

points /= int_sqrt(cpu_time);
points /= int_sqrt(int_sqrt(run_time));


==

NOW I SEE THE MOST IMPORTANT MISTAKE:

There should be a de-normalization of the units

CPU_time/total_uptime
RUN_time/total_uptime
mem/total_mem.

Otherwise you can't map the intended logics sufficiently safe
on to the calculation you do. You compare bits with seconds - which is
WRONG.

Let:
 m := memmory used by the process 
 M := the total memmory in the system.
 c := cpu time used by the process
 u := uptime of the process.
 U := uptime of the system

Then you calculate points
as 

(m / sqrt(c)) / sqrt(sqrt(r))

Which is just very wired function with a non homogen behaviour.
(Just take the first derivative of it in any dimension to see what I
mean)


You should calculate to represent you intended logics:

 x * (m / M) + y * (U / c) + z * (U / u),

where x y z are constants representing the wighting heuristic
importance one gives to those particular measure points.

A simple *normalized* polynom the only thing people and computers can
realy deal with.

> (the code is full of comments, so it should be easy enough to
> find your way around the code and tweak it until it does the
> right thing in a number of test cases)
> 
> regards,
> 
> Rik
> --
> Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml
> 
> Virtual memory is like a game you can't win;
> However, without VM there's truly nothing to lose...
> 
> http://www.surriel.com/
> http://www.conectiva.com/   http://distro.conectiva.com/

-- 
- phone: +49 214 8656 283
- job:   eVision-Ventures AG, LEV .de (MY OPINIONS ARE MY OWN!)
- langs: de_DE.ISO8859-1, en_US, pl_PL.ISO8859-2, last ressort:
ru_RU.KOI8-R
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Martin Dalecki

Stephen Clouse wrote:
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Sat, Mar 23, 2002 at 01:33:50AM +0100, Martin Dalecki wrote:
> > AMEN! TO THIS!
> > Uptime of a process is a much better mesaure for a killing candidate
> > then it's size.
> 
> Thing is, if you take a good study of mm/oom_kill.c, it *does* take start time

I did thing is Rik did use a non normalized formula in oom_kill for the
calculation of the kill penalty a process get's. This is the main
reason for the non controllable behaviour of it.

> into account, as well as CPU time.  The problem is that a process (like Oracle,
> in our case) using ludicrous amounts of memory can still rank at the top of the
> list, even with the time-based reduction factors, because total VM is the
> starting number in the equation for determining what to kill.  Oracle or what
> not sitting at 80 MB for a day or two will still find a way to outrank the
> newly-started 1 MB shell process whose malloc triggered oom_kill in the first
> place.

This is due to the broken calculation formula in oom_kill().

> 
> If anything, time really needs to be a hard criterion for sorting the final list
> on and not merely a variable in the equation and thus tied to vmsize.
> 
> This is why the production database boxen aren't running 2.4 yet.  I can control
> Oracle's usage very finely (since it uses a fixed memory pool preallocated at
> startup), but if something else decides to fire up on there (like the nightly
> backup and maintenance routine) and decides it needs just a pinch more memory
> than what's available -- ick.  2.2.x doesn't appear to enforce new memory
> allocation with a sniper rifle -- the new process just suffers a pleasant ("Out
> of memory!") or violent (SIGSEGV) death.

And you should never ever overcommit memmory to oracle! Don't make the
buffers bigger then half the memmory in the system really. There ARE
circumstances where oracle is using all available memmory in very random
manner.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Stephen Clouse

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, Mar 23, 2002 at 01:33:50AM +0100, Martin Dalecki wrote:
> AMEN! TO THIS!
> Uptime of a process is a much better mesaure for a killing candidate
> then it's size.

Thing is, if you take a good study of mm/oom_kill.c, it *does* take start time
into account, as well as CPU time.  The problem is that a process (like Oracle,
in our case) using ludicrous amounts of memory can still rank at the top of the 
list, even with the time-based reduction factors, because total VM is the
starting number in the equation for determining what to kill.  Oracle or what
not sitting at 80 MB for a day or two will still find a way to outrank the
newly-started 1 MB shell process whose malloc triggered oom_kill in the first
place.

If anything, time really needs to be a hard criterion for sorting the final list
on and not merely a variable in the equation and thus tied to vmsize.

This is why the production database boxen aren't running 2.4 yet.  I can control
Oracle's usage very finely (since it uses a fixed memory pool preallocated at
startup), but if something else decides to fire up on there (like the nightly
backup and maintenance routine) and decides it needs just a pinch more memory
than what's available -- ick.  2.2.x doesn't appear to enforce new memory 
allocation with a sniper rifle -- the new process just suffers a pleasant ("Out
of memory!") or violent (SIGSEGV) death.

- -- 
Stephen Clouse <[EMAIL PROTECTED]>
Senior Programmer, IQ Coordinator Project Lead
The IQ Group, Inc. 

-BEGIN PGP SIGNATURE-
Version: PGP 6.5.8

iQA/AwUBOrqW3wOGqGs0PadnEQLZUwCfWTr8HwAChQamWWvWWzZcX5DZ8PAAnROB
Ja25OAQu3W1h7Ck0SU/TfKj8
=VlQt
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Mikael Pettersson

On Thu, 22 Mar 2001 23:43:57 + (GMT), Alan Cox wrote:

> > >How do you return an out of memory error to a C program that is out of memory
> > >due to a stack growth fault. There is actually not a language construct for it
> > SIGSEGV.
> > Stack overflow for a language like C using standard implementation techniques
> > is the same as a page fault while accessing a page for which there is no backing
> > store. SIGSEGV is the logical choice, and the one I'd expect on other Unices.
> 
> Guess again. You are expanding the stack because you have no room left on it.
> You take a fault. You want to report a SIGSEGV. Now where are you
> going to put the stack frame ?
> 
> SIGSEGV in combination with a preallocated alternate stack maybe

Oh I know 99% of the processes getting this will die. The behaviour I'd
expect from vanilla code in this particular case (stack overflow) is:
- page fault in stack "segment"
- no backing store available
- post SIGSEGV to current
  * push sighandler frame on current stack (or altstack, if registered) [+]
  * no room? SIG_DFL, i.e kill

My point is that with overcommit removed, there's no question as to
which process is actually out of memory. No need for the kernel to guess;
since it doesn't guess, it cannot guess wrong.

Concerning the stack: sure, oom makes it problematic to report the
error in a useful way. So use sigaltstack() and SA_ONSTACK. [+]
Processes that don't do this get killed, but not because oom_kill
did some fancy guesswork.

[+] Speaking as a hacker on a runtime system for a concurrent
programming language (Erlang), I consider the current Unix/POSIX/Linux
default of having the kernel throw up[*] at the user's current stack
pointer to be unbelievably broken. sigaltstack() and SA_ONSTACK should
not be options but required behaviour.

[*] Signal & trap frames used to be called "stack puke" in old 68k days.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Andrew Morton

Mikael Pettersson wrote:
> 
> [+] Speaking as a hacker on a runtime system for a concurrent
> programming language (Erlang), I consider the current Unix/POSIX/Linux
> default of having the kernel throw up[*] at the user's current stack
> pointer to be unbelievably broken. sigaltstack() and SA_ONSTACK should
> not be options but required behaviour.
> 

Why?  What problem does stack puke cause?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Sergey Kubushin

On Thu, 22 Mar 2001, Alan Cox wrote:

OK.

> > On Thu, 22 Mar 2001, Alan Cox wrote:
> >
> > Does not build for PPro/P-II. i586 is OK.
>
> You need to avoid enabling 64G support. The PAE stuff (as Linus said
> with
> 2.4.3pre6) is currently broken. Once Linus and co fix it I'll merge the
> fixed
> one

---
Sergey Kubushin Sr. Unix Administrator
CyberBills, Inc.Phone:  702-567-8857
874 American Pacific Dr,Fax:702-567-8808
Henderson, NV, 89014

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2ac22

2001-03-22 Thread Doug Ledford

Alan Cox wrote:

> o   Next incarnation of the i810 audio driver   (Doug Ledford)

Is this the i810 that's in Red Hat's CVS or the last copy of the big file that
I sent you?  If it's the last copy of the big file I sent you, then it has a
memory leak that needs fixed.  I committed the fix for the memory leak to the
CVS archive something like two days ago.  The patch is attached.

-- 

 Doug Ledford <[EMAIL PROTECTED]>  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems

--- linux/drivers/sound/i810_audio.c.save   Wed Mar 21 20:44:29 2001
+++ linux/drivers/sound/i810_audio.cWed Mar 21 20:44:34 2001
@@ -1820,12 +1820,11 @@
return -EBUSY;
}
stop_dac(state);
-   dealloc_dmabuf(state);
}
if(dmabuf->enable & ADC_RUNNING) {
stop_adc(state);
-   dealloc_dmabuf(state);
}
+   dealloc_dmabuf(state);
if (file->f_mode & FMODE_WRITE) {
state->card->free_pcm_channel(state->card, dmabuf->write_channel->num);
}



2.4.2-ac21

2001-03-22 Thread Lawrence Walton

Hello all
2.4.2-ac21 seems to have a couple problems. First the fs was acting very
strangely, while compiling; the compiler complained about being unable
to find files and directory's that existed. I was able to cd to those
directory's and see the files with ls, (I was recompiling ac20 at the
time.). Second was every half a minute or so, I would get this message.

Mar 22 15:15:55 the-penguin kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 22 15:15:55 the-penguin kernel: eth0: transmit timed out, tx_status 00 status e000.
Mar 22 15:15:55 the-penguin kernel:   diagnostics: net 0cd8 media 8880 dma 00a0.
Mar 22 15:15:55 the-penguin kernel:   Flags; bus-master 1, dirty 16(0) current 32(0)
Mar 22 15:15:55 the-penguin kernel:   Transmit list 17a7e200 vs. d7a7e200.
Mar 22 15:15:55 the-penguin kernel:   0: @d7a7e200  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   1: @d7a7e240  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   2: @d7a7e280  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   3: @d7a7e2c0  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   4: @d7a7e300  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   5: @d7a7e340  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   6: @d7a7e380  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   7: @d7a7e3c0  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   8: @d7a7e400  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   9: @d7a7e440  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   10: @d7a7e480  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   11: @d7a7e4c0  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   12: @d7a7e500  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   13: @d7a7e540  length 806e status 006e
Mar 22 15:15:55 the-penguin kernel:   14: @d7a7e580  length 806e status 806e
Mar 22 15:15:55 the-penguin kernel:   15: @d7a7e5c0  length 8063 status 8063
Mar 22 15:15:55 the-penguin kernel: eth0: Resetting the Tx ring pointer.

I rebooted back into 2.4.2-ac20 and everything was fine. 
No opps, or other kernel messages out of the ordinary.
I was pretty afraid of booting back into ac21.



Linux the-penguin 2.4.2-ac20 #2 Thu Mar 22 16:04:42 PST 2001 i686 unknown

Gnu C  2.95.3
Gnu make   3.79.1
binutils   2.11.90.0.1
util-linux
util-linux Note: /usr/bin/fdformat is obsolete and is no longer available.
util-linux Please use /usr/bin/superformat instead (make sure you have the
util-linux fdutils package installed first).  Also, there had been some
util-linux major changes from version 4.x.  Please refer to the 
documentation.
util-linux
modutils   2.4.2
e2fsprogs  1.19
PPP2.4.0
Linux C Library2.2.2
Dynamic linker (ldd)   2.2.2
Procps 2.0.7
Net-tools  1.59
Console-tools  0.2.3
Sh-utils   2.0.11
Modules Loaded nls_cp437 nls_iso8859-1 smbfs mga ipx agpgart emu10k1 soundcore 
3c59x

#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_USE_3DNOW=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_TOSHIBA is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_SMP is not set
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y

#
# General setup
#
CONFIG_NET=y
# CONFIG_VISWS is not set
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_HOTPLUG is not set
# CONFIG_PCMCIA is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set

Re: [PATCH] gcc-3.0 warnings

2001-03-22 Thread Alan Cox

>   page_cache_release(page);
> -out:

out:;

does that trick

> - default:
> + default:;

Agree - done

> --- linux-2.4.2-ac21/net/ipv4/icmp.c.orig Thu Mar 22 23:39:22 2001
> +++ linux-2.4.2-ac21/net/ipv4/icmp.c  Thu Mar 22 23:42:23 2001

Again out:;

>   goto error;
> - default:
> + default:;

Ok

The aic7xxx change looks right too. Someone with the hardware handy needs to
check that one though.

As to the asm - I'll apply it to -ac if you can verify the asm after changes
goes happily through the older gcc/binutils (should do) and send me a nice
clean diff of just those changes



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] gcc-3.0 warnings

2001-03-22 Thread J . A . Magallon

Hi, kernel list readers.

I have been building (and hopefully booting) ac-21 with gcc-3.0 snapshot
dated 20010312. I have cleared the 99% of the warnings that 3.0 issues
when building the kernel. Obviuosly, only in the main kernel part for
i386 and the drivers I use. I suppose other arch will require a similar
cleanup.

All are related to multiline strings in asm() sentences, that seem to have
been deprecated, and out: or default: labels at the end of blocks. Pathc
is inlined.

There are a couple more curious errors:
1) Is this a bug ?
make[1]: Entering directory `/usr/src/linux-2.4.2-ac21/arch/i386/kernel'
gcc ... -c -o setup.o setup.c
setup.c: In function `get_cpuinfo':
setup.c:2378: warning: unused variable `x86_udelay_tsc'
I have not patched this. Is a reminder of previous work, or should be used
for something and the use has flown erroneously ?

2)
gcc ... -c -o aic7xxx.o aic7xxx.c
aic7xxx.c: In function `ahc_print_scb':
aic7xxx.c:1335: warning: operation on `i' may be undefined
(nine times)
The piece of code is three reps of this:
printf(" %#02x %#02x %#02x %#02x\n",
hscb->shared_data.cdb[i++],
hscb->shared_data.cdb[i++],
hscb->shared_data.cdb[i++],
hscb->shared_data.cdb[i++]);
I suppose that gcc claims that the result is dependent on evaluation order
of the args to the printf(), so it is potentially dangerous. Just chaged it
to
hscb->shared_data.cdb[ 1],
hscb->shared_data.cdb[ 2],
hscb->shared_data.cdb[ 3],
etc.

If mantainers do not like the way I corrected this, at least it is a list
of thigs to look at.

BTW, after that changes, the kernel built and booted ok.

 patch-gcc-3

--- linux-2.4.2-ac21/fs/smbfs/cache.c.orig  Fri Mar 23 00:45:27 2001
+++ linux-2.4.2-ac21/fs/smbfs/cache.c   Fri Mar 23 00:46:04 2001
@@ -34,7 +34,7 @@
 
page = grab_cache_page(>i_data, 0);
if (!page)
-   goto out;
+   return;
 
if (!Page_Uptodate(page))
goto out_unlock;
@@ -47,7 +47,6 @@
 out_unlock:
UnlockPage(page);
page_cache_release(page);
-out:
 }
 
 /*
--- linux-2.4.2-ac21/fs/smbfs/ioctl.c.orig  Fri Mar 23 00:46:22 2001
+++ linux-2.4.2-ac21/fs/smbfs/ioctl.c   Fri Mar 23 00:46:56 2001
@@ -45,7 +45,7 @@
if (!copy_from_user(, (void *)arg, sizeof(opt)))
result = smb_newconn(server, );
break;
-   default:
+   default:;
}
 
return result;
--- linux-2.4.2-ac21/include/asm-i386/string.h.orig Thu Mar 22 23:17:03
2001
+++ linux-2.4.2-ac21/include/asm-i386/string.h  Thu Mar 22 23:20:40 2001
@@ -516,12 +516,12 @@
 {
if (!size)
return addr;
-   __asm__("repnz; scasb
-   jnz 1f
-   dec %%edi
-1: "
-   : "=D" (addr), "=c" (size)
-   : "0" (addr), "1" (size), "a" (c));
+   __asm__("repnz; scasb\n\t"
+   "   jnz 1f\n\t"
+   "   dec %%edi\n\t"
+   "1:"
+   : "=D" (addr), "=c" (size)
+   : "0" (addr), "1" (size), "a" (c));
return addr;
 }
 
--- linux-2.4.2-ac21/include/asm-i386/system.h.orig Thu Mar 22 23:20:50
2001
+++ linux-2.4.2-ac21/include/asm-i386/system.h  Thu Mar 22 23:21:47 2001
@@ -145,10 +145,10 @@
unsigned int low, unsigned int high)
 {
 __asm__ __volatile__ (
-   "1: movl (%0), %%eax;
-   movl 4(%0), %%edx;
-   cmpxchg8b (%0);
-   jnz 1b"
+   "1: movl (%0), %%eax;\n\t"
+   "movl 4(%0), %%edx;\n\t"
+   "cmpxchg8b (%0);\n\t"
+   "jnz 1b"
::  "D"(ptr),
"b"(low),
"c"(high)
--- linux-2.4.2-ac21/include/asm-i386/checksum.h.orig   Thu Mar 22 23:21:58
2001
+++ linux-2.4.2-ac21/include/asm-i386/checksum.hThu Mar 22 23:25:19 2001
@@ -69,25 +69,24 @@
  unsigned int ihl) {
unsigned int sum;
 
-   __asm__ __volatile__("
-   movl (%1), %0
-   subl $4, %2
-   jbe 2f
-   addl 4(%1), %0
-   adcl 8(%1), %0
-   adcl 12(%1), %0
-1: adcl 16(%1), %0
-   lea 4(%1), %1
-   decl %2
-   jne 1b
-   adcl $0, %0
-   movl %0, %2
-   shrl $16, %0
-   addw %w2, %w0
-   adcl $0, %0
-   notl %0
-2:
-   "
+   __asm__ __volatile__(
+"  movl (%1), %0\n"
+"  subl $4, %2\n"
+"  jbe 2f\n"
+"  addl 4(%1), %0\n"
+"  adcl 8(%1), %0\n"
+"  adcl 12(%1), %0\n"
+"1:adcl 16(%1), %0\n"
+"  lea 4(%1), %1\n"
+"  decl %2\n"
+"  jne 1b\n"
+"  adcl $0, %0\n"
+"  movl %0, %2\n"
+"  shrl $16, %0\n"
+"  addw %w2, %w0\n"
+"  adcl $0, %0\n"
+"  notl %0\n"
+"2:"

CML2 version 0.9.5 is available.

2001-03-22 Thread Eric S. Raymond

The latest version is always available at http://www.tuxedo.org/~esr/cml2/

Release 0.9.5: Thu Mar 22 18:21:12 EST 2001
* Put Python version guard up front so user won't see a stack
  trace from bad imports.
* Follow through on representing numbers as numbers internally.

My most persistent bug finder, Giacomo Catenazzi, reported no bugs in 0.9.4, 
but I found some.  The conversion of the internals to use numbers for
numbers rather than strings was incomplete.

It's very likely that the next CML2 release, just in time for the 2.5
kickoff workshop, will be 1.0.0.  I'm assuming kernel version 2.4.3 
will issue sometime before that and will resync the rules files with it.
-- 
http://www.tuxedo.org/~esr/">Eric S. Raymond

Love your country, but never trust its government.
-- Robert A. Heinlein.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel_thread vs. zombie

2001-03-22 Thread Andrew Morton

Benjamin Herrenschmidt wrote:
> 
> >daemonize() makes calls that are all protected with the
> >big kernel lock in do_exit(). All usages of daemonize have
> >the big kernel lock held. So I guess it just needs it.
> >
> >Please let me know whether you have success if it makes
> >a difference with having it held.
> 
> With a bit more experiments, I have this behaviour:
> 
> (I hold the kerne lock, daemonize(), and release the kernel lock, then do
> my probe thing which takes a few seconds, and let the thread die by itself)
> 
>  - When started during boot (low PID (9)) It becomes a zombie
>  - When started from a process that quits after sending the ioctl,
>it is correctly "garbage collected".
>  - When started from a process that stays around, it becomes a zombie too
> 
> So something is not working, or I'm missing something obvious, or whatever...
> 
> Any clue ?

Take a look at kernel/kmod.c:call_usermodehelper().  Copy it.

This will make your thread a child of keventd.  This takes
care of things like chrootedness, uids, cwds, signal masks,
reaping children, open files, and all the other crud which
you can accidentally inherit from your caller.

something like:

struct my_struct
{
struct tq_struct tq;
void (*function)(void *);
struct semaphore sem;

};

/* keventd runs this */
void helper(void *data)
{
struct my_struct *my_ptr = data;

kernel_thread(my_ptr->function, my_ptr, CLONE_FLAGS|SIGCHLD);
}

start_thread(struct my_struct *my_ptr)
{
my_ptr->tq.sync = 0;
INIT_LIST_HEAD(_ptr->tq.list);
my_ptr->routine = helper;
my_ptr->data = my_ptr;
schedule_task(_ptr->tq);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: serial driver question

2001-03-22 Thread Thorsten Kranzkowski

On Thu, Mar 22, 2001 at 06:07:52PM -0500, John Covici wrote:
> I have been wondering about the serial drivers shared irq
> configuration parameter.  Will it allow two dumb serial ports which
> know nothing about sharing irq's to actually share the same irq, or

no.

> does the actual hardware have to support some kind of irq sharing for
> this to work?

yes.

For one there are multiport serial cards that combine every ports' irq
into one.

PCI cards should also be able to share their irqs (with other PCI cards).

But you can't share an ISA irq with a PCI one.

And you can't share one ISA card's irq with another ISA one nor the irqs of
a dumb ISA multiport card either ...
... without some modification, that is ;-)

Imagine the situation on a typical ISA card:

   Card  ISA Mainboard
  connector

---+  "  | to other ISA connectors
   |   Jumper "  |
  IC   |-+--*=*---"--+
   | |"  |
---+ +--* *-- "  | to IRQ-Controller-IC
 |"
 +--* *--


Now replace the jumper for each irq sharing device with a diode and add one
resistor to the irq line:

---+Diode "  | to other ISA connectors
   |   e.g. 1n4148"  |
  IC   |-+--*=*--|>|---+--"--+
   | |  A   K  |  "  |
---+ +--* *--- |  "  | to IRQ-Controller-IC
 | #  "
 +--* *--- # Resistor 20kOhm
   #
   |
 __|__ GND

So for 4 serial ports sharing a single irq line you wuld use 4 diodes and 
1 resistor.


> I did try two ports on the same irq, but one of them isn't seem at all
> by Linux, so I am quite curious whether I am barking up the wrong
> line?

It should be seen. You won't be able to use them effectively (they'll be
transmitting only about 16 bytes every 30 seconds or something) but they
should definitively be detected both.
Did you use setserial to convince the kernel of their presence?

> 
> Thanks.
> 

Bye,
Thorsten

-- 
| Thorsten KranzkowskiInternet: [EMAIL PROTECTED]|
| Mobile: ++49 170 1876134   Snail: Niemannsweg 30, 49201 Dissen, Germany |
| Ampr: dl8bcu@db0lj.#rpl.deu.eu, [EMAIL PROTECTED] [44.130.8.19] |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Linux 2.4.2ac22

2001-03-22 Thread Alan Cox


ftp://ftp.kernel.org/pub/linux/kernel/people/alan/2.4/

Intermediate diffs are available from

http://www.bzimage.org

(Note that the cmsfs port to 2.4 is a work in progress)

2.4.2-ac22
o   Fix dereference after free in megaraid driver   (me)
o   Fix crash if we run out of memory during a link (me)
follow [found by Stanford tools]
o   Fix crash if we run out of memory during
block_truncate_page [found by Stanford tools]   (me)
o   Update Alpha to pre6 style pte/pmd_alloc(Ivan Kokshaysky)
o   Fix ppp memory corruption   (Kevin Buhr)
| Bizzarely enough a direct re-invention of a 1.2 ppp bug
o   Fix heavy stack usage in tty_foo_devfs()(Jeff Dike)
o   Make alloc_tty_struct always use kmalloc(Andrew Morton)
o   Document task struct locking rules  (Andrew Morton)
o   Document SAK properly   (Andrew Morton)
o   Fix SAK deadlocks   (Andrew Morton)
o   Fix inline/type order for picky compiler tools  (Dave Jones)
o   Fix printk levels for various fs printks that   (Andrey Panin)
lacked them
o   Next incarnation of the i810 audio driver   (Doug Ledford)
o   Add __init stuff to 3c515 driver(Andrzej Krzysztofowicz)
o   Add __init stuff to ppp layer   (Andrzej Krzysztofowicz)
o   Remove duplicate NF_TARGET_TCPMSS config text   (Steven Cole)
o   Fix missing unlock_kernel in pcwd   (me)
| Found by Stanford tools
o   Fix missing unlock_kernels in es1371(me)
| Found by Stanford tools
o   Fix missing unlock_kernels in es1370(me)
| Found by Stanford tools
o   Fix missing unlock_kernels in esssolo1  (me)
| Found by Stanford tools
o   Fix missing unlock kernels in sonicvibes(me)
| Found by Stanford tools
o   Fix missing unlock kernels in fb mmap   (me)
| Found by Stanford tools
o   Fix missing unlock_super in UFS code(me)
| Found by Stanford tools


2.4.2-ac21
o   Merge with Linus 2.4.3pre6
o   Close last known reiserfs tail bug  (Chris Mason)
o   Fix link order bug with iso8859_8 and cp1255(Dan Aloni)
o   Generate generic CPU namings for 386/486(Cesar Eduardo Barros)
o   First set of ISDN fixes from Stanford code  (Kai Germaschewski)
analyser
o   Allow up to 16 parallel ports by default(Tim Waugh)
o   Use long delays on low speed usb hub ports  (Pete Zaitcev)
o   Update credits for assorted Australians (Stephen Rothwell)
o   Fix ali_restore_regs thinko (Pavel Roskin)
o   Fix whiteheat usb driver bugs   (Greg Kroah-Hartman)
o   Fix kfree in belkin_sa  (Greg Kroah-Hartman)
o   Fix omninet copy*user bug   (Greg Kroah-Hartman)
o   Fix modular atyfb   (Geert Uytterhoeven)
o   Update joystick and input drivers   (Vojtech Pavlik)
o   Relax checksum enforcement on ISAPnP CSN(Gunther Mayer)
o   Resync ids/comments with ISDN cvs   (Kai Germaschewski)
o   Update Harald Hoyer Credits entry   (Harald Hoyer)
o   Fix off by 2* mtrr handling bug (David Wragg)
o   Fix irda hang on boot   (Dag Brattli)
o   FB device init updates  (Geert Uytterhoeven)
o   Add it8712 misp eval board support  (P. Popov)
o   Update NEC DDB5476 eval board support   (Jun Sun)
o   Update NEC DDB5074 eval board support   (Ralf Baechle)
o   Add Karsten Merker and Michael Engel to credits (Ralf Baechle)
o   Update Baget port   (Vladimir Roganov,
 Gleb Raiko)
o   Add LVM ioctls to sparc64 ioctl32 convertor (Patrick Caulfield)
o   Powerpc updates for openfirmware mm, python etc (Cort Dougan)
o   Add the casio qv digitalcamera to the usb
unusual devices list(Harald Schreiber)
o   atyfb mode updates for powermac (Olaf Hering)
o   Fix khubd locking   (Pete Zaitcev)
o   More on the great aic7xxx libdb game(Nathan Dabney)
o   Further console handling updates(Andrew Morton)
o   Fix i2o build problem when half modular (Michael Mueller)
o   Fix off by one in prink  check (Mitchell Blank Jr)
o   Fix do_swap_page hang   (Linus Torvalds)

2.4.2-ac20
o   Add support for the GoHubs GO-COM232(Greg Kroah-Hartman)
o   Remove cobalt remnants  (Ralf 

Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Jean Tourrilhes

On Thu, Mar 22, 2001 at 03:49:31PM -0800, Junfeng Yang wrote:
> 
> Sometimes the line number reported by the checker is not correct.
> But if you go into the function, you can find the bug.

Gotcha. It in fact indicate the error at the end of the
function instead of the place where the error is. Very confusing.
So, mea culpa, I was wrong...

Jean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [linux-usb-devel] Re: USB oops Linux 2.4.2ac6

2001-03-22 Thread David Brownell

> I found the problem.
> CONFIG_DEBUG_SLAB "Debug memory allocation"
> in the 2.4.2-ac series doesn't work with USB.
> 
> 2.4.2-ac5 just booted and found the mouse correctly.
> On to ac-21 now...

I just glanced at Alan's change list, it didn't have patches
that seemed to cover that (vs ac20).

You might see what sort of luck you have with the patches
I posted to linux-usb-devel earlier today.  At least both
usb-ohci and usb-uhci enumerated even after configuring
in slab debugging ... but there are bugs yet to be found.
Maybe it deserves a CONFIG_DEBUG_PCI_POOL to
decouple autopoisoning from CONFIG_DEBUG_SLAB.


> Did David Brownell's patch to disable OHCI loading
> on the AMD-756 make it into the source trees?

It's been sent to Linus.  Unless/until someone learns the
vendor fix and implements it, it seems to be the best way
to prevent the 756-specific USB problems (happening
most with lowspeed devices like mice).

- Dave



___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
http://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Jean Tourrilhes

Junfeng Yang wrote :
> Hi,
> 
> Here are yet more results from the MC project.  This checker looks for
> inconsistent usage of interrupt functions.
[...]
> -
> [BUG] error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/net/irda/irport.c:943:irport_net_ioctl: 
>ERROR:INTR:947:997: Interrupts inconsistent, severity `20':997
> 
> /* Disable interrupts & save flags */
> save_flags(flags);
> Start --->
> cli();
> 
> switch (cmd) {
> case SIOCSBANDWIDTH: /* Set bandwidth */
> if (!capable(CAP_NET_ADMIN))
> return -EPERM;
> irda_task_execute(self, __irport_change_speed, NULL, NULL,
> 
> ... DELETED 40 lines ...
> 
> }
> 
> restore_flags(flags);
> 
> Error --->
> return ret;
> }
> 
> static struct net_device_stats *irport_net_get_stats(struct net_device *dev)
> {
> -

I agree that the IrDA stack is full of irq/locking bugs (there
is a patch of mine waiting in Dag's mailbox), but this one is not a
bug, it's a false positive.
The restore_flags(flags); will restore the state of the
interrupt register before the cli happened, so will automatically
reenable interrupts. The exact same code was used all over the kernel
before spinlock were introduced.

So, if you see :
save_flags(flags);
cli();
...
restore_flags(flags);
It's correct (but a bit outdated).


> -
> [BUG] error path. this bug is interesting
> 
> 
>/u2/acc/oses/linux/2.4.1/drivers/net/pcmcia/wavelan_cs.c:2561:wavelan_get_wireless_stats:
> ERROR:INTR:2528:2561: Interrupts inconsistent, severity `20':2561
> 
> 
>   /* Disable interrupts & save flags */
> Start --->
>   spin_lock_irqsave (>lock, flags);
> 
>   if(lp == (net_local *) NULL)
> return (iw_stats *) NULL;
>   wstats = >wstats;
> 
>   /* Get data from the mmc */
> 
> ... DELETED 23 lines ...
> 
> 
> #ifdef DEBUG_IOCTL_TRACE
>   printk(KERN_DEBUG "%s: <-wavelan_get_wireless_stats()\n", dev->name);
> #endif
> Error --->
>   return >wstats;
> }
> #endif  /* WIRELESS_EXT */
> 
> -

Didn't look into 2.4.1, but in 2.4.2 the irq_restore is just
above the printk, in the part that is "DELETED". It even has a nice
comments to that effect. Check the code by yourself.
So, I guess it's another false positive and a bug in your
parser. That's why it's so "interesting" ;-)

Good luck...

Jean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Mikael Pettersson

On Thu, 22 Mar 2001 21:23:54 + (GMT), Alan Cox wrote:

>> Really the whole oom_kill process seems bass-ackwards to me.  I can't in my mind
>> logically justify annihilating large-VM processes that have been running for 
>> days or weeks instead of just returning ENOMEM to a process that just started 
>> up.
>
>How do you return an out of memory error to a C program that is out of memory
>due to a stack growth fault. There is actually not a language construct for it

SIGSEGV.
Stack overflow for a language like C using standard implementation techniques
is the same as a page fault while accessing a page for which there is no backing
store. SIGSEGV is the logical choice, and the one I'd expect on other Unices.

oom_kill should simply fail the current allocation which cannot be satisfied,
either by having {s,}brk/mmap return error or by posting a SIGSEGV. This would
actually also be the correct answer, if Linux didn't overcommit memory ...

Remove the overcommit crap and oom_kill can go away; this entails ensuring
that mmap() honors MAP_RESERVE/MAP_NORESERVE.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Rik van Riel

On Fri, 23 Mar 2001, Guest section DW wrote:
> On Thu, Mar 22, 2001 at 10:52:09PM +, Alan Cox wrote:
>
> > You can do overcommit avoidance in Linux if you are bored enough to try it.
>
> Would you accept it as the default? Would Linus?

It wouldn't help.  Suppose you run without overcommit and you
fill up RAM and swap to the last page.

Then you change the size of one of the windows on your desktop
and a program gets sent -SIGWINCH. In order to process this
signal, the program needs to allocate some variables on its
stack, possibly needing a new page to be allocated for its
stack ...

... and since this is something which could happen to any program
on the system, the result of non-overcommit would be getting a
random process killed (though not completely random, syslogd and
klogd would get killed more often than the others).

The only solution to not getting processes killed is to run with
enough memory and swap space, having an OOM killer which takes care
to *NOT* let any random innocent process gets killed is nothing but
a bonus, IMHO.

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: kernel_thread vs. zombie

2001-03-22 Thread Benjamin Herrenschmidt

>daemonize() makes calls that are all protected with the
>big kernel lock in do_exit(). All usages of daemonize have
>the big kernel lock held. So I guess it just needs it.
>
>Please let me know whether you have success if it makes
>a difference with having it held.

With a bit more experiments, I have this behaviour:

(I hold the kerne lock, daemonize(), and release the kernel lock, then do
my probe thing which takes a few seconds, and let the thread die by itself)

 - When started during boot (low PID (9)) It becomes a zombie
 - When started from a process that quits after sending the ioctl,
   it is correctly "garbage collected".
 - When started from a process that stays around, it becomes a zombie too

So something is not working, or I'm missing something obvious, or whatever...

Any clue ?

Ben.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Alan Cox

>   thanks, i just tested the "notsc" option (.config has CONFIG_X86_TSC
> enabled=y, but CONFIG_M586TSC is not enabled.. if that's ok), but this time
...
> boot and stay on battery power exclusively.  did anyone else expect this
> behaviour?  

Errmm no.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Alan Cox

> > Even if malloc fails the situation is no different.
> Why do you say so?

Because you will fail on other things - stack overflow, signal delivery,
eventually it will get you. You just cut the odds down. 

> > You can do overcommit avoidance in Linux if you are bored enough to try it.
> 
> Would you accept it as the default? Would Linus?

I'd like to have it there as an option. As to the default - You would have to
see how much applications assume they can overcommit and rely on it. You might
find you need a few Gbytes of swap just to boot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Guest section DW

On Thu, Mar 22, 2001 at 10:52:09PM +, Alan Cox wrote:

> > You see, the bug is that malloc does not fail. This means that the
> > decisions about what to do are not taken by the program that knows
> > what it is doing, but by the kernel.

> Even if malloc fails the situation is no different.

Why do you say so?

> You can do overcommit avoidance in Linux if you are bored enough to try it.

Would you accept it as the default? Would Linus?

(With disk I/O we are terribly conservative, using very cautious settings,
and many people use hdparm to double or triple their disk speed.
But for a few these optimistic settings cause data corruption,
so we do not make it the default.
Similarly I would be happy if the "no overcommit", "no OOM killer"
situation was the default. The people who need a reliable system
will leave it that way. The people who do not mind if some process
is killed once in a while use vmparm or /proc/vm/overcommit or so
to make Linux achieve more on average.)

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Sergey Kubushin

On Thu, 22 Mar 2001, Alan Cox wrote:

Does not build for PPro/P-II. i586 is OK.

=== Cut ===
ld -m elf_i386 -T /tmp/build-kernel/usr/src/linux-2.4.2ac21/arch/i386/vmlinux.lds -e 
stext arch/i386/kernel/head.o arch/i386/kernel/init_task.o init/main.o init/version.o \
--start-group \
arch/i386/kernel/kernel.o arch/i386/mm/mm.o kernel/kernel.o mm/mm.o fs/fs.o 
ipc/ipc.o \
drivers/block/block.o drivers/char/char.o drivers/misc/misc.o 
drivers/net/net.o drivers/media/media.o  drivers/char/drm/drm.o drivers/net/fc/fc.o 
drivers/net/appletalk/appletalk.o drivers/net/tokenring/tr.o drivers/net/wan/wan.o 
drivers/atm/atm.o drivers/cdrom/driver.o drivers/pci/driver.o drivers/video/video.o 
drivers/net/hamradio/hamradio.o drivers/md/mddev.o \
net/network.o \
/tmp/build-kernel/usr/src/linux-2.4.2ac21/arch/i386/lib/lib.a 
/tmp/build-kernel/usr/src/linux-2.4.2ac21/lib/lib.a 
/tmp/build-kernel/usr/src/linux-2.4.2ac21/arch/i386/lib/lib.a \
--end-group \
-o vmlinux
arch/i386/mm/mm.o: In function `do_check_pgt_cache':
arch/i386/mm/mm.o(.text+0x201): undefined reference to `get_pmd_slow'
mm/mm.o: In function `clear_page_tables':
mm/mm.o(.text+0x150): undefined reference to `pmd_free'
mm/mm.o: In function `__pmd_alloc':
mm/mm.o(.text+0x1fe4): undefined reference to `get_pmd_slow'
mm/mm.o(.text+0x207a): undefined reference to `pmd_free'
mm/mm.o(.text+0x208a): undefined reference to `pgd_populate'
make: *** [vmlinux] Error 1
=== Cut ===

Here is the config (processor part). Full config is available on request.
Everything's modular except romfs and initrd.

=== Cut ===
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_TOSHIBA=m
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_NOHIGHMEM is not set
# CONFIG_HIGHMEM4G is not set
CONFIG_HIGHMEM64G=y
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_SMP=y
CONFIG_HAVE_DEC_LOCK=y
=== Cut ===

---
Sergey Kubushin Sr. Unix Administrator
CyberBills, Inc.Phone:  702-567-8857
874 American Pacific Dr,Fax:702-567-8808
Henderson, NV, 89014

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re : 16 potential locking bugs in 2.4.1 (wavelan patch attached)

2001-03-22 Thread Jean Tourrilhes

Andy Chou :
> Here are some more results from the MC project. These are 16 errors found 
> in 2.4.1 related to inconsistent use of locks. As usual, if you can 
> verify any of these or show that they are false positives, please let us 
> know by CC'ing [EMAIL PROTECTED] 
> 
> -Andy Chou 
>
> -
> 
> [BUG] error condition 
> 
>/u2/acc/oses/linux/2.4.1/drivers/net/pcmcia/wavelan_cs.c:2561:wavelan_get_wireless_stats:
> ERROR:LOCK:2528:2561: Inconsistent 
> lock using `spin_lock':2528 
> 
> static iw_stats * 
> wavelan_get_wireless_stats(device * dev) 
> { 
>   ... 
> --> Lock 
>   spin_lock_irqsave (>lock, flags); 
> 
>   if(lp == (net_local *) NULL) 
> --> Missing unlock? 
> return (iw_stats *) NULL; 
> 
> -

Thanks for the hint (actually, also thanks to LWN for
reporting this, I don't read the list).
At first, I felt offended to have such an obvious bug in my
driver, and then I check the master copy of the driver in the Pcmcia
package that I maintain, and it doesn't contain this bug. So whoever
did the port from Pcmcia -> kernel introduced this one :-(
Patch attached. Have fun...

Jean



diff -u -p linux/drivers/net/pcmcia/wireless.24d/wavelan_cs.c 
linux/drivers/net/pcmcia/wavelan_cs.c
--- linux/drivers/net/pcmcia/wireless.24d/wavelan_cs.c  Thu Mar 22 15:08:46 2001
+++ linux/drivers/net/pcmcia/wavelan_cs.c   Thu Mar 22 15:10:25 2001
@@ -2524,11 +2524,13 @@ wavelan_get_wireless_stats(device * dev)
   printk(KERN_DEBUG "%s: ->wavelan_get_wireless_stats()\n", dev->name);
 #endif
 
+  /* Pure paranoia */
+  if(lp == (net_local *) NULL)
+return (iw_stats *) NULL;
+
   /* Disable interrupts & save flags */
   spin_lock_irqsave (>lock, flags);
 
-  if(lp == (net_local *) NULL)
-return (iw_stats *) NULL;
   wstats = >wstats;
 
   /* Get data from the mmc */



RE: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Woller, Thomas


> > I wonder if there is a way to modify mdelay to use a kernel timer if
> > interval > 10msec? I am not familiar with this section of the kernel,
> but I
> > do know that Microsoft's similar function KeStallExecutionProcessor is
> not
> > recommended for more than 50 *micro*seconds.
> 
>>Basically the same kind of recommendation applies. But as with all
rules its
>>sometimes appropriate to break it

thanks, i just tested the "notsc" option (.config has CONFIG_X86_TSC
enabled=y, but CONFIG_M586TSC is not enabled.. if that's ok), but this time
I booted and kept the machine on battery power the ENTIRE time, i had not
tried this before.  the MHZ value Detected in time.c is 132Mhz (down from
500Mhz if not on battery power).  but the interesting thing that i just
noticed is that the mdelay() wait time, is STILL about 25% of what it should
delay.  i use 1 (for a 10 second delay) and get only about 2-3 seconds
out of it.  this smaller delay occurs with or without "notsc" on the boot
line.  now, i did not expect this behaviour if i did not plug in to get more
CPU speed, with the calculated cpu rate when on battery power.  i expected
that mdelay() would function properly with the appropriate wait time if i
booted and stayed on battery power, at the same reduced CPU frequency.
Alan, you might have answered this in your first post but i don't under the
INTEL speedstep logic to understand if this is expected behaviour.  but the
bottom line is that my delay of 700 milleseconds in the driver fails if i
boot and stay on battery power exclusively.  did anyone else expect this
behaviour?  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Doug Ledford

Alan Cox wrote:
> 
> > > How do you return an out of memory error to a C program that is out of memory
> > > due to a stack growth fault. There is actually not a language construct for it
> >
> > Simple, you reclaim a few of those uptodate buffers.  My testing here has
> 
> If you have reclaimable buffers you are not out of memory. If oom is triggered
> in that state it is a bug. If you are complaining that the oom killer triggers
> at the wrong time then thats a completely unrelated issue.

Ummm, yeah, that would pretty much be the claim.  Real easy to reproduce too. 
Take your favorite machine with lots of RAM, run just a handful of startup
process and system daemons, then log in on a few terminals and do:

while true; do bonnie -s (1/2 ram); done

Pretty soon, system daemons will start to die.

-- 

 Doug Ledford <[EMAIL PROTECTED]>  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Alan Cox

> On Thu, 22 Mar 2001, Alan Cox wrote:
> 
> Does not build for PPro/P-II. i586 is OK.

You need to avoid enabling 64G support. The PAE stuff (as Linus said with
2.4.3pre6) is currently broken. Once Linus and co fix it I'll merge the fixed
one

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



serial driver question

2001-03-22 Thread John Covici

I have been wondering about the serial drivers shared irq
configuration parameter.  Will it allow two dumb serial ports which
know nothing about sharing irq's to actually share the same irq, or
does the actual hardware have to support some kind of irq sharing for
this to work?

I did try two ports on the same irq, but one of them isn't seem at all
by Linux, so I am quite curious whether I am barking up the wrong
line?

Thanks.


-- 
 John Covici
 [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: USB oops Linux 2.4.2ac6

2001-03-22 Thread Thomas Dodd

Peter Zaitcev wrote:
> 
> > > 2.4.2-ac6
> > > o USB hub kmalloc wrong size corruption fix (Peter Zaitcev)
> 
> > The first line of the oops is
> >
> > 
> > kernel BUG at slab.c:1398!
> > 
> > Any other ideas to try?
> > -Thomas
> 
> I did not break it, honest! I will be looking in a USB mouse
> problem though. If you need an immediate resolution, nice
> folks at [EMAIL PROTECTED] may be able to help.
> Or may be not :)

I found the problem.
CONFIG_DEBUG_SLAB "Debug memory allocation"
in the 2.4.2-ac series doesn't work with USB.

2.4.2-ac5 just booted and found the mouse correctly.
On to ac-21 now...

Did David Brownell's patch to disable OHCI loading
on the AMD-756 make it into the source trees?

-Thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated)

2001-03-22 Thread Rik van Riel

On Thu, 22 Mar 2001, Richard Jerrell wrote:

> 2.4.1 has a memory leak (temporary) where anonymous memory pages
> that have been moved into the swap cache will stick around after
> their vma has been unmapped by the owning process.

> free_pte in mm/memory.c has been modified to check to see if the
> page is only being referenced by the swap cache

Your idea is nice, but the patch lacks a few things:

- SMP locking, what if some other process faults in this page
  between the atomic_read of the page count and the test later?
- testing if our process is the _only_ user of this swap page,
  for eg. apache you'll have lots of COW-shared pages .. it would
  be good to keep the page in memory for our siblings

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Alan Cox

> > How do you return an out of memory error to a C program that is out of memory
> > due to a stack growth fault. There is actually not a language construct for it
> 
> Simple, you reclaim a few of those uptodate buffers.  My testing here has

If you have reclaimable buffers you are not out of memory. If oom is triggered
in that state it is a bug. If you are complaining that the oom killer triggers
at the wrong time then thats a completely unrelated issue.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: supermount ?

2001-03-22 Thread Alan Cox

> Supermount sounds to me like a very important part of linux, at least for us 
> who like our cds/dvds/etc. to work as easily as in fx. windows. For linux to 
> be popular among "normal" users, it should be present at every system with 
> local removable drives. So, my question is; why isn't supermount a standard 
> part of the kernel, or at least a module ?

Because it wants rewriting as a clean file system using the 2.4 dcache and
layering itself above the real fs. In theory the infrastructure for this is
all there. 

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Alan Cox

> I wonder if there is a way to modify mdelay to use a kernel timer if
> interval > 10msec? I am not familiar with this section of the kernel, but I
> do know that Microsoft's similar function KeStallExecutionProcessor is not
> recommended for more than 50 *micro*seconds.

Basically the same kind of recommendation applies. But as with all rules its
sometimes appropriate to break it

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 8 more potential locking problems

2001-03-22 Thread Alan Cox

> We modified our compiler extension for checking lock consistency and
> found 8 more potential errors.

All 8 are real, all 8 fixed in my tree. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2-ac21

2001-03-22 Thread Frank de Lange

Oops...

Linux 2.4.2-ac21 does not like my box, or the other way around:

loading the agpgart module (MGA G400 AGP) -> system hangs
loading the SCSI module (53c875) -> system hangs

In both cases, the magic SysRq sequence does not work, but it is still possible
to ping the box from the outside. Connecting to it (ssh) does not work,
however. I backed out both the SCSI driver patches as well as the agpgart
patches, but this did not fix the symptoms. Looks more like a module-loading
related issue, but I have not found it yet.

All this on an SMP (Abit BP6) box by the way...

The changes which introduced these symptoms have occured somewhere between -ac7
and -ac21, since -ac7 DID run on the same hardware.

Cheers//Frank
-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Incorrect mdelay() results on Power Managed Machines x86

2001-03-22 Thread Grover, Andrew

> During resume the IBM thinkpad with the cs46xx driver needs 
> to delay 700
> milleseconds, so if the machine is booted up on battery power, then to
> ensure that the delay is long enough, then a value of 3000 
> milleseconds is
> must be programmed into the driver (3 seconds!).  all the 
> mdelay and udelay
> wait times are incorrect by the same factor, resulting in some serious
> problems when attempting to wait specific delay times in 
> other parts of the
> driver.  

Well yes this is a problem, but only when starting out with a low effective
CPU freq and going high - the reverse is usually OK because longer than
anticipated waits are OK.

However, you can alleviate this problem by not using udelay (or mdelay) but
using a kernel timer. I would think you should be doing this anyway (700ms
is a LONG TIME) but this should also work regardless of effective CPU freq.

A grep of the kernel source shows cs46xx isn't even doing the biggest
mdelay. I can understand the use of spinning on a calibrated loop for less
than a clock tick, but I gotta think there are better ways for longer
periods.

I wonder if there is a way to modify mdelay to use a kernel timer if
interval > 10msec? I am not familiar with this section of the kernel, but I
do know that Microsoft's similar function KeStallExecutionProcessor is not
recommended for more than 50 *micro*seconds.

Regards -- Andy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Alan Cox

> > Eventually you have to kill something or the machine deadlocks.
> 
> Alan, this is a fake argument.

No it is not.

> You see, the bug is that malloc does not fail. This means that the
> decisions about what to do are not taken by the program that knows
> what it is doing, but by the kernel.

Even if malloc fails the situation is no different. You can do 
overcommit avoidance in Linux if you are bored enough to try it. I did it
in 1.2 one afternoon when bored. You simply account address space. Almost
everything you need to touch is in mm/*.c and localised. The only exception
is ptrace.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.2-ac21: aviplay slowdown

2001-03-22 Thread Alan Cox

> I compiled 2.4.3-pre6 and 2.4.2-ac21 and noticed, that aviplay works
> much worse than before. Avifile benchmark told me:
> 
> Average video output speed: 20.566223 Mb/s
> 
> On 2.2.18 and earlier 2.4.2-ac* it gives 50-55Mb/s.
> 
> mtrr is enabled:
> 
> [jp@darkwood jp]$ cat /proc/mtrr
> reg00: base=0xe800 (3712MB), size=  32MB: write-combining, count=2
> 
> My hardware: K6-2 500, VIA MVP3, Voodoo3

Are the numbers comparable if you have mtrr disabled on both the old and new 
kernel tree ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: supermount ?

2001-03-22 Thread Erik Gustavsson


I don't know if this applies to 2.4.2, but there is a patch for 2.4.0:

http://www.geocities.com/SiliconValley/Lab/8144/supermount.html

/cyr

---
Lister: Shouldn't this plug in to something?
Holly: Yes, that joins up with the white cable.
--- Lister electrocuted ---
Holly: ...or was that the yellow cable? Yes, it should have
been the yellow cable.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Serial port latency

2001-03-22 Thread Theodore Tso

On Thu, Mar 22, 2001 at 09:32:39PM +0100, Geir Thomassen wrote:
> 
> The serial port chip is 16550A, which has a built in fifo. Can this be
> the source of my problems ?

Well, if you set the uart to be 16450 using setserial, this will cause
Linux to avoid enabling the FIFO.  That will cause the loop to save
the 4 character times (which at 9600 bps is 4ms).  If your original
protocol is writing six characters, and then reads 2 characters in a
tight loop, that means a total cycle takes 8ms, and disabling the FIFO
will have significant savings assuming that all other causes of
latencies have been removed.  (The FIFO delay can cause a slowdown by
50%).

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Multicast and IP-conntrack problem

2001-03-22 Thread Michel Wilson

Hi!

I'm having some problems with ip-connection tracking and multicast packets:
the conntrack stuff doesn't seem to be able to handle multicast packets,
flooding my logs with messages like these:

Feb 28 15:53:00 procyon kernel: NAT: 0 dropping untracked packet c7105b00 1
195.38.203.147 -> 224.0.0.2
Feb 28 15:53:23 procyon kernel: NAT: 0 dropping untracked packet c7a13740 1
195.38.207.229 -> 224.0.0.2
Feb 28 15:53:26 procyon kernel: NAT: 0 dropping untracked packet c7a13740 1
195.38.207.229 -> 224.0.0.2
Feb 28 15:53:31 procyon kernel: NAT: 0 dropping untracked packet c6535a80 1
195.38.207.229 -> 224.0.0.2
Feb 28 15:54:18 procyon kernel: NAT: 0 dropping untracked packet c7a132c0 1
195.38.202.44 -> 224.0.0.2
Feb 28 15:54:21 procyon kernel: NAT: 0 dropping untracked packet c7a132c0 1
195.38.202.44 -> 224.0.0.2
(this is an old logfile, i disabled logging for these messages, because it
generated several megs each day)

I'm currently using kernel 2.4.0-test9, but a friend of mine is using 2.4.0
and is experiencing the same problem.

Is this a known problem which can't be fixed, or is it fixable? And am i
asking this question in the right place here?

Thanks for any help,

Michel Wilson
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



supermount ?

2001-03-22 Thread Gerry

I recently upgraded my kernel to version 2.4.2, with no problems at all, 
except one: supermount. I guess you already know that supermount haven't been 
upgraded to support 2.4.2 or even 2.4 yet, and i guess there's nothing to do 
about that but wait. But that's not why i'm writing this.

Supermount sounds to me like a very important part of linux, at least for us 
who like our cds/dvds/etc. to work as easily as in fx. windows. For linux to 
be popular among "normal" users, it should be present at every system with 
local removable drives. So, my question is; why isn't supermount a standard 
part of the kernel, or at least a module ?

Right now i have to use autofs to manage automounting, but there's several 
problems with that (as it's aimed at use with network devices): Fx, it locks 
my dvd/cdrw-drives every time they get mounted, so that eject isn't possible 
until it gets unmounted. Floppy disks aren't updated until they're remounted. 
Setting low timeouts doesn't help at this, since it doesn't seem to work that 
well with local devices for some reason..

So, supermount is required even if autofs is included in the kernel, from my 
point of view anyway. I'm sure there's many people out there like me :)

Any chance supermount will be a standard kernel module in the future ?

Gerry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4.2 fails to merge mmap areas, 700% slowdown.

2001-03-22 Thread Kevin Buhr

Mike Galbraith <[EMAIL PROTECTED]> writes:
> 
> 2.4.2.ac20.virgin   2.4.3-pre6
> real11m0.708s   11m58.617s
> user15m8.720s   7m29.970s
> sys 1m31.410s   0m41.590s
> 
> It looks like ac20 is doing some double accounting.

Alan:

In "2.4.2-ac20", the check in "apic.c" in the "APIC_init_uniprocessor"
function to avoid initializing the APIC is:

if (!smp_found_config && !cpu_has_apic)
return -1;

However, in "arch/i386/time.c", we use the following check:

if (!smp_found_config)
smp_local_timer_interrupt(regs);

to see if we need to emulate an smp_local_timer_interrupt from
"do_timer_interrupt".

In Mike's case, I think we have smp_found_config == 0 but cpu_has_apic
== 1, so we're telling the CPU APIC to generate smp_local_timer_interrupts,
and then we're also emulating them on normal timer ticks.  That
doubles the rate at which "smp_local_timer_interrupt" is called,
doubling the process user and system time accounting.

Mike, would you like to try out the following (untested) patch against
vanilla ac20 to see if it does the trick?

Kevin <[EMAIL PROTECTED]>

*   *   *

diff -ru linux-2.4.2-ac20/arch/i386/kernel/apic.c 
linux-2.4.2-ac20-local/arch/i386/kernel/apic.c
--- linux-2.4.2-ac20/arch/i386/kernel/apic.cThu Mar 22 12:36:02 2001
+++ linux-2.4.2-ac20-local/arch/i386/kernel/apic.c  Thu Mar 22 15:59:08 2001
@@ -30,6 +30,9 @@
 #include 
 #include 
 
+/* Using APIC to generate smp_local_timer_interrupt? */
+int using_apic_timer = 0;
+
 int prof_multiplier[NR_CPUS] = { 1, };
 int prof_old_multiplier[NR_CPUS] = { 1, };
 int prof_counter[NR_CPUS] = { 1, };
@@ -884,6 +887,8 @@
 
/* and update all other cpus */
smp_call_function(setup_APIC_timer, (void *)calibration_result, 1, 1);
+
+   using_apic_timer = 1;
 }
 
 /*
diff -ru linux-2.4.2-ac20/arch/i386/kernel/time.c 
linux-2.4.2-ac20-local/arch/i386/kernel/time.c
--- linux-2.4.2-ac20/arch/i386/kernel/time.cThu Mar 22 12:36:03 2001
+++ linux-2.4.2-ac20-local/arch/i386/kernel/time.c  Thu Mar 22 16:03:02 2001
@@ -422,7 +422,7 @@
if (!user_mode(regs))
x86_do_profile(regs->eip);
 #else
-   if (!smp_found_config)
+   if (!using_apic_timer)
smp_local_timer_interrupt(regs);
 #endif
 
diff -ru linux-2.4.2-ac20/include/asm-i386/smp.h 
linux-2.4.2-ac20-local/include/asm-i386/smp.h
--- linux-2.4.2-ac20/include/asm-i386/smp.h Sun Mar  4 21:35:03 2001
+++ linux-2.4.2-ac20-local/include/asm-i386/smp.h   Thu Mar 22 16:07:28 2001
@@ -34,6 +34,7 @@
 extern unsigned long cpu_online_map;
 extern volatile unsigned long smp_invalidate_needed;
 extern int pic_mode;
+extern int using_apic_timer;
 extern void smp_flush_tlb(void);
 extern void smp_message_irq(int cpl, void *dev_id, struct pt_regs *regs);
 extern void smp_send_reschedule(int cpu);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread James A. Sutherland

On Wed, 21 Mar 2001, Rik van Riel wrote:

> On Wed, 21 Mar 2001, Patrick O'Rourke wrote:
>
> > Since the system will panic if the init process is chosen by
> > the OOM killer, the following patch prevents select_bad_process()
> > from picking init.
>
> One question ... has the OOM killer ever selected init on
> anybody's system ?

Well, I managed to get the OOM killer killing init once; OTOH, I had just
broken MM completely (disabled freeing of pages entirely!) so that doesn't
really count, I think :-)

> I think that the scoring algorithm should make sure that
> we never pick init, unless the system is screwed so badly
> that init is broken or the only process left ;)

If the system is that badly screwed, killing init is probably the right
thing to do, since this should then cause a panic, and thus a reboot if
the machine is so configured?


James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PROBLEM: 2.2.18 oops leaves umount hung in disk sleep

2001-03-22 Thread Trond Myklebust

> " " == Camm Maguire <[EMAIL PROTECTED]> writes:

 > I'd be happy to generate one if I could.  I've got the system
 > map.  The defaults reported by ksymoops are all correct.  Don't
 > know why it didn't give me more info.  Normally, the info is
 > reported by klogd anyway, but not here.  I've sent you all I
 > currently have.  If you can suggest how I can get more, would
 > be glad to do so.


Unless you happen to have a dump from 'dmesg', there's probably not
much you can do to recover the rest of the Oops...

We need at least the line 'EIP:' if we're to find out where the fault
occurred. Are you certain that it can't be found in the syslog?

 > I thought I was running v3.  Can't seem to find anything now
 > which indicates the protocol version in use, but was under the
 > impression that v4 was only an option in 2.4.x, no?


Mar 21 01:14:49 intech9 automount[305]: using kernel protocol version 3 on reawaken

Sorry, the above message fooled me.


Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Ed Tomlinson

On Thursday 22 March 2001 17:00, Guest section DW wrote:
> On Thu, Mar 22, 2001 at 09:23:54PM +, Alan Cox wrote:
> > > Really the whole oom_kill process seems bass-ackwards to me.  I can't
> > > in my mind logically justify annihilating large-VM processes that have
> > > been running for days or weeks instead of just returning ENOMEM to a
> > > process that just started up.
> >
> > How do you return an out of memory error to a C program that is out of
> > memory due to a stack growth fault. There is actually not a language
> > construct for it
>
> Alan, this is a fake argument.
> Linux is bad, and you defend it by saying that it is impossible to be
> perfect.
>
> I have used various Unix flavours for approximately thirty years.
> Stack overflow has not been a real problem. Of course they occurred
> every now and then, but roughly speaking only for unchecked recursion,
> that is, in cases of a program bug.
>
> Presently however, a flawless program can be killed.
> That is what makes Linux unreliable.
>
> > Eventually you have to kill something or the machine deadlocks.
>
> Alan, this is a fake argument.
> When I have a computer algebra system, and it computes millions of
> function values for some expensive function, then it keeps a cache
> of already computed values. Maybe a value is needed again and we
> save ten seconds of computation.
> But of course, when we run out of memory, nothing is easier than
> just throwing this cache out.
>
> You see, the bug is that malloc does not fail. This means that the
> decisions about what to do are not taken by the program that knows
> what it is doing, but by the kernel.

By this arguement the OOM kill code is fine...  If malloc is broken fix it.  
Maybe we need to stage things so that ENOMEM gets returned to requests
before we are totally out of memory.  If the apps ignore the errors then the
kills happen.

Thoughts?
Ed Tomlinson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[CHECKER] 8 more potential locking problems

2001-03-22 Thread Andy Chou

We modified our compiler extension for checking lock consistency and
found 8 more potential errors.

-Andy Chou

Index:

drivers/char/pcwd.c:468:pcwd_close
drivers/sound/es1370.c:1775:es1370_release
drivers/sound/es1370.c:2442:es1370_midi_release
drivers/sound/es1371.c:2611:es1371_midi_release
drivers/sound/esssolo1.c:1936:solo1_midi_release
drivers/sound/sonicvibes.c:2215:sv_midi_release
drivers/video/fbmem.c:545:fb_mmap
fs/ufs/balloc.c:274:ufs_new_fragments

Errors:

-
[BUG] GEM
/home/acc/oses/linux/2.4.1/drivers/char/pcwd.c:468:pcwd_close: ERROR:ALOCK:455:468: 
Inconsistent
lock using `lock_kernel':455

Start --->
lock_kernel();
is_open = 0;
#ifndef CONFIG_WATCHDOG_NOWAYOUT
/*  Disable the board  */
if (revision == PCWD_REVISION_C) {

... DELETED 5 lines ...

}
unlock_kernel();
#endif
}
Error --->
return 0;
}
-
[BUG] 
/home/acc/oses/linux/2.4.1/drivers/sound/es1370.c:2442:es1370_midi_release: 
ERROR:ALOCK:2427:2442: Inconsistent
lock using `lock_kernel':2427

Start --->
lock_kernel();
if (file->f_mode & FMODE_WRITE) {
add_wait_queue(>midi.owait, );
for (;;) {
__set_current_state(TASK_INTERRUPTIBLE);

... DELETED 7 lines ...

break;
if (file->f_flags & O_NONBLOCK) {
remove_wait_queue(>midi.owait, );
set_current_state(TASK_RUNNING);
Error --->
return -EBUSY;
}
-
[BUG] GEM
/home/acc/oses/linux/2.4.1/drivers/sound/es1370.c:1775:es1370_release: 
ERROR:ALOCK:1759:1775: Inconsistent
lock using `lock_kernel':1759

Start --->
lock_kernel();
if (file->f_mode & FMODE_WRITE)
drain_dac2(s, file->f_flags & O_NONBLOCK);
down(>open_sem);
if (file->f_mode & FMODE_WRITE) {

... DELETED 8 lines ...

}
s->open_mode &= (~file->f_mode) & (FMODE_READ|FMODE_WRITE);
wake_up(>open_wait);
up(>open_sem);
Error --->
return 0;
unlock_kernel();
-
[BUG]
/home/acc/oses/linux/2.4.1/drivers/sound/es1371.c:2611:es1371_midi_release: 
ERROR:ALOCK:2596:2611: Inconsistent
lock using `lock_kernel':2596

Start --->
lock_kernel();
if (file->f_mode & FMODE_WRITE) {
add_wait_queue(>midi.owait, );
for (;;) {
__set_current_state(TASK_INTERRUPTIBLE);

... DELETED 7 lines ...

break;
if (file->f_flags & O_NONBLOCK) {
remove_wait_queue(>midi.owait, );
set_current_state(TASK_RUNNING);
Error --->
return -EBUSY;
}
-
[BUG]
/home/acc/oses/linux/2.4.1/drivers/sound/esssolo1.c:1936:solo1_midi_release: 
ERROR:ALOCK:1921:1936: Inconsistent
lock using `lock_kernel':1921

Start --->
lock_kernel();
if (file->f_mode & FMODE_WRITE) {
add_wait_queue(>midi.owait, );
for (;;) {
__set_current_state(TASK_INTERRUPTIBLE);

... DELETED 7 lines ...

break;
if (file->f_flags & O_NONBLOCK) {
remove_wait_queue(>midi.owait, );
set_current_state(TASK_RUNNING);
Error --->
return -EBUSY;
}
-
[BUG]
/home/acc/oses/linux/2.4.1/drivers/sound/sonicvibes.c:2215:sv_midi_release: 
ERROR:ALOCK:2200:2215: Inconsistent
lock using `lock_kernel':2200

Start --->
lock_kernel();
if (file->f_mode & FMODE_WRITE) {
add_wait_queue(>midi.owait, );
for (;;) {
__set_current_state(TASK_INTERRUPTIBLE);

... DELETED 7 lines ...

break;
if (file->f_flags & O_NONBLOCK) {
remove_wait_queue(>midi.owait, );
set_current_state(TASK_RUNNING);
Error --->
return -EBUSY;
}
-
[BUG]
/home/acc/oses/linux/2.4.1/drivers/video/fbmem.c:545:fb_mmap: ERROR:ALOCK:521:545: 
Inconsistent
lock using `lock_kernel':521

Start --->
lock_kernel();
res = fb->fb_mmap(info, file, vma);
  

[PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated)

2001-03-22 Thread Richard Jerrell

2.4.1 has a memory leak (temporary) where anonymous memory pages that have
been moved into the swap cache will stick around after their vma has been
unmapped by the owning process.  These pages are not free'd in free_pte()
because they are still referenced by the page cache.  In addition, if the
pages are dirty, they will be written out to the swap device before they
are reclaimed even though the owning process no longer will be using the
pages.

free_pte in mm/memory.c has been modified to check to see if the page is
only being referenced by the swap cache (and possibly buffers).  If so,
the buffers (if existant) are free'd and the page and swap cache
entry are removed immediately.

Essentially, this is the same patch as before, but there was one condition
in which case we would leak and extra reference to the targeted page if
the counts would not allow us to remove the swap cache entry.  The leak in
2.4.1 also applies to 2.4.2 and 2.4.3-pre5.

Rich Jerrell
[EMAIL PROTECTED]


diff --recursive -u -N linux-2.4.1/mm/memory.c linux-2.4.1-paging-fix/mm/memory.c
--- linux-2.4.1/mm/memory.c Sat Jan 27 22:12:35 2001
+++ linux-2.4.1-paging-fix/mm/memory.c  Thu Feb 15 13:36:06 2001
@@ -281,6 +285,34 @@
return 1;
}
swap_free(pte_to_swp_entry(pte));
+   {
+   int num, target = 1;
+   struct page *page = lookup_swap_cache(pte_to_swp_entry(pte));
+   /* returns the page and takes a reference */
+   
+   if (!page || (!VALID_PAGE(page)) || PageReserved(page))
+   return 0;
+   
+   num = atomic_read(>count);
+   if(page->buffers) target++; /* 1 count if we have buffers 
+*/
+   if(PageSwapCache(page)) target++;   /* 1 count for the page cache 
+*/
+   /* 1 count for our reference  
+*/
+
+   if((num == target) && PageSwapCache(page)) {
+   /* SwapCache entry is the only thing referencing this page   
+*/
+   /* (and maybe buffers) asides from us, so to prevent it from 
+*/
+   /* sitting around and wasting time/memory, throw it away 
+*/
+   if((page->buffers)) {
+   if(!try_to_free_buffers(page,1)) {  /* Can't get 
+rid of buffers   */
+   page_cache_release(page);   /* get rid of 
+our reference   */
+   return 0;   /* and let 
+someone else do it */
+   }
+   }
+   free_page_and_swap_cache(page); /* Expects the page to be 
+mapped, so will */
+   return 1;   /* account for the reference 
+we have  */
+   }
+   page_cache_release(page);   /* Remove our reference, we can't do 
+anything */
+   }
return 0;
 }
 



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Doug Ledford

Alan Cox wrote:
> 
> > Really the whole oom_kill process seems bass-ackwards to me.  I can't in my mind
> > logically justify annihilating large-VM processes that have been running for
> > days or weeks instead of just returning ENOMEM to a process that just started
> > up.
> 
> How do you return an out of memory error to a C program that is out of memory
> due to a stack growth fault. There is actually not a language construct for it

Simple, you reclaim a few of those uptodate buffers.  My testing here has
resulting in more of my system daemons getting killed than anything else, and
it never once has solved the actual problem of simple memory pressure from
apps reading/writing to disk and disk cache not releasing buffers quick
enough.

> > It would be nice to give immunity to certain uids, or better yet, just turn the
> > damn thing off entirely.  I've already hacked that in...errr, out.
> 
> Eventually you have to kill something or the machine deadlocks. The oom killing
> doesnt kick in until that point. So its up to you how you like your errors.

I beg to differ.  If you tell me that a machine that looks like this:

[dledford@monster dledford]$ free
 total   used   free sharedbuffers cached
Mem:   10178001014808   2992  0  73644 796392
-/+ buffers/cache: 144772 873028
Swap:0  0  0
[dledford@monster dledford]$ 

is in need of killing sshd, I'll claim you are smoking some nice stuff ;-)

-- 

 Doug Ledford <[EMAIL PROTECTED]>  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Prevent OOM from killing init

2001-03-22 Thread Guest section DW

On Thu, Mar 22, 2001 at 09:23:54PM +, Alan Cox wrote:
> > Really the whole oom_kill process seems bass-ackwards to me.  I can't in my mind
> > logically justify annihilating large-VM processes that have been running for 
> > days or weeks instead of just returning ENOMEM to a process that just started 
> > up.
> 
> How do you return an out of memory error to a C program that is out of memory
> due to a stack growth fault. There is actually not a language construct for it

Alan, this is a fake argument.
Linux is bad, and you defend it by saying that it is impossible to be perfect.

I have used various Unix flavours for approximately thirty years.
Stack overflow has not been a real problem. Of course they occurred
every now and then, but roughly speaking only for unchecked recursion,
that is, in cases of a program bug.

Presently however, a flawless program can be killed.
That is what makes Linux unreliable.

> Eventually you have to kill something or the machine deadlocks.

Alan, this is a fake argument.
When I have a computer algebra system, and it computes millions of
function values for some expensive function, then it keeps a cache
of already computed values. Maybe a value is needed again and we
save ten seconds of computation.
But of course, when we run out of memory, nothing is easier than
just throwing this cache out.

You see, the bug is that malloc does not fail. This means that the
decisions about what to do are not taken by the program that knows
what it is doing, but by the kernel.

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Documentation/ioctl-number.txt

2001-03-22 Thread Dave Kleikamp

Alexander Viro wrote:
> 
> On Thu, 22 Mar 2001, Dave Kleikamp wrote:
> 
> > Linus,
> > I would like to reserve a block of 32 ioctl's for the JFS filesystem.
> 
> Details, please? More specifically, what kind of objects are these ioctls
> applied to?

I don't have all the details worked out yet, but the utilities to extend
and defragment the filesystem will operate on a live volume, so the
utilities will need to talk to the filesystem to move blocks, extend the
block map, etc.

The utilities will probably open the root directory and apply the ioctls
to it, unless there is a better way to do it.

-- 
David Kleikamp
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Serial port latency

2001-03-22 Thread Jonathan Lundell

>This is what the program does:
>
> fd=open("/dev/ttyS0",O_NOCTTY | O_RDWR);
>
> tcsetattr(fd,TCSANOW, ); /* setting baud, parity, raw mode, etc */
>
> while() {
> write( 6 bytes);   /* send command */
> read( 2 bytes);/* wait for reply */
> }

What are your settings for VTIME and VMIN?

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Documentation/ioctl-number.txt

2001-03-22 Thread Alexander Viro



On Thu, 22 Mar 2001, Dave Kleikamp wrote:

> Linus,
> I would like to reserve a block of 32 ioctl's for the JFS filesystem.

Details, please? More specifically, what kind of objects are these ioctls
applied to?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Bug report

2001-03-22 Thread Tim Walberg

"User does not understand how *NIX systems utilize memory" is
not the same thing as "bug". What you are seeing is the kernel
using memory for file system cache.

On 03/22/2001 12:58 -0800, Craig Cummings wrote:
>>  Hi there,
>>  
>>  I think this qualifies as a bug but let me know if this could be a
>>  configuration or hardware issue.
>>  
>>  I've been having problems with memory leaks when I run programs on
>>  large--up to 250MB text files.  (I know this is huge, but that's the 3
>>  billion base human genome for you.)  At first I though it was a Perl
>>  problem but I later found that a completely unrelated C program also
>>  caused memory leaks.  I recently upgraded to the 2.4 kernel, hoping to
>>  solve these problems (see below).  However, the memory leaks are still
>>  happening and this time I know the problem is at a deeper level than my
>>  programs.  Some standard UNIX programs are leaking a lot of memory.  I
>>  would appreciate some advice and ultimately, a fix.  Unfortunately, My
>>  programming skills are not sufficient for tinkering with the kernel
>>  source.  Thank you, in advance for your help.  Details follow.
>>  
>>  Regards,
>>  
>>  Craig Cummings
>>  
>>  
>>  Here are the specs for my system:
>>  Dell Precision XPSt700, Pentium III, 512 MB RAM
>>  I've recently upgraded from Red Hat 6.2 with the 2.2.14 kernel to
>>  Red Hat 7, then built the 2.4.2 kernel on my own.
>>  
>>  Here's what happens with grep:
>>  
>>  Output of free, freshly booted system:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616  47516 466100  0   2476  27048
>>  -/+ buffers/cache:  17992 495624
>>  Swap:   128480  0 128480
>>  
>>  Output of free after grep 'NT_005289' Data/hs_chr12.fa:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616 183548 330068  0   2624 159616
>>  -/+ buffers/cache:  21308 492308
>>  Swap:   128480  0 128480
>>  
>>  Output of grep 'NT_005289' Data/hs_chr2.fa:
>>  
>>  >gi|12728771|ref|NT_005289.2|Hs2_5446 Homo sapiens chromosome 2 working draft 
>sequence segment
>>  
>>  Output of free after this grep:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616 424272  89344  0   2860 394232
>>  -/+ buffers/cache:  27180 486436
>>  Swap:   128480  0 128480
>>  
>>  Output of grep 'NT_005289' Data/hs_chr2.fa:
>>  
>>  >gi|12728771|ref|NT_005289.2|Hs2_5446 Homo sapiens chromosome 2 working draft 
>sequence segment
>>  
>>  Output of free after this grep:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616 424272  89344  0   2860 394232
>>  -/+ buffers/cache:  27180 486436
>>  Swap:   128480  0 128480
>>  
>>  File sizes of the two files grep'ed:
>>  
>>  -rw-rw-r--1 cummings genomics 135744469 Mar 12 22:09 Data/hs_chr12.fa
>>  -rw-rw-r--1 cummings genomics 240244039 Mar 12 22:24 Data/hs_chr2.fa
>>  
>>  Note that these file sizes are equivalent to the amount of memory leaked
>>  when grep is called on that file.
>>  
>>  When I grep the same file a second time, very little additional memory is
>>  leaked.
>>  
>>  
>>  This same phenomenon occurs when I run a different UNIX program, e.g. wc:
>>  
>>  Output of wc -l Data/hs_chr3.fa:
>>  
>>  2915465 Data/hs_chr3.fa
>>  
>>  Output of free:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616 511520   2096  0   1252 481020
>>  -/+ buffers/cache:  29248 484368
>>  Swap:   128480  0 128480
>>  
>>  Interestingly, after running wc a second time on the same file, it goes
>>  very fast and very little additional memory is leaked:
>>  
>>   total   used   free sharedbuffers cached
>>  Mem:513616 510732   2884  0   1204 480948
>>  -/+ buffers/cache:  28580 485036
>>  Swap:   128480 40 128440
>>  
>>  
>>  ---
>>  Craig Cummings, Ph.D.
>>  
>>  Relman Laboratory
>>  Stanford University School of Medicine
>>  Department of Microbiology and Immunology
>>  
>>  e-mail: [EMAIL PROTECTED]
>>  phone:  650-498-5998
>>  fax:650-852-3291
>>  
>>  -
>>  

Re: Serial port latency

2001-03-22 Thread Manfred Spraul

Is the computer otherwise idle?
I've seen one unexplainable report with atm problems that disappeared
(!) if a kernel compile was running.

Could you try to run a simple cpu hog (with nice 20)?

<<
main()
{
for(;;) getpid();
}
<<

I'm aware of one bug that could cause a delay of up to 20 ms (cpu_idle()
doesn't check for pending softirq's before sleeping), but that doesn't
explain your 500 ms delay.

--
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Where's Alan?

2001-03-22 Thread Horst von Brand

"David S. Miller" <[EMAIL PROTECTED]> said:
> alterity  <[EMAIL PROTECTED]> wrote:
> >Haven't seen a post for sometime from the usually prolific Mr Cox.
> >What's the gossip?

> They needed some help from him to position Mir for it's
> final descent.

Just hope he gets out before the burnup...
-- 
Dr. Horst H. von Brand   mailto:[EMAIL PROTECTED]
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



  1   2   3   4   5   >