On Sun, Jan 25, 2004 at 11:15:17AM +0200, Tzafrir Cohen wrote:
> On Sun, Jan 25, 2004 at 10:30:15AM +0200, Muli Ben-Yehuda wrote:
> > On Sun, Jan 25, 2004 at 10:21:29AM +0200, Tzafrir Cohen wrote:
> > 
> > > >From looking at the source I could only ssee that it means "something is
> > > fishy". 
> > 
> > 0 order allocation failed means the kernel couldn't allocate even one
> > page of memory. Unless the machines are pretty much out of memory,
> > that should never happen. 
> > 
> > > In one case replacing reiserfs with ext3 made the problem go away.
> > > Naturally this is a drastic solution that I don't want to take.
> > > 
> > > Anybody seen this lately?
> > 
> > Actually, yes, on lkml, unless my memory is playing tricks on me. But
> > I don't remember the details, sorry. Try the archives...  
> 
> Makes some sense. I did see a number of huge processes (e.g: tar eating
> out almost all of my memory, and likewise "proxymap" of postfix).
> 
> I'm positive that the tar process was run as non-root without any
> privilges.
> 
> I also saw VM killing messages. A number of processes managed to get
> killed around the time I saw that message.

Interesting development here:

THe system has been upgraded from Mandrake 9.0 to Mandrake 9.2 using
urpmi . It went generally smoother than I thought, but troubles began
immidietly after that. I had not changed the kernel, so I initially
figured it was some new kernel module being loadd for the first time or
so.

After a bit of stracing on two such "inflating" processes (tar and
devfsd) I noticed that they were in a countinous loop always opening ,
reading-form and closing /etc/group. And they kept allocating more
memory.

I compared the existing /etc/group to the new defaault one
(/etc/group.rpmnew) and noticed that:

1. my original file had 'x' in the password field, whereas the original
   had this field empty
2. The original had a group with many members at the last line.

After just deleting the 'x'-s devfsd seems to manage to work. Tar still
looped, so I removed the big group, and it functioned as usual.

So the imidiate suspect is glibc's handling of edge cases.

Current glibc: glibc-2.3.2-14mdk

I hope his works


> 
> > 
> > Which kernel is this with? 
> > 
> 
> This time: 2.4.23-pre6 . But I also had something similar with 2.4.23
> 
> And to Shachar: thanks, I tryed that. But the corruption re-appeared too
> soon :-(
> 
> -- 
> Tzafrir Cohen                       +---------------------------+
> http://www.technion.ac.il/~tzafrir/ |vim is a mutt's best friend|
> mailto:[EMAIL PROTECTED]       +---------------------------+
> 
> =================================================================
> To unsubscribe, send mail to [EMAIL PROTECTED] with
> the word "unsubscribe" in the message body, e.g., run the command
> echo unsubscribe | mail [EMAIL PROTECTED]

-- 
Tzafrir Cohen                       +---------------------------+
http://www.technion.ac.il/~tzafrir/ |vim is a mutt's best friend|
mailto:[EMAIL PROTECTED]       +---------------------------+

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to