Re: test10-pre4: deadlock in VM?

2000-10-26 Thread Tigran Aivazian

On Thu, 26 Oct 2000, Rik van Riel wrote:

> On Wed, 25 Oct 2000, Roger Larsson wrote:
> 
> > I noted that even try_to_free_buffers locks lru_list_lock.
> 
> lru_list_lock != pagemap_lru_lock
> 

btw, while we are at it, I am not able to reproduce this with test10-pre5
but am still running tests with higher and higher load... I will let you
know if something of interest happens.

regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-26 Thread Rik van Riel

On Wed, 25 Oct 2000, Roger Larsson wrote:

> I noted that even try_to_free_buffers locks lru_list_lock.

lru_list_lock != pagemap_lru_lock

cheers,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-26 Thread Rik van Riel

On Wed, 25 Oct 2000, Roger Larsson wrote:

 I noted that even try_to_free_buffers locks lru_list_lock.

lru_list_lock != pagemap_lru_lock

cheers,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-26 Thread Tigran Aivazian

On Thu, 26 Oct 2000, Rik van Riel wrote:

 On Wed, 25 Oct 2000, Roger Larsson wrote:
 
  I noted that even try_to_free_buffers locks lru_list_lock.
 
 lru_list_lock != pagemap_lru_lock
 

btw, while we are at it, I am not able to reproduce this with test10-pre5
but am still running tests with higher and higher load... I will let you
know if something of interest happens.

regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Not.

It does not lock anything else...
This was not a problem.

/RogerL

Roger Larsson wrote:
> 
> Hi again,
> 
> Please ignore my patch suggestion from getblk -
> it will give problems later - in alloc...
> 
> It is grow_buffers that might need to lock the
> other ones too...
> 
> /RogerL
> 
> --
> Home page:
>   http://www.norran.net/nra02596/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Hi again,

Please ignore my patch suggestion from getblk -
it will give problems later - in alloc...

It is grow_buffers that might need to lock the
other ones too...

/RogerL

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Found a strange one.

getblk releases hash_table_lock and lru_list_lock
before calling refill_freelist that calls grow_buffers
that locks free_list[].lock
- lru_lock and hash_table_lock not held, violating
deadlock prevention rules in beginning of file.

patch.
  in getblk move the call to refill_freelist before
  releasing the locks - ok?

/RogerL


Roger Larsson wrote:
> 
> Hi,
> 
> I noted that even try_to_free_buffers locks lru_list_lock.
> Then it tries to lock some others - maybe one of the other treads
> got one of those (hash_table_lock, free_list[index].lock)
> It fits with that proc 4 it executes in the beginning of
> try_to_free_buffers, does it move?
> Or is it stuck at a spin lock there - which one? disassembly of
> try_to_free_buffers?
> 
> /RogerL
> 
> Rajagopal Ananthanarayanan wrote:
> >
> > Tigran Aivazian wrote:
> > >
> > > Hi guys,
> > >
> > > When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
> > > machine with 6G RAM (highmem+PAE enabled) I got
> > >
> > > __alloc_pages: 0-order allocation failed.
> > >
> > > (probably coming from nfsd, why don't we print eip of the caller there?)
> > >
> > > and the machine locked up (but pingable). So I entered kdb and got stack
> > > traces of all running proceeses:
> >
> > Hmm. It appears that some of the processes are stuck on this
> > part of page_launder:
> >
> > /*
> >  * Re-take the spinlock. Note that we cannot
> >  * unlock the page yet since we're still
> >  * accessing the page_struct here...
> >  */
> > spin_lock(_lru_lock);
> >
> > It will be interesting to see what's going on in each of the cpus.
> > Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
> > and just type "bt" on each cpu. Also, it will be good to see what
> > kswapd (pid 2) is upto ...
> >
> > --
> > Rajagopal Ananthanarayanan ("ananth")
> > Member Technical Staff, SGI.
> > --
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > Please read the FAQ at http://www.tux.org/lkml/
> 
> --
> Home page:
>   http://www.norran.net/nra02596/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Hi,

I noted that even try_to_free_buffers locks lru_list_lock.
Then it tries to lock some others - maybe one of the other treads
got one of those (hash_table_lock, free_list[index].lock)
It fits with that proc 4 it executes in the beginning of
try_to_free_buffers, does it move?
Or is it stuck at a spin lock there - which one? disassembly of
try_to_free_buffers?

/RogerL

Rajagopal Ananthanarayanan wrote:
> 
> Tigran Aivazian wrote:
> >
> > Hi guys,
> >
> > When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
> > machine with 6G RAM (highmem+PAE enabled) I got
> >
> > __alloc_pages: 0-order allocation failed.
> >
> > (probably coming from nfsd, why don't we print eip of the caller there?)
> >
> > and the machine locked up (but pingable). So I entered kdb and got stack
> > traces of all running proceeses:
> 
> Hmm. It appears that some of the processes are stuck on this
> part of page_launder:
> 
> /*
>  * Re-take the spinlock. Note that we cannot
>  * unlock the page yet since we're still
>  * accessing the page_struct here...
>  */
> spin_lock(_lru_lock);
> 
> It will be interesting to see what's going on in each of the cpus.
> Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
> and just type "bt" on each cpu. Also, it will be good to see what
> kswapd (pid 2) is upto ...
> 
> --
> Rajagopal Ananthanarayanan ("ananth")
> Member Technical Staff, SGI.
> --
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Rajagopal Ananthanarayanan

Tigran Aivazian wrote:
> 
> Hi guys,
> 
> When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
> machine with 6G RAM (highmem+PAE enabled) I got
> 
> __alloc_pages: 0-order allocation failed.
> 
> (probably coming from nfsd, why don't we print eip of the caller there?)
> 
> and the machine locked up (but pingable). So I entered kdb and got stack
> traces of all running proceeses:


Hmm. It appears that some of the processes are stuck on this
part of page_launder:

/*
 * Re-take the spinlock. Note that we cannot
 * unlock the page yet since we're still
 * accessing the page_struct here...
 */
spin_lock(_lru_lock);

It will be interesting to see what's going on in each of the cpus.
Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
and just type "bt" on each cpu. Also, it will be good to see what
kswapd (pid 2) is upto ...


--
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Rajagopal Ananthanarayanan

Tigran Aivazian wrote:
 
 Hi guys,
 
 When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
 machine with 6G RAM (highmem+PAE enabled) I got
 
 __alloc_pages: 0-order allocation failed.
 
 (probably coming from nfsd, why don't we print eip of the caller there?)
 
 and the machine locked up (but pingable). So I entered kdb and got stack
 traces of all running proceeses:


Hmm. It appears that some of the processes are stuck on this
part of page_launder:

/*
 * Re-take the spinlock. Note that we cannot
 * unlock the page yet since we're still
 * accessing the page_struct here...
 */
spin_lock(pagemap_lru_lock);

It will be interesting to see what's going on in each of the cpus.
Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
and just type "bt" on each cpu. Also, it will be good to see what
kswapd (pid 2) is upto ...


--
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Hi,

I noted that even try_to_free_buffers locks lru_list_lock.
Then it tries to lock some others - maybe one of the other treads
got one of those (hash_table_lock, free_list[index].lock)
It fits with that proc 4 it executes in the beginning of
try_to_free_buffers, does it move?
Or is it stuck at a spin lock there - which one? disassembly of
try_to_free_buffers?

/RogerL

Rajagopal Ananthanarayanan wrote:
 
 Tigran Aivazian wrote:
 
  Hi guys,
 
  When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
  machine with 6G RAM (highmem+PAE enabled) I got
 
  __alloc_pages: 0-order allocation failed.
 
  (probably coming from nfsd, why don't we print eip of the caller there?)
 
  and the machine locked up (but pingable). So I entered kdb and got stack
  traces of all running proceeses:
 
 Hmm. It appears that some of the processes are stuck on this
 part of page_launder:
 
 /*
  * Re-take the spinlock. Note that we cannot
  * unlock the page yet since we're still
  * accessing the page_struct here...
  */
 spin_lock(pagemap_lru_lock);
 
 It will be interesting to see what's going on in each of the cpus.
 Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
 and just type "bt" on each cpu. Also, it will be good to see what
 kswapd (pid 2) is upto ...
 
 --
 Rajagopal Ananthanarayanan ("ananth")
 Member Technical Staff, SGI.
 --
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Found a strange one.

getblk releases hash_table_lock and lru_list_lock
before calling refill_freelist that calls grow_buffers
that locks free_list[].lock
- lru_lock and hash_table_lock not held, violating
deadlock prevention rules in beginning of file.

patch.
  in getblk move the call to refill_freelist before
  releasing the locks - ok?

/RogerL


Roger Larsson wrote:
 
 Hi,
 
 I noted that even try_to_free_buffers locks lru_list_lock.
 Then it tries to lock some others - maybe one of the other treads
 got one of those (hash_table_lock, free_list[index].lock)
 It fits with that proc 4 it executes in the beginning of
 try_to_free_buffers, does it move?
 Or is it stuck at a spin lock there - which one? disassembly of
 try_to_free_buffers?
 
 /RogerL
 
 Rajagopal Ananthanarayanan wrote:
 
  Tigran Aivazian wrote:
  
   Hi guys,
  
   When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP
   machine with 6G RAM (highmem+PAE enabled) I got
  
   __alloc_pages: 0-order allocation failed.
  
   (probably coming from nfsd, why don't we print eip of the caller there?)
  
   and the machine locked up (but pingable). So I entered kdb and got stack
   traces of all running proceeses:
 
  Hmm. It appears that some of the processes are stuck on this
  part of page_launder:
 
  /*
   * Re-take the spinlock. Note that we cannot
   * unlock the page yet since we're still
   * accessing the page_struct here...
   */
  spin_lock(pagemap_lru_lock);
 
  It will be interesting to see what's going on in each of the cpus.
  Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x,
  and just type "bt" on each cpu. Also, it will be good to see what
  kswapd (pid 2) is upto ...
 
  --
  Rajagopal Ananthanarayanan ("ananth")
  Member Technical Staff, SGI.
  --
  -
  To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
  the body of a message to [EMAIL PROTECTED]
  Please read the FAQ at http://www.tux.org/lkml/
 
 --
 Home page:
   http://www.norran.net/nra02596/
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Hi again,

Please ignore my patch suggestion from getblk -
it will give problems later - in alloc...

It is grow_buffers that might need to lock the
other ones too...

/RogerL

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: test10-pre4: deadlock in VM?

2000-10-25 Thread Roger Larsson

Not.

It does not lock anything else...
This was not a problem.

/RogerL

Roger Larsson wrote:
 
 Hi again,
 
 Please ignore my patch suggestion from getblk -
 it will give problems later - in alloc...
 
 It is grow_buffers that might need to lock the
 other ones too...
 
 /RogerL
 
 --
 Home page:
   http://www.norran.net/nra02596/
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/