Re: test10-pre4: deadlock in VM?
On Thu, 26 Oct 2000, Rik van Riel wrote: > On Wed, 25 Oct 2000, Roger Larsson wrote: > > > I noted that even try_to_free_buffers locks lru_list_lock. > > lru_list_lock != pagemap_lru_lock > btw, while we are at it, I am not able to reproduce this with test10-pre5 but am still running tests with higher and higher load... I will let you know if something of interest happens. regards, Tigran - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
On Wed, 25 Oct 2000, Roger Larsson wrote: > I noted that even try_to_free_buffers locks lru_list_lock. lru_list_lock != pagemap_lru_lock cheers, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
On Wed, 25 Oct 2000, Roger Larsson wrote: I noted that even try_to_free_buffers locks lru_list_lock. lru_list_lock != pagemap_lru_lock cheers, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
On Thu, 26 Oct 2000, Rik van Riel wrote: On Wed, 25 Oct 2000, Roger Larsson wrote: I noted that even try_to_free_buffers locks lru_list_lock. lru_list_lock != pagemap_lru_lock btw, while we are at it, I am not able to reproduce this with test10-pre5 but am still running tests with higher and higher load... I will let you know if something of interest happens. regards, Tigran - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Not. It does not lock anything else... This was not a problem. /RogerL Roger Larsson wrote: > > Hi again, > > Please ignore my patch suggestion from getblk - > it will give problems later - in alloc... > > It is grow_buffers that might need to lock the > other ones too... > > /RogerL > > -- > Home page: > http://www.norran.net/nra02596/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Hi again, Please ignore my patch suggestion from getblk - it will give problems later - in alloc... It is grow_buffers that might need to lock the other ones too... /RogerL -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Found a strange one. getblk releases hash_table_lock and lru_list_lock before calling refill_freelist that calls grow_buffers that locks free_list[].lock - lru_lock and hash_table_lock not held, violating deadlock prevention rules in beginning of file. patch. in getblk move the call to refill_freelist before releasing the locks - ok? /RogerL Roger Larsson wrote: > > Hi, > > I noted that even try_to_free_buffers locks lru_list_lock. > Then it tries to lock some others - maybe one of the other treads > got one of those (hash_table_lock, free_list[index].lock) > It fits with that proc 4 it executes in the beginning of > try_to_free_buffers, does it move? > Or is it stuck at a spin lock there - which one? disassembly of > try_to_free_buffers? > > /RogerL > > Rajagopal Ananthanarayanan wrote: > > > > Tigran Aivazian wrote: > > > > > > Hi guys, > > > > > > When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP > > > machine with 6G RAM (highmem+PAE enabled) I got > > > > > > __alloc_pages: 0-order allocation failed. > > > > > > (probably coming from nfsd, why don't we print eip of the caller there?) > > > > > > and the machine locked up (but pingable). So I entered kdb and got stack > > > traces of all running proceeses: > > > > Hmm. It appears that some of the processes are stuck on this > > part of page_launder: > > > > /* > > * Re-take the spinlock. Note that we cannot > > * unlock the page yet since we're still > > * accessing the page_struct here... > > */ > > spin_lock(_lru_lock); > > > > It will be interesting to see what's going on in each of the cpus. > > Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, > > and just type "bt" on each cpu. Also, it will be good to see what > > kswapd (pid 2) is upto ... > > > > -- > > Rajagopal Ananthanarayanan ("ananth") > > Member Technical Staff, SGI. > > -- > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to [EMAIL PROTECTED] > > Please read the FAQ at http://www.tux.org/lkml/ > > -- > Home page: > http://www.norran.net/nra02596/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Hi, I noted that even try_to_free_buffers locks lru_list_lock. Then it tries to lock some others - maybe one of the other treads got one of those (hash_table_lock, free_list[index].lock) It fits with that proc 4 it executes in the beginning of try_to_free_buffers, does it move? Or is it stuck at a spin lock there - which one? disassembly of try_to_free_buffers? /RogerL Rajagopal Ananthanarayanan wrote: > > Tigran Aivazian wrote: > > > > Hi guys, > > > > When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP > > machine with 6G RAM (highmem+PAE enabled) I got > > > > __alloc_pages: 0-order allocation failed. > > > > (probably coming from nfsd, why don't we print eip of the caller there?) > > > > and the machine locked up (but pingable). So I entered kdb and got stack > > traces of all running proceeses: > > Hmm. It appears that some of the processes are stuck on this > part of page_launder: > > /* > * Re-take the spinlock. Note that we cannot > * unlock the page yet since we're still > * accessing the page_struct here... > */ > spin_lock(_lru_lock); > > It will be interesting to see what's going on in each of the cpus. > Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, > and just type "bt" on each cpu. Also, it will be good to see what > kswapd (pid 2) is upto ... > > -- > Rajagopal Ananthanarayanan ("ananth") > Member Technical Staff, SGI. > -- > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Tigran Aivazian wrote: > > Hi guys, > > When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP > machine with 6G RAM (highmem+PAE enabled) I got > > __alloc_pages: 0-order allocation failed. > > (probably coming from nfsd, why don't we print eip of the caller there?) > > and the machine locked up (but pingable). So I entered kdb and got stack > traces of all running proceeses: Hmm. It appears that some of the processes are stuck on this part of page_launder: /* * Re-take the spinlock. Note that we cannot * unlock the page yet since we're still * accessing the page_struct here... */ spin_lock(_lru_lock); It will be interesting to see what's going on in each of the cpus. Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, and just type "bt" on each cpu. Also, it will be good to see what kswapd (pid 2) is upto ... -- Rajagopal Ananthanarayanan ("ananth") Member Technical Staff, SGI. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Tigran Aivazian wrote: Hi guys, When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP machine with 6G RAM (highmem+PAE enabled) I got __alloc_pages: 0-order allocation failed. (probably coming from nfsd, why don't we print eip of the caller there?) and the machine locked up (but pingable). So I entered kdb and got stack traces of all running proceeses: Hmm. It appears that some of the processes are stuck on this part of page_launder: /* * Re-take the spinlock. Note that we cannot * unlock the page yet since we're still * accessing the page_struct here... */ spin_lock(pagemap_lru_lock); It will be interesting to see what's going on in each of the cpus. Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, and just type "bt" on each cpu. Also, it will be good to see what kswapd (pid 2) is upto ... -- Rajagopal Ananthanarayanan ("ananth") Member Technical Staff, SGI. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Hi, I noted that even try_to_free_buffers locks lru_list_lock. Then it tries to lock some others - maybe one of the other treads got one of those (hash_table_lock, free_list[index].lock) It fits with that proc 4 it executes in the beginning of try_to_free_buffers, does it move? Or is it stuck at a spin lock there - which one? disassembly of try_to_free_buffers? /RogerL Rajagopal Ananthanarayanan wrote: Tigran Aivazian wrote: Hi guys, When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP machine with 6G RAM (highmem+PAE enabled) I got __alloc_pages: 0-order allocation failed. (probably coming from nfsd, why don't we print eip of the caller there?) and the machine locked up (but pingable). So I entered kdb and got stack traces of all running proceeses: Hmm. It appears that some of the processes are stuck on this part of page_launder: /* * Re-take the spinlock. Note that we cannot * unlock the page yet since we're still * accessing the page_struct here... */ spin_lock(pagemap_lru_lock); It will be interesting to see what's going on in each of the cpus. Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, and just type "bt" on each cpu. Also, it will be good to see what kswapd (pid 2) is upto ... -- Rajagopal Ananthanarayanan ("ananth") Member Technical Staff, SGI. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Found a strange one. getblk releases hash_table_lock and lru_list_lock before calling refill_freelist that calls grow_buffers that locks free_list[].lock - lru_lock and hash_table_lock not held, violating deadlock prevention rules in beginning of file. patch. in getblk move the call to refill_freelist before releasing the locks - ok? /RogerL Roger Larsson wrote: Hi, I noted that even try_to_free_buffers locks lru_list_lock. Then it tries to lock some others - maybe one of the other treads got one of those (hash_table_lock, free_list[index].lock) It fits with that proc 4 it executes in the beginning of try_to_free_buffers, does it move? Or is it stuck at a spin lock there - which one? disassembly of try_to_free_buffers? /RogerL Rajagopal Ananthanarayanan wrote: Tigran Aivazian wrote: Hi guys, When running SPEC SFS tests against 2.4.0-test10-pre4 on a 4-way SMP machine with 6G RAM (highmem+PAE enabled) I got __alloc_pages: 0-order allocation failed. (probably coming from nfsd, why don't we print eip of the caller there?) and the machine locked up (but pingable). So I entered kdb and got stack traces of all running proceeses: Hmm. It appears that some of the processes are stuck on this part of page_launder: /* * Re-take the spinlock. Note that we cannot * unlock the page yet since we're still * accessing the page_struct here... */ spin_lock(pagemap_lru_lock); It will be interesting to see what's going on in each of the cpus. Use "cpu x" x=0,1,2,3 on your 4 cpu system to switch to cpu x, and just type "bt" on each cpu. Also, it will be good to see what kswapd (pid 2) is upto ... -- Rajagopal Ananthanarayanan ("ananth") Member Technical Staff, SGI. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Hi again, Please ignore my patch suggestion from getblk - it will give problems later - in alloc... It is grow_buffers that might need to lock the other ones too... /RogerL -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test10-pre4: deadlock in VM?
Not. It does not lock anything else... This was not a problem. /RogerL Roger Larsson wrote: Hi again, Please ignore my patch suggestion from getblk - it will give problems later - in alloc... It is grow_buffers that might need to lock the other ones too... /RogerL -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ -- Home page: http://www.norran.net/nra02596/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/