Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-11 Thread Mel Gorman
On Fri, Jan 11, 2013 at 12:51:05AM +, Eric Wong wrote: > Mel Gorman wrote: > > mm: compaction: Partially revert capture of suitable high-order page > > > > > Reported-by: Eric Wong > > Cc: sta...@vger.kernel.org > > Signed-off-by: Mel Gorman > > Thanks, my original use case and test

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-11 Thread Mel Gorman
On Fri, Jan 11, 2013 at 12:51:05AM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: mm: compaction: Partially revert capture of suitable high-order page snip Reported-by: Eric Wong normalper...@yhbt.net Cc: sta...@vger.kernel.org Signed-off-by: Mel Gorman mgor...@suse.de

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman wrote: > mm: compaction: Partially revert capture of suitable high-order page > Reported-by: Eric Wong > Cc: sta...@vger.kernel.org > Signed-off-by: Mel Gorman Thanks, my original use case and test works great after several hours! Tested-by: Eric Wong Unfortunately, I also

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Dumazet
On Thu, 2013-01-10 at 19:42 +, Mel Gorman wrote: > Thanks Eric, it's much appreciated. However, I'm still very much in favour > of a partial revert as in retrospect the implementation of capture took the > wrong approach. Could you confirm the following patch works for you? > It's should

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman wrote: > Thanks Eric, it's much appreciated. However, I'm still very much in favour > of a partial revert as in retrospect the implementation of capture took the > wrong approach. Could you confirm the following patch works for you? > It's should functionally have the same effect as

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Mel Gorman
On Thu, Jan 10, 2013 at 09:25:11AM +, Eric Wong wrote: > Mel Gorman wrote: > > page->pfmemalloc can be left set for captured pages so try this but as > > capture is rarely used I'm strongly favouring a partial revert even if > > this works for you. I haven't reproduced this using your

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman wrote: > page->pfmemalloc can be left set for captured pages so try this but as > capture is rarely used I'm strongly favouring a partial revert even if > this works for you. I haven't reproduced this using your workload yet > but I have found that high-order allocation stress tests

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Mel Gorman
On Thu, Jan 10, 2013 at 09:25:11AM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: page-pfmemalloc can be left set for captured pages so try this but as capture is rarely used I'm strongly favouring a partial revert even if this works for you. I haven't reproduced this using your

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: Thanks Eric, it's much appreciated. However, I'm still very much in favour of a partial revert as in retrospect the implementation of capture took the wrong approach. Could you confirm the following patch works for you? It's should functionally have the same

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Dumazet
On Thu, 2013-01-10 at 19:42 +, Mel Gorman wrote: Thanks Eric, it's much appreciated. However, I'm still very much in favour of a partial revert as in retrospect the implementation of capture took the wrong approach. Could you confirm the following patch works for you? It's should

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: mm: compaction: Partially revert capture of suitable high-order page snip Reported-by: Eric Wong normalper...@yhbt.net Cc: sta...@vger.kernel.org Signed-off-by: Mel Gorman mgor...@suse.de Thanks, my original use case and test works great after several

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: page-pfmemalloc can be left set for captured pages so try this but as capture is rarely used I'm strongly favouring a partial revert even if this works for you. I haven't reproduced this using your workload yet but I have found that high-order allocation

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Mel Gorman wrote: > When I looked at it for long enough I found a number of problems. Most > affect timing but two serious issues are in there. One affects how long > kswapd spends compacting versus reclaiming and the other increases lock > contention meaning that async compaction can abort

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Wed, Jan 09, 2013 at 01:37:46PM +, Mel Gorman wrote: > On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote: > > Mel Gorman wrote: > > > Please try the following patch. However, even if it works the benefit of > > > capture may be so marginal that partially reverting it and

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Tue, Jan 08, 2013 at 06:32:29PM -0800, Eric Dumazet wrote: > On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote: > > On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: > > > Mel Gorman wrote: > > > > Please try the following patch. However, even if it works the benefit of > > > > capture

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote: > Mel Gorman wrote: > > Please try the following patch. However, even if it works the benefit of > > capture may be so marginal that partially reverting it and simplifying > > compaction.c is the better decision. > > I already got my VM

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Eric Wong wrote: > Oops, I had to restart my test :x. However, I was able to reproduce the > issue very quickly again with your patch. I've double-checked I'm > booting into the correct kernel, but I do have more load on this > laptop host now, so maybe that made it happen more quickly...

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Eric Wong wrote: > Eric Dumazet wrote: > > On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: > > > Hmm, it seems sk_filter() can return -ENOMEM because skb has the > > > pfmemalloc() set. > > > > > > > > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack > > > drops

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Dumazet erdnet...@gmail.com wrote: On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: Hmm, it seems sk_filter() can return -ENOMEM because skb has the pfmemalloc() set. One TCP socket keeps retransmitting an SKB via loopback, and

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Oops, I had to restart my test :x. However, I was able to reproduce the issue very quickly again with your patch. I've double-checked I'm booting into the correct kernel, but I do have more load on this laptop host now, so maybe that made it happen more

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture may be so marginal that partially reverting it and simplifying compaction.c is the better decision. I already got

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Tue, Jan 08, 2013 at 06:32:29PM -0800, Eric Dumazet wrote: On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote: On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Mel Gorman
On Wed, Jan 09, 2013 at 01:37:46PM +, Mel Gorman wrote: On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture may be so marginal that partially reverting it and

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-09 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: When I looked at it for long enough I found a number of problems. Most affect timing but two serious issues are in there. One affects how long kswapd spends compacting versus reclaiming and the other increases lock contention meaning that async compaction can

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Eric Dumazet wrote: > On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: > > Hmm, it seems sk_filter() can return -ENOMEM because skb has the > > pfmemalloc() set. > > > > > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack > > drops the packet again and again. > >

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: > > Hmm, it seems sk_filter() can return -ENOMEM because skb has the > pfmemalloc() set. > > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack > drops the packet again and again. sock_init_data() sets

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote: > On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: > > Mel Gorman wrote: > > > Please try the following patch. However, even if it works the benefit of > > > capture may be so marginal that partially reverting it and simplifying > > >

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: > Mel Gorman wrote: > > Please try the following patch. However, even if it works the benefit of > > capture may be so marginal that partially reverting it and simplifying > > compaction.c is the better decision. > > I already got my VM stuck

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Mel Gorman wrote: > Please try the following patch. However, even if it works the benefit of > capture may be so marginal that partially reverting it and simplifying > compaction.c is the better decision. I already got my VM stuck on this one. I had two twosleepy instances, 2774 was the one

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Mel Gorman
On Mon, Jan 07, 2013 at 10:38:50PM +, Eric Wong wrote: > Mel Gorman wrote: > > Right now it's difficult to see how the capture could be the source of > > this bug but I'm not ruling it out either so try the following (untested > > but should be ok) patch. It's not a proper revert, it just

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Eric Wong wrote: > Mel Gorman wrote: > > Right now it's difficult to see how the capture could be the source of > > this bug but I'm not ruling it out either so try the following (untested > > but should be ok) patch. It's not a proper revert, it just disables the > > capture page logic to see

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Mel Gorman mgor...@suse.de wrote: Right now it's difficult to see how the capture could be the source of this bug but I'm not ruling it out either so try the following (untested but should be ok) patch. It's not a proper revert, it just disables the

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Mel Gorman
On Mon, Jan 07, 2013 at 10:38:50PM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Right now it's difficult to see how the capture could be the source of this bug but I'm not ruling it out either so try the following (untested but should be ok) patch. It's not a proper revert, it

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture may be so marginal that partially reverting it and simplifying compaction.c is the better decision. I already got my VM stuck on this one. I had two twosleepy instances, 2774 was

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture may be so marginal that partially reverting it and simplifying compaction.c is the better decision. I already got my VM

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote: On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Please try the following patch. However, even if it works the benefit of capture may be so marginal that partially reverting it and simplifying

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: Hmm, it seems sk_filter() can return -ENOMEM because skb has the pfmemalloc() set. One TCP socket keeps retransmitting an SKB via loopback, and TCP stack drops the packet again and again. sock_init_data() sets sk-sk_allocation to

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Eric Dumazet erdnet...@gmail.com wrote: On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote: Hmm, it seems sk_filter() can return -ENOMEM because skb has the pfmemalloc() set. One TCP socket keeps retransmitting an SKB via loopback, and TCP stack drops the packet again and again.

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Wong
Eric Dumazet wrote: > It would not surprise me if sk_stream_wait_memory() have plain bug(s) or > race(s). > > In 2010, in commit 482964e56e132 Nagendra Tomar fixed a pretty severe > long standing bug. > > This path is not taken very often on most machines. > > I would try the following patch :

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Wong
Mel Gorman wrote: > Right now it's difficult to see how the capture could be the source of > this bug but I'm not ruling it out either so try the following (untested > but should be ok) patch. It's not a proper revert, it just disables the > capture page logic to see if it's at fault. Things

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Dumazet
On Mon, 2013-01-07 at 12:25 +, Mel Gorman wrote: > > > ===> 28014[28017]/stack <=== > > [] release_sock+0xe5/0x11b > > [] sk_stream_wait_memory+0x1f7/0x1fc > > [] autoremove_wake_function+0x0/0x2a > > [] tcp_sendmsg+0x710/0x86d > > [] sock_sendmsg+0x7b/0x93 > > [] sys_sendto+0xee/0x145 > >

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Mel Gorman
On Sun, Jan 06, 2013 at 12:07:00PM +, Eric Wong wrote: > Mel Gorman wrote: > > Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then > > answer the following questions please? > > This is on my main machine running 3.8-rc2 > > > 1. What are the contents of /proc/vmstat at

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Mel Gorman
On Sun, Jan 06, 2013 at 12:07:00PM +, Eric Wong wrote: Mel Gorman mgor...@suse.de wrote: Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then answer the following questions please? This is on my main machine running 3.8-rc2 1. What are the contents of

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Dumazet
On Mon, 2013-01-07 at 12:25 +, Mel Gorman wrote: === 28014[28017]/stack === [8129fc1d] release_sock+0xe5/0x11b [812a642c] sk_stream_wait_memory+0x1f7/0x1fc [81040d5e] autoremove_wake_function+0x0/0x2a [812d8fc3] tcp_sendmsg+0x710/0x86d

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: Right now it's difficult to see how the capture could be the source of this bug but I'm not ruling it out either so try the following (untested but should be ok) patch. It's not a proper revert, it just disables the capture page logic to see if it's at fault.

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-07 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: It would not surprise me if sk_stream_wait_memory() have plain bug(s) or race(s). In 2010, in commit 482964e56e132 Nagendra Tomar fixed a pretty severe long standing bug. This path is not taken very often on most machines. I would try the

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-06 Thread Eric Wong
Mel Gorman wrote: > Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then > answer the following questions please? This is on my main machine running 3.8-rc2 > 1. What are the contents of /proc/vmstat at the time it is stuck? ===> /proc/vmstat <=== nr_free_pages 40305

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-06 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then answer the following questions please? This is on my main machine running 3.8-rc2 1. What are the contents of /proc/vmstat at the time it is stuck? === /proc/vmstat === nr_free_pages

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman wrote: > On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: > > Instead, I disabled THP+compaction under v3.7.1 and I've been unable to > > reproduce the issue without THP+compaction. > > > > Implying that it's stuck in compaction somewhere. It could be the case > that

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman wrote: > On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: > > Instead, I disabled THP+compaction under v3.7.1 and I've been unable to > > reproduce the issue without THP+compaction. > > > > Implying that it's stuck in compaction somewhere. It could be the case > that

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Dumazet
On Fri, 2013-01-04 at 16:01 +, Mel Gorman wrote: > Implying that it's stuck in compaction somewhere. It could be the case > that compaction alters timing enough to trigger another bug. You say it > tests differently depending on whether TCP or unix sockets are used > which might indicate

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Mel Gorman
On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: > (changing Cc:) > > Eric Wong wrote: > > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a > > local TCP socket. The isolated code below can reproduces the issue > > after many minutes (<1 hour). It might be easier

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Mel Gorman
On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: (changing Cc:) Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Dumazet
On Fri, 2013-01-04 at 16:01 +, Mel Gorman wrote: Implying that it's stuck in compaction somewhere. It could be the case that compaction alters timing enough to trigger another bug. You say it tests differently depending on whether TCP or unix sockets are used which might indicate multiple

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: Instead, I disabled THP+compaction under v3.7.1 and I've been unable to reproduce the issue without THP+compaction. Implying that it's stuck in compaction somewhere. It could be the case that

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: Instead, I disabled THP+compaction under v3.7.1 and I've been unable to reproduce the issue without THP+compaction. Implying that it's stuck in compaction somewhere. It could be the case that

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong wrote: > Eric Wong wrote: > > I think this requires frequent dirtying/cycling of pages to reproduce. > > (from copying large files around) to interact with compaction. > > I'll see if I can reproduce the issue with read-only FS activity. > > Still successfully running the read-only

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong wrote: > I think this requires frequent dirtying/cycling of pages to reproduce. > (from copying large files around) to interact with compaction. > I'll see if I can reproduce the issue with read-only FS activity. Still successfully running the read-only test on my main machine, will

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong wrote: > Eric Dumazet wrote: > > With the following patch, I cant reproduce the 'apparent stuck' > > Right, the output is just an approximation and the logic there > was bogus. > > Thanks for looking at this. I'm still able to reproduce the issue under v3.8-rc2 with your patch for

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Dumazet wrote: > On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote: > > Eric Wong wrote: > > > [1] my full setup is very strange. > > > > > > Other than the FUSE component I forgot to mention, little depends on > > > the kernel. With all this, the standalone toosleepy can get

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Dumazet
On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote: > Eric Wong wrote: > > [1] my full setup is very strange. > > > > Other than the FUSE component I forgot to mention, little depends on > > the kernel. With all this, the standalone toosleepy can get stuck. > > I'll try to reproduce

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Dumazet
On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote: Eric Wong normalper...@yhbt.net wrote: [1] my full setup is very strange. Other than the FUSE component I forgot to mention, little depends on the kernel. With all this, the standalone toosleepy can get stuck. I'll try to

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote: Eric Wong normalper...@yhbt.net wrote: [1] my full setup is very strange. Other than the FUSE component I forgot to mention, little depends on the kernel. With all this, the

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Dumazet eric.duma...@gmail.com wrote: With the following patch, I cant reproduce the 'apparent stuck' Right, the output is just an approximation and the logic there was bogus. Thanks for looking at this. I'm still able to reproduce the issue

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: I think this requires frequent dirtying/cycling of pages to reproduce. (from copying large files around) to interact with compaction. I'll see if I can reproduce the issue with read-only FS activity. Still successfully running the read-only test on my

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Wong normalper...@yhbt.net wrote: I think this requires frequent dirtying/cycling of pages to reproduce. (from copying large files around) to interact with compaction. I'll see if I can reproduce the issue with read-only FS activity. Still

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
Eric Wong wrote: > [1] my full setup is very strange. > > Other than the FUSE component I forgot to mention, little depends on > the kernel. With all this, the standalone toosleepy can get stuck. > I'll try to reproduce it with less... I just confirmed my toosleepy processes will

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
(changing Cc:) Eric Wong wrote: > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a > local TCP socket. The isolated code below can reproduces the issue > after many minutes (<1 hour). It might be easier to reproduce on > a busy system while disk I/O is happening. s/might

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
(changing Cc:) Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy system while disk I/O is

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: [1] my full setup is very strange. Other than the FUSE component I forgot to mention, little depends on the kernel. With all this, the standalone toosleepy can get stuck. I'll try to reproduce it with less... I just confirmed my toosleepy

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-29 Thread Eric Wong
Eric Wong wrote: > Eric Wong wrote: > > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a > > local TCP socket. The isolated code below can reproduces the issue > > after many minutes (<1 hour). It might be easier to reproduce on > > a busy system while disk I/O is happening.

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-29 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
Eric Wong wrote: > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a > local TCP socket. The isolated code below can reproduces the issue > after many minutes (<1 hour). It might be easier to reproduce on > a busy system while disk I/O is happening. Ugh, I can't seem to

ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (<1 hour). It might be easier to reproduce on a busy system while disk I/O is happening. This may also be related to an epoll-related issue

ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy system while disk I/O is happening. This may also be related to an epoll-related issue

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy system while disk I/O is happening. Ugh, I