[RFC/PATCH] epoll: replace EPOLL_CTL_DISABLE with EPOLL_CTL_POKE

2012-11-01 Thread Eric Wong
:00:00 2001 From: Eric Wong normalper...@yhbt.net Date: Fri, 2 Nov 2012 03:47:08 + Subject: [PATCH] epoll: replace EPOLL_CTL_DISABLE with EPOLL_CTL_POKE EPOLL_CTL_POKE may be used to force an item into the epoll ready list. Instead of disabling an item asynchronously via EPOLL_CTL_DISABLE

Re: [RFC/PATCH] epoll: replace EPOLL_CTL_DISABLE with EPOLL_CTL_POKE

2012-11-06 Thread Eric Wong
Christof Meerwald cme...@cmeerw.org wrote: On Fri, 2 Nov 2012 04:13:12 +, Eric Wong wrote: [...] EPOLL_CTL_POKE may be used to force an item into the epoll ready list. Instead of disabling an item asynchronously via EPOLL_CTL_DISABLE, this forces the threads calling epoll_wait

net_dropmon usage documentation/examples?

2013-04-05 Thread Eric Wong
Hi Neil, I'm wondering if you have or know of any public documentation/examples for using net_dropmon. If not, I'll figure it out on my own at some point. Thanks in advance! (Not a very high priority project for me, my network connectivity problems are sadly _very_ obvious at the moment :x) --

Re: [RFC PATCH] wfcqueue: implement __wfcq_enqueue_head() (v2)

2013-04-06 Thread Eric Wong
at first, but it does not seem to hurt performance on my 4-core system. In fact, it was slightly better (but within margin of error) time ./eponeshotmt -c 100 -w 4 -t 4 -f 10 real0m 5.78s user0m 1.20s sys 0m 21.90s Tested-by: Eric Wong normalper...@yhbt.net Hopefully somebody can test

Re: net_dropmon usage documentation/examples?

2013-04-07 Thread Eric Wong
Neil Horman nhor...@tuxdriver.com wrote: On Fri, Apr 05, 2013 at 07:38:55PM +, Eric Wong wrote: Hi Neil, I'm wondering if you have or know of any public documentation/examples for using net_dropmon. If not, I'll figure it out on my own at some point. Thanks in advance! I

Re: [RFC PATCH] wfcqueue: implement __wfcq_enqueue_head() (v3)

2013-04-07 Thread Eric Wong
). $ time ./eponeshotmt -c 100 -w 4 -t 4 -f 10 real0m 5.83s user0m 1.35s sys 0m 21.95s I also ran v2 on Davide Libenzi's totalmess epoll stresser for a few hours yesterday without failures. Running totalmess right now on v3, so far so good :) Tested-by: Eric Wong normalper...@yhbt.net

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-04-11 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: It is provided as a starting point only. Test cases should be ported from Userspace RCU to kernel space and thoroughly ran on a wide range of architectures before considering this port production-ready. Hi Mathieu, any progress on this

Re: [PATCH] epoll: fix sparse error on RCU assignment

2013-03-28 Thread Eric Wong
Oleg Nesterov o...@redhat.com wrote: On 03/14, Eric Wong wrote: Oleg Nesterov o...@redhat.com wrote: On 03/10, Eric Wong wrote: This fixes the following sparse error when using CONFIG_SPARSE_RCU_POINTER=y and make C=2 fs/eventpoll.o fs/eventpoll.c:514:17: error

[PATCH -mm] epoll: cleanup: use RCU_INIT_POINTER when nulling

2013-03-28 Thread Eric Wong
It is always safe to use RCU_INIT_POINTER to NULL a pointer. This results in slightly smaller/faster code. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Andrew Morton a...@linux-foundation.org --- Andrew: Sorry for the noise and requiring these cleanups. Would you want a squashed commit

[PATCH] wfcqueue: add function for unsynchronized prepend

2013-03-29 Thread Eric Wong
In some situations, it is necessary to prepend a node to a queue. For epoll, this is necessary for two rare conditions: * when the user triggers -EFAULT * when reinjecting elements from the ovflist (implemented as a stack) Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Mathieu Desnoyers

Re: New copyfile system call - discuss before LSF?

2013-03-31 Thread Eric Wong
Pavel Machek pa...@ucw.cz wrote: Eric Wong wrote: [1] my splice() annoyances: * need to create/manage a pipe * copy size limited by pipe size * doesn't reduce userspace syscalls (just data copy overhead) * easy to misuse and starve with blocking sockets + big buffers

[PATCH v4] epoll: avoid spinlock contention with wfcqueue

2013-04-01 Thread Eric Wong
cache line and helps improve performance. * destroy wakeup source immediately for zombie epitems, there is no need to wait until ep_send_events. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux

Re: [PATCH] wfcqueue: add function for unsynchronized prepend

2013-04-02 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: In some situations, it is necessary to prepend a node to a queue. For epoll, this is necessary for two rare conditions: * when the user triggers -EFAULT * when reinjecting elements from

Re: [PATCH] epoll: fix sparse error on RCU assignment

2013-03-13 Thread Eric Wong
Oleg Nesterov o...@redhat.com wrote: On 03/10, Eric Wong wrote: This fixes the following sparse error when using CONFIG_SPARSE_RCU_POINTER=y and make C=2 fs/eventpoll.o fs/eventpoll.c:514:17: error: incompatible types in comparison expression (different address spaces

[PATCH mm] epoll: lock ep-mtx in ep_free to silence lockdep

2013-03-13 Thread Eric Wong
Technically we do not need to hold ep-mtx during ep_free since we are certain there are no other users of ep at that point. However, lockdep complains with a suspicious rcu_dereference_check() usage! message; so lock the mutex before ep_remove to silence the warning. Signed-off-by: Eric Wong

[PATCH] epoll: cleanup: hoist out f_op-poll calls

2013-03-13 Thread Eric Wong
This reduces the amount of code inside the ready list iteration loops for better readability IMHO. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- (I think this depends

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-03-13 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: Ported to the Linux kernel from Userspace RCU library, at commit 108a92e5b97ee91b2b902dba2dd2e78aab42f420. Ref: http://git.lttng.org/userspace-rcu.git It is provided as a starting point only. Test cases should be ported from

[RFC] epoll: avoid spinlock contention with wfcqueue

2013-03-13 Thread Eric Wong
cleanly to 3.9-rc* since there's no epoll changes in that) --8--- From 139f0d4528c3fabc6a54e47be73ba9990b42cdd8 Mon Sep 17 00:00:00 2001 From: Eric Wong normalper...@yhbt.net Date: Thu, 14 Mar 2013 02:37:12 + Subject: [PATCH] epoll: avoid

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-03-14 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: +/* + * Load a data from shared memory. + */ +#define CMM_LOAD_SHARED(p) ACCESS_ONCE(p) When iterating

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-03-14 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: The advantage of using splice() over dequeue() is that you will reduce the amount of interactions between concurrent enqueue

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-03-16 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: +/* + * Load a data from shared memory. + */ +#define CMM_LOAD_SHARED(p

Re: [RFC PATCH] Linux kernel Wait-Free Concurrent Queue Implementation

2013-03-18 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: Thanks for providing this detailed scenario. I think there is an important aspect in the use of splice I suggested on which we are not fully understanding each other. I will annotate your scenario below with clarifications: Ah yes, I

[RFC v2] epoll: avoid spinlock contention with wfcqueue

2013-03-18 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: I'm posting this lightly tested version since I may not be able to do more testing/benchmarking until the weekend. Still lightly tested (on an initramfs KVM, no real applications, yet). Davide's totalmess is still running, so that's probably a good sign

Re: [RFC v2] epoll: avoid spinlock contention with wfcqueue

2013-03-18 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: Eric Wong normalper...@yhbt.net wrote: I'm posting this lightly tested version since I may not be able to do more testing/benchmarking until the weekend. Still lightly tested

Re: [RFC v2] epoll: avoid spinlock contention with wfcqueue

2013-03-18 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: I'm also not entirely sure why you need to add enum epoll_item_state along with expensive atomic ops to compute the state. Wouldn't it be enough to know in which queue the nodes are located

[PATCH mm] epoll: fix suspicious RCU usage in ep_poll_callback

2013-03-20 Thread Eric Wong
The commit epoll: use RCU to protect wakeup_source in epitem introduced the ep_pm_stay_awake_rcu function for ep_poll_callback use, but I left it unused on accident. ep-mtx cannot be held in ep_poll_callback, so RCU should be used here. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Rafael J

[PATCH] wfcqueue: functions for local append and enqueue

2013-03-21 Thread Eric Wong
-by: Eric Wong normalper...@yhbt.net Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com Cc: Lai Jiangshan la...@cn.fujitsu.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Stephen Hemminger shemmin...@vyatta.com Cc: Davide Libenzi davi...@xmailserver.org --- Benchmark for this coming

[RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-21 Thread Eric Wong
-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com --- fs/eventpoll.c | 615 ++--- 1

[RFC v3 2/2] epoll: use a local wfcq functions for Level Trigger

2013-03-21 Thread Eric Wong
and these new _local functions is large and outside of the margin of error. ref: http://www.xmailserver.org/epwbench.c Somewhat-tested-by: Eric Wong normalper...@yhbt.net Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-21 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: This is still not a proper commit, I've lightly tested this. Btw, full series here (already sent to LKML and some in -mm) http://yhbt.net/epoll-wfcqueue-v3.8.3-20130321.mbox (should apply cleanly to all 3.8/3.9 kernels) -- To unsubscribe from this list

[RFC v2 3/2] epoll: avoid using extra cache line on most 64-bit

2013-03-21 Thread Eric Wong
-by: Eric Wong normalper...@yhbt.net Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- fs/eventpoll.c | 27 +++ 1 file changed, 23 insertions(+), 4

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-21 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Thu, Mar 21, 2013 at 4:52 AM, Eric Wong normalper...@yhbt.net wrote: Changes since v2: * epi-state is no longer atomic, we only cmpxchg in ep_poll_callback now and rely on implicit barriers in other places for reading. * intermediate

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-22 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Thu, Mar 21, 2013 at 8:24 PM, Eric Wong normalper...@yhbt.net wrote: With EPOLLET and improper usage (not hitting EAGAIN), the event now has a larger window to be lost (as mentioned in my changelog). What about the case where EPOLLET is not set

Re: [PATCH] wfcqueue: functions for local append and enqueue

2013-03-22 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: /* + * __wfcq_append_local: append one local queue to another local queue + * + * No memory barriers are issued. Mutual exclusion is the responsibility + * of the caller

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-22 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Thu, Mar 21, 2013 at 8:24 PM, Eric Wong normalper...@yhbt.net wrote: With EPOLLET and improper usage (not hitting EAGAIN), the event now has a larger window to be lost (as mentioned in my changelog

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-22 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 22, 2013 at 3:31 AM, Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Thu, Mar 21, 2013 at 8:24 PM, Eric Wong normalper...@yhbt.net wrote: With EPOLLET and improper usage (not hitting EAGAIN), the event

Re: [RFC v3 1/2] epoll: avoid spinlock contention with wfcqueue

2013-03-23 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: At some point the level triggered event has to get cleared. As far as I can tell, your new code will drop new events that occur between revents = ep_item_poll(epi, pt); and epi-state = EP_STATE_IDLE

Re: [-next] fs/eventpoll.c:545 suspicious rcu_dereference_check() usage

2013-03-23 Thread Eric Wong
Sergey Senozhatsky sergey.senozhat...@gmail.com wrote: [3.163894] === [3.163895] [ INFO: suspicious RCU usage. ] [3.163897] 3.9.0-rc3-next-20130322-dbg-dirty #1 Not tainted [3.163898] --- [3.163900] fs/eventpoll.c:545

[PATCH v2] wfcqueue: functions for local append and enqueue

2013-03-23 Thread Eric Wong
-off-by: Eric Wong normalper...@yhbt.net Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com Cc: Lai Jiangshan la...@cn.fujitsu.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Stephen Hemminger shemmin...@vyatta.com Cc: Davide Libenzi davi...@xmailserver.org --- I noticed the original

[PATCH v3] wfcqueue: functions for local append and enqueue

2013-03-23 Thread Eric Wong
Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: * Eric Wong (normalper...@yhbt.net) wrote: /* + * ___wfcq_append: append one local queue to another local queue + * __wfcq_append() and ___wfcq_append() are meant to be private to wfcqueue.h. Therefore, the comment above should

Re: I/O blocked while dirty pages are being flushed

2013-03-24 Thread Eric Wong
Fredrik Tolf fred...@dolda2000.com wrote: It is worth noting, also, that this seems to be a situation introduced somewhere between 2.6.26 and 2.6.32, because I started noticing it when I upgraded from Debian 5.0 to 6.0. I've since tried it on 3.2.0, 3.5.4 and 3.7.1, and it appears in every

Re: epoll: possible bug from wakeup_source activation

2013-03-07 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Hi Arve, looking at commit 4d7e30d98939a0340022ccd49325a3d70f7e0238 (epoll: Add a flag, EPOLLWAKEUP, to prevent suspend ...) I think the reason for using ep-ws instead of epi-ws in the unlikely ovflist case applies to the likely rdllist case, too

Re: [PATCH 2/2] epoll: add tracepoints for epitem enqueue/dequeue

2013-03-07 Thread Eric Wong
Putting this on hold for now. I'm awaiting answers for 20130307112639.ga25...@dcvr.yhbt.net, (Subject: epoll: possible bug from wakeup_source activation) this patch may hide the possible bug I'm referring to in that email. -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: epoll: possible bug from wakeup_source activation

2013-03-08 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Thu, Mar 7, 2013 at 5:30 PM, Eric Wong normalper...@yhbt.net wrote: Eric Wong normalper...@yhbt.net wrote: Hi Arve, looking at commit 4d7e30d98939a0340022ccd49325a3d70f7e0238 (epoll: Add a flag, EPOLLWAKEUP, to prevent suspend ...) I think

[PATCH] epoll: comment + BUILD_BUG_ON to prevent epitem bloat

2013-03-08 Thread Eric Wong
This will prevent us from accidentally introducing a memory bloat regression here in the future. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Andrew Morton a...@linux-foundation.org Cc: Davide Libenzi davi...@xmailserver.org, Cc: Al Viro v...@zeniv.linux.org.uk --- Andrew Morton

Re: epoll: possible bug from wakeup_source activation

2013-03-08 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 12:49 PM, Eric Wong normalper...@yhbt.net wrote: What happens if ep_modify calls ep_destroy_wakeup_source while __pm_stay_awake is running on the same epi-ws? Yes, that looks like a problem. I think calling

Re: epoll: possible bug from wakeup_source activation

2013-03-09 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 12:49 PM, Eric Wong normalper...@yhbt.net wrote: What happens if ep_modify calls ep_destroy_wakeup_source while __pm_stay_awake is running on the same epi-ws? Yes, that looks like

[PATCH] epoll: fix sparse error on RCU assignment

2013-03-10 Thread Eric Wong
: Davide Libenzi davi...@xmailserver.org Cc: Eric Dumazet eric.duma...@gmail.com Cc: Oleg Nesterov o...@redhat.com Signed-off-by: Eric Wong normalper...@yhbt.net --- Oleg: I found this error since I was working on an unrelated patch to convert wakeup_source users to RCU in epoll. This was introduced

[PATCH] epoll: use RCU protect wakeup_source in epitem

2013-03-10 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Sun, 2013-03-10 at 01:11 +, Eric Wong wrote: static void ep_destroy_wakeup_source(struct epitem *epi) { - wakeup_source_unregister(epi-ws); - epi-ws = NULL; + struct wakeup_source *ws = epi-ws; + + rcu_assign_pointer

wfcqueue (in Userspace RCU) for Linux kernel (for epoll)

2013-03-11 Thread Eric Wong
Hi, I'm looking to reduce contention for the ep-lock spin lock in epoll. I came across wfcqueue in Userspace RCU and am wondering if there's any reason (other that lack of developer time/users) it hasn't been adapted for the Linux kernel. I'd be happy to do the work if it's suitable (and omit

Re: epoll: possible bug from wakeup_source activation

2013-03-11 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 11:10 PM, Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 12:49 PM, Eric Wong normalper...@yhbt.net wrote: What happens if ep_modify calls ep_destroy_wakeup_source while

Re: epoll: possible bug from wakeup_source activation

2013-03-11 Thread Eric Wong
Arve Hjønnevåg a...@android.com wrote: On Mon, Mar 11, 2013 at 5:17 PM, Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 11:10 PM, Eric Wong normalper...@yhbt.net wrote: Arve Hjønnevåg a...@android.com wrote: On Fri, Mar 8, 2013 at 12

Re: [PATCH] epoll: preserve ordering of events from ovflist

2013-03-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Events arriving in ovflist are stored in LIFO order, so we should account for that when inserting them into rddlist. Fwiw, I noticed this oddity because I wanted to start tracing epitem readiness (to detect when my application is not calling epoll_wait

[PATCH 1/2] epoll: hoist out duplicated wake up logic

2013-03-03 Thread Eric Wong
This makes the kernel slightly smaller, and hopefully easier to follow. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- fs/eventpoll.c | 27 +++ 1

[PATCH 2/2] epoll: add tracepoints for epitem enqueue/dequeue

2013-03-03 Thread Eric Wong
-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- fs/eventpoll.c | 35 +++- include/linux/eventpoll.h| 8 +++ include/trace/events/eventpoll.h

Re: sendfile and EAGAIN

2013-03-04 Thread Eric Wong
Ulrich Drepper drep...@gmail.com wrote: On Mon, Feb 25, 2013 at 2:22 PM, Eric Dumazet eric.duma...@gmail.com wrote: I don't understand the issue. sendfile() returns -EAGAIN only if no bytes were copied to the socket. There is something wrong/unexpected/... I have a program which can

[PATCH] epoll: trim epitem by one cache line on x86_64

2013-03-04 Thread Eric Wong
instead of __attribute__((packed)) Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- fs/eventpoll.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs

Re: [PATCH] epoll: trim epitem by one cache line on x86_64

2013-03-07 Thread Eric Wong
@@ struct epoll_filefd { struct file *file; int fd; -} EPOLL_PACKED; +} __packed; Thanks for testing on ppc. Looks good to me. For what it's worth: Acked-by: Eric Wong normalper...@yhbt.net -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

epoll: possible bug from wakeup_source activation

2013-03-07 Thread Eric Wong
Hi Arve, looking at commit 4d7e30d98939a0340022ccd49325a3d70f7e0238 (epoll: Add a flag, EPOLLWAKEUP, to prevent suspend ...) I think the reason for using ep-ws instead of epi-ws in the unlikely ovflist case applies to the likely rdllist case, too. Since epi-ws is only protected by ep-mtx, it can

Re: splice() giving unexpected EOF in 3.7.3 and 3.8-rc4+

2013-02-07 Thread Eric Wong
David Miller da...@davemloft.net wrote: From: Eric Dumazet eric.duma...@gmail.com Date: Fri, 18 Jan 2013 22:13:16 -0800 On Fri, 2013-01-18 at 21:54 -0800, Eric Dumazet wrote: Hmm, this might be already fixed in net-next tree, could you try it ? Yes, running your program on

Re: [PATCH 1/1] eventfd: implementation of EFD_MASK flag

2013-02-08 Thread Eric Wong
Andy Lutomirski l...@amacapital.net wrote: On Thu, Feb 7, 2013 at 12:11 PM, Martin Sustrik sust...@250bpm.com wrote: On 07/02/13 20:12, Andy Lutomirski wrote: On 02/06/2013 10:41 PM, Martin Sustrik wrote: The value of 'events' should be any combination of event flags as defined by

Re: [PATCH 1/1] eventfd: implementation of EFD_MASK flag

2013-02-08 Thread Eric Wong
Martin Sustrik sust...@250bpm.com wrote: On 07/02/13 23:44, Andrew Morton wrote: That's a nice changelog but it omitted a critical thing: why do you think the kernel needs this feature? What's the value and use case for being able to poll these descriptors? To address the question, I've

Re: [PATCH 1/1] eventfd: implementation of EFD_MASK flag

2013-02-08 Thread Eric Wong
Martin Sustrik sust...@250bpm.com wrote: On 08/02/13 23:21, Eric Wong wrote: Martin Sustriksust...@250bpm.com wrote: To address the question, I've written down detailed description of the challenges of the network protocol development in user space and how the proposed feature addresses

Re: [PATCH 1/1] eventfd: implementation of EFD_MASK flag

2013-02-09 Thread Eric Wong
Martin Sustrik sust...@250bpm.com wrote: On 09/02/13 04:54, Eric Wong wrote: Using one eventfd per userspace socket still seems a bit wasteful. Wasteful in what sense? Occupying a slot in file descriptor table? That's the price for having the socket uniquely identified by the fd. Yes. I

[PATCH] epoll: preserve ordering of events from ovflist

2013-03-01 Thread Eric Wong
Events arriving in ovflist are stored in LIFO order, so we should account for that when inserting them into rddlist. Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Davide Libenzi davi...@xmailserver.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- I

splice() giving unexpected EOF in 3.7.3 and 3.8-rc4+

2013-01-18 Thread Eric Wong
With the following flow, I'm sometimes getting an unexpected EOF on the pipe reader even though I never close the pipe writer: tcp_wr -write- tcp_rd -splice- pipe_wr - pipe_rd -splice- /dev/null I encounter this in in 3.7.3, 3.8-rc3, and the latest from Linus

Re: splice() giving unexpected EOF in 3.7.3 and 3.8-rc4+

2013-01-18 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Fri, 2013-01-18 at 21:54 -0800, Eric Dumazet wrote: Hmm, this might be already fixed in net-next tree, could you try it ? Yes, running your program on net-next seems OK. David, we need the two following commits. commit

Re: New copyfile system call - discuss before LSF?

2013-02-21 Thread Eric Wong
Jeremy Allison j...@samba.org wrote: On Thu, Feb 21, 2013 at 01:51:53PM +, Myklebust, Trond wrote: On Thu, 2013-02-21 at 12:37 +0100, Ric Wheeler wrote: We have debated the need to have a system call to allow for offloading copy operations, for example to an NFS server (part to

Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue

2013-02-22 Thread Eric Wong
Phillip Susi ps...@ubuntu.com wrote: On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: strace -T timing on an uncached, one gigabyte file: Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 2.484832 After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 0.61 It shouldn't

Re: New copyfile system call - discuss before LSF?

2013-02-22 Thread Eric Wong
Myklebust, Trond trond.mykleb...@netapp.com wrote: -Original Message- From: Zach Brown [mailto:z...@redhat.com] Sent: Thursday, February 21, 2013 5:25 PM To: Myklebust, Trond Cc: Paolo Bonzini; Ric Wheeler; Linux FS Devel; linux-kernel@vger.kernel.org; Chris L. Mason;

Re: epoll with ONESHOT possibly fails to deliver events

2012-12-17 Thread Eric Wong
Andreas Voellmy andreas.voel...@yale.edu wrote: There were a couple of errors in the code when I posted my last message. I have fixed those. The epoll bug still occurs. Sorry I haven't gotten around to this. Can you reproduce this with fewer cores? (I only have 4 at most). Have you tried the

Re: epoll with ONESHOT possibly fails to deliver events

2012-12-20 Thread Eric Wong
Andreas Voellmy andreas.voel...@yale.edu wrote: I wrote a C program that behaves similar to my original program and triggers the bug. The bug only arises when I use enough cores and threads (about 16). The program is here: https://github.com/AndreasVoellmy/epollbug/blob/master/epollbug.c I

Re: epoll with ONESHOT possibly fails to deliver events

2012-12-21 Thread Eric Wong
Junchang(Jason) Wang junchang.w...@yale.edu wrote: We still believe this is a bug in epoll system even though we can't prove that so far. Both Andi and I are very interested in this problem and helping you experts solve this it. Just let us know if we can help. I'm just another epoll user,

[PATCH v2] fadvise: perform WILLNEED readahead asynchronously

2012-12-24 Thread Eric Wong
...@fromorbit.com Cc: Zheng Liu gnehzuil@gmail.com Signed-off-by: Eric Wong normalper...@yhbt.net --- I have not tested on NUMA (since I've no access to NUMA hardware) and do not know how the use of the workqueue affects RA performance. I'm only using WQ_UNBOUND on non-NUMA, though. I'm

ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
== pthread_join(rs, NULL)); assert(0 == pthread_join(r, NULL)); return 0; } 8 Any help/suggestions/test patches would be greatly appreciated. Thanks for reading! -- Eric Wong -- To unsubscribe from this list: send the line

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-27 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy system while disk I/O is happening. Ugh, I

Re: ppoll() stuck on POLLIN while TCP peer is sending

2012-12-29 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy

[PATCH] poll: prevent missed events if _qproc is NULL

2012-12-31 Thread Eric Wong
() barrier in poll_schedule_timeout() appears to be insufficient on my SMP x86-64 machine (as it's only an xchg()). This may also be related to the epoll issue described by Andreas Voellmy in http://thread.gmane.org/gmane.linux.kernel/1408782/ Signed-off-by: Eric Wong normalper...@yhbt.net Cc: Hans

Re: [PATCH] poll: prevent missed events if _qproc is NULL

2012-12-31 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: This patch seems to fix my issue with ppoll() being stuck on my SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414 OK, it doesn't fix my issue, but it seems to make it harder-to-hit... The change to sock_poll_wait() in commit

Re: [PATCH] poll: prevent missed events if _qproc is NULL

2013-01-01 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Mon, 2012-12-31 at 13:21 +, Eric Wong wrote: This patch seems to fix my issue with ppoll() being stuck on my SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414 The change to sock_poll_wait() in commit

Re: [PATCH] poll: prevent missed events if _qproc is NULL

2013-01-01 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Dumazet eric.duma...@gmail.com wrote: commit 626cf236608505d376e4799adb4f7eb00a8594af should not have this side effect, at least for poll()/select() functions. The epoll() changes I am not yet very confident. I have a better explanation

[PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

2013-01-01 Thread Eric Wong
it. --- 8 From 02f43757d04bb6f2786e79eecf1cfa82e6574379 Mon Sep 17 00:00:00 2001 From: Eric Wong normalper...@yhbt.net Date: Tue, 1 Jan 2013 21:20:27 + Subject: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD EPOLL_CTL_MOD sets the interest mask before

Re: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

2013-01-02 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: First, thanks for working on this issue. No problem! It seems the real problem is the epi-event.events = event-events; which is done without taking ep-lock Yes. I am hoping it is possible to do it without a lock there, but your change is more

Re: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

2013-01-02 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2013-01-02 at 18:40 +, Eric Wong wrote: Eric Dumazet eric.duma...@gmail.com wrote: It seems the real problem is the epi-event.events = event-events; which is done without taking ep-lock Yes. I am hoping it is possible to do

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
(changing Cc:) Eric Wong normalper...@yhbt.net wrote: I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a local TCP socket. The isolated code below can reproduces the issue after many minutes (1 hour). It might be easier to reproduce on a busy system while disk I/O

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-02 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: [1] my full setup is very strange. Other than the FUSE component I forgot to mention, little depends on the kernel. With all this, the standalone toosleepy can get stuck. I'll try to reproduce it with less... I just confirmed my toosleepy

Re: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

2013-01-02 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Linus Torvalds torva...@linux-foundation.org wrote: Please document the barrier that this mb() pairs with, and then give an explanation for the fix in the commit message, and I'll happily take it. Even if it's just duplicating the comments above

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote: Eric Wong normalper...@yhbt.net wrote: [1] my full setup is very strange. Other than the FUSE component I forgot to mention, little depends on the kernel. With all

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Dumazet eric.duma...@gmail.com wrote: With the following patch, I cant reproduce the 'apparent stuck' Right, the output is just an approximation and the logic there was bogus. Thanks for looking at this. I'm still able to reproduce the issue

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: I think this requires frequent dirtying/cycling of pages to reproduce. (from copying large files around) to interact with compaction. I'll see if I can reproduce the issue with read-only FS activity. Still successfully running the read-only test on my

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-03 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Eric Wong normalper...@yhbt.net wrote: I think this requires frequent dirtying/cycling of pages to reproduce. (from copying large files around) to interact with compaction. I'll see if I can reproduce the issue with read-only FS activity. Still

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: Instead, I disabled THP+compaction under v3.7.1 and I've been unable to reproduce the issue without THP+compaction. Implying that it's stuck in compaction somewhere. It could be the case

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-04 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote: On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote: Instead, I disabled THP+compaction under v3.7.1 and I've been unable to reproduce the issue without THP+compaction. Implying that it's stuck in compaction somewhere. It could be the case

Re: epoll with ONESHOT possibly fails to deliver events

2012-12-13 Thread Eric Wong
Andreas Voellmy andreas.voel...@yale.edu wrote: Using strace, I checked that my program is using epoll api as I described. Here is a fragment of the strace output that demonstrates my use: recvfrom(161, GET / HTTP/1.1\r\nHost: 10.12.0.1:..., 90, 0, NULL, NULL) = 90 sendto(161, HTTP/1.1 200

[PATCH] fadvise: perform WILLNEED readahead in a workqueue

2012-12-14 Thread Eric Wong
should not hurt existing applications. strace -T timing on an uncached, one gigabyte file: Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 2.484832 After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 0.61 Signed-off-by: Eric Wong normalper...@yhbt.net --- N.B.: I'm not sure if I'm

Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue

2012-12-15 Thread Eric Wong
Alan Cox a...@lxorguk.ukuu.org.uk wrote: On Sat, 15 Dec 2012 00:54:48 + Eric Wong normalper...@yhbt.net wrote: Applications streaming large files may want to reduce disk spinups and I/O latency by performing large amounts of readahead up front How does it compare benchmark wise

Re: resend--[PATCH] improve read ahead in kernel

2012-12-15 Thread Eric Wong
xtu4 xiaobing...@intel.com wrote: resend it, due to format error Subject: [PATCH] when system in low memory scenario, imaging there is a mp3 play, ora video play, we need to read mp3 or video file from memory to page cache,but when system lack of memory, page cache of mp3 or video file

Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue

2012-12-15 Thread Eric Wong
Dave Chinner da...@fromorbit.com wrote: On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: Applications streaming large files may want to reduce disk spinups and I/O latency by performing large amounts of readahead up front. Applications also tend to read files soon after opening

Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue

2012-12-15 Thread Eric Wong
Eric Wong normalper...@yhbt.net wrote: Perhaps squashing something like the following will work? Last hunk should've had a return before skip_ra: --- a/mm/readahead.c +++ b/mm/readahead.c @@ -264,6 +266,10 @@ void wq_page_cache_readahead(struct address_space *mapping, struct file *filp

Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue

2012-12-15 Thread Eric Wong
Dave Chinner da...@fromorbit.com wrote: On Sun, Dec 16, 2012 at 12:25:49AM +, Eric Wong wrote: Alan Cox a...@lxorguk.ukuu.org.uk wrote: On Sat, 15 Dec 2012 00:54:48 + Eric Wong normalper...@yhbt.net wrote: Applications streaming large files may want to reduce disk spinups

  1   2   3   4   >