Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On 02/17/2015 04:09 PM, Andy Lutomirski wrote: > On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron wrote: >> On 02/17/2015 02:46 PM, Andy Lutomirski wrote: >>> On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: When we are sharing a wakeup source among multiple epoll fds, we end up with thundering herd wakeups, since there is currently no way to add to the wakeup source exclusively. This series introduces 2 new epoll flags, EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN which is to be used in conjunction to EPOLLEXCLUSIVE to evenly distribute the wakeups. This patch was originally motivated by a desire to improve wakeup balance and cpu usage for a listen socket() shared amongst multiple epoll fd sets. See: http://lwn.net/Articles/632590/ for previous test program and testing resutls. Epoll manpage text: EPOLLEXCLUSIVE Provides exclusive wakeups when attaching multiple epoll fds to a shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. EPOLLROUNDROBIN Provides balancing for exclusive wakeups when attaching multiple epoll fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and must be specified with an EPOLL_CTL_ADD operation. Thanks, >>> What permissions do you need on the file descriptor to do this? This >>> will be the first case where a poll-like operation has side effects, >>> and that's rather weird IMO. >>> >> So in the case where you have both non-exclusive and exclusive >> waiters, all of the non-exclusive waiters will continue to get woken >> up. However, I think you're getting at having multiple exclusive >> waiters and potentially 'starving' out other exclusive waiters. >> >> In general, I think wait queues are associated with a 'struct file', >> so I think unless you are sharing your fd table, this isn't an issue. >> However, there may be cases where this is not true? In which >> case, perhaps, we could limit this to CAP_SYS_ADMIN... > There's also SCM_RIGHTS, which can be used in conjunction with file > sealing and such. > > In general, I feel like this patch series solves a problem that isn't > well understood and does it by adding a rather strange new mechanism. > Is there really a problem that can't be addressed by more normal epoll > features? > > --Andy hmmso I dug through some of the Linux archives a bit and this problem seems to crop up every so often without resolution. So I do believe that its an issue that ppl are more generally interested in. See: http://lkml.iu.edu/hypermail/linux/kernel/1201.1/02620.html http://marc.info/?l=linux-kernel=128638781921073=2 In the latter thread, Linus suggests adding it to the "requested events" field to poll: http://marc.info/?l=linux-kernel=128639416832335=2 So, I think that this series at least moves in that suggested direction. Thanks, -Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron wrote: > On 02/17/2015 02:46 PM, Andy Lutomirski wrote: >> On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: >>> When we are sharing a wakeup source among multiple epoll fds, we end up with >>> thundering herd wakeups, since there is currently no way to add to the >>> wakeup source exclusively. This series introduces 2 new epoll flags, >>> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And >>> EPOLLROUNDROBIN >>> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly >>> distribute the wakeups. This patch was originally motivated by a desire to >>> improve wakeup balance and cpu usage for a listen socket() shared amongst >>> multiple epoll fd sets. >>> >>> See: http://lwn.net/Articles/632590/ for previous test program and testing >>> resutls. >>> >>> Epoll manpage text: >>> >>> EPOLLEXCLUSIVE >>> Provides exclusive wakeups when attaching multiple epoll fds to a >>> shared wakeup source. Must be specified with an EPOLL_CTL_ADD >>> operation. >>> >>> EPOLLROUNDROBIN >>> Provides balancing for exclusive wakeups when attaching multiple >>> epoll >>> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set >>> and >>> must be specified with an EPOLL_CTL_ADD operation. >>> >>> Thanks, >> What permissions do you need on the file descriptor to do this? This >> will be the first case where a poll-like operation has side effects, >> and that's rather weird IMO. >> > > So in the case where you have both non-exclusive and exclusive > waiters, all of the non-exclusive waiters will continue to get woken > up. However, I think you're getting at having multiple exclusive > waiters and potentially 'starving' out other exclusive waiters. > > In general, I think wait queues are associated with a 'struct file', > so I think unless you are sharing your fd table, this isn't an issue. > However, there may be cases where this is not true? In which > case, perhaps, we could limit this to CAP_SYS_ADMIN... There's also SCM_RIGHTS, which can be used in conjunction with file sealing and such. In general, I feel like this patch series solves a problem that isn't well understood and does it by adding a rather strange new mechanism. Is there really a problem that can't be addressed by more normal epoll features? --Andy > > Thanks, > > -Jason > -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On 02/17/2015 02:46 PM, Andy Lutomirski wrote: > On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: >> When we are sharing a wakeup source among multiple epoll fds, we end up with >> thundering herd wakeups, since there is currently no way to add to the >> wakeup source exclusively. This series introduces 2 new epoll flags, >> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN >> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly >> distribute the wakeups. This patch was originally motivated by a desire to >> improve wakeup balance and cpu usage for a listen socket() shared amongst >> multiple epoll fd sets. >> >> See: http://lwn.net/Articles/632590/ for previous test program and testing >> resutls. >> >> Epoll manpage text: >> >> EPOLLEXCLUSIVE >> Provides exclusive wakeups when attaching multiple epoll fds to a >> shared wakeup source. Must be specified with an EPOLL_CTL_ADD >> operation. >> >> EPOLLROUNDROBIN >> Provides balancing for exclusive wakeups when attaching multiple >> epoll >> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set >> and >> must be specified with an EPOLL_CTL_ADD operation. >> >> Thanks, > What permissions do you need on the file descriptor to do this? This > will be the first case where a poll-like operation has side effects, > and that's rather weird IMO. > So in the case where you have both non-exclusive and exclusive waiters, all of the non-exclusive waiters will continue to get woken up. However, I think you're getting at having multiple exclusive waiters and potentially 'starving' out other exclusive waiters. In general, I think wait queues are associated with a 'struct file', so I think unless you are sharing your fd table, this isn't an issue. However, there may be cases where this is not true? In which case, perhaps, we could limit this to CAP_SYS_ADMIN... Thanks, -Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: > When we are sharing a wakeup source among multiple epoll fds, we end up with > thundering herd wakeups, since there is currently no way to add to the > wakeup source exclusively. This series introduces 2 new epoll flags, > EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN > which is to be used in conjunction to EPOLLEXCLUSIVE to evenly > distribute the wakeups. This patch was originally motivated by a desire to > improve wakeup balance and cpu usage for a listen socket() shared amongst > multiple epoll fd sets. > > See: http://lwn.net/Articles/632590/ for previous test program and testing > resutls. > > Epoll manpage text: > > EPOLLEXCLUSIVE > Provides exclusive wakeups when attaching multiple epoll fds to a > shared wakeup source. Must be specified with an EPOLL_CTL_ADD > operation. > > EPOLLROUNDROBIN > Provides balancing for exclusive wakeups when attaching multiple epoll > fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and > must be specified with an EPOLL_CTL_ADD operation. > > Thanks, What permissions do you need on the file descriptor to do this? This will be the first case where a poll-like operation has side effects, and that's rather weird IMO. --Andy > > -Jason > > > Jason Baron (2): > sched/wait: add round robin wakeup mode > epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN > > fs/eventpoll.c | 25 - > include/linux/wait.h | 11 +++ > include/uapi/linux/eventpoll.h | 6 ++ > kernel/sched/wait.c| 10 -- > 4 files changed, 45 insertions(+), 7 deletions(-) > > -- > 1.8.2.rc2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-api" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On 02/17/2015 04:09 PM, Andy Lutomirski wrote: On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron jba...@akamai.com wrote: On 02/17/2015 02:46 PM, Andy Lutomirski wrote: On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron jba...@akamai.com wrote: When we are sharing a wakeup source among multiple epoll fds, we end up with thundering herd wakeups, since there is currently no way to add to the wakeup source exclusively. This series introduces 2 new epoll flags, EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN which is to be used in conjunction to EPOLLEXCLUSIVE to evenly distribute the wakeups. This patch was originally motivated by a desire to improve wakeup balance and cpu usage for a listen socket() shared amongst multiple epoll fd sets. See: http://lwn.net/Articles/632590/ for previous test program and testing resutls. Epoll manpage text: EPOLLEXCLUSIVE Provides exclusive wakeups when attaching multiple epoll fds to a shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. EPOLLROUNDROBIN Provides balancing for exclusive wakeups when attaching multiple epoll fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and must be specified with an EPOLL_CTL_ADD operation. Thanks, What permissions do you need on the file descriptor to do this? This will be the first case where a poll-like operation has side effects, and that's rather weird IMO. So in the case where you have both non-exclusive and exclusive waiters, all of the non-exclusive waiters will continue to get woken up. However, I think you're getting at having multiple exclusive waiters and potentially 'starving' out other exclusive waiters. In general, I think wait queues are associated with a 'struct file', so I think unless you are sharing your fd table, this isn't an issue. However, there may be cases where this is not true? In which case, perhaps, we could limit this to CAP_SYS_ADMIN... There's also SCM_RIGHTS, which can be used in conjunction with file sealing and such. In general, I feel like this patch series solves a problem that isn't well understood and does it by adding a rather strange new mechanism. Is there really a problem that can't be addressed by more normal epoll features? --Andy hmmso I dug through some of the Linux archives a bit and this problem seems to crop up every so often without resolution. So I do believe that its an issue that ppl are more generally interested in. See: http://lkml.iu.edu/hypermail/linux/kernel/1201.1/02620.html http://marc.info/?l=linux-kernelm=128638781921073w=2 In the latter thread, Linus suggests adding it to the requested events field to poll: http://marc.info/?l=linux-kernelm=128639416832335w=2 So, I think that this series at least moves in that suggested direction. Thanks, -Jason -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron jba...@akamai.com wrote: When we are sharing a wakeup source among multiple epoll fds, we end up with thundering herd wakeups, since there is currently no way to add to the wakeup source exclusively. This series introduces 2 new epoll flags, EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN which is to be used in conjunction to EPOLLEXCLUSIVE to evenly distribute the wakeups. This patch was originally motivated by a desire to improve wakeup balance and cpu usage for a listen socket() shared amongst multiple epoll fd sets. See: http://lwn.net/Articles/632590/ for previous test program and testing resutls. Epoll manpage text: EPOLLEXCLUSIVE Provides exclusive wakeups when attaching multiple epoll fds to a shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. EPOLLROUNDROBIN Provides balancing for exclusive wakeups when attaching multiple epoll fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and must be specified with an EPOLL_CTL_ADD operation. Thanks, What permissions do you need on the file descriptor to do this? This will be the first case where a poll-like operation has side effects, and that's rather weird IMO. --Andy -Jason Jason Baron (2): sched/wait: add round robin wakeup mode epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN fs/eventpoll.c | 25 - include/linux/wait.h | 11 +++ include/uapi/linux/eventpoll.h | 6 ++ kernel/sched/wait.c| 10 -- 4 files changed, 45 insertions(+), 7 deletions(-) -- 1.8.2.rc2 -- To unsubscribe from this list: send the line unsubscribe linux-api in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On 02/17/2015 02:46 PM, Andy Lutomirski wrote: On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron jba...@akamai.com wrote: When we are sharing a wakeup source among multiple epoll fds, we end up with thundering herd wakeups, since there is currently no way to add to the wakeup source exclusively. This series introduces 2 new epoll flags, EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN which is to be used in conjunction to EPOLLEXCLUSIVE to evenly distribute the wakeups. This patch was originally motivated by a desire to improve wakeup balance and cpu usage for a listen socket() shared amongst multiple epoll fd sets. See: http://lwn.net/Articles/632590/ for previous test program and testing resutls. Epoll manpage text: EPOLLEXCLUSIVE Provides exclusive wakeups when attaching multiple epoll fds to a shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. EPOLLROUNDROBIN Provides balancing for exclusive wakeups when attaching multiple epoll fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and must be specified with an EPOLL_CTL_ADD operation. Thanks, What permissions do you need on the file descriptor to do this? This will be the first case where a poll-like operation has side effects, and that's rather weird IMO. So in the case where you have both non-exclusive and exclusive waiters, all of the non-exclusive waiters will continue to get woken up. However, I think you're getting at having multiple exclusive waiters and potentially 'starving' out other exclusive waiters. In general, I think wait queues are associated with a 'struct file', so I think unless you are sharing your fd table, this isn't an issue. However, there may be cases where this is not true? In which case, perhaps, we could limit this to CAP_SYS_ADMIN... Thanks, -Jason -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron jba...@akamai.com wrote: On 02/17/2015 02:46 PM, Andy Lutomirski wrote: On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron jba...@akamai.com wrote: When we are sharing a wakeup source among multiple epoll fds, we end up with thundering herd wakeups, since there is currently no way to add to the wakeup source exclusively. This series introduces 2 new epoll flags, EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN which is to be used in conjunction to EPOLLEXCLUSIVE to evenly distribute the wakeups. This patch was originally motivated by a desire to improve wakeup balance and cpu usage for a listen socket() shared amongst multiple epoll fd sets. See: http://lwn.net/Articles/632590/ for previous test program and testing resutls. Epoll manpage text: EPOLLEXCLUSIVE Provides exclusive wakeups when attaching multiple epoll fds to a shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. EPOLLROUNDROBIN Provides balancing for exclusive wakeups when attaching multiple epoll fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and must be specified with an EPOLL_CTL_ADD operation. Thanks, What permissions do you need on the file descriptor to do this? This will be the first case where a poll-like operation has side effects, and that's rather weird IMO. So in the case where you have both non-exclusive and exclusive waiters, all of the non-exclusive waiters will continue to get woken up. However, I think you're getting at having multiple exclusive waiters and potentially 'starving' out other exclusive waiters. In general, I think wait queues are associated with a 'struct file', so I think unless you are sharing your fd table, this isn't an issue. However, there may be cases where this is not true? In which case, perhaps, we could limit this to CAP_SYS_ADMIN... There's also SCM_RIGHTS, which can be used in conjunction with file sealing and such. In general, I feel like this patch series solves a problem that isn't well understood and does it by adding a rather strange new mechanism. Is there really a problem that can't be addressed by more normal epoll features? --Andy Thanks, -Jason -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/