[Cluster-devel] Re: [GFS2] Fix runtime issue with UP kernels
From: Steven Whitehouse [EMAIL PROTECTED] Date: Fri, 16 Nov 2007 10:37:43 + Ah, all becomes clear :-) Thanks for chasing down this issue, the patch is in my git tree now. I guess I must have had some other option turned on when I did my UP build that caused this not to happen, Spinlock debugging...
[Cluster-devel] [PATCH] dlm: Handle application limited situations properly.
In the normal regime where an application uses non-blocking I/O writes on a socket, they will handle -EAGAIN and use poll() to wait for send space. They don't actually sleep on the socket I/O write. But kernel level RPC layers that do socket I/O operations directly and key off of -EAGAIN on the write() to try again later don't use poll(), they instead have their own sleeping mechanism and rely upon -sk_write_space() to trigger the wakeup. So they do effectively sleep on the write(), but this mechanism alone does not let the socket layers know what's going on. Therefore they must emulate what would have happened, otherwise TCP cannot possibly see that the connection is application window size limited. Handle this, therefore, like SUNRPC by setting SOCK_NOSPACE and bumping the -sk_write_count as needed when we hit the send buffer limits. This should make TCP send buffer size auto-tuning and the -sk_write_space() callback invocations actually happen. Signed-off-by: David S. Miller da...@davemloft.net diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 37a34c2..77720f8 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -108,6 +108,7 @@ struct connection { #define CF_INIT_PENDING 4 #define CF_IS_OTHERCON 5 #define CF_CLOSE 6 +#define CF_APP_LIMITED 7 struct list_head writequeue; /* List of outgoing writequeue_entries */ spinlock_t writequeue_lock; int (*rx_action) (struct connection *); /* What to do when active */ @@ -295,7 +296,17 @@ static void lowcomms_write_space(struct sock *sk) { struct connection *con = sock2con(sk); - if (con !test_and_set_bit(CF_WRITE_PENDING, con-flags)) + if (!con) + return; + + clear_bit(SOCK_NOSPACE, con-sock-flags); + + if (test_and_clear_bit(CF_APP_LIMITED, con-flags)) { + con-sock-sk-sk_write_pending--; + clear_bit(SOCK_ASYNC_NOSPACE, con-sock-flags); + } + + if (!test_and_set_bit(CF_WRITE_PENDING, con-flags)) queue_work(send_workqueue, con-swork); } @@ -1319,6 +1330,15 @@ static void send_to_sock(struct connection *con) ret = kernel_sendpage(con-sock, e-page, offset, len, msg_flags); if (ret == -EAGAIN || ret == 0) { + if (ret == -EAGAIN + test_bit(SOCK_ASYNC_NOSPACE, con-sock-flags) + !test_and_set_bit(CF_APP_LIMITED, con-flags)) { + /* Notify TCP that we're limited by the +* application window size. +*/ + set_bit(SOCK_NOSPACE, con-sock-flags); + con-sock-sk-sk_write_pending++; + } cond_resched(); goto out; }
Re: [Cluster-devel] [PATCH 5/6] net: add paged frag destructor support to kernel_sendpage.
From: Ian Campbell ian.campb...@citrix.com Date: Thu, 5 Jan 2012 17:13:43 + -static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset, +static ssize_t do_tcp_sendpages(struct sock *sk, + struct page **pages, + struct skb_frag_destructor **destructors, + int poffset, size_t psize, int flags) { struct tcp_sock *tp = tcp_sk(sk); An array of destructors is madness, and the one call site that specifies this passes an address of a single entry. This also would never even have to occur if you put the destructor inside of struct page instead. Finally, except for the skb_shared_info() layout optimization in patch #1 which I alreayd applied, this stuff is not baked enough for the 3.3 merge window.
Re: [Cluster-devel] [PATCH] dlm, sctp: Do not allocate a fd for peeloff
From: Vladislav Yasevich vladislav.yasev...@hp.com Date: Wed, 07 Mar 2012 11:29:20 -0500 On 03/07/2012 10:41 AM, Benjamin Poirier wrote: avoids allocating a fd that a) propagates to every kernel thread and usermodehelper b) is not properly released. References: http://article.gmane.org/gmane.linux.network.drbd/22529 Signed-off-by: Benjamin Poirier bpoir...@suse.de It might make sense to change sctp_do_peeloff to take the association id as the first argument and not do the mapping from id to association yourself. It's a bit ugly to expose internal sctp structures outside of SCTP. Agreed.
Re: [Cluster-devel] [PATCH v2 1/2] sctp: Export sctp_do_peeloff
From: Vladislav Yasevich vladislav.yasev...@hp.com Date: Thu, 08 Mar 2012 15:07:40 -0500 On 03/08/2012 10:55 AM, Benjamin Poirier wrote: lookup sctp_association within sctp_do_peeloff() to enable its use outside of the sctp code with minimal knowledge of the former. Signed-off-by: Benjamin Poirier bpoir...@suse.de Acked-by: Vlad Yasevich vladislav.yasev...@hp.com Applied to net-next.
Re: [Cluster-devel] [PATCH v2 2/2] dlm: Do not allocate a fd for peeloff
From: Vladislav Yasevich vladislav.yasev...@hp.com Date: Thu, 08 Mar 2012 15:08:17 -0500 On 03/08/2012 10:55 AM, Benjamin Poirier wrote: avoids allocating a fd that a) propagates to every kernel thread and usermodehelper b) is not properly released. References: http://article.gmane.org/gmane.linux.network.drbd/22529 Signed-off-by: Benjamin Poirier bpoir...@suse.de Looks much better. Also applied to net-next, thanks.
Re: [Cluster-devel] GFS2: Pre-pull patch posting (merge window)
From: David Teigland teigl...@redhat.com Date: Fri, 23 Mar 2012 15:41:52 -0400 Why does gfs2 Kconfig bother with SCTP at all? It seems that line should just be removed. I'll also remove EXPERIMENTAL. I don't understand the vagaries of Kconfig, so a dumb question, how could sctp_do_peeloff possibly be undefined if we're selecting SCTP. GFS2=y SCTP=m
Re: [Cluster-devel] [patch for-3.8] net, sctp: remove CONFIG_EXPERIMENTAL
From: David Rientjes rient...@google.com Date: Tue, 12 Feb 2013 16:24:56 -0800 (PST) From: Kees Cook keesc...@chromium.org This config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it. Acked-by: David S. Miller da...@davemloft.net Acked-by: Vlad Yasevich vyasev...@gmail.com Signed-off-by: Kees Cook keesc...@chromium.org Signed-off-by: David Rientjes rient...@google.com Applied, thanks.
Re: [Cluster-devel] [PATCH net] sctp: label accepted/peeled off sockets
From: Marcelo Ricardo LeitnerDate: Wed, 23 Dec 2015 16:44:09 -0200 > From: Marcelo Ricardo Leitner > > Accepted or peeled off sockets were missing a security label (e.g. > SELinux) which means that socket was in "unlabeled" state. > > This patch clones the sock's label from the parent sock and resolves the > issue (similar to AF_BLUETOOTH protocol family). > > Cc: Paul Moore > Cc: David Teigland > Signed-off-by: Marcelo Ricardo Leitner Applied.
Re: [Cluster-devel] [RFC 0/7] netlink: Add allocation flag to netlink_unicast()
From: Masashi HonmaDate: Wed, 6 Jul 2016 09:28:29 +0900 > Though currently such a use case was not found, to solve potential > issue we will add an allocation flag to netlink_unicast(). We don't solve potential issues, we solve real issues. There is no reason to add the GFP parameter until it is actually needed.
Re: [Cluster-devel] [PATCH 1/3] rhashtable: Add rhashtable_lookup_get_insert_fast
From: Andreas GruenbacherDate: Fri, 17 Mar 2017 15:18:27 +0100 > Dave, > > is it a problem if this commit goes in through the gfs2 tree? > > https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git With no review whatseover on netdev or linux-kernel? I don't think so. Please repost with appropriate CC:'s added.
Re: [Cluster-devel] [PATCH] rhashtable: Add rhashtable_lookup_get_insert_fast
From: Andreas GruenbacherDate: Sat, 18 Mar 2017 00:36:15 +0100 > Add rhashtable_lookup_get_insert_fast for fixed keys, similar to > rhashtable_lookup_get_insert_key for explicit keys. > > Signed-off-by: Andreas Gruenbacher > Acked-by: Herbert Xu Applied to net-next, thanks.
Re: [Cluster-devel] [PATCH 2/3] rhashtable: Add rhashtable_walk_curr
From: Andreas GruenbacherDate: Tue, 19 Dec 2017 09:35:47 +0100 > By the way, git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git > master doesn't merge cleanly with current mainline. Yes, this is the case more often than not :-)
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt
From: Christoph Hellwig Date: Wed, 13 May 2020 08:26:15 +0200 > Hi Dave, > > this series removes the kernel_setsockopt and kernel_getsockopt > functions, and instead switches their users to small functions that > implement setting (or in one case getting) a sockopt directly using > a normal kernel function call with type safety and all the other > benefits of not having a function call. > > In some cases these functions seem pretty heavy handed as they do > a lock_sock even for just setting a single variable, but this mirrors > the real setsockopt implementation - counter to that a few kernel > drivers just set the fields directly already. > > Nevertheless the diffstat looks quite promising: > > 42 files changed, 721 insertions(+), 799 deletions(-) Overall I'm fine with these changes, but three things need to happen before I can think about applying this: 1) Address David's feedback about the ip_mtu*() calls that can occur on ipv6 sockets too. 2) Handle the feedback about dlm now bringing in sctp even if sctp sockets are not even used because of the symbol dependency. 3) Add the rxrpc documentation requested by David. Thank you.
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt
From: David Laight Date: Thu, 14 May 2020 10:26:41 + > I doubt we are the one company with out-of-tree drivers > that use the kernel_socket interface. Not our problem.
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt
From: David Laight Date: Thu, 14 May 2020 08:29:30 + > You need to export functions that do most of the socket options > for all protocols. If all in-tree users of this stuff are converted, there is no argument for keeping these routines. You seemed to be concerned about out of tree stuff. If so, that's not of our concern.
Re: [Cluster-devel] [PATCH 31/33] sctp: add sctp_sock_set_nodelay
From: Marcelo Ricardo Leitner Date: Wed, 20 May 2020 20:10:01 -0300 > The duplication with sctp_setsockopt_nodelay() is quite silly/bad. > Also, why have the 'true' hardcoded? It's what dlm uses, yes, but the > API could be a bit more complete than that. The APIs are being designed based upon what in-tree users actually make use of. We can expand things later if necessary.
Re: [Cluster-devel] remove kernel_getsockopt
From: Christoph Hellwig Date: Wed, 27 May 2020 20:22:27 +0200 > this series reduces scope from the last round and just removes > kernel_getsockopt to avoid conflicting with the sctp cleanup series. Series applied to net-next, thanks.
Re: [Cluster-devel] remove most callers of kernel_setsockopt v3
From: Christoph Hellwig Date: Thu, 28 May 2020 07:12:08 +0200 > this series removes most callers of the kernel_setsockopt functions, and > instead switches their users to small functions that implement setting a > sockopt directly using a normal kernel function call with type safety and > all the other benefits of not having a function call. > > In some cases these functions seem pretty heavy handed as they do > a lock_sock even for just setting a single variable, but this mirrors > the real setsockopt implementation unlike a few drivers that just set > set the fields directly. ... Series applied, thanks Christoph.
Re: [Cluster-devel] remove kernel_setsockopt v4
From: Christoph Hellwig Date: Fri, 29 May 2020 14:09:39 +0200 > now that only the dlm calls to sctp are left for kernel_setsockopt, > while we haven't really made much progress with the sctp setsockopt > refactoring, how about this small series that splits out a > sctp_setsockopt_bindx_kernel that takes a kernel space address array > to share more code as requested by Marcelo. This should fit in with > whatever variant of the refator of sctp setsockopt we go with, but > just solved the immediate problem for now. ... Series applied, thanks.