Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
From: Christoph Hellwig > Sent: 15 May 2020 16:25 > On Fri, May 15, 2020 at 04:20:02PM +0100, David Howells wrote: > > Christoph Hellwig wrote: > > > > > > The advantage on using kernel_setsockopt here is that sctp module will > > > > only be loaded if dlm actually creates a SCTP socket. With this > > > > change, sctp will be loaded on setups that may not be actually using > > > > it. It's a quite big module and might expose the system. > > > > > > True. Not that the intent is to kill kernel space callers of setsockopt, > > > as I plan to remove the set_fs address space override used for it. > > > > For getsockopt, does it make sense to have the core kernel load > > optval/optlen > > into a buffer before calling the protocol driver? Then the driver need not > > see the userspace pointer at all. > > > > Similar could be done for setsockopt - allocate a buffer of the size > > requested > > by the user inside the kernel and pass it into the driver, then copy the > > data > > back afterwards. > > I did look into that initially. The problem is that tons of sockopts > entirely ignore optlen and just use a fixed size. So I fear that there > could be tons of breakage if we suddently respect it. Otherwise that > would be a pretty nice way to handle the situation. I'd guess that most application use the correct size for setsockopt(). (Well, apart from using 4 instead of 1.) It is certainly possible to always try to read in 64 bytes regardless of the supplied length, but handle the EFAULT case by shortening the buffer. Historically getsockopt() only wrote the length back. Treating 0 and garbage as (say) 4k and letting the protocol code set a shorten the copy to user might work. All short transfers would want to use an on-stack buffer, so slight oversizes could also be allowed for. OTOH if i did a getsockopt() with too short a length I wouldn't want the kernel to trash my program memory. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
From: David Howells > Sent: 15 May 2020 16:20 > Christoph Hellwig wrote: > > > > The advantage on using kernel_setsockopt here is that sctp module will > > > only be loaded if dlm actually creates a SCTP socket. With this > > > change, sctp will be loaded on setups that may not be actually using > > > it. It's a quite big module and might expose the system. > > > > True. Not that the intent is to kill kernel space callers of setsockopt, > > as I plan to remove the set_fs address space override used for it. > > For getsockopt, does it make sense to have the core kernel load optval/optlen > into a buffer before calling the protocol driver? Then the driver need not > see the userspace pointer at all. > > Similar could be done for setsockopt - allocate a buffer of the size requested > by the user inside the kernel and pass it into the driver, then copy the data > back afterwards. Yes, it also simplifies all the compat code. And there is a BPF test in setsockopt that also wants to pass on a kernel buffer. I'm willing to sit and write the patch. Quoting from a post I made later on Friday. Basically: This patch sequence (to be written) does the following: Patch 1: Change __sys_setsockopt() to allocate a kernel buffer, copy the data into it then call set_fs(KERNEL_DS). An on-stack buffer (say 64 bytes) will be used for small transfers. Patch 2: The same for __sys_getsockopt(). Patch 3: Compat setsockopt. Patch 4: Compat getsockopt. Patch 5: Remove the user copies from the global socket options code. Patches 6 to n-1; Remove the user copies from the per-protocol code. Patch n: Remove the set_fs(KERNEL_DS) from the entry points. This should be bisectable. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
On Fri, May 15, 2020 at 04:20:02PM +0100, David Howells wrote: > Christoph Hellwig wrote: > > > > The advantage on using kernel_setsockopt here is that sctp module will > > > only be loaded if dlm actually creates a SCTP socket. With this > > > change, sctp will be loaded on setups that may not be actually using > > > it. It's a quite big module and might expose the system. > > > > True. Not that the intent is to kill kernel space callers of setsockopt, > > as I plan to remove the set_fs address space override used for it. > > For getsockopt, does it make sense to have the core kernel load optval/optlen > into a buffer before calling the protocol driver? Then the driver need not > see the userspace pointer at all. > > Similar could be done for setsockopt - allocate a buffer of the size requested > by the user inside the kernel and pass it into the driver, then copy the data > back afterwards. I did look into that initially. The problem is that tons of sockopts entirely ignore optlen and just use a fixed size. So I fear that there could be tons of breakage if we suddently respect it. Otherwise that would be a pretty nice way to handle the situation.
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
Christoph Hellwig wrote: > > The advantage on using kernel_setsockopt here is that sctp module will > > only be loaded if dlm actually creates a SCTP socket. With this > > change, sctp will be loaded on setups that may not be actually using > > it. It's a quite big module and might expose the system. > > True. Not that the intent is to kill kernel space callers of setsockopt, > as I plan to remove the set_fs address space override used for it. For getsockopt, does it make sense to have the core kernel load optval/optlen into a buffer before calling the protocol driver? Then the driver need not see the userspace pointer at all. Similar could be done for setsockopt - allocate a buffer of the size requested by the user inside the kernel and pass it into the driver, then copy the data back afterwards. David
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
From: Marcelo Ricardo Leitner > Sent: 13 May 2020 19:01 > On Wed, May 13, 2020 at 08:26:42AM +0200, Christoph Hellwig wrote: > > And call it directly from dlm instead of going through kernel_setsockopt. > > The advantage on using kernel_setsockopt here is that sctp module will > only be loaded if dlm actually creates a SCTP socket. With this > change, sctp will be loaded on setups that may not be actually using > it. It's a quite big module and might expose the system. > > I'm okay with the SCTP changes, but I'll defer to DLM folks to whether > that's too bad or what for DLM. I didn't see these sneak through. There is a big long list of SCTP socket options that are needed to make anything work. They all need exporting. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
On Wed, May 13, 2020 at 03:00:58PM -0300, Marcelo Ricardo Leitner wrote: > On Wed, May 13, 2020 at 08:26:42AM +0200, Christoph Hellwig wrote: > > And call it directly from dlm instead of going through kernel_setsockopt. > > The advantage on using kernel_setsockopt here is that sctp module will > only be loaded if dlm actually creates a SCTP socket. With this > change, sctp will be loaded on setups that may not be actually using > it. It's a quite big module and might expose the system. True. Not that the intent is to kill kernel space callers of setsockopt, as I plan to remove the set_fs address space override used for it. So if always pulling in sctp is not an option for the DLM maintainers we'd have to do tricks using symbol_get() or similar. The same would also apply for ipv6, although I'm not sure how common modular ipv6 is in practice.
Re: [Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
On Wed, May 13, 2020 at 08:26:42AM +0200, Christoph Hellwig wrote: > And call it directly from dlm instead of going through kernel_setsockopt. The advantage on using kernel_setsockopt here is that sctp module will only be loaded if dlm actually creates a SCTP socket. With this change, sctp will be loaded on setups that may not be actually using it. It's a quite big module and might expose the system. I'm okay with the SCTP changes, but I'll defer to DLM folks to whether that's too bad or what for DLM. > > Signed-off-by: Christoph Hellwig > --- > fs/dlm/lowcomms.c | 13 - > include/net/sctp/sctp.h | 3 +++ > net/sctp/socket.c | 5 +++-- > 3 files changed, 14 insertions(+), 7 deletions(-) > > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c > index b722a09a7ca05..e4939d770df53 100644 > --- a/fs/dlm/lowcomms.c > +++ b/fs/dlm/lowcomms.c > @@ -1005,14 +1005,17 @@ static int sctp_bind_addrs(struct connection *con, > uint16_t port) > memcpy(&localaddr, dlm_local_addr[i], sizeof(localaddr)); > make_sockaddr(&localaddr, port, &addr_len); > > - if (!i) > + if (!i) { > result = kernel_bind(con->sock, >(struct sockaddr *)&localaddr, >addr_len); > - else > - result = kernel_setsockopt(con->sock, SOL_SCTP, > -SCTP_SOCKOPT_BINDX_ADD, > -(char *)&localaddr, > addr_len); > + } else { > + lock_sock(con->sock->sk); > + result = sctp_setsockopt_bindx(con->sock->sk, > + (struct sockaddr *)&localaddr, addr_len, > + SCTP_BINDX_ADD_ADDR); > + release_sock(con->sock->sk); > + } > > if (result < 0) { > log_print("Can't bind to %d addr number %d, %d.\n", > diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h > index 3ab5c6bbb90bd..f702b14d768ba 100644 > --- a/include/net/sctp/sctp.h > +++ b/include/net/sctp/sctp.h > @@ -615,4 +615,7 @@ static inline bool sctp_newsk_ready(const struct sock *sk) > return sock_flag(sk, SOCK_DEAD) || sk->sk_socket; > } > > +int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, > + int addrs_size, int op); > + > #endif /* __net_sctp_h__ */ > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index 1c96b52c4aa28..30c981d9f6158 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -979,8 +979,8 @@ int sctp_asconf_mgmt(struct sctp_sock *sp, struct > sctp_sockaddr_entry *addrw) > * > * Returns 0 if ok, <0 errno code on error. > */ > -static int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, > - int addrs_size, int op) > +int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, > + int addrs_size, int op) > { > int err; > int addrcnt = 0; > @@ -1032,6 +1032,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, > struct sockaddr *kaddrs, > return -EINVAL; > } > } > +EXPORT_SYMBOL(sctp_setsockopt_bindx); > > static int sctp_connect_new_asoc(struct sctp_endpoint *ep, >const union sctp_addr *daddr, > -- > 2.26.2 >
[Cluster-devel] [PATCH 27/33] sctp: export sctp_setsockopt_bindx
And call it directly from dlm instead of going through kernel_setsockopt. Signed-off-by: Christoph Hellwig --- fs/dlm/lowcomms.c | 13 - include/net/sctp/sctp.h | 3 +++ net/sctp/socket.c | 5 +++-- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index b722a09a7ca05..e4939d770df53 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1005,14 +1005,17 @@ static int sctp_bind_addrs(struct connection *con, uint16_t port) memcpy(&localaddr, dlm_local_addr[i], sizeof(localaddr)); make_sockaddr(&localaddr, port, &addr_len); - if (!i) + if (!i) { result = kernel_bind(con->sock, (struct sockaddr *)&localaddr, addr_len); - else - result = kernel_setsockopt(con->sock, SOL_SCTP, - SCTP_SOCKOPT_BINDX_ADD, - (char *)&localaddr, addr_len); + } else { + lock_sock(con->sock->sk); + result = sctp_setsockopt_bindx(con->sock->sk, + (struct sockaddr *)&localaddr, addr_len, + SCTP_BINDX_ADD_ADDR); + release_sock(con->sock->sk); + } if (result < 0) { log_print("Can't bind to %d addr number %d, %d.\n", diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index 3ab5c6bbb90bd..f702b14d768ba 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -615,4 +615,7 @@ static inline bool sctp_newsk_ready(const struct sock *sk) return sock_flag(sk, SOCK_DEAD) || sk->sk_socket; } +int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, + int addrs_size, int op); + #endif /* __net_sctp_h__ */ diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 1c96b52c4aa28..30c981d9f6158 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -979,8 +979,8 @@ int sctp_asconf_mgmt(struct sctp_sock *sp, struct sctp_sockaddr_entry *addrw) * * Returns 0 if ok, <0 errno code on error. */ -static int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, -int addrs_size, int op) +int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, + int addrs_size, int op) { int err; int addrcnt = 0; @@ -1032,6 +1032,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, struct sockaddr *kaddrs, return -EINVAL; } } +EXPORT_SYMBOL(sctp_setsockopt_bindx); static int sctp_connect_new_asoc(struct sctp_endpoint *ep, const union sctp_addr *daddr, -- 2.26.2