Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt v2
On Wed, May 20, 2020 at 09:54:36PM +0200, Christoph Hellwig wrote: > Hi Dave, > > this series removes the kernel_setsockopt and kernel_getsockopt > functions, and instead switches their users to small functions that > implement setting (or in one case getting) a sockopt directly using > a normal kernel function call with type safety and all the other > benefits of not having a function call. > > In some cases these functions seem pretty heavy handed as they do > a lock_sock even for just setting a single variable, but this mirrors > the real setsockopt implementation unlike a few drivers that just set > set the fields directly. Hi Dave and other maintainers, can you take a look at and potentially merge patches 1-30 while we discuss the sctp refactoring? It would get a nice headstart by removing kernel_getsockopt and most kernel_setsockopt users, and for the next follow on I wouldn't need to spam lots of lists with 30+ patches again.
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt v2
From: 'Christoph Hellwig' > Sent: 21 May 2020 10:12 ... > > I worried about whether getsockopt() should read the entire > > user buffer first. SCTP needs the some of it often (including a > > sockaddr_storage in one case), TCP needs it once. > > However the cost of reading a few words is small, and a big > > buffer probably needs setting to avoid leaking kernel > > memory if the structure has holes or fields that don't get set. > > Reading from userspace solves both issues. > > As mention in the thread on the last series: That was my first idea, but > we have way to many sockopts, especially in obscure protocols that just > hard code the size. The chance of breaking userspace in a way that can't > be fixed without going back to passing user pointers to get/setsockopt > is way to high to commit to such a change unfortunately. Right the syscall stubs probably can't do it. But the per-protocol ones can for the main protocols. I posted a patch for SCTP yesterday that removes 800 lines of source and 8k of object code. Even that needs a horrid bodge for one request where the length returned has to be less than the data copied! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt v2
On Thu, May 21, 2020 at 08:01:33AM +, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately.
Re: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt v2
From: Christoph Hellwig > Sent: 20 May 2020 20:55 > > this series removes the kernel_setsockopt and kernel_getsockopt > functions, and instead switches their users to small functions that > implement setting (or in one case getting) a sockopt directly using > a normal kernel function call with type safety and all the other > benefits of not having a function call. > > In some cases these functions seem pretty heavy handed as they do > a lock_sock even for just setting a single variable, but this mirrors > the real setsockopt implementation unlike a few drivers that just set > set the fields directly. How much does this increase the kernel code by? You are also replicating a lot of code making it more difficult to maintain. I don't think the performance of an socket option code really matters - it is usually done once when a socket is initialised and the other costs of establishing a connection will dominate. Pulling the user copies outside the [gs]etsocksopt switch statement not only reduces the code size (source and object) and trivially allows kernel_[sg]sockopt() to me added to the list of socket calls. It probably isn't possible to pull the usercopies right out into the syscall wrapper because of some broken requests. I worried about whether getsockopt() should read the entire user buffer first. SCTP needs the some of it often (including a sockaddr_storage in one case), TCP needs it once. However the cost of reading a few words is small, and a big buffer probably needs setting to avoid leaking kernel memory if the structure has holes or fields that don't get set. Reading from userspace solves both issues. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)