Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22882 )

Change subject: [socket] Fix test failure caused due to unsupported socket 
option
......................................................................


Patch Set 2:

(2 comments)

> Patch Set 1:
>
> (2 comments)
>
> > > Patch Set 1:
>  > >
>  > > > > Patch Set 1:
>  > >  > >
>  > >  > > > Patch Set 1:
>  > >  > > >
>  > >  > > > (1 comment)
>  > >  > >
>  > >  > > The two tests are consistently failing on my local Ubuntu
>  > 20.04
>  > >  > machine for debug build (rebase 9e661cb34).
>  > >  > >
>  > >  > > It appears that in my case, for some reason, the socket
>  > shutdown
>  > >  > is failing consistently due to which subsequent operation of
>  > >  > setting SO_REUSEPORT option fails on unix domain socket. If
>  > socket
>  > >  > shutdown completes successfully, setting SO_REUSEPORT option
>  > may
>  > >  > not fail. I will need to validate this theory though, by
>  > running
>  > >  > some tests.
>  > >  > >
>  > >  > > Considering above explanation holds, we need to address two
>  > >  > issues:
>  > >  > > 1. Ensure proper socket cleanup is done successfully in
>  > order to
>  > >  > avoid reuseport failure.
>  > >  > > 2. Ensure unix domain sockets are not unknowingly used for
>  > >  > SO_REUSEPORT socket option and expected to work fine.
>  > >  >
>  > >  >
>  > >  > I ran a sample routine, that sets socket option(SO_REUSEPORT)
>  > on a
>  > >  > AF_UNIX type socket, on different linux kernel versions
>  > (5.15.0 and
>  > >  > 6.15-rc7) and it became clear that kernel versions containing
>  > >  > commit [1] are gracefully erroring out on non-inet sockets.
>  > The
>  > >  > change was introduced to fix a different issue however.
>  > >  >
>  > >  > [1]: 
> https://github.com/torvalds/linux/commit/5b0af621c3f6ef9261cf6067812f2fd9943acb4b
>  > >  >
>  > >  > Socket cleanup issue (mentioned above) turns out to be
>  > something
>  > >  > different and could be related to C++ version installed on my
>  > >  > ubuntu linux variant. I haven't figured that out yet.
>  > >  >
>  > >  > I still believe, we need to make sure we are not unknowingly
>  > using
>  > >  > SO_REUSEPORT option while setting sockops on a AF_UNIX type
>  > socket.
>  > >  > Current change should help in doing that.
>  > >
>  > > OK, but why to change socket.cc?
>  > >
>  > > Also, maybe update the description with your findings, since the
>  > statement that SO_REUSEPORT for a Unix domain socket isn't
>  > supported isn't true.
>  >
>  > SO_REUSEPORT macro has been available there since kernel 3.9 and it
>  > would be safe to assume that it is available in all the linux
>  > distros today, so there is no need to have this macro availability
>  > check, hence the change in socket.cc.
>
> This doesn't explain why this change is needed to fix the tests that you 
> reported as failing.  As I understand, the change in socket.cc in PS1 is 
> unrelated to changes needed to address the failing tests.  What benefit does 
> the change in socket.cc bring in?  Did it fix or improve anything, even in a 
> theoretical use case?  I don't see much of that, but I might be missing 
> something.  However, I have one concern with the change in socket.cc: it 
> might break compilation at some of the platforms listed as supported at 
> https://kudu.apache.org/docs/installation.html.
>
> Did you verify that Kudu source code still compiles with this update on all 
> the OSes listed there?  Even if you did or you know there is nothing to 
> break, I'd rather see the update on socket.cc published in a separate 
> changelist since it's not related to fixing the failing tests.  If you feel 
> strongly about this and prefer lumping it with the rest of the changes in 
> this patch, then I'm fine with keeping the update on socket.cc here as well, 
> but otherwise I encourage you to post the changes in socket.cc in a separate 
> patch.
>
>  >
>  > SO_REUSEPORT for a Unix domain socket is not supported is true
>  > because there is no proper handling of such a case in kernel. Also,
>  > there is no equivalent of sin_port in sockaddr_un structure. For
>  > the kernel versions (with [1] commit), setsockopt fails with
>  > appropriate error which is good because we will get to know the
>  > error before bind operation. However, if there is a wrong
>  > expectation of successfully setting reuseport option on unix domain
>  > socket, we can certainly correct that expectation with this check.
>  > The check will also handle the cases where [1] is not present in
>  > the kernel, potentially leading to undefined behaviour.
>
> The statement the commit description for PS1 is literally this:
>   > Socket API (SetSockOpt) doesn't support SO_REUSEPORT for a Unix domain 
> socket.
>
> SetSockOpt() is Kudu's API, so maybe that was the source of my confusion.  As 
> one can see, SetSockOpt() simply forwards everything down to the setsockopt() 
> syscall, but there is nothing specific about Unix domain sockets regarding 
> particular socket options in SetSockOpt() itself.  It seems you rather meant 
> setsockopt() syscall.  Indeed -- the port number isn't applicable for Unix 
> socket family, so SO_REUSEPORT isn't relevant for AF_UNIX sockets.  Please 
> consider updating the commit description accordingly.
>
> Also, please add the kernel version information for the platform where you 
> saw those two tests failing.  Adding the relevant information regarding the 
> change in setsockopt() behavior for SO_REUSEPORT in case of AF_UNIX sockets 
> should be great as well.
>
> Thank you!

I have added more detail in the commit message. Hope that clarifies the 
doubts/questions.

http://gerrit.cloudera.org:8080/#/c/22882/1/src/kudu/rpc/messenger.cc
File src/kudu/rpc/messenger.cc:

http://gerrit.cloudera.org:8080/#/c/22882/1/src/kudu/rpc/messenger.cc@262
PS1, Line 262: Unix
> nit: UNIX
Replaced with Unix, as is everywhere else in the code.


http://gerrit.cloudera.org:8080/#/c/22882/1/src/kudu/rpc/messenger.cc@263
PS1, Line 263: (reuseport_ && accept_addr.is_unix())
> Instead of silently ignoring the 'reuseport_' option for UNIX sockets, I'd
Done



--
To view, visit http://gerrit.cloudera.org:8080/22882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ice00b1c4fd1df78fa84c3dd2a79c968a4a91cc21
Gerrit-Change-Number: 22882
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 26 May 2025 08:10:43 +0000
Gerrit-HasComments: Yes

Reply via email to