[ 
https://issues.apache.org/jira/browse/HADOOP-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964047#comment-14964047
 ] 

Alan Burlison commented on HADOOP-12487:
----------------------------------------

As I explained in the original bug filing, POSIX states that shutdown() is only 
valid on a connected socket, a socket in accept() is not connected and it's 
therefore not valid to call shutdown() on it, no matter what Linux happens to 
do in that circumstance.

Your assertion is the Linux implementation is more useful is wrong. Calling 
close() is the correct way to close a socket and force all other concurrent 
uses of the socket to fail, which is exactly what happens on Solaris. If 
close() causes all concurrent uses of a file descriptor to fail and the 
application correctly detects that failure and discards the file descriptor 
then there is no way the reuse scenario you describe can happen. Any 
application code that uses a file descriptor without checking for errors and 
discarding it on error is simply broken.

If you look at the kernel.org bug you'll see that the problems on Linux are 
even deeper than the broken close() behaviour. If you use poll() on a socket in 
the listen state then the poll() call returns immediately even when there is no 
incoming connection request, with  the (POLLOUT|POLLWRBAND) bits set, 
indicating that you can write to the poll() FD. That's broken because it means 
you can't use poll() to multiplex between a set of sockets that are listening, 
reading & writing. Also if you do actually try to write to the FD, Linux 
returns ENOTCONN. Although the immediate return from poll() is incorrect, the 
ENOTCONN error on a subsequent write() does confirm what I said above - 
listening sockets are *not* connected, even as far as Linux is concerned.

The only reason you have to use shutdown() on Linux is because the semantics of 
close() on listening sockets on Linux is broken, using shutdown() on Linux is a 
hack for the broken close() behaviour, not a justification.

> DomainSocket.close() assumes incorrect Linux behaviour
> ------------------------------------------------------
>
>                 Key: HADOOP-12487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12487
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: net
>    Affects Versions: 2.7.1
>         Environment: Linux Solaris
>            Reporter: Alan Burlison
>            Assignee: Alan Burlison
>         Attachments: shutdown.c
>
>
> I'm getting a test failure in TestDomainSocket.java, in the 
> testSocketAcceptAndClose test. That test creates a socket which one thread 
> waits on in DomainSocket.accept() whilst a second thread sleeps for a short 
> time before closing the same socket with DomainSocket.close().
> DomainSocket.close() first calls shutdown0() on the socket before closing 
> close0() - both those are thin wrappers around the corresponding libc socket 
> calls. DomainSocket.close() contains the following comment, explaining the 
> logic involved:
> {code}
>           // Calling shutdown on the socket will interrupt blocking system
>           // calls like accept, write, and read that are going on in a
>           // different thread.
> {code}
> Unfortunately that relies on non-standards-compliant Linux behaviour. I've 
> written a simple C test case that replicates the scenario above:
> # ThreadA opens, binds, listens and accepts on a socket, waiting for 
> connections.
> # Some time later ThreadB calls shutdown on the socket ThreadA is waiting in 
> accept on.
> Here is what happens:
> On Linux, the shutdown call in ThreadB succeeds and the accept call in 
> ThreadA returns with EINVAL.
> On Solaris, the shutdown call in ThreadB fails and returns ENOTCONN. ThreadA 
> continues to wait in accept.
> Relevant POSIX manpages:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/accept.html
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/shutdown.html
> The POSIX shutdown manpage says:
> "The shutdown() function shall cause all or part of a full-duplex connection 
> on the socket associated with the file descriptor socket to be shut down."
> ...
> "\[ENOTCONN] The socket is not connected."
> Page 229 & 303 of "UNIX System V Network Programming" say:
> "shutdown can only be called on sockets that have been previously connected"
> "The socket \[passed to accept that] fd refers to does not participate in the 
> connection. It remains available to receive further connect indications"
> That is pretty clear, sockets being waited on with accept are not connected 
> by definition. Nor is it the accept socket connected when a client connects 
> to it, it is the socket returned by accept that is connected to the client. 
> Therefore the Solaris behaviour of failing the shutdown call is correct.
> In order to get the required behaviour of ThreadB causing ThreadA to exit the 
> accept call with an error, the correct way is for ThreadB to call close on 
> the socket that ThreadA is waiting on in accept.
> On Solaris, calling close in ThreadB succeeds, and the accept call in ThreadA 
> fails and returns EBADF.
> On Linux, calling close in ThreadB succeeds but ThreadA continues to wait in 
> accept until there is an incoming connection. That accept returns 
> successfully. However subsequent accept calls on the same socket return EBADF.
> The Linux behaviour is fundamentally broken in three places:
> # Allowing shutdown to succeed on an unconnected socket is incorrect.  
> # Returning a successful accept on a closed file descriptor is incorrect, 
> especially as future accept calls on the same socket fail.
> # Once shutdown has been called on the socket, calling close on the socket 
> fails with EBADF. That is incorrect, shutdown should just prevent further IO 
> on the socket, it should not close it.
> The real issue though is that there's no single way of doing this that works 
> on both Solaris and Linux, there will need to be platform-specific code in 
> Hadoop to cater for the Linux brokenness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to