DomainSocket issues on Solaris

Alan Burlison Wed, 30 Sep 2015 01:15:07 -0700

Now that the Hadoop native code builds on Solaris I've been chippingaway at all the test failures. About 50% of the failures involveDomainSocket, either directly or indirectly. That seems to be mainlybecause the tests use DomainSocket to do single-node testing, whereas inproduction it seems that DomainSocket is less commonly used(https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html).

The particular problem on Solaris is that socket read/write timeouts(the SO_SNDTIMEO and SO_RCVTIMEO socket options) are not supported forUNIX domain (PF_UNIX) sockets. Those options are however supported forPF_INET sockets. That's because the socket implementation on Solaris issplit roughly into two parts, for inet sockets and for STREAMS sockets,and the STREAMS implementation lacks support for SO_SNDTIMEO andSO_RCVTIMEO. As an aside, performance of sockets that use loopback orthe host's own IP is slightly better than that of UNIX domain sockets onSolaris.

I'm investigating getting timeouts supported for PF_UNIX sockets addedto Solaris, but in the meantime I'm also looking how this might beworked around in Hadoop. One way would be to implement timeouts bywrapping all the read/write/send/recv etc calls in DomainSocket.c witheither poll() or select().

The basic idea is to add two new fields to DomainSocket.c to hold theread/write timeouts. On platforms that support SO_SNDTIMEO andSO_RCVTIMEO these would be unused as setsockopt() would be used to setthe socket timeouts. On platforms such as Solaris the JNI code would usethe values to implement the timeouts appropriately.

To prevent the code in DomainSocket.c becoming a #ifdef hairball, thecurrent socket IO function calls such as accept(), send(), read() etcwould be replaced with a macros such as HD_ACCEPT. On platforms thatprovide timeouts these would just expand to the normal socket functions,on platforms that don't support timeouts it would expand to wrappersthat implements timeouts for them.

The only caveats are that all code that does anything to a PF_UNIXsocket would *always* have to do so via DomainSocket. As far as I cantell that's not an issue, but it would have to be borne in mind if anychanges were made in this area.


Before I set about doing this, does the approach seem reasonable?

Thanks,

--
Alan Burlison
--

DomainSocket issues on Solaris

Reply via email to