[ 
https://issues.apache.org/jira/browse/HADOOP-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586472#comment-14586472
 ] 

Sangjin Lee commented on HADOOP-12090:
--------------------------------------

This is caused by fragmented TCP packets for the kerberos authentication 
request.

In the problem situation, the kerberos authentication request sent by the 
client gets fragmented into 2 packets although the size is tiny (e.g. 584 
bytes). It gets split into one packet with 570 bytes of data and another with 
14 bytes in this case. Tcpdump output:

{noformat}
10:30:32.358645 IP localhost.50199 > localhost.60538: Flags [S], seq 
1804572222, win 32792, options [mss 16396,sackOK,TS val 566449661 ecr 
0,nop,wscale 8], length 0
10:30:32.358661 IP localhost.60538 > localhost.50199: Flags [S.], seq 
2381946627, ack 1804572223, win 1140, options [mss 16396,sackOK,TS val 
566449661 ecr 566449661,nop,wscale 0], length 0
10:30:32.358672 IP localhost.50199 > localhost.60538: Flags [.], ack 1, win 
129, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358788 IP localhost.50199 > localhost.60538: Flags [.], seq 1:571, ack 
1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 570
10:30:32.358796 IP localhost.60538 > localhost.50199: Flags [.], ack 571, win 
570, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358801 IP localhost.50199 > localhost.60538: Flags [P.], seq 571:585, 
ack 1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 14
{noformat}

It turns out there is a bug with apacheds (on which minikdc is based) where the 
kerberos message decoding fails with a NPE if the kerberos message is not 
contained in a single TCP packet (DIRSERVER-2071).

Furthermore, the TCP fragmentation itself has something to do with apacheds as 
well. Mina, the underlying I/O framework for apacheds, sets a pretty small 
receive/send buffer size by default (1 KB). This has an affect of reducing the 
TCP window size significantly as it is evidenced by the tcp dump above. This is 
causing the fragmentation.

> minikdc-related unit tests fail consistently on some platforms
> --------------------------------------------------------------
>
>                 Key: HADOOP-12090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12090
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 2.7.0
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>
> On some platforms all unit tests that use minikdc fail consistently. Those 
> tests include TestKMS, TestSaslDataTransfer, 
> TestTimelineAuthenticationFilter, etc.
> Typical failures on the unit tests:
> {noformat}
> java.lang.AssertionError: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Cannot get a 
> KDC reply)
>       at org.junit.Assert.fail(Assert.java:88)
>       at 
> org.apache.hadoop.crypto.key.kms.server.TestKMS$8$4.run(TestKMS.java:1154)
>       at 
> org.apache.hadoop.crypto.key.kms.server.TestKMS$8$4.run(TestKMS.java:1145)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1645)
>       at 
> org.apache.hadoop.crypto.key.kms.server.TestKMS.doAs(TestKMS.java:261)
>       at 
> org.apache.hadoop.crypto.key.kms.server.TestKMS.access$100(TestKMS.java:76)
> {noformat}
> The errors that cause this failure on the KDC server on the minikdc are a 
> NullPointerException:
> {noformat}
> org.apache.mina.filter.codec.ProtocolDecoderException: 
> java.lang.NullPointerException: message (Hexdump: ...)
>       at 
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:234)
>       at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
>       at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:48)
>       at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:802)
>       at 
> org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:120)
>       at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
>       at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:426)
>       at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:604)
>       at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:564)
>       at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:553)
>       at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.access$400(AbstractPollingIoProcessor.java:57)
>       at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:892)
>       at 
> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException: message
>       at 
> org.apache.mina.filter.codec.AbstractProtocolDecoderOutput.write(AbstractProtocolDecoderOutput.java:44)
>       at 
> org.apache.directory.server.kerberos.protocol.codec.MinaKerberosDecoder.decode(MinaKerberosDecoder.java:65)
>       at 
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:224)
>       ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to