[
https://issues.apache.org/jira/browse/HADOOP-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586472#comment-14586472
]
Sangjin Lee commented on HADOOP-12090:
--------------------------------------
This is caused by fragmented TCP packets for the kerberos authentication
request.
In the problem situation, the kerberos authentication request sent by the
client gets fragmented into 2 packets although the size is tiny (e.g. 584
bytes). It gets split into one packet with 570 bytes of data and another with
14 bytes in this case. Tcpdump output:
{noformat}
10:30:32.358645 IP localhost.50199 > localhost.60538: Flags [S], seq
1804572222, win 32792, options [mss 16396,sackOK,TS val 566449661 ecr
0,nop,wscale 8], length 0
10:30:32.358661 IP localhost.60538 > localhost.50199: Flags [S.], seq
2381946627, ack 1804572223, win 1140, options [mss 16396,sackOK,TS val
566449661 ecr 566449661,nop,wscale 0], length 0
10:30:32.358672 IP localhost.50199 > localhost.60538: Flags [.], ack 1, win
129, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358788 IP localhost.50199 > localhost.60538: Flags [.], seq 1:571, ack
1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 570
10:30:32.358796 IP localhost.60538 > localhost.50199: Flags [.], ack 571, win
570, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358801 IP localhost.50199 > localhost.60538: Flags [P.], seq 571:585,
ack 1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 14
{noformat}
It turns out there is a bug with apacheds (on which minikdc is based) where the
kerberos message decoding fails with a NPE if the kerberos message is not
contained in a single TCP packet (DIRSERVER-2071).
Furthermore, the TCP fragmentation itself has something to do with apacheds as
well. Mina, the underlying I/O framework for apacheds, sets a pretty small
receive/send buffer size by default (1 KB). This has an affect of reducing the
TCP window size significantly as it is evidenced by the tcp dump above. This is
causing the fragmentation.
> minikdc-related unit tests fail consistently on some platforms
> --------------------------------------------------------------
>
> Key: HADOOP-12090
> URL: https://issues.apache.org/jira/browse/HADOOP-12090
> Project: Hadoop Common
> Issue Type: Bug
> Components: kms
> Affects Versions: 2.7.0
> Reporter: Sangjin Lee
> Assignee: Sangjin Lee
>
> On some platforms all unit tests that use minikdc fail consistently. Those
> tests include TestKMS, TestSaslDataTransfer,
> TestTimelineAuthenticationFilter, etc.
> Typical failures on the unit tests:
> {noformat}
> java.lang.AssertionError:
> org.apache.hadoop.security.authentication.client.AuthenticationException:
> GSSException: No valid credentials provided (Mechanism level: Cannot get a
> KDC reply)
> at org.junit.Assert.fail(Assert.java:88)
> at
> org.apache.hadoop.crypto.key.kms.server.TestKMS$8$4.run(TestKMS.java:1154)
> at
> org.apache.hadoop.crypto.key.kms.server.TestKMS$8$4.run(TestKMS.java:1145)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1645)
> at
> org.apache.hadoop.crypto.key.kms.server.TestKMS.doAs(TestKMS.java:261)
> at
> org.apache.hadoop.crypto.key.kms.server.TestKMS.access$100(TestKMS.java:76)
> {noformat}
> The errors that cause this failure on the KDC server on the minikdc are a
> NullPointerException:
> {noformat}
> org.apache.mina.filter.codec.ProtocolDecoderException:
> java.lang.NullPointerException: message (Hexdump: ...)
> at
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:234)
> at
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
> at
> org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:48)
> at
> org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:802)
> at
> org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:120)
> at
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
> at
> org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:426)
> at
> org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:604)
> at
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:564)
> at
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:553)
> at
> org.apache.mina.core.polling.AbstractPollingIoProcessor.access$400(AbstractPollingIoProcessor.java:57)
> at
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:892)
> at
> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException: message
> at
> org.apache.mina.filter.codec.AbstractProtocolDecoderOutput.write(AbstractProtocolDecoderOutput.java:44)
> at
> org.apache.directory.server.kerberos.protocol.codec.MinaKerberosDecoder.decode(MinaKerberosDecoder.java:65)
> at
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:224)
> ... 15 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)