[ 
https://issues.apache.org/jira/browse/DIRSERVER-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586432#comment-14586432
 ] 

Sangjin Lee commented on DIRSERVER-2071:
----------------------------------------

Thanks [~akiran] for the latest fix! I can confirm that the latest does fix the 
issue. I'll defer a more rigorous code review to the directory server 
committers...

On a related note, I did identify why the TCP packets were being fragmented in 
the first place. It is due to the small receive buffer size. Mina's default 
receive buffer size (as well as the send buffer size) is 1 KB. These values 
affect the TCP window size during the TCP handshake, and on some platforms this 
results in consistently small TCP window sizes. One example (from the tcpdump):

{noformat}
10:30:32.358645 IP localhost.50199 > localhost.60538: Flags [S], seq 
1804572222, win 32792, options [mss 16396,sackOK,TS val 566449661 ecr 
0,nop,wscale 8], length 0
10:30:32.358661 IP localhost.60538 > localhost.50199: Flags [S.], seq 
2381946627, ack 1804572223, win 1140, options [mss 16396,sackOK,TS val 
566449661 ecr 566449661,nop,wscale 0], length 0
10:30:32.358672 IP localhost.50199 > localhost.60538: Flags [.], ack 1, win 
129, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358788 IP localhost.50199 > localhost.60538: Flags [.], seq 1:571, ack 
1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 570
10:30:32.358796 IP localhost.60538 > localhost.50199: Flags [.], ack 571, win 
570, options [nop,nop,TS val 566449661 ecr 566449661], length 0
10:30:32.358801 IP localhost.50199 > localhost.60538: Flags [P.], seq 571:585, 
ack 1, win 129, options [nop,nop,TS val 566449661 ecr 566449661], length 14
{noformat}

Although the client (50199) starts out with a large window size (32792), the 
server (KDC server at port 60538) immediately narrows it to 1140, and the 
resulting window size comes out to be 570 bytes. The 1140 bytes the server sets 
is largely a function of the receive buffer size. So it's doing a 570-byte 
packets on a *loopback* connection!

To me, mina's small default sizes for receive/send buffers strike me as 
outdated and inadequate. I see several options for eliminating this unnecessary 
fragmentation:
# change the default buffer sizes in TcpTransport to use more reasonable values 
(e.g. 32 KB or 64 KB)
# at least allow an option to set the default buffer sizes in TcpTransport so 
users of TcpTransport can configure the buffer size
# change the default buffer sizes in mina (affects all apps that use mina)

Although I suspect it might be fine to update the default buffer sizes in mina, 
that would potentially be a disruptive change as it changes behavior implicitly 
for everyone.

How about doing the first two? If you guys think it's a better way to go, I can 
file a separate DS JIRA for that issue.

> MinaKerberosDecoder fails with NullPointerException if the kerberos message 
> is split in multiple packets
> --------------------------------------------------------------------------------------------------------
>
>                 Key: DIRSERVER-2071
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-2071
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 2.0.0-M15
>            Reporter: Sangjin Lee
>            Assignee: Kiran Ayyagari
>             Fix For: 2.0.0-M21
>
>         Attachments: TestAsn1Decoder.java, TestAsn1Decoder.java
>
>
> On some machines (but not all), kerberos authentication to Apache DS is 
> consistently failing with a NullPointerException:
> {noformat}
> 2015-06-08 20:35:57,189 ERROR KerberosProtocolHandler - /127.0.0.1:54477 
> EXCEPTION
> org.apache.mina.filter.codec.ProtocolDecoderException: 
> java.lang.NullPointerException: message (Hexdump: ...)
>         at 
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:234)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:48)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:802)
>         at 
> org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:120)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:426)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:604)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:564)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:553)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.access$400(AbstractPollingIoProcessor.java:57)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:892)
>         at 
> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException: message
>         at 
> org.apache.mina.filter.codec.AbstractProtocolDecoderOutput.write(AbstractProtocolDecoderOutput.java:44)
>         at 
> org.apache.directory.server.kerberos.protocol.codec.MinaKerberosDecoder.decode(MinaKerberosDecoder.java:65)
>         at 
> org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:224)
>         ... 15 more
> {noformat}
> I traced it down to a silent failure of Asn1Decoder. Basically the ASN.1 
> decoder thinks the byte buffer it is passed does not have the promised size 
> payload, and aborts parsing:
> {noformat}
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - 
> >>>==========================================
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --> Decoding a PDU
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - 
> >>>------------------------------------------
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --- State = TAG_STATE_START ---
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x6C
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - Tag 0x6C has been decoded
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --- State = LENGTH_STATE_START ---
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x82
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --- State = LENGTH_STATE_PENDING 
> ---
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x02
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x02
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x41
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --- State = LENGTH_STATE_END ---
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x30
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - Parent length : TLV expected 
> length stack :  - null
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - Root TLV[577]
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - Length 577 has been decoded
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - --- State = VALUE_STATE_START ---
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder -   current byte : 0x30
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - 
> <<<------------------------------------------
> 2015-06-08 20:35:57,187 DEBUG Asn1Decoder - <-- End decoding : TLV[ 0x6C, 
> 577, DATA[... ]]
> {noformat}
> Here's an interesting thing however. The reason the buffer has fewer bytes 
> than the length value indicates is that the TCP message is split into 2 
> packets due to a small TCP window size on this problematic machine. For the 
> kerberos message of size 585 bytes, the first 570 bytes are sent in the first 
> packet, and the remaining 15 bytes in another packet. The Asn1Decoder thinks 
> that it is missing the last 15 bytes and aborts decoding it. The following is 
> the gist of the tcpdump capture of this TCP conversation:
> {noformat}
> 13:35:57.186758 IP localhost.54477 > localhost.40829: Flags [S], seq 
> 1599306555, win 32792, options [mss 16396,sackOK,TS val 318374489 ecr 
> 0,nop,wscale 8], length 0
> 13:35:57.186771 IP localhost.40829 > localhost.54477: Flags [S.], seq 
> 7524064, ack 1599306556, win 1140, options [mss 16396,sackOK,TS val 318374489 
> ecr 318374489,nop,wscale 0], length 0
> 13:35:57.186781 IP localhost.54477 > localhost.40829: Flags [.], ack 1, win 
> 129, options [nop,nop,TS val 318374489 ecr 318374489], length 0
> 13:35:57.186903 IP localhost.54477 > localhost.40829: Flags [.], seq 1:571, 
> ack 1, win 129, options [nop,nop,TS val 318374490 ecr 318374489], length 570
> 13:35:57.186909 IP localhost.40829 > localhost.54477: Flags [.], ack 571, win 
> 570, options [nop,nop,TS val 318374490 ecr 318374490], length 0
> 13:35:57.186914 IP localhost.54477 > localhost.40829: Flags [P.], seq 
> 571:586, ack 1, win 129, options [nop,nop,TS val 318374490 ecr 318374489], 
> length 15
> 13:35:57.187519 IP localhost.40829 > localhost.54477: Flags [.], ack 571, win 
> 1140, options [nop,nop,TS val 318374490 ecr 318374490], length 0
> 13:35:57.190641 IP localhost.40829 > localhost.54477: Flags [F.], seq 1, ack 
> 571, win 1140, options [nop,nop,TS val 318374493 ecr 318374490], length 0
> 13:35:57.190687 IP localhost.54477 > localhost.40829: Flags [F.], seq 586, 
> ack 2, win 129, options [nop,nop,TS val 318374493 ecr 318374493], length 0
> 13:35:57.190701 IP localhost.40829 > localhost.54477: Flags [R], seq 7524066, 
> win 0, length 0
> {noformat}
> Having small TCP window sizes aside, this strikes me as a code issue. The 
> server should be able to handle the kerberos message even if it is split into 
> multiple packets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to