Hello Robbie,

After couple hours of Stress Run with the proton-j change – we still could 
repro the Receive Stuck problem. Although – the below bug fix places us in a 
much better state now.
Attached a screenshot of Objects sizes on Heap – which corresponds to the 
codepath that you fixed (SimpleSslTransportWrapper).
Please see if this rings any bells – I am more than happy to share more details 
(proton traces & dump – pls suggest if you will need more details).

Thx!
Sree

From: Garlapati Sreeram Kumar<mailto:sreer...@live.com>
Sent: Monday, April 11, 2016 11:50 AM
To: Robbie Gemmell<mailto:robbie.gemm...@gmail.com>; 
proton@qpid.apache.org<mailto:proton@qpid.apache.org>
Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; 
hm...@microsoft.com<mailto:hm...@microsoft.com>
Subject: RE: Proton-j Reactor - Receiver

Awesome.

To make it easy - added you as collaborator to my fork of Proton & here’s the 
branch from which I submitted the PR: 
https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck

Thx!
Sree

From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com>
Sent: Monday, April 11, 2016 9:52 AM
To: proton@qpid.apache.org<mailto:proton@qpid.apache.org>
Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; 
hm...@microsoft.com<mailto:hm...@microsoft.com>
Subject: Re: Proton-j Reactor - Receiver

Ah, excellent. I had actually started on testing this myself a little
earlier, so I'll take a look and see whats what before continuing
tomorrow. On taking an initial better look at things I think the
change itself may need augmented to account for some other conditions
too, need to investigate further to be sure.

Robbie

On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <sreer...@live.com> wrote:
> Thanks a lot for the Response Robbie!
> Per your suggestion, added the CIT to the Pull Request (& yes, as you already 
> said – this issue is being tracked via JIRA - PROTON-1171).
>
> Thanks a lot for the Wonderful Collaboration!
> Sree
>
> From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com>
> Sent: Thursday, April 7, 2016 3:52 AM
> To: proton@qpid.apache.org<mailto:proton@qpid.apache.org>
> Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; 
> hm...@microsoft.com<mailto:hm...@microsoft.com>
> Subject: Re: Proton-j Reactor - Receiver
>
> Hi Sree,
>
> Thanks for the analysis and PR, I'll try to take a proper look soon.
> It's not an area of the code I'm familiar with so I'll need to have a
> bit of a dig myself to see if the change seems ok. I'd note that any
> not-insignificant bug fix such as this should probably have a test
> with it (and a JIRA, though I see you have since created one of those)
> :)
>
> Robbie
>
> On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <sreer...@live.com> wrote:
>> Hello Robbie,
>>
>> We are using proton-j client with SSL and many of our customers are hitting 
>> this issue.
>> Here are my findings after debugging through this issue:
>>
>> -          When incoming bytes arrive on the SocketChannel – proton-j client 
>> gets signaled by nio & as a result it unwinds the transport stack – as a 
>> result all the TransportInput implementations performs its task on the Read 
>> Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to 
>> frameparser etc).
>>
>> -          While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput 
>> reads(16k bytes) from _inputBuffer and the result - decoded bytes are 
>> written to _decodedInputBuffer – as an intermediate buffer.
>>
>> -          It then flushes bytes from intermediate buffer to the next layer 
>> & invokes an _underlyingInput.Process() – to signal it that it has bytes in 
>> its input buffer.
>>
>> -          If the underlyingInput (lets say FrameParser) buffer size is 
>> small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & 
>> Over time this accrues.
>>
>> The fix here is to flush decodedInputBuffer to the Next transport in the 
>> Network Stack & call _underlyingInput.Process() - until decodedInputBuffer 
>> is empty. Here’s the pull request - 
>> https://github.com/apache/qpid-proton/pull/73
>>
>> Pl. let me know if we need to do more to fix this issue comprehensively.
>>
>> Thx!
>> Sree
>>
>> From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com>
>> Sent: Thursday, March 31, 2016 9:19 AM
>> To: proton@qpid.apache.org<mailto:proton@qpid.apache.org>
>> Subject: Re: Proton-j Reactor - Receiver
>>
>> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <sreer...@live.com> wrote:
>>> Hello All!
>>>
>>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP 
>>> Messages (from Microsoft Azure Event Hubs): 
>>> https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124
>>>
>>> Am using the onDelivery(Event) callback to receive messages. I really 
>>> appreciate your help with this issue/behavior:
>>>
>>> ISSUE: I noticed that the last few messages on the Queue are not being 
>>> issued to onDelivery(Event) callback by the Reactor
>>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and 
>>> discovered that the Transfer frames corresponding to those messages were 
>>> not even delivered to Client. Then, I looked at our Service Proton Frames 
>>> and can clearly see that they are being delivered by the Service. And other 
>>> AMQP clients (for ex: .net client can see the Transfer frames)
>>> - Is this a known behavior?
>>> Does Reactor code path disable Nagle on underlying socket – could this be 
>>> related? or is there any other Configuration that we should be setting to 
>>> see all Transfer frames received on the Socket?
>>>
>>> Please advice.
>>>
>>> Thanks a lot in Advance!
>>> Sree
>>>
>>> Sent from Mail for Windows 10
>>>
>>
>> I'm not aware of anyone else reporting anything like that. I don't see
>> anything in the code suggesting the reactor sets TCP_NODELAY trueon
>> the socket, but I wouldn't think that should matter here.
>>
>> The frame trace logging is done after the bytes are given to the
>> Transport and are processed into frames, so a lack of logging could
>> suggest various things such as they didnt actually get there, they
>> werent processed, something went wrong before they did/were, something
>> went wrong decoding them, etc. Its hard to say much more without more
>> info.
>>
>> Robbie

Reply via email to