Hello Robbie, After couple hours of Stress Run with the proton-j change – we still could repro the Receive Stuck problem. Although – the below bug fix places us in a much better state now. Attached a screenshot of Objects sizes on Heap – which corresponds to the codepath that you fixed (SimpleSslTransportWrapper). Please see if this rings any bells – I am more than happy to share more details (proton traces & dump – pls suggest if you will need more details).
Thx! Sree From: Garlapati Sreeram Kumar<mailto:sreer...@live.com> Sent: Monday, April 11, 2016 11:50 AM To: Robbie Gemmell<mailto:robbie.gemm...@gmail.com>; proton@qpid.apache.org<mailto:proton@qpid.apache.org> Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; hm...@microsoft.com<mailto:hm...@microsoft.com> Subject: RE: Proton-j Reactor - Receiver Awesome. To make it easy - added you as collaborator to my fork of Proton & here’s the branch from which I submitted the PR: https://github.com/SreeramGarlapati/qpid-proton/tree/sg.recvstuck Thx! Sree From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com> Sent: Monday, April 11, 2016 9:52 AM To: proton@qpid.apache.org<mailto:proton@qpid.apache.org> Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; hm...@microsoft.com<mailto:hm...@microsoft.com> Subject: Re: Proton-j Reactor - Receiver Ah, excellent. I had actually started on testing this myself a little earlier, so I'll take a look and see whats what before continuing tomorrow. On taking an initial better look at things I think the change itself may need augmented to account for some other conditions too, need to investigate further to be sure. Robbie On 11 April 2016 at 17:37, Garlapati Sreeram Kumar <sreer...@live.com> wrote: > Thanks a lot for the Response Robbie! > Per your suggestion, added the CIT to the Pull Request (& yes, as you already > said – this issue is being tracked via JIRA - PROTON-1171). > > Thanks a lot for the Wonderful Collaboration! > Sree > > From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com> > Sent: Thursday, April 7, 2016 3:52 AM > To: proton@qpid.apache.org<mailto:proton@qpid.apache.org> > Cc: SeongJoon Kwak (SJ)<mailto:sjk...@microsoft.com>; > hm...@microsoft.com<mailto:hm...@microsoft.com> > Subject: Re: Proton-j Reactor - Receiver > > Hi Sree, > > Thanks for the analysis and PR, I'll try to take a proper look soon. > It's not an area of the code I'm familiar with so I'll need to have a > bit of a dig myself to see if the change seems ok. I'd note that any > not-insignificant bug fix such as this should probably have a test > with it (and a JIRA, though I see you have since created one of those) > :) > > Robbie > > On 6 April 2016 at 01:23, Garlapati Sreeram Kumar <sreer...@live.com> wrote: >> Hello Robbie, >> >> We are using proton-j client with SSL and many of our customers are hitting >> this issue. >> Here are my findings after debugging through this issue: >> >> - When incoming bytes arrive on the SocketChannel – proton-j client >> gets signaled by nio & as a result it unwinds the transport stack – as a >> result all the TransportInput implementations performs its task on the Read >> Bytes and hands off to the Next Layer in the stack (transport to ssl, ssl to >> frameparser etc). >> >> - While unwinding that stack, SimpleSSLTransportWrapper.unwrapInput >> reads(16k bytes) from _inputBuffer and the result - decoded bytes are >> written to _decodedInputBuffer – as an intermediate buffer. >> >> - It then flushes bytes from intermediate buffer to the next layer >> & invokes an _underlyingInput.Process() – to signal it that it has bytes in >> its input buffer. >> >> - If the underlyingInput (lets say FrameParser) buffer size is >> small – lets say 4k – then decodedInputBuffer will be left with 12k bytes & >> Over time this accrues. >> >> The fix here is to flush decodedInputBuffer to the Next transport in the >> Network Stack & call _underlyingInput.Process() - until decodedInputBuffer >> is empty. Here’s the pull request - >> https://github.com/apache/qpid-proton/pull/73 >> >> Pl. let me know if we need to do more to fix this issue comprehensively. >> >> Thx! >> Sree >> >> From: Robbie Gemmell<mailto:robbie.gemm...@gmail.com> >> Sent: Thursday, March 31, 2016 9:19 AM >> To: proton@qpid.apache.org<mailto:proton@qpid.apache.org> >> Subject: Re: Proton-j Reactor - Receiver >> >> On 31 March 2016 at 04:32, Garlapati Sreeram Kumar <sreer...@live.com> wrote: >>> Hello All! >>> >>> I am using Proton-J reactor API (Version 0.12.0) for receiving AMQP >>> Messages (from Microsoft Azure Event Hubs): >>> https://github.com/Azure/azure-event-hubs/blob/master/java/azure-eventhubs/src/main/java/com/microsoft/azure/servicebus/amqp/ReceiveLinkHandler.java#L124 >>> >>> Am using the onDelivery(Event) callback to receive messages. I really >>> appreciate your help with this issue/behavior: >>> >>> ISSUE: I noticed that the last few messages on the Queue are not being >>> issued to onDelivery(Event) callback by the Reactor >>> - Then, I went ahead and enabled proton Frame tracing (PN_TRACE_FRM=1) and >>> discovered that the Transfer frames corresponding to those messages were >>> not even delivered to Client. Then, I looked at our Service Proton Frames >>> and can clearly see that they are being delivered by the Service. And other >>> AMQP clients (for ex: .net client can see the Transfer frames) >>> - Is this a known behavior? >>> Does Reactor code path disable Nagle on underlying socket – could this be >>> related? or is there any other Configuration that we should be setting to >>> see all Transfer frames received on the Socket? >>> >>> Please advice. >>> >>> Thanks a lot in Advance! >>> Sree >>> >>> Sent from Mail for Windows 10 >>> >> >> I'm not aware of anyone else reporting anything like that. I don't see >> anything in the code suggesting the reactor sets TCP_NODELAY trueon >> the socket, but I wouldn't think that should matter here. >> >> The frame trace logging is done after the bytes are given to the >> Transport and are processed into frames, so a lack of logging could >> suggest various things such as they didnt actually get there, they >> werent processed, something went wrong before they did/were, something >> went wrong decoding them, etc. Its hard to say much more without more >> info. >> >> Robbie