On Wed, Dec 4, 2019 at 12:44 PM Rémy Maucherat <r...@apache.org> wrote:

> On Wed, Dec 4, 2019 at 11:51 AM Mark Thomas <ma...@apache.org> wrote:
>
>> On 04/12/2019 10:25, Rémy Maucherat wrote:
>> > On Wed, Dec 4, 2019 at 11:11 AM Mark Thomas <m...@homeinbox.net
>> > <mailto:m...@homeinbox.net>> wrote:
>> >
>> >     On 04/12/2019 09:58, Mark Thomas wrote:
>> >     > On 04/12/2019 09:27, Mark Thomas wrote:
>> >     >> On 04/12/2019 07:55, Rémy Maucherat wrote:
>> >     >
>> >     > <snip/>
>> >     >
>> >     >>> Blocking is also used for HTTP/2 where it seems to work. If the
>> >     write
>> >     >>> buffer is full, 0 bytes are written and the socket is indeed
>> >     passed to
>> >     >>> the poller but until the operation is complete it is supposed
>> to be
>> >     >>> blocked here:
>> >     >>>
>> >
>> https://github.com/apache/tomcat/blob/master/java/org/apache/tomcat/util/net/SocketWrapperBase.java#L1430
>> >     >>
>> >     >> That helps a lot. I can try and see why things aren't blocking
>> there.
>> >     >
>> >     > Some progress. The thread does wait there but it is getting
>> >     released too
>> >     > early. I'm guessing something is calling notify(). I'm planning
>> to add
>> >     > more debug to figure out what / why.
>> >
>> >     Found it.
>> >
>> >     The wait on line 1430 (see link above) waits for the write/read
>> timeout.
>> >     The problem was that this wait was timing out, allowing the code to
>> >     continue before the Poller timed out a few milliseconds later and
>> called
>> >     the CompletionHander which ultimately sets the SendResult the
>> WebSocket
>> >     code is looking for.
>> >
>> >     I'm currently testing using:
>> >
>> >     state.wait()
>> >
>> >     rather than
>> >
>> >     state.wait(unit.toMillis(timeout));
>> >
>> >
>> > Well, it would remove the enforcement of the overall timeout since the
>> > write could be made up of multiple operations. But I think I get the
>> > idea that it's imperfect (once the IO is running, you cannot stop it
>> > this way).
>>
>> Could we record the start time of the write (or similar) and adjust the
>> timeout passed the Poller accordingly?
>>
>
> Yes, it's usually this kind of strategy, but it's complex :( I may keep
> the wait(timeout) instead but make the processing of the failed more robust.
>

It seemed unreasonably complex for what it is and the breakage risk was
really high too, so I preferred making the call of the "user" completion
handler robust. Let me know if it doesn't work well enough but it seems ok
with the test you gave (no null send result).

I will revisit this after the next build.

Rémy

Reply via email to