cweld510 opened a new pull request #224:
URL: https://github.com/apache/httpcomponents-core/pull/224


   I’ve observed a deadlock when writing to a ContentOutputStream obtained 
through AbstractClassicEntityProducer. We’ve looked through the layers at how 
ContentOutputStream’s SharedOutputBuffer coordinates between threads and traced 
this to a race condition in AbstractHttp1StreamDuplexer. The issue is that 
AbstractHttp1StreamDuplexer.requestSessionOutput can be called, but the event 
loop never calls onOutput, which then doesn’t call produceOutput.
   
   When a thread calls requestSessionOutput in the duplexer, it increments an 
AtomicInteger called outputRequests (used to track the current number of 
requests for output), and sets an OP_WRITE event. An event thread notices the 
OP_WRITE state, and responds to it by calling 
AbstractHttp1StreamDuplexer.onOutput(). This method then calls 
AbstractHttp1StreamDuplexer.produceOutput(). After the event thread returns, if 
requestSessionOutput has not been called since the event thread returned from 
produceOutput, the thread clears the event and sets outputRequests to zero. 
Otherwise, it decrements outputRequests after it finishes.
   
   How this is going wrong: Let’s say that 
AbstractHttp1StreamDuplexer.outputRequests starts with a value of 1 and that 
the event thread is in the middle of a call to onOutput and has just returned 
from calling produceOutput. At that moment, a call to requestSessionOutput 
increments the outputRequests counter to 2. Then line 350: “final int 
pendingOutputRequests = outputRequests.get();” pendingOutputRequests is set to 
2, and then we set pendingOutputRequeests back to zero and unset the OP_WRITE 
state in lines 356 and 357:
                   if (!outputPending && !outbuf.hasData() && 
outputRequests.compareAndSet(pendingOutputRequests, 0)) {
                       ioSession.clearEvent(SelectionKey.OP_WRITE);
   
   Because the OP_WRITE event was cleared, onOutput (and thus produceOutput) 
will not be called again, even though a caller requested one.
   
   The simplest way to fix this problem is to set pendingOutputRequests before 
the call to produceOutput; this makes the condition on line 356 false whenever 
requestSessionOutput is called directly after produceOutput. This change is 
included in my pull request.
   
   I wasn’t able to reproduce the issue in a unit test because 
AbstractHttp1StreamDuplexer has several dependencies (e.g. the behavior of 
event threads) that I couldn’t figure out how to mock well, and the class 
doesn’t have any existing unit tests. A test in my company’s codebase that 
reliably reproduced the race condition passes after the change, but failed 
before, and I have verified that the change works as expected by using my 
debugger. However, I’d be happy to create a unit test to verify the correctness 
of this change, if there is a way to do so effectively.
   
   Does this seem like the correct fix?
   
   Thank you! --Colin
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to