[ 
https://issues.apache.org/jira/browse/HADOOP-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567765#action_12567765
 ] 

Hadoop QA commented on HADOOP-2789:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12375254/HADOOP-2789.patch
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 3 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler 
warnings.

    release audit +1.  The applied patch does not generate any new release 
audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1773/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1773/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1773/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1773/console

This message is automatically generated.

> Race condition in ipc.Server prevents responce being written back to client.
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2789
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2789
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.16.0
>            Reporter: Clint Morgan
>            Assignee: Raghu Angadi
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: failure-with-patch.log, failure.log, 
> HADOOP-2789-Test.patch, HADOOP-2789.patch, HADOOP-2789.patch, 
> HADOOP-2789.patch, HADOOP-2789.patch, HADOOP-2789.patch, HADOOP-2789.patch, 
> success.log
>
>
> I encountered a race condition in ipc.Server when writing the response
> back to the socket. Sometimes the write SelectKey is being canceled
> when it should not be, and thus the full response never gets
> written. This results in clients timing out on the socket while waiting for 
> the response.
> I am attaching a unit test that demonstrates the problem. It follows
> closely the TestIPC test, however the socket output buffer is set
> smaller than the result being sent back, so that partial writes
> occur. I also put random sleep in the client to help provoke the race
> condition.
> On my machine this fails over half of the time.
> Looking at the code in ipc.Server.java. The problem is manifested in
> Responder.doAsyncWrite(). If I comment out the key.cancel() line, then
> everything works fine. 
> So we need to identify when to safely cancel the key.
> I tried the following:
> {noformat}
>     private void doAsyncWrite(SelectionKey key) throws IOException {
>       Call call = (Call)key.attachment();
>       if (call == null) {
>         return;
>       }
>       if (key.channel() != call.connection.channel) {
>         throw new IOException("doAsyncWrite: bad channel");
>       }
>       if (processResponse(call.connection.responseQueue)) {
>           synchronized(call.connection.responseQueue) {
>               if (call.connection.responseQueue.size() == 0) {
>                   LOG.info("Cancelling key for call "+call.toString()+ " key: 
> "+ key.toString());
>                   key.cancel();          // remove item from selector.
>               } else {
>                   LOG.warn("NOT REALLY DONE: "+call.toString()+ " key: "+ 
> key.toString());
>               }
>           }
>       }
>     }
> {noformat}
> And this does catch some of the cases (EG, the LOG.warn message gets hit), 
> but i still hit the race condition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to