[ 
https://issues.apache.org/jira/browse/DERBY-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584520#action_12584520
 ] 

Øystein Grøvlen commented on DERBY-3567:
----------------------------------------

Thanks for the patch, Jørgen.  
If I am not wrong, I think that with this patch there is a chance that threads 
that are waiting for a flush, will not be notified.
Consider the case when two threads are calling forceFlush concurrently.  As far 
as I can tell, both may be waiting for a log send to complete, but only one 
thread will be notified when the sending has been completed.  Since, 
forceFlushed is then false, no further notification may be done.  Hence, the 
second thread may time out and cause replication to be stopped.

I do not think it will work to just switch to notifyAll, since it is not 
guaranteed that a single write is sufficient to have space for all waiting 
threads to write their log records.  Instead, I suggest that notify can be 
called after every send, regardless of whether anyone may be waiting or not.  I 
am not sure the extra synchronization overhead will be big since the current 
solution involves access to a volatile variable anyway.



> AsynchronousLogShipper#forceFlush should time out
> -------------------------------------------------
>
>                 Key: DERBY-3567
>                 URL: https://issues.apache.org/jira/browse/DERBY-3567
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0, 10.5.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: derby-3567-1a.diff, derby-3567-1a.stat
>
>
> If the network connection to the slave is lost, 
> ObjectOutputStream#writeObject may be blocked for 2 minutes before failing 
> (not configurable TCP property). 
> Currently, ALS#forceFlush sends a chunk of log to the slave using the client 
> thread. The client thread cannot be blocked for 2 minutes before giving up. 
> Rather, it should notify the log shipper that it has to send log immediately, 
> and then wait for a short while (until notified or e.g. maximum 5 seconds). 
> If the log shipper has not been able to empty some space in the log buffer by 
> then, replication should be stopped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to