[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135159#comment-13135159
 ] 

Ivan Kelly commented on BOOKKEEPER-93:
--------------------------------------

1) Yikes, that's a big oversight. There is actually a test for it, 
BookieReadWriteTest#testReadFromOpenLedger, but the @Test annotation is missing 
from it so it never gets run. Also, the actual checking code seems to be wrong, 
as it tries to read from lh, not lhOpen (line 861). Could you break the fix for 
this problem into a single patch along with the fix for the test and ill commit 
that as BOOKKEEPER-91. 

2) This is unrelated to 1) so should be in a separate JIRA. Also, im unsure the 
race you describe can occur. ReadLastConfirmedOp#readEntryComplete is already 
synchronized.

3) Actually this could go into BOOKKEEPER-91. However, I think a better 
solution may be to do a ReadLastConfirmedOp in the else part of 
LedgerOpenOp#processResult. 
{code}
        if(!unsafe) {
            lh.recover(new GenericCallback<Void>() {
            @Override
            public void operationComplete(int rc, Void result) {
                if (rc != BKException.Code.OK) {
                    cb.openComplete(BKException.Code.LedgerRecoveryException, 
null, LedgerOpenOp.this.ctx);
                } else {
                    cb.openComplete(BKException.Code.OK, lh, 
LedgerOpenOp.this.ctx);
                }
            }
       } else {
           lh.asyncReadLastConfirmed(new ReadLastConfirmedCallback() {
               void readLastConfirmedComplete(int rc, long lastConfirmed, 
Object ctx) {
                   lh.lastAddConfirmed = lh.lastAddPushed = lastConfirmed;
                   cb.complete(rc, LedgerOpenOp.this.ctx);
               }
           });
       }
{code}

This way, a non recovery ledger will be able to read entries up to the point it 
was opened and no further. I think this should be correct behaviour, as 
otherwise it could be possible for the ledger to read an entry which hasn't 
been confirmed to the writer. If it hasn't been confirmed to the writer and the 
writer closes at that point. Which means the reader can read more than the 
writer, which I don't think affects correctness, but is a little ugly.
                
> bookkeeper doesn't work correctly on OpenLedgerNoRecovery
> ---------------------------------------------------------
>
>                 Key: BOOKKEEPER-93
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-93
>             Project: Bookkeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.0
>            Reporter: Sijie Guo
>            Assignee: Sijie Guo
>             Fix For: 3.4.0
>
>         Attachments: bookkeeper-93.patch
>
>
> 1) bookkeeper hang when openLedgerNoRecovery, since LedgerOpenOp didn't 
> trigger callback when opening ledger no recovery.
> 2) race condition in ReadLastConfirmOp
> ReadLastConfirmOp callback on readEntryComplete.
> a) first decrement numResponsePending
> b) then increment validResponses
> c) check validResponses to callback with OK
> b) check numResponsePending to callback with LedgerRecoveryException
> support two callbacks returns on readEntryComplete: A, B. (quorum/ensemble 
> size : 2)
> a) A first decrement numResponsePending from 2 to 1.
> b) A increment validResponses from 0 to 1.
> c) B then decrement numResponsePending from 1 to 0.
> d) A check numResponsePending before B check validResponse, A found the 
> numResponsePending is 0 now. A will callback with exception. But the right 
> action is B check validResponse and callback with OK.
> 3) if an LegerHandle is opened by openLedgerNoRecovery, the lastAddConfirmed 
> will be set to -1. so all read requests will be failed since readEntry id > 
> lastAddConfirmed.
> so I suggested that if an LegerHandle is opened by openLegerNoRecovery, the 
> ledgerHandle is under unsafeRead mode. close/write operations will be failed, 
> read operations should not check condition entry_id > lastAddConfirmed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to