[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406641#comment-13406641
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-247:
------------------------------------------------

Hi Ivan,

Here is one boundary case came across. When client written single entry and 
waiting, at this time if one BK goes down, then Ledger checker is not able to 
find that as underReplicated fragment.

I think it should detect that as under replicated, then I can wait in 
PendingReplicationWorker for grace period and fence the ledger. If it is not 
able to detect as underReplicated, we can not know whether really there is no 
fragments underReplicated or some one else already replicated them.

Here is test to reproduce:

{code}
/**
     * Tests that LedgerChecker should one fragment as underReplicated
     * if there is an open ledger with single entry written.
     */
    @Test(timeout = 3000)
    public void testShouldGetOneFragmentWithSingleEntryOpenedLedger() throws 
Exception {
        LedgerHandle lh = bkc.createLedger(3, 3, BookKeeper.DigestType.CRC32,
                TEST_LEDGER_PASSWORD);
        lh.addEntry(TEST_LEDGER_ENTRY_DATA);
        ArrayList<InetSocketAddress> firstEnsemble = lh.getLedgerMetadata()
                .getEnsembles().get(0L);
        InetSocketAddress lastBookieFromEnsemble = firstEnsemble.get(0);
        LOG.info("Killing " + lastBookieFromEnsemble + " from ensemble="
                + firstEnsemble);
        killBookie(lastBookieFromEnsemble);

        startNewBookie();
        
        //Open ledger separately for Ledger checker.
        LedgerHandle lh1 =bkc.openLedgerNoRecovery(lh.getId(), 
BookKeeper.DigestType.CRC32,
                TEST_LEDGER_PASSWORD);
        
        Set<LedgerFragment> result = getUnderReplicatedFragments(lh1);
        assertNotNull("Result shouldn't be null", result);
        assertEquals("There should be 1 fragment. But returned fragments are "
                + result, 1, result.size());
    }

   private Set<LedgerFragment> getUnderReplicatedFragments(LedgerHandle lh)
            throws InterruptedException {
        LedgerChecker checker = new LedgerChecker(bkc);
        CheckerCallback cb = new CheckerCallback();
        checker.checkLedger(lh, cb);
        Set<LedgerFragment> result = cb.waitAndGetResult();
        return result;
    }
{code}

I think the problem is, when ledger is not closed then getLastConfirmed may not 
give real last entry. we will get one lesser than real last entry confirmed. If 
the ledger is closed, then only we can get real last entry. In this case also, 
it has written only one entry and it was in open state. so, it may get last 
confirmed is nothing. Finally it is not detecting ledger any fragments from the 
ledger as underReplicated.

If I write one more entry extra, then it can detect as underReplicated.

Thanks
Uma
                
> Detection of under replication
> ------------------------------
>
>                 Key: BOOKKEEPER-247
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-247
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-client, bookkeeper-server
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>         Attachments: BOOKKEEPER-247.diff, BOOKKEEPER-247.diff, 
> BOOKKEEPER-247.patch, BOOKKEEPER-247.patch
>
>
> This JIRA discusses how the bookkeeper system will detect underreplication of 
> ledger entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to