[
https://issues.apache.org/jira/browse/BOOKKEEPER-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406641#comment-13406641
]
Uma Maheswara Rao G commented on BOOKKEEPER-247:
------------------------------------------------
Hi Ivan,
Here is one boundary case came across. When client written single entry and
waiting, at this time if one BK goes down, then Ledger checker is not able to
find that as underReplicated fragment.
I think it should detect that as under replicated, then I can wait in
PendingReplicationWorker for grace period and fence the ledger. If it is not
able to detect as underReplicated, we can not know whether really there is no
fragments underReplicated or some one else already replicated them.
Here is test to reproduce:
{code}
/**
* Tests that LedgerChecker should one fragment as underReplicated
* if there is an open ledger with single entry written.
*/
@Test(timeout = 3000)
public void testShouldGetOneFragmentWithSingleEntryOpenedLedger() throws
Exception {
LedgerHandle lh = bkc.createLedger(3, 3, BookKeeper.DigestType.CRC32,
TEST_LEDGER_PASSWORD);
lh.addEntry(TEST_LEDGER_ENTRY_DATA);
ArrayList<InetSocketAddress> firstEnsemble = lh.getLedgerMetadata()
.getEnsembles().get(0L);
InetSocketAddress lastBookieFromEnsemble = firstEnsemble.get(0);
LOG.info("Killing " + lastBookieFromEnsemble + " from ensemble="
+ firstEnsemble);
killBookie(lastBookieFromEnsemble);
startNewBookie();
//Open ledger separately for Ledger checker.
LedgerHandle lh1 =bkc.openLedgerNoRecovery(lh.getId(),
BookKeeper.DigestType.CRC32,
TEST_LEDGER_PASSWORD);
Set<LedgerFragment> result = getUnderReplicatedFragments(lh1);
assertNotNull("Result shouldn't be null", result);
assertEquals("There should be 1 fragment. But returned fragments are "
+ result, 1, result.size());
}
private Set<LedgerFragment> getUnderReplicatedFragments(LedgerHandle lh)
throws InterruptedException {
LedgerChecker checker = new LedgerChecker(bkc);
CheckerCallback cb = new CheckerCallback();
checker.checkLedger(lh, cb);
Set<LedgerFragment> result = cb.waitAndGetResult();
return result;
}
{code}
I think the problem is, when ledger is not closed then getLastConfirmed may not
give real last entry. we will get one lesser than real last entry confirmed. If
the ledger is closed, then only we can get real last entry. In this case also,
it has written only one entry and it was in open state. so, it may get last
confirmed is nothing. Finally it is not detecting ledger any fragments from the
ledger as underReplicated.
If I write one more entry extra, then it can detect as underReplicated.
Thanks
Uma
> Detection of under replication
> ------------------------------
>
> Key: BOOKKEEPER-247
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-247
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: bookkeeper-client, bookkeeper-server
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
> Attachments: BOOKKEEPER-247.diff, BOOKKEEPER-247.diff,
> BOOKKEEPER-247.patch, BOOKKEEPER-247.patch
>
>
> This JIRA discusses how the bookkeeper system will detect underreplication of
> ledger entries.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira