[
https://issues.apache.org/jira/browse/BOOKKEEPER-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630230#comment-13630230
]
Jiannan Wang commented on BOOKKEEPER-590:
-----------------------------------------
{quote}
+1 for retaining the old scan and compare. I don't see why it would hit a
problem going from 32bit - 64bit. The ledger ranges are defined as 2 longs, so
it should handle it fine.
{quote}
The problem is currently we format the ledger id into a fixed length (10 bytes)
string (see MSLedgerManagerFactory#ledgerId2Key). After ledger id space is
enlarged to 64 bits, the formatted string will larger than 10 bytes then, which
breaks the order when scan ledger ids. An example, ledger id "1111234567890"
will return before ledger id "1234567890" under string comparison. We need more
works to resolve this issue.
Of cause, we can get around it by simply adding a prefix char that larger than
any digit ascii for those ledger id larger than 10 bytes. For example, 'A' >
'0'..'9', we format 1111234567890 to "A 000000001111234567890" then the scan
result is still order. However, this is very tricky and I don't know is there
any better solution.
Anyway, my concern is on the extra efforts to maintain the order for
MSLedgerManagerFactory backward compatibility. Is it worth us to do it only for
a simple and specific GC implementation?
> Another Scan-And-Compare GC Implementation
> ------------------------------------------
>
> Key: BOOKKEEPER-590
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-590
> Project: Bookkeeper
> Issue Type: Improvement
> Components: bookkeeper-server
> Reporter: Jiannan Wang
> Assignee: Jiannan Wang
> Attachments: BOOKKEEPER-590.patch
>
>
> The idea of Scan-And-Compare GC is as below:
> * Assume the ledger id list in local bookie server is *LocalLedgers*
> * At the same time, the ledger id list at metadata storage is *LiveLedgers*
> * Then the ledgers require garbage collection are *LocalLedgers -
> LiveLedgers*
> Under current implementation, an ledger id order guarantee is required when
> obtain *LiveLedgers* from metadata storage. However, this is unnecessary: we
> get *LocalLedgers* and we can just remove elements that in *LiveLedgers* one
> by one in any order.
> What's more, without the order requirement when scan all ledger ids, some
> things become simple:
> * We even don't need radix tree to maintain 64-bits ledger metadata, a
> hierarchical hash tree is enough (just as what topic metadata management
> does).
> * Easy to handle 64-bit ledger id backward compatibility for
> MSLedgerManager:
> ** Currently, for MSLedgerManager, we format ledger id to a fixed
> length (it's 10 now) digit string to make order scan
> ** When a 64-bit ledger id is used we need to enlarge the fixed length,
> then old ledger id backward compatibility turns to be a trouble if we require
> this order guarantee.
> As above reasons, it would better to remove specific order requirement from
> current Scan-And-Compare GC implementation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira