[
https://issues.apache.org/jira/browse/ZOOKEEPER-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973395#action_12973395
]
Flavio Junqueira commented on ZOOKEEPER-465:
--------------------------------------------
Hi Dhruba, When I wrote the description, I think I was referring to writing to
ZooKeeper once we close a ledger, so we wouldn't have to pay the price of a
ZooKeeper update upon each addEntry. However, thinking again about the problem,
this approach is not fault tolerant. If the client writer crashes before
closing and the byte count is volatile, then we will lose it.
One way I see to overcome this problem is having each bookie keep the byte
count for its ledger fragment. Given the byte count B for a ledger fragment, we
can obtain an estimate of the total number by computing (B * n/r), where n is
the number of bookies storing the ledger and r is the replication factor of
each entry. This last formula comes from the observation that each bookie
stores r/n entries of a ledger.
This approach, however, does not provide a good estimate if the length of
entries varies significantly. A less efficient approach that doesn't have the
imbalance problem is reading the byte counts from all bookies, adding them up,
and dividing by the replication factor. This operation will only complete if no
bookie is faulty. In the case we have a faulty bookie, we have a procedure to
recover the ledger fragments of a faulty bookie.
Assuming that there are bookies that have crashed and their fragments haven't
been replicated to new bookies, the best I can think of at this point is taking
the average over the bookies that are up and performing the same computation
above.
Any other option I'm missing?
> Ledger size in bytes
> --------------------
>
> Key: ZOOKEEPER-465
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-465
> Project: ZooKeeper
> Issue Type: New Feature
> Components: contrib-bookkeeper
> Reporter: Flavio Junqueira
>
> It is currently easy to know how many entries a ledger has, but there is no
> easy way to know the total number of bytes in a ledger. The idea of this jira
> is to add a method that gives the number of bytes in a closed ledger. My
> current idea is to simply have the writer counting the number of bytes
> written and store it to ZooKeeper.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.