[bookkeeper] branch master updated: Introduce condensed encoding format for availabilityOfEntriesOfLedger

sijie Sat, 17 Nov 2018 09:12:01 -0800

This is an automated email from the ASF dual-hosted git repository.

sijie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/bookkeeper.git



The following commit(s) were added to refs/heads/master by this push:
     new fe50b8c  Introduce condensed encoding format for 
availabilityOfEntriesOfLedger
fe50b8c is described below

commit fe50b8c975acce46a2ed680c82de62cba36b058e
Author: Charan Reddy Guttapalem <[email protected]>
AuthorDate: Sat Nov 17 09:11:31 2018 -0800

    Introduce condensed encoding format for availabilityOfEntriesOfLedger
    
    Descriptions of the changes in this PR:
    
    - this condensed encoding format will take fewer bytes than initially
    proposed bit vector.
    
    Reviewers: Sijie Guo <[email protected]>, Enrico Olivelli 
<[email protected]>
    
    This closes #1814 from reddycharan/updatebpdoc
---
 site/bps/BP-34-cluster-metadata-checker.md | 61 ++++++++++++++++++++++++++++--
 1 file changed, 58 insertions(+), 3 deletions(-)

diff --git a/site/bps/BP-34-cluster-metadata-checker.md 
b/site/bps/BP-34-cluster-metadata-checker.md
index 1992d4c..d022fb0 100644
--- a/site/bps/BP-34-cluster-metadata-checker.md
+++ b/site/bps/BP-34-cluster-metadata-checker.md
@@ -51,13 +51,68 @@ message GetListOfEntriesOfALedgerRequest {
 message GetListOfEntriesOfALedgerResponse {
        required StatusCode status = 1;
        required int64 ledgerId = 2;
-       optional bytes availabilityOfEntriesOfLedger = 3; // treated as array 
of bits indicating availability of entry at that particular index location      
          
+       optional bytes availabilityOfEntriesOfLedger = 3; // explained below
 }
 ```
 
-For the sake of future extensibility it would be helpful to add version info 
(and possibly some metadata in the future) at the beginning of the 
'availabilityOfEntriesOfLedger'. So the first 64 bytes will be considered 
reserved space, with the first byte specifying the version for now and the rest 
of the bytes in the reserved space will be 0's.
+For ‘availabilityOfEntriesOfLedger’ we can use following condensed encoding 
format, which helps in reducing the number of bytes needed to represent the 
list of entries.
 
-Here Bookie is expected to attain this information just from just LedgerCache 
(Index Files) - IndexPersistenceMgr and IndexInMemPageMgr but it doesn’t 
actually check the availability of this entry in Entrylogger. Since the 
intention of this checker is limited to do just metadata validation at cluster 
level.
+(note: following representation is used just for understanding purpose, but 
this is not protobuf (or any other) representation)
+
+```
+AvailabilityOfEntriesOfLedger {
+       OrderedCollection sequenceGroup;
+}
+
+SequenceGroup {
+       long firstSequenceStart;
+       long lastSequenceStart;
+       int sequenceSize;
+       int sequencePeriod;
+}
+```
+
+Nomenclature:
+
+  - Continuous entries are grouped as a ’Sequence’.
+  - Number of continuous entries in a ‘Sequence’ is called ‘sequenceSize’.
+  - Gap between Consecutive sequences is called ‘sequencePeriod’.
+  - Consecutive sequences with same sequenceSize and same sequencePeriod in 
between consecutive sequences are grouped as a SequenceGroup.
+  - ‘firstSequenceStart’ is the first entry in the first sequence of the 
SequenceGroup.
+  - ‘lastSequenceStart’ is the first entry in the last sequence of the 
SequenceGroup.
+  - Ordered collection of such SequenceGroups will represent entries of a 
ledger residing in a bookie.
+
+(in the best case scenario there will be only one SequenceGroup, 
‘sequencePeriod’ will be ensembleSize and ‘sequenceSize’ will be 
writeQuorumSize of the ledger).
+
+for example,
+
+example 1 (best case scenario):
+
+1, 2, 4, 5, 7, 8, 10, 11
+
+in this case (1, 2), (4, 5), (7, 8), (10, 11) are sequences and in this 
scenario there happens to be just one SequenceGroup, which can be represented 
like
+
+{ firstSequenceStart - 1, lastSequenceStart - 10, sequenceSize - 2, 
sequencePeriod - 3 }
+
+example 2 (an entry is missing):
+
+1, 2, 3, 6, 7, 8, 11, 13, 16, 17, 18, 21, 22
+
+(entry 12 is missing and in the last sequence there are only 2 entries 21, 22)
+in this case (1, 2, 3), (6, 7, 8), (11), (13), (16, 17, 18), (21, 22) are the 
sequences
+so the sequence groups are
+
+{ firstSequenceStart - 1, lastSequenceStart - 6, sequenceSize - 3, 
sequencePeriod - 5 }, { firstSequenceStart - 11, lastSequenceStart - 13, 
sequenceSize - 1, sequencePeriod - 2 }, { firstSequenceStart - 16, 
lastSequenceStart - 16, sequenceSize - 3, sequencePeriod - 0 }, { 
firstSequenceStart - 21, lastSequenceStart - 21, sequenceSize - 2, 
sequencePeriod - 0 }
+
+As you can notice to represent a SequenceGroup, two long values and two int 
values are needed, so each SequenceGroup can be represented with (2 * 8 + 2 * 4 
= 24 bytes).
+
+In the ‘availabilityOfEntriesOfLedger’ byte array, for the sake of future 
extensibility it would be helpful to have reserved space for metadata at the 
beginning. So the first 64 bytes will be used for metadata, with the first four 
bytes specifying the int version number, next four bytes specifying the number 
of entries for now and the rest of the bytes in the reserved space will be 0's. 
The encoded format will be represented after the first 64 bytes. The ordered 
collection of SequenceGro [...]
+
+So for a ledger having thousands of entries, this condensed encoded format 
would need one or two SequenceGroups (in the best case, with no holes and no 
overreplication) 24/48 bytes, which would be much less than what is needed to 
represent using bit vector (array of bits indicating availability of entry at 
that particular index location)
+
+Any encoded format needs encoder and decoder at the sending/receiving ends of 
the channel, so the encoding/decoding logic should be handled optimally from 
computation and memory perspective.
+
+Here Bookie is expected to just attain index information (say from LedgerCache 
(Index Files) - IndexPersistenceMgr and IndexInMemPageMgr, unflushed entries in 
EntryMemTable in SortedLedgerStorage case and from rocksdb database in 
DBLedgerStorage case) but it doesn’t actually check the availability of this 
entry in Entrylogger. Since the intention of this checker is limited to do just 
metadata validation at cluster level.
 
 ### Compatibility, Deprecation, and Migration Plan

[bookkeeper] branch master updated: Introduce condensed encoding format for availabilityOfEntriesOfLedger

Reply via email to