Luo Chen has posted comments on this change. Change subject: Avoid always merging old components in prefix policy ......................................................................
Patch Set 5: (7 comments) > (7 comments) > > some minor comments. https://asterix-gerrit.ics.uci.edu/#/c/1818/4/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java: Line 49: */ > maybe it's good to also put the source of this magic number? Done, it's taken from HBase compaction policy https://www.ngdata.com/visualizing-hbase-flushes-and-compactions/ Line 143: List<ILSMDiskComponent> immutableComponents = new ArrayList<>(index.getImmutableComponents()); > could you explain the reason of the reverse? In the original list, the components are ordered from newest to oldest. Reversing it makes the order from oldest to newest, and simplifies the logic of prefix policy. Line 256: } > I think it's better to move this big chunk of comment before the function a Done Line 266: * ratio * total size of the younger components in the sequence, schedule a merge of all sequences. > just curious, to make sure the state check is consistent, is this function The merge policy is called by the exitComponent method of the LSMHarness (when a new disk component is created because of flush/merge/load). The call of the merge policy is always guarded by the synchronized block with opTracker. Line 276: continue; > if it's a suspicious state, it's better to add a warning log. Hmm...I think some time this could happen (as a proper state). 1. When we call check isMergeLagging, we could see newer components are being merged; 2. totalSize is just an estimation of the size of the result component. It is possible that the result size < maxMergeableSize (but totalSize > maxMergeableSize...) Line 284: || immutableComponents.get(j).getState() != ComponentState.READABLE_UNWRITABLE) { > no need to use ternary? Done Line 289: if (startComponentSize == 0) { > it feels like we should treat two part of the && differently for the else c It makes sense to separate these two conditions. I also separated two canMerge conditions (>maxMergableSize or >maxToleranceCount) to make the logic more complete. -- To view, visit https://asterix-gerrit.ics.uci.edu/1818 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I464da3fed38cded0aee7b319a35664eae069a2ba Gerrit-PatchSet: 5 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Luo Chen <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Luo Chen <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
