Luo Chen has posted comments on this change.

Change subject: Avoid always merging old components in prefix policy
......................................................................


Patch Set 6:

(3 comments)

https://asterix-gerrit.ics.uci.edu/#/c/1818/6/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java:

PS6, Line 201: if (mergableIndexes != null) {
             :             return mergableIndexes.getRight() - 
mergableIndexes.getLeft() + 1;
             :         } else {
             :             return 0;
             :         }
> return mergeableIndexes == null? 0: mergableIndexes.getRight() - mergableIn
Done


PS6, Line 248: for (int i = startIndex; i <= endIndex; i++) {
             :             mergableComponents.add(immutableComponents.get(i));
             :         }
> mergableComponents.addAll(immutableComponents.subList(startIndex, endIndex+
Done


PS6, Line 273:  private Pair<Integer, Integer> 
getMergableComponentsIndex(List<ILSMDiskComponent> immutableComponents) 
> Oh, we shouldn't have the resultFromFlush flag because we don't always have
Not sure I fully understand this correctly. But the idea is quite a 
specialization of the level-based merge policy, where we only merge disk 
components in the same levels. For example, the newly flush components will be 
in level 1, components after one round of merge will be in level 2, .... 
Moreover, disk components are also ordered on levels. The level information 
could be stored in the component meta-data after flush/merge.

Developing a new merge policy probably needs some more time, and definitely 
needs more experiments to see whether it works better and to understand the 
side-effect of more extra disk components at each level. I'll probably take a 
detailed look at this issue from a research prospective next Fall quarter (as 
I'll be having an summer internship soon).

Thus, this fix is more about a temporary fix ("one line" fix as suggested by 
Mike). In terms of the complexity of finding a mergeable sequence, consider the 
layout of the disk components. Say after a while, the system now has 100 disk 
components (ordered by oldest to youngest), then it's almost the case the the 
first 90 components or so (based on the parameters) are too large and will be 
ignored by the policy ( and this is also the behavior of the previous prefix 
policy). The policy will then examine the next 10 components or so, which 
wouldn't take too much time.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1818
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I464da3fed38cded0aee7b319a35664eae069a2ba
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to