Luo Chen has posted comments on this change.
Change subject: Avoid always merging old components in prefix policy
......................................................................
Patch Set 7:
> A revised version that fixes a bug and skips looking ahead earlier:
>
> private Pair<Integer, Integer>
> getMergableComponentsIndex(List<ILSMDiskComponent>
> immutableComponents) {
> int numComponents = immutableComponents.size();
> for (int i = 0; i < numComponents; i++) {
> long startComponentSize = immutableComponents.get(i).getComponentSize();
> if (immutableComponents.get(i).getComponentSize() >
> maxMergableComponentSize
> || immutableComponents.get(i).getState() !=
> ComponentState.READABLE_UNWRITABLE)
> {
> continue;
> }
> long totalSize = startComponentSize;
> int j = i + 1;
> boolean mergeable = true;
> for (; j < numComponents; j++) {
> long componentSize = immutableComponents.get(j).getComponentSize();
> if (componentSize > maxMergableComponentSize
> || immutableComponents.get(j).getState() !=
> ComponentState.READABLE_UNWRITABLE)
> {
> // skip unmergeable components if any
> break;
> }
> totalSize += componentSize;
> mergeable =
> startComponentSize < MAX_MERGABLE_COMPONENT_SIZE_RATIO * (totalSize
> - startComponentSize);
> if (totalSize > maxMergableComponentSize && mergeable) {
> // If the ratio check passes and total size exceeds the threshold,
> we return this sequence
> return Pair.of(i, j);
> }
> // Stops search if the ratio threshold cannot be met.
> if (startComponentSize >= MAX_MERGABLE_COMPONENT_SIZE_RATIO
> * (totalSize + componentSize * (numComponents - j - 1) -
> startComponentSize)) {
> break;
> }
> }
> if (j != numComponents) {
> continue;
> }
> if (numComponents - i >= maxToleranceComponentCount && mergeable) {
> // If it's the last component, component count exceeds the
> threshold, the ratio check passes,
> // then we return this sequence.
> return Pair.of(i, numComponents - 1);
> }
> // if we reach the last component, but there are not enough
> components to merge,
> // then we don't need to check i+1 th component
> if (numComponents - i <= maxToleranceComponentCount) {
> return null;
> }
> }
> return null;
> }
Looks good to me, and I adopted this new change (though Sonar will again
complain about this complicated loop...)
BTW, the complexity of this policy is not as scary as O(n log n). Actually, n
is not the number of disk components, but rather the number of mergeable
components (and we'll first skip all unmergeable components). Once all
parameters are fixed, n would be a small constant, roughly calculated as
O(maxToleranceCount + log(maxCmpSize/flushCmpSize) ).
--
To view, visit https://asterix-gerrit.ics.uci.edu/1818
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I464da3fed38cded0aee7b319a35664eae069a2ba
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: No