abdullah alamoudi has posted comments on this change. Change subject: Implemented LSM disk components alignment ......................................................................
Patch Set 17: (24 comments) half way through this but I think this is enough for this patch... https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/CorrelatedPrefixMergePolicy.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/CorrelatedPrefixMergePolicy.java: PS17, Line 94: if (numComponents != lsmIndex.getImmutableComponents().size()) { : throw new IllegalStateException("Secondary index " + lsmIndex : + " not having the same number of disk components with primary index " + index); : } This will end up being thrown a lot. For example, if there is an secondary index on an open field, then a flush could produce a primary index only and not a secondary index!! https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetInfo.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetInfo.java: PS17, Line 79: public synchronized ILSMIndex getPrimaryIndex() { : Set<ILSMIndex> indexSet = getDatasetIndexes(); : for (ILSMIndex index : indexSet) { : if (index.isPrimaryIndex()) { : return index; : } : } : return null; : } What is the intended behavior of this method? can we add a comment? https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java: PS17, Line 411: Set<ILSMIndex> indexSet = dsInfo.getDatasetIndexes(); : ILSMIndex primaryIndex = dsInfo.getPrimaryIndex(); : if (primaryIndex.isCorrelated()) { : List<ILSMIOOperationGroup> operationGroups = : LSMIOOperationGroup.create(dsInfo.getDatasetID(), indexSet, new LSMIOOperationGroupCallback()); : for (ILSMIOOperationGroup operationGroup : operationGroups) { : primaryIndex.getIOScheduler().scheduleOperationGroup(operationGroup); : } : } else { : for (ILSMIndex lsmIndex : indexSet) { : ILSMIndexAccessor accessor = : lsmIndex.createAccessor(NoOpOperationCallback.INSTANCE, NoOpOperationCallback.INSTANCE); : accessor.scheduleFlush(lsmIndex.getIOOperationCallback()); : } This kind of code with if else is bad. essentially, the code should be the same for all the merge policies using the merge policy interface. The difference between correlated and not correlated should only be inside the merge policy code only. https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/PrimaryIndexOperationTracker.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/PrimaryIndexOperationTracker.java: PS17, Line 158: if (primaryIndex.isCorrelated()) { : List<ILSMIOOperationGroup> operationGroups = : LSMIOOperationGroup.create(dsInfo.getDatasetID(), indexSet, new LSMIOOperationGroupCallback()); : for (ILSMIOOperationGroup operationGroup : operationGroups) { : primaryIndex.getIOScheduler().scheduleOperationGroup(operationGroup); : } : } else { : for (ILSMIndex lsmIndex : indexSet) { : //get resource : ILSMIndexAccessor accessor = : lsmIndex.createAccessor(NoOpOperationCallback.INSTANCE, NoOpOperationCallback.INSTANCE); : //schedule flush after update : accessor.scheduleFlush(lsmIndex.getIOOperationCallback()); : } : } we are doing this again here. this is extremely unmaintainable. https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationGroupCallback.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationGroupCallback.java: PS17, Line 47: if (failed) { : operations.forEach(operation -> operation.setNewComponent(null)); : } we set the component to null but the artifacts are already in disk???? PS17, Line 50: LSMIOOpertionType type = operationGroup.getIOOpertionType(); : List<ILSMIndexAccessor> indexAccessors = operationGroup.getIndexAccessors(); : if (type == LSMIOOpertionType.FLUSH) { : for (int i = 0; i < indexes.size(); i++) { : ILSMIndexAccessor accessor = indexAccessors.get(i); : accessor.commitFlush(operations.get(i)); : } : } else if (type == LSMIOOpertionType.MERGE) { : for (int i = 0; i < indexes.size(); i++) { : ILSMIndexAccessor accessor = indexAccessors.get(i); : accessor.commitMerge(operations.get(i)); : } : } : : for (ILSMIndex index : indexes) { : if (index.isPrimaryIndex()) { : index.getMergePolicy().diskComponentAdded(index, false); : break; : } : } : } : } what if one is committed and a failure happened when committing the others? a simple failure would be a system crash. then? https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeLocalResourceMetadata.java: PS17, Line 64: partition() Storage in Hyracks has been designed to be be agnostic to storage partitions. partition info is in the resource info of the index and should be kept there! https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeWithBuddyLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeWithBuddyLocalResourceMetadata.java: PS17, Line 78: partition() same here!!! https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalRTreeLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalRTreeLocalResourceMetadata.java: PS17, Line 72: partition( same!! https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java: PS17, Line 92: partition( same https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java: PS17, Line 100: partition() same PS17, Line 109: partition() same https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java File asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java: PS17, Line 98: partition() same https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeDataflowHelper.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeDataflowHelper.java: PS17, Line 76: ctx.getTaskAttemptId().getTaskId().getPartition() same here and this is not correct. this is a task partition which is different from storage partition https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeWithBuddyDataflowHelper.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeWithBuddyDataflowHelper.java: PS17, Line 87: (IMetadataPageManagerFactory) opDesc.getPageManagerFactory(), : ctx.getTaskAttemptId().getTaskId().getPartition()); same here https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/LSMBTreeDataflowHelper.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/LSMBTreeDataflowHelper.java: PS17, Line 82: ctx.getTaskAttemptId().getTaskId().getPartition()); same here https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/ExternalBTree.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/ExternalBTree.java: PS17, Line 103: partition) same here! https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/LSMBTreeFlushOperation.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/LSMBTreeFlushOperation.java: PS17, Line 43: private volatile ILSMDiskComponent newComponent; : private volatile ILSMIOOperationGroup operationGroup; just out of curiosity, why are those volatile? https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroup.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroup.java: add javadocs? PS17, Line 31: public String getDatasetPartitionID(); doesn't belong to hyracks. This should be moved to asterixdb PS17, Line 37: public List<ILSMIndexAccessor> getIndexAccessors(); you can get those from the indexes https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroupCallback.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroupCallback.java: PS17, Line 26: afterOperationGroup( since it is a single call, can we rename it? https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndex.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndex.java: PS17, Line 140: boolean isCorrelated(); this is not right PS17, Line 144: int getPartition(); this is not right -- To view, visit https://asterix-gerrit.ics.uci.edu/1725 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I64bf34e255def72adc73b9f87cfa628a172ea694 Gerrit-PatchSet: 17 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Luo Chen <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Luo Chen <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
