abdullah alamoudi has posted comments on this change.

Change subject: Implemented LSM disk components alignment
......................................................................


Patch Set 17:

(24 comments)

half way through this but I think this is enough for this patch...

https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/CorrelatedPrefixMergePolicy.java
File 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/CorrelatedPrefixMergePolicy.java:

PS17, Line 94: if (numComponents != lsmIndex.getImmutableComponents().size()) {
             :                     throw new IllegalStateException("Secondary 
index " + lsmIndex
             :                             + " not having the same number of 
disk components with primary index " + index);
             :                 }
This will end up being thrown a lot. For example, if there is an secondary 
index on an open field, then a flush could produce a primary index only and not 
a secondary index!!


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetInfo.java
File 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetInfo.java:

PS17, Line 79: public synchronized ILSMIndex getPrimaryIndex() {
             :         Set<ILSMIndex> indexSet = getDatasetIndexes();
             :         for (ILSMIndex index : indexSet) {
             :             if (index.isPrimaryIndex()) {
             :                 return index;
             :             }
             :         }
             :         return null;
             :     }
What is the intended behavior of this method? can we add a comment?


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java
File 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java:

PS17, Line 411: Set<ILSMIndex> indexSet = dsInfo.getDatasetIndexes();
              :             ILSMIndex primaryIndex = dsInfo.getPrimaryIndex();
              :             if (primaryIndex.isCorrelated()) {
              :                 List<ILSMIOOperationGroup> operationGroups =
              :                         
LSMIOOperationGroup.create(dsInfo.getDatasetID(), indexSet, new 
LSMIOOperationGroupCallback());
              :                 for (ILSMIOOperationGroup operationGroup : 
operationGroups) {
              :                     
primaryIndex.getIOScheduler().scheduleOperationGroup(operationGroup);
              :                 }
              :             } else {
              :                 for (ILSMIndex lsmIndex : indexSet) {
              :                     ILSMIndexAccessor accessor =
              :                             
lsmIndex.createAccessor(NoOpOperationCallback.INSTANCE, 
NoOpOperationCallback.INSTANCE);
              :                     
accessor.scheduleFlush(lsmIndex.getIOOperationCallback());
              :                 }
This kind of code with if else is bad. essentially, the code should be the same 
for all the merge policies using the merge policy interface.

The difference between correlated and not correlated should only be inside the 
merge policy code only.


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/PrimaryIndexOperationTracker.java
File 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/PrimaryIndexOperationTracker.java:

PS17, Line 158: if (primaryIndex.isCorrelated()) {
              :             List<ILSMIOOperationGroup> operationGroups =
              :                     
LSMIOOperationGroup.create(dsInfo.getDatasetID(), indexSet, new 
LSMIOOperationGroupCallback());
              :             for (ILSMIOOperationGroup operationGroup : 
operationGroups) {
              :                 
primaryIndex.getIOScheduler().scheduleOperationGroup(operationGroup);
              :             }
              :         } else {
              :             for (ILSMIndex lsmIndex : indexSet) {
              :                 //get resource
              :                 ILSMIndexAccessor accessor =
              :                         
lsmIndex.createAccessor(NoOpOperationCallback.INSTANCE, 
NoOpOperationCallback.INSTANCE);
              :                 //schedule flush after update
              :                 
accessor.scheduleFlush(lsmIndex.getIOOperationCallback());
              :             }
              :         }
we are doing this again here. this is extremely unmaintainable.


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationGroupCallback.java
File 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationGroupCallback.java:

PS17, Line 47:   if (failed) {
             :             operations.forEach(operation -> 
operation.setNewComponent(null));
             :         }
we set the component to null but the artifacts are already in disk????


PS17, Line 50:         LSMIOOpertionType type = 
operationGroup.getIOOpertionType();
             :         List<ILSMIndexAccessor> indexAccessors = 
operationGroup.getIndexAccessors();
             :         if (type == LSMIOOpertionType.FLUSH) {
             :             for (int i = 0; i < indexes.size(); i++) {
             :                 ILSMIndexAccessor accessor = 
indexAccessors.get(i);
             :                 accessor.commitFlush(operations.get(i));
             :             }
             :         } else if (type == LSMIOOpertionType.MERGE) {
             :             for (int i = 0; i < indexes.size(); i++) {
             :                 ILSMIndexAccessor accessor = 
indexAccessors.get(i);
             :                 accessor.commitMerge(operations.get(i));
             :             }
             :         }
             : 
             :         for (ILSMIndex index : indexes) {
             :             if (index.isPrimaryIndex()) {
             :                 index.getMergePolicy().diskComponentAdded(index, 
false);
             :                 break;
             :             }
             :         }
             :     }
             : }
what if one is committed and a failure happened when committing the others? a 
simple failure would be a system crash. then?


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeLocalResourceMetadata.java:

PS17, Line 64:  partition()
Storage in Hyracks has been designed to be be agnostic to storage partitions. 
partition info is in the resource info of the index and should be kept there!


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeWithBuddyLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalBTreeWithBuddyLocalResourceMetadata.java:

PS17, Line 78: partition()
same here!!!


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalRTreeLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/ExternalRTreeLocalResourceMetadata.java:

PS17, Line 72:  partition(
same!!


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java:

PS17, Line 92: partition(
same


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java:

PS17, Line 100:  partition()
same


PS17, Line 109: partition()
same


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java
File 
asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java:

PS17, Line 98:  partition()
same


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeDataflowHelper.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeDataflowHelper.java:

PS17, Line 76: ctx.getTaskAttemptId().getTaskId().getPartition()
same here and this is not correct. this is a task partition which is different 
from storage partition


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeWithBuddyDataflowHelper.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/ExternalBTreeWithBuddyDataflowHelper.java:

PS17, Line 87: (IMetadataPageManagerFactory) opDesc.getPageManagerFactory(),
             :                 
ctx.getTaskAttemptId().getTaskId().getPartition());
same here


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/LSMBTreeDataflowHelper.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/dataflow/LSMBTreeDataflowHelper.java:

PS17, Line 82:  ctx.getTaskAttemptId().getTaskId().getPartition());
same here


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/ExternalBTree.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/ExternalBTree.java:

PS17, Line 103:  partition)
same here!


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/LSMBTreeFlushOperation.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/apache/hyracks/storage/am/lsm/btree/impls/LSMBTreeFlushOperation.java:

PS17, Line 43:  private volatile ILSMDiskComponent newComponent;
             :     private volatile ILSMIOOperationGroup operationGroup;
just out of curiosity, why are those volatile?


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroup.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroup.java:

add javadocs?


PS17, Line 31: public String getDatasetPartitionID();
doesn't belong to hyracks. This should be moved to asterixdb


PS17, Line 37: public List<ILSMIndexAccessor> getIndexAccessors();
you can get those from the indexes


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroupCallback.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationGroupCallback.java:

PS17, Line 26: afterOperationGroup(
since it is a single call, can we rename it?


https://asterix-gerrit.ics.uci.edu/#/c/1725/17/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndex.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndex.java:

PS17, Line 140: boolean isCorrelated();
this is not right


PS17, Line 144: int getPartition();
this is not right


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1725
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I64bf34e255def72adc73b9f87cfa628a172ea694
Gerrit-PatchSet: 17
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to