abdullah alamoudi has posted comments on this change. Change subject: [NO ISSUE][STO] Add consistency to flush lifecycle ......................................................................
Patch Set 16: (37 comments) https://asterix-gerrit.ics.uci.edu/#/c/2584/16/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/nc/NCAppRuntimeContext.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/app/nc/NCAppRuntimeContext.java: PS16, Line 185: AsynchronousScheduler > define as inner static class Done PS16, Line 188: warn > fatal or error Done PS16, Line 194: warn > fatal or error Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/metadata/MetadataTxnTest.java File asterixdb/asterix-app/src/test/java/org/apache/asterix/test/metadata/MetadataTxnTest.java: PS16, Line 284: final MetadataTransactionContext committedMdTxn = MetadataManager.INSTANCE.beginTransaction(); : MetadataManager.INSTANCE.addNodegroup(committedMdTxn, new NodeGroup(committedNodeGroup, ngNodes)); : MetadataManager.INSTANCE.commitTransaction(committedMdTxn); > why is this needed? Because otherwise, the primary index is not modified and so, the flush on line 288 will be ignored. https://asterix-gerrit.ics.uci.edu/#/c/2584/16/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/BaseOperationTracker.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/BaseOperationTracker.java: PS16, Line 40: : > Why FLUSH/MERGE is removed? They are now handled when the IO operation is scheduled, completed. we should probably do the same for replicate as well. https://asterix-gerrit.ics.uci.edu/#/c/2584/16/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationCallback.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIOOperationCallback.java: PS16, Line 54: public > private Done PS16, Line 55: public > private Done PS16, Line 65: ArrayDeque > Use Deque interface? Done PS16, Line 106: // check status of the flush > Remove Done PS16, Line 177: componentId > component LSN Done PS16, Line 181: ILSMComponent > ILSMDiskComponent Done PS16, Line 182: ILSMDiskComponent > Remove Done PS16, Line 215: // check status before updating > remove Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIndexIOOperationCallbackFactory.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/ioopcallbacks/LSMIndexIOOperationCallbackFactory.java: PS16, Line 38: protected > private Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperation.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperation.java: PS16, Line 140: @param operation > Remove Done PS16, Line 143: String getOperationGroupId(); > What's the usage of this method? More comments about this "group"? Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationCallback.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIOOperationCallback.java: PS16, Line 87: getOperationGroupId > is the group id always the index identifier? if so, I think we should remov This was here to allow for overriding the order enforcement and allow for example primary and secondary to be processed together. However, since it is not used. I am removing it. PS16, Line 87: operation > java docs Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndexOperationContext.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndexOperationContext.java: PS16, Line 112: void setMap(Map<String, Object> map); > I think we can still do something better explicitly for flush info in Aster Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractIoOperation.java: PS16, Line 87: getStatus > will state be accessed from anywhere that is not synchronized? Should be accessed by the IO operation thread only while it is running. Should be accessed by outside callers once it completes. PS16, Line 124: return; > why not rethrow? Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndex.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndex.java: PS16, Line 219: createAccessor(NoOpIndexAccessParameters.INSTANCE).scheduleFlush().sync(); > what if this triggers a merge? shouldn't we wait for it as well before deac Done. This behavior was there before... Not sure how we didn't see failures. PS16, Line 347: ScheduleFlushCalled > fix Done PS16, Line 363: Map<String, Object> map = ctx.getMap(); : if (map != null && !map.isEmpty()) { : flushCtx.setMap(new HashMap<>(map)); : } > refactor with the one below Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndexOperationContext.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndexOperationContext.java: PS16, Line 62: map > use a more meaning name, like parameterMap? Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMMemoryComponent.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMMemoryComponent.java: PS16, Line 126: case REPLICATE: : // why would replicate ever enter a memory component? > remove them Done PS16, Line 188: to allow for a re-attempt > maybe some day :) Done PS16, Line 245: // a pending schedule flush > replace this by when this can happen Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AsynchronousScheduler.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AsynchronousScheduler.java: PS16, Line 51: operation.getIOOpertionType() > make it a switch and it would be better to extract the handling for flush Done PS16, Line 68: PriorityQueue<ILSMIOOperation> q = new PriorityQueue<>(); > Why we need a priority queue here? It seems FIFO queue should be enough? That seems correct.... not sure why it was made a priority queue originally https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/IoOperationExecutor.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/IoOperationExecutor.java: PS16, Line 48: Runnable > remove Done PS16, Line 70: shutdown(); > why? Once a scheduler fails, it is unsafe to execute further operations. It is guarded against now by the callback.schedulerFailed call, but I think we should also stop accepting newer operations completely https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMComponentIdGenerator.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMComponentIdGenerator.java: PS16, Line 32: private int currentComponentIndex; > Why do we need to keep track of this? The idea is that primary and secondary components that are correlated/ have the same component id will always match the index of the component. This is now expected to hold at all times. It has the additional benefit that the flushed components sizes are predictable since primary and all secondaries will share the same virtual buffer cache at all times. This is to be used when creating a new index. (Think of the case where the existing indexes current memory component has index 1), and then a new index is added, this will be called and the new index will start with memory component with index 1. The only case this doesn't hold after this change is if there was a need to recover a flush operation but we can fix that later. https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/LSMIndexDiskComponentBulkLoader.java: PS16, Line 38: // Note that by using a flush target file name, we state that the : // new bulk loaded component is "newer" than any other merged component. > This comment can be removed? Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/NoOpIoOperationFailedCallback.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/NoOpIoOperationFailedCallback.java: PS16, Line 28: public > private Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java File hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/buffercache/BufferCache.java: PS16, Line 1469: return; > remove Done https://asterix-gerrit.ics.uci.edu/#/c/2584/16/hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java File hyracks-fullstack/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/ExitUtil.java: PS16, Line 40: EXIT_CODE_IO_SCHEDULER_FAILED > Either use EC_ or rename the rest Done -- To view, visit https://asterix-gerrit.ics.uci.edu/2584 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b Gerrit-PatchSet: 16 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: abdullah alamoudi <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Luo Chen <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
