Jianfeng Jia has posted comments on this change. Change subject: Implemented the memory-bounded HashGroupby and HashJoin for BigObject ......................................................................
Patch Set 10: (15 comments) https://asterix-gerrit.ics.uci.edu/#/c/398/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java File algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java: Line 252: //TODO(jianfeng) add the partial stage support > What does "partial stage" support mean here? I was taking this wrong. As we discussed earlier, it is directly supported by the given aggregation functions. Removed. Line 256: true); > why the group-by final stage bit is always true? ditto. removed. https://asterix-gerrit.ics.uci.edu/#/c/398/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/HybridHashJoinPOperator.java File algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/HybridHashJoinPOperator.java: Line 174: opDesc = new OptimizedHybridHashJoinOperatorDescriptor(spec, getMemSizeInFrames(), > Make this OptimizedHashJoin consistent with Hyrid one. Done https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/context/IHyracksFrameMgrContext.java File hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/context/IHyracksFrameMgrContext.java: Line 30: //TODO tobedeleted > Why is the default allocate() to be deleted? As we discussed previously, it's better to keep this interface. Comments removed. https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAccessor.java File hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAccessor.java: Line 151: sb.append("AException!"); > better error message. changed a little. This is not an exception message, it is used as a placeholder inside the record representation. Sometimes the deserialzation doesn't work because it didn't set previously. For debug purpose, we can just show the known part and bookmark the failed part will be OK. https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java: Line 24: public interface IDeletableFramePool extends IFramePool { > Change the name Done https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/ISpillableTable.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/ISpillableTable.java: Line 43: /** > why sortFrames() is gone? Yes, it uses recursively hash algorithm. https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupOperatorDescriptor.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupOperatorDescriptor.java: Line 66: ISpillableTableFactory spillableTableFactory, boolean isFinalStage) { > the parameter isOutputSorted is removed? Yes, the sorted option is removed. we are doing the hybrid hash algorithm for now. Later I will make a HashSort algorithm invented by Jarod into the system. The `isFinalStage` parameter is gone now. https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java: Line 118: public void join(IFrameTupleAccessor accessorProbe, int tid, IFrameWriter writer) throws HyracksDataException { > public-->private? It is also used by OptimizedHashJoin, so I change it to package private. https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractExternalSortRunGenerator.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractExternalSortRunGenerator.java: Line 67: switch (policy) { > have a FrameFreeSlotPolicyFactory to be a single place doing the dispatch? Done https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractFrameSorter.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractFrameSorter.java: Line 187: protected int compare(int tp1, int tp2) throws HyracksDataException { > protected --> protected final Done https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/ExternalSortRunGenerator.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/ExternalSortRunGenerator.java: Line 21: import java.nio.ByteBuffer; > unused import Done https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/RunAndMaxFrameSizePair.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/RunAndMaxFrameSizePair.java: Line 25: public IFrameReader run; > move maxFrameSize into RunAndMaxFrameSizePair Done. move the RunAndMaxFrameSizePair to GeneratedRunFileReader which implemented the RunFileReader https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/util/FrameTuplePairComparator.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/util/FrameTuplePairComparator.java: Line 65: public int compare(IFrameTupleAccessor accessor0, int tIndex0, ITupleBufferAccessor bufferAccessor) > why there is no tuple index for the bufferAccessor? There are some buffer managers whose frame structure is not as the common frame. (e.g.VariableDeletableTupleMemoryManager). It will create a special accessor, and reset the frame according to the given TuplePointer. I rename the ITupleBufferAccessor to ITupleBufferPointAccessor https://asterix-gerrit.ics.uci.edu/#/c/398/10/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/api/ITwoPCIndexBulkLoader.java File hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/api/ITwoPCIndexBulkLoader.java: Line 26: > ws Done -- To view, visit https://asterix-gerrit.ics.uci.edu/398 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I248f3a374fdacad7d57e49cf18d8233745e55460 Gerrit-PatchSet: 10 Gerrit-Project: hyracks Gerrit-Branch: master Gerrit-Owner: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Pouria Pirzadeh <[email protected]> Gerrit-Reviewer: Preston Carman <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-HasComments: Yes
