Yingyi Bu has posted comments on this change. Change subject: Implemented the memory-bounded HashGroupby and HashJoin for BigObject ......................................................................
Patch Set 18: (11 comments) https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAccessor.java File hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAccessor.java: Line 149: sb.append("FailedToDeserializeField" + j); "FailedToDeserializeField" -> "Failed to deserialize field " https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AtMostOneFrameForSpilledPartitionPolicy.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AtMostOneFrameForSpilledPartitionPolicy.java: Line 24: public class AtMostOneFrameForSpilledPartitionPolicy { It's a bit difficult for me to understand this class name. Line 45: // try to flush the in memory partition. "the" --> "an" Line 65: } It looks the code in findInMemPartitionWithMaxMemoryUsage and findSpilledPartitionWithMaxMemoryUsage can be shared by passing in a lamda expression? Maybe add an private method for that and the two methods call the private method? https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java: Line 93: * Insert the tuple from the tupleAccessor into the already spilled partition. The spilled ones can occupy at most one frame. What does the last sentence mean here? Line 132: * Clear the particular partition. "Clear" --> "Clear the memory occupation of " https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/VPartitionTupleBufferManager.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/VPartitionTupleBufferManager.java: Line 104: partitionArray[partitionId].reset(); Should we reserve one empty flushed page in partitionArray[partitionId].reset() as far as we can, to avoid thrashing? For example, if buffers.size()>0, remove buffers.size()-1 buffers and reserves 1. if buffers.size()==1, remove it. https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/HashSpillableTableFactory.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/HashSpillableTableFactory.java: Line 1: /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file license header changed https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupWriteOperatorNodePushable.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupWriteOperatorNodePushable.java: Line 96: writer.fail(); In case there is a failure in its data consumer, we should still cleanup leftover run files if any. https://asterix-gerrit.ics.uci.edu/#/c/398/18/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/OptimizedHybridHashJoin.java File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/OptimizedHybridHashJoin.java: Line 101: private TuplePointer tempPtr = new TuplePointer(); // this is a reusable object to store the pointer,which is not used anywhere, WS Line 241: private void flushAndClearSpilledPartition(SIDE whichSide) throws HyracksDataException { "flushAndClearSpilledPartition"-->"closeAllSpilledPartitions"? -- To view, visit https://asterix-gerrit.ics.uci.edu/398 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I248f3a374fdacad7d57e49cf18d8233745e55460 Gerrit-PatchSet: 18 Gerrit-Project: hyracks Gerrit-Branch: master Gerrit-Owner: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Pouria Pirzadeh <[email protected]> Gerrit-Reviewer: Preston Carman <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-HasComments: Yes
