Yingyi Bu has posted comments on this change. Change subject: ASTERIXDB-1566, ASTERIXDB-1733: Hash Group By and Hash Join conform to the memory budget ......................................................................
Patch Set 60: (10 comments) https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java: Line 330: BigInteger groupByPrimeNumberTableSize = BigInteger.valueOf(finalGroupByTableCardinality).nextProbablePrime(); nextProbablePrime is very expensive. I tried on my local machine and it usually takes 5 to 10 ms. Can you just use finalGroupByTableCardinality? https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/HashSpillableTableFactory.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/HashSpillableTableFactory.java: Line 60: private static final int MIN_OUTPUT_FRAME_LIMT = 1; MIN_OUTPUT_FRAME_LIMIT =1 --> OUTPUT_FRAME_COUNT = 1 https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupOperatorDescriptor.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/external/ExternalGroupOperatorDescriptor.java: Line 70: * Minimum of 4 frames: 1 for input records, 1 for output, and 2 for hash table (1 header and 1 content) If minimum is 4, the condition should frameLimit <= 3. Correct? https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java: Line 211: LOGGER.fine("InMemoryHashJoin has finished using " + nFrames + " frames for Thread ID " Check logging level so that we don't always do string concatenation. https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/SimpleSerializableHashTable.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/SimpleSerializableHashTable.java: Line 490: bytes[offset++] = (byte) (value); > MAJOR SonarQube violation: Address this? Remove the last ++. https://asterix-gerrit.ics.uci.edu/#/c/1056/60/hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/unit/AbstractExternalGroupbyTest.java File hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/unit/AbstractExternalGroupbyTest.java: Line 180: int frameSize = 256; Is 50 too large? 101*8=808. CEIL(808/256) = 4. Line 189: int tableSize = 101; Is 50 too large? 101*8=808. CEIL(808/256) = 4. Line 192: int minDataSize = frameSize * 80; Is 80 too large? Line 202: int frameSize = 256; Is 50 too large? 101*8=808. CEIL(808/256) = 4. Line 203: int minDataSize = frameSize * 80; Is 80 too large? -- To view, visit https://asterix-gerrit.ics.uci.edu/1056 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2b323e9a2141b4c1dd1652a360d2d9354d3bc3f5 Gerrit-PatchSet: 60 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Taewoo Kim <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Michael Blow <[email protected]> Gerrit-Reviewer: Steven Jacobs <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Wenhai Li <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-HasComments: Yes
