Taewoo Kim has posted comments on this change. Change subject: ASTERIXDB-1566, ASTERIXDB-1733: Hash Group By and Hash Join conform to the memory budget ......................................................................
Patch Set 50: (21 comments) Thanks Yingyi. https://asterix-gerrit.ics.uci.edu/#/c/1056/50/asterixdb/asterix-app/src/main/resources/asterix-build-configuration.xml File asterixdb/asterix-app/src/main/resources/asterix-build-configuration.xml: Line 58: <value>640KB</value> > Why do you change the sort memory? I have reset the value to the original one. Line 63: </property> > Is the group memory too large? Based on the expected table size calculation for test cases, the minimum number is 8 (256KB). I have adjusted that size. Line 66: <value>640KB</value> > Is the join memory too large? Based on the expected table size calculation for test cases, the minimum number is 8 (256KB). I have adjusted that size. https://asterix-gerrit.ics.uci.edu/#/c/1056/50/asterixdb/asterix-common/src/test/java/org/apache/asterix/test/aql/TestExecutor.java File asterixdb/asterix-common/src/test/java/org/apache/asterix/test/aql/TestExecutor.java: Line 1055: // Temp: to be deleted > Should you delete the system.out.println? Sorry. Done. https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java: Line 298: public static int calculateGroupByTableEntrySize(int memoryBudgetInBytes, int numberOfGroupByColumns, > calculateGroupByTableEntrySize --> calculateGroupByTableCardinality Name: changed. For the second part of your comments: for the optimized hybrid hash join, the hash table cardinality will be given after the build phase. That is, the number is exact. Thus, it doesn't need to have to call this method. https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/InMemoryHashJoin.java: Line 48: private final IHyracksTaskContext ctx; > MAJOR SonarQube violation: Done Line 48: private final IHyracksTaskContext ctx; > MAJOR SonarQube violation: Done Line 48: private final IHyracksTaskContext ctx; > address this? Done https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/OptimizedHybridHashJoin.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/join/OptimizedHybridHashJoin.java: Line 234: // Temp: > Remove temporary code Done Line 649: // Temp: debug purpose only. > Remove the temporary code? This will be useful for the future debugging. So, it might be worth while to keep this method. Line 651: StringBuffer buf = new StringBuffer(); > MAJOR SonarQube violation: Done Line 651: StringBuffer buf = new StringBuffer(); > MAJOR SonarQube violation: Done Line 651: StringBuffer buf = new StringBuffer(); > address this. Done Line 719: buf = null; > MAJOR SonarQube violation: Done Line 719: buf = null; > MAJOR SonarQube violation: Done https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/SimpleSerializableHashTable.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/SimpleSerializableHashTable.java: Line 482: + ((bytes[offset + 2] & 0xff) << 8) + ((bytes[offset + 3] & 0xff) << 0); > CRITICAL SonarQube violation: Done Line 482: + ((bytes[offset + 2] & 0xff) << 8) + ((bytes[offset + 3] & 0xff) << 0); > CRITICAL SonarQube violation: Done Line 482: + ((bytes[offset + 2] & 0xff) << 8) + ((bytes[offset + 3] & 0xff) << 0); > address this. Done Line 490: bytes[offset++] = (byte) (value); > address this. I think this SonarQube comment is wrong unless we don't want to use (pos * 4 + 1). https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/integration/TPCHCustomerOptimizedHybridHashJoinTest.java File hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/integration/TPCHCustomerOptimizedHybridHashJoinTest.java: Line 166: OptimizedHybridHashJoinOperatorDescriptor join = new OptimizedHybridHashJoinOperatorDescriptor(spec, 24, 122, > Why does it have to be larger? You are correct. Restored the original number. https://asterix-gerrit.ics.uci.edu/#/c/1056/50/hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/unit/AbstractExternalGroupbyTest.java File hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/src/test/java/org/apache/hyracks/tests/unit/AbstractExternalGroupbyTest.java: Line 190: int numFrames = 50; > why does it have to be 50? Because the frame size is only 256. For the table cardinality 101, we need to have at least more than 20 frames to accommodate the hash table in memory. The number of frames in the previous version is 3 because it didn't need to consider the hash table footprint in memory. Also, the minimum data size is 80 frames. So, the spill must happen. -- To view, visit https://asterix-gerrit.ics.uci.edu/1056 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2b323e9a2141b4c1dd1652a360d2d9354d3bc3f5 Gerrit-PatchSet: 50 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Taewoo Kim <wangs...@yahoo.com> Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Jianfeng Jia <jianfeng....@gmail.com> Gerrit-Reviewer: Michael Blow <mb...@apache.org> Gerrit-Reviewer: Steven Jacobs <sjaco...@ucr.edu> Gerrit-Reviewer: Taewoo Kim <wangs...@yahoo.com> Gerrit-Reviewer: Till Westmann <ti...@apache.org> Gerrit-Reviewer: Wenhai Li <lwhaym...@yahoo.com> Gerrit-Reviewer: Yingyi Bu <buyin...@gmail.com> Gerrit-HasComments: Yes