[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507257#comment-14507257 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12727207/HIVE-9824.09.patch {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8750 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3527/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3527/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3527/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12727207 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch, HIVE-9824.09.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508014#comment-14508014 ] Matt McCline commented on HIVE-9824: Added 2 new JIRA, as [~sershe] requested: HIVE-10448: Consider replacing BytesBytesMultiHashMap with new fast hash table code of Native Vector Map Join HIVE-10449: LLAP: Make new fast hash table for Native Vector Map Join work with Hybrid Grace LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch, HIVE-9824.09.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504946#comment-14504946 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726853/HIVE-9824.07.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8750 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3512/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726853 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505186#comment-14505186 ] Vikram Dixit K commented on HIVE-9824: -- [~mmccline] The latest patch doesn't apply on trunk anymore. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505656#comment-14505656 ] Matt McCline commented on HIVE-9824: I switched over to using https://github.com/apache/hive from using git://git.apache.org/hive.git because of the read error: Connection reset by peer problem. I did notice when I generated the review board patch with this command line: {noformat} git diff --no-ext-diff HEAD^ review_board_patch_07.txt {noformat} and this command line for the actual patch: {noformat} git diff --no-ext-diff --no-prefix HEAD^ HIVE-9824.07.patch {noformat} The files had the same length when they usually have different lengths. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505716#comment-14505716 ] Sergey Shelukhin commented on HIVE-9824: +1. Can you file follow up jiras for replacing the hashtable, and also for making hybrid work in all cases (if still needed)? LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505696#comment-14505696 ] Matt McCline commented on HIVE-9824: Actually, they are exactly 1000 bytes different... LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506061#comment-14506061 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726989/HIVE-9824.08.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8750 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3518/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726989 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501804#comment-14501804 ] Matt McCline commented on HIVE-9824: Unfortunately, the nature of vectorization is the moment you try and abstract and encapsulate to reduce duplication you heavily impact performance. Each of the cases needs to be expanded out for good performance. Each of the vector join algorithms now has a match phase that collects equal key series and remembers the small table information. Then, a finish phase that outputs the join results. So, actually the code is a lot cleaner that it use to be when those phases were wound together :) I just coded making the join algorithms use the string templates in GenVectorCode. It is by far the goriest one I've seen. I'm not sure it is an improvement, so I'm holding it back... LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501138#comment-14501138 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726343/HIVE-9824.04.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8743 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3492/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3492/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3492/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726343 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499497#comment-14499497 ] Matt McCline commented on HIVE-9824: Patch 02: Removed trailing whitespace, unneeded imports, changed ReduceRecordSource.processVectorGroup to copy key by value instead of by reference, removed accidentally left in debugDisplayRow calls that affect performance. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499765#comment-14499765 ] Matt McCline commented on HIVE-9824: Well, add to the mystery, when I run that query with vectorization on and on MR (i.e. no native vector map join since we only do Tez), I get the following exception! {noformat} Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:159) ... 58 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1037) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:995) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1162) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.init(VectorFilterOperator.java:54) ... 63 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1037) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:995) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1162) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1013) ... 67 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:290) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1017) ... 71 more {noformat} LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499599#comment-14499599 ] Gopal V commented on HIVE-9824: --- A quick benchmark says that this makes simple map-joins ~5x faster - 280s - 54secs. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499712#comment-14499712 ] Gopal V commented on HIVE-9824: --- [~mmccline]: Here's a simplified test-case {code} explain select s_state, count(1) from store_sales, store, date_dim where store_sales.ss_sold_date_sk = date_dim.d_date_sk and store_sales.ss_store_sk = store.s_store_sk and store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') group by s_state order by s_state limit 100; {code} LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499637#comment-14499637 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726114/HIVE-9824.02.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8730 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3473/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3473/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3473/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726114 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499707#comment-14499707 ] Matt McCline commented on HIVE-9824: [~gopalv] thanks for running quick performance tests. As for Duplicate column 3 in ordered column map the code assumes the big table retain output column mapping is unique. Maybe this is a wrong assumption. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500379#comment-14500379 ] Sergey Shelukhin commented on HIVE-9824: nm, I see it https://reviews.apache.org/r/33281/ LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500535#comment-14500535 ] Mostafa Mokhtar commented on HIVE-9824: --- [~mmccline] These failures are un-related {code} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view {code} LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501033#comment-14501033 ] Sergey Shelukhin commented on HIVE-9824: I'd like to re-read the patch with (3) above, too, if possible... Also I didn't review correctness of join type code at all LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501032#comment-14501032 ] Sergey Shelukhin commented on HIVE-9824: skimmed entire patch. Aside from minor comments another thing is that there's a lot of duplicated code in many classes... is it possible to do some cleanup? LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501036#comment-14501036 ] Sergey Shelukhin commented on HIVE-9824: Btw it would be good to document the formats too LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501035#comment-14501035 ] Sergey Shelukhin commented on HIVE-9824: Also, I wonder what prevents BytesBytes hashtable from being replaced by another (or existing that I missed?) permutation of fast... tables. I see that fast tables have similar concepts and lots of similar code, somewhat different binary format, and they have improvements like separate key and value store. Is there anything missing? LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500994#comment-14500994 ] Sergey Shelukhin commented on HIVE-9824: I'm slowly getting thru the patch. I am skimming lots of the parts of the code that look like they are almost the same... 3 big questions so far: 1) What is the plan for hybrid join? I see this disables the hybrid join in code. 2) Someone like [~vikram.dixit] or [~ashutoshc] should probably review (or skim) the join variations. My brain quickly went out trying to understand code for all the options. 3) Can you write short description of what code is where and in what order it's better to read it? Patch is huge... LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346389#comment-14346389 ] Gunther Hagleitner commented on HIVE-9824: -- like the title :-) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)