[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507257#comment-14507257
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727207/HIVE-9824.09.patch

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8750 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3527/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3527/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3527/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727207 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, 
 HIVE-9824.08.patch, HIVE-9824.09.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-22 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508014#comment-14508014
 ] 

Matt McCline commented on HIVE-9824:


Added 2 new JIRA, as [~sershe] requested:

HIVE-10448: Consider replacing BytesBytesMultiHashMap with new fast hash table 
code of Native Vector Map Join

HIVE-10449: LLAP: Make new fast hash table for Native Vector Map Join work with 
Hybrid Grace


 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, 
 HIVE-9824.08.patch, HIVE-9824.09.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504946#comment-14504946
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726853/HIVE-9824.07.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8750 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3512/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726853 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505186#comment-14505186
 ] 

Vikram Dixit K commented on HIVE-9824:
--

[~mmccline] The latest patch doesn't apply on trunk anymore.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505656#comment-14505656
 ] 

Matt McCline commented on HIVE-9824:


I switched over to using https://github.com/apache/hive from using  
git://git.apache.org/hive.git  because of the read error: Connection reset by 
peer problem.

I did notice when I generated the review board patch with this command line:
{noformat}
git diff --no-ext-diff  HEAD^  review_board_patch_07.txt
{noformat}

and this command line for the actual patch:
{noformat}
git diff --no-ext-diff --no-prefix HEAD^  HIVE-9824.07.patch
{noformat}

The files had the same length when they usually have different lengths.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505716#comment-14505716
 ] 

Sergey Shelukhin commented on HIVE-9824:


+1. Can you file follow up jiras for replacing the hashtable, and also for 
making hybrid work in all cases (if still needed)?

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505696#comment-14505696
 ] 

Matt McCline commented on HIVE-9824:


Actually, they are exactly 1000 bytes different...

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506061#comment-14506061
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726989/HIVE-9824.08.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8750 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726989 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-19 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501804#comment-14501804
 ] 

Matt McCline commented on HIVE-9824:


Unfortunately, the nature of vectorization is the moment you try and abstract 
and encapsulate to reduce duplication you heavily impact performance.  Each of 
the cases needs to be expanded out for good performance.

Each of the vector join algorithms now has a match phase that collects equal 
key series and remembers the small table information.  Then, a finish phase 
that outputs the join results.  So, actually the code is a lot cleaner that it 
use to be when those phases were wound together :)

I just coded making the join algorithms use the string templates in 
GenVectorCode.  It is by far the goriest one I've seen.  I'm not sure it is an 
improvement, so I'm holding it back...

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501138#comment-14501138
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726343/HIVE-9824.04.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8743 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3492/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3492/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3492/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726343 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499497#comment-14499497
 ] 

Matt McCline commented on HIVE-9824:


Patch 02: Removed trailing whitespace, unneeded imports, changed 
ReduceRecordSource.processVectorGroup to copy key by value instead of by 
reference, removed accidentally left in debugDisplayRow calls that affect 
performance.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499765#comment-14499765
 ] 

Matt McCline commented on HIVE-9824:


Well, add to the mystery, when I run that query with vectorization on and on MR 
(i.e. no native vector map join since we only do Tez), I get the following 
exception!

{noformat}
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:159)
... 58 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1037)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:995)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1162)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.init(VectorFilterOperator.java:54)
... 63 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1037)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:995)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1162)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1013)
... 67 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:290)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1017)
... 71 more
{noformat}

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499599#comment-14499599
 ] 

Gopal V commented on HIVE-9824:
---

A quick benchmark says that this makes simple map-joins ~5x faster - 280s - 
54secs.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499712#comment-14499712
 ] 

Gopal V commented on HIVE-9824:
---

[~mmccline]: Here's a simplified test-case

{code}
explain select
s_state, count(1)
 from store_sales,
 store,
 date_dim
 where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
   store_sales.ss_store_sk = store.s_store_sk and
   store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
 group by s_state
 order by s_state
 limit 100;
{code}

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499637#comment-14499637
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726114/HIVE-9824.02.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8730 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3473/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3473/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3473/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726114 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499707#comment-14499707
 ] 

Matt McCline commented on HIVE-9824:


[~gopalv] thanks for running quick performance tests.

As for Duplicate column 3 in ordered column map the code assumes the big 
table retain output column mapping is unique.  Maybe this is a wrong assumption.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500379#comment-14500379
 ] 

Sergey Shelukhin commented on HIVE-9824:


nm, I see it https://reviews.apache.org/r/33281/

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500535#comment-14500535
 ] 

Mostafa Mokhtar commented on HIVE-9824:
---

[~mmccline]

These failures are un-related
{code}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
{code}

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501033#comment-14501033
 ] 

Sergey Shelukhin commented on HIVE-9824:


I'd like to re-read the patch with (3) above, too, if possible...
Also I didn't review correctness of join type code at all

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501032#comment-14501032
 ] 

Sergey Shelukhin commented on HIVE-9824:


skimmed entire patch. Aside from minor comments another thing is that there's a 
lot of duplicated code in many classes... is it possible to do some cleanup? 

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501036#comment-14501036
 ] 

Sergey Shelukhin commented on HIVE-9824:


Btw it would be good to document the formats too

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501035#comment-14501035
 ] 

Sergey Shelukhin commented on HIVE-9824:


Also, I wonder what prevents BytesBytes hashtable from being replaced by 
another (or existing that I missed?) permutation of fast... tables. I see that 
fast tables have similar concepts and lots of similar code, somewhat different 
binary format, and they have improvements like separate key and value store. Is 
there anything missing?

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500994#comment-14500994
 ] 

Sergey Shelukhin commented on HIVE-9824:


I'm slowly getting thru the patch. I am skimming lots of the parts of the code 
that look like they are almost the same...
3 big questions so far:
1) What is the plan for hybrid join? I see this disables the hybrid join in 
code.
2) Someone like [~vikram.dixit] or [~ashutoshc] should probably review (or 
skim) the join variations. My brain quickly went out trying to understand code 
for all the options. 
3) Can you write short description of what code is where and in what order it's 
better to read it? Patch is huge...

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-03-03 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346389#comment-14346389
 ] 

Gunther Hagleitner commented on HIVE-9824:
--

like the title :-)

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical

 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)