[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2
{code}

After copying to jira, cannot found difference, but copying from original junit 
xml, there is a whitespace difference in the lines (show by hex values), 
between 1 (x31) and 4 (x34). See  [^diff] .
(x09 is horizontal tab, x20 is space)

{code}
20 31 *20* 34 09 32
20 31 *09* 34 09 32
{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2
{code}

After copying to jira, cannot found difference, but copying from original junit 
xml, there is a whitespace difference in the lines (show by hex values), 
between 1 (x31) and 4 (x34). See attachment.

{code}
20 31 20 34 09 32
20 31 09 34 09 32
{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml,
>  diff
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}
> After copying to jira, cannot found difference, but copying from original 
> junit xml, there is a whitespace difference in the lines (show by hex 
> values), between 1 (x31) and 4 (x34). See  [^diff] .
> (x09 is horizontal tab, x20 is space)
> {code}
> 20 31 *20* 34 09 32
> 20 31 *09* 34 09 32
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2
{code}

After copying to jira, cannot found difference, but copying from original junit 
xml, there is a whitespace difference in the lines (show by hex values), 
between 1 (x31) and 4 (x34). See  [^diff] . Original golden file contains 
horizontal tab.
(x09 is horizontal tab, x20 is space)

{code}
20 31 *20* 34 09 32
20 31 *09* 34 09 32
{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2
{code}

After copying to jira, cannot found difference, but copying from original junit 
xml, there is a whitespace difference in the lines (show by hex values), 
between 1 (x31) and 4 (x34). See  [^diff] .
(x09 is horizontal tab, x20 is space)

{code}
20 31 *20* 34 09 32
20 31 *09* 34 09 32
{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml,
>  diff
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}
> After copying to jira, cannot found difference, but copying from original 
> junit xml, there is a whitespace difference in the lines (show by hex 
> values), between 1 (x31) and 4 (x34). See  [^diff] . Original golden file 
> contains horizontal tab.
> (x09 is horizontal tab, x20 is space)
> {code}
> 20 31 *20* 34 09 32
> 20 31 *09* 34 09 32
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Attachment: diff

> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml,
>  diff
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}
> After copying to jira, cannot found difference, but copying from original 
> junit xml, there is a whitespace difference in the lines (show by hex 
> values), between 1 (x31) and 4 (x34). See attachment.
> {code}
> 20 31 20 34 09 32
> 20 31 09 34 09 32
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2
{code}

After copying to jira, cannot found difference, but copying from original junit 
xml, there is a whitespace difference in the lines (show by hex values), 
between 1 (x31) and 4 (x34). See attachment.

{code}
20 31 20 34 09 32
20 31 09 34 09 32
{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml,
>  diff
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}
> After copying to jira, cannot found difference, but copying from original 
> junit xml, there is a whitespace difference in the lines (show by hex 
> values), between 1 (x31) and 4 (x34). See attachment.
> {code}
> 20 31 20 34 09 32
> 20 31 09 34 09 32
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Attachment: 
171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt

> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Attachment: 
TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml

> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
> Attachments: 
> 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt,
>  
> TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml
>
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1  4 2 
--- 
> 1 4 2

{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Ashutosh Bapat (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695553#comment-16695553
 ] 

Ashutosh Bapat commented on HIVE-20953:
---

[~maheshk114], AFAIU, the functions are loaded as part of _functions directory 
which is loaded first so changing name of the function does not change the load 
sequence.

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695545#comment-16695545
 ] 

slim bouguerra commented on HIVE-20932:
---

[~nishantbangarwa] any more comments?

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Fix Version/s: 4.0.0

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Component/s: Druid integration

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695543#comment-16695543
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949138/HIVE-20932.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15538 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15038/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15038/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15038/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949138 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695538#comment-16695538
 ] 

mahesh kumar behera commented on HIVE-20953:


[~ashutosh.bapat]

can the function name be changed so that the function is loaded before 
partitions ?

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695527#comment-16695527
 ] 

Hive QA commented on HIVE-20932:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15038/dev-support/hive-personality.sh
 |
| git revision | master / ddf3b6c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15038/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.8.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695508#comment-16695508
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949127/HIVE-20932.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 15538 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=232)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=232)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15037/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15037/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15037/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949127 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695481#comment-16695481
 ] 

Hive QA commented on HIVE-20932:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
40s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15037/dev-support/hive-personality.sh
 |
| git revision | master / ddf3b6c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15037/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695458#comment-16695458
 ] 

Hive QA commented on HIVE-20958:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949126/HIVE-20958.patch

{color:green}SUCCESS:{color} +1 due to 34 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15538 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15036/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15036/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15036/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949126 - PreCommit-HIVE-Build

> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: HIVE-20958.patch
>
>
> mostly cleaning imports  like .* 
> re ordering imports.
> remove the unused ones
> some logic simplification.
> Use lambdas when possible.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695432#comment-16695432
 ] 

Hive QA commented on HIVE-20958:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
29s{color} | {color:red} common: The patch generated 74 new + 1799 unchanged - 
181 fixed = 1873 total (was 1980) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
38s{color} | {color:red} common generated 1 new + 56 unchanged - 9 fixed = 57 
total (was 65) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:common |
|  |  Possible null pointer dereference of codahaleReporter in 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.initMetricsReporter()
 on exception path  Dereferenced at CodahaleMetrics.java:codahaleReporter in 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.initMetricsReporter()
 on exception path  Dereferenced at CodahaleMetrics.java:[line 455] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15036/dev-support/hive-personality.sh
 |
| git revision | master / ddf3b6c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15036/yetus/diff-checkstyle-common.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15036/yetus/new-findbugs-common.html
 |
| modules | C: common U: common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15036/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: HIVE-20958.patch
>
>
> mostly cleaning imports  like .* 
> re ordering imports.
> remove the unused ones
> some logic simplification.
> Use lambdas when possible.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695419#comment-16695419
 ] 

Hive QA commented on HIVE-20953:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949122/HIVE-20953.02

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15538 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestTxnCommands.testMergeOnTezEdges (batchId=324)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15035/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15035/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15035/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949122 - PreCommit-HIVE-Build

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695371#comment-16695371
 ] 

Hive QA commented on HIVE-20953:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} itests/hive-unit: The patch generated 2 new + 125 
unchanged - 3 fixed = 127 total (was 128) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15035/dev-support/hive-personality.sh
 |
| git revision | master / ddf3b6c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15035/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: itests/hive-unit U: itests/hive-unit |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15035/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed 

[jira] [Comment Edited] (HIVE-16725) Support recursive CTEs

2018-11-21 Thread Antoine CARME (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695355#comment-16695355
 ] 

Antoine CARME edited comment on HIVE-16725 at 11/21/18 11:29 PM:
-

Would be nice to have this feature in Hive and Impala. Recursive CTEs are 
useful to translate recurrent neural networks into SQL.

Some succesful usage of recursive CTEs is available here :

[https://github.com/antoinecarme/keras2sql/issues/2]

For a lot of databases, it is OK. Hive/Impala and MonetDB are missing.

I know this is not a very standard usage, but it does the job in a very elegant 
way.

 

Some code for postgresql :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/pgsql/demo3_keras_KerasClassifier_SimpleRNN_pgsql.sql]

The same for SQLite :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/sqlite/demo3_keras_KerasClassifier_SimpleRNN_sqlite.sql]

etc ...

 

 

 


was (Author: antoinecarme):
Would be nice to have this feature in Hive and Impala. Recursive CTEs are 
useful to translate recurrent neural networks into SQL.

Some succesful usage of recursive CTEs is available here :(

[https://github.com/antoinecarme/keras2sql/issues/2]

For a lot of databases, it is OK. Hive/Impala and MonetDB are missing.

I know this is not a very standard usage, but it does the job in a very elegant 
way.

 

Some code for postgresql :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/pgsql/demo3_keras_KerasClassifier_SimpleRNN_pgsql.sql]

The same for SQLite :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/sqlite/demo3_keras_KerasClassifier_SimpleRNN_sqlite.sql]

etc ...

 

 

 

> Support recursive CTEs
> --
>
> Key: HIVE-16725
> URL: https://issues.apache.org/jira/browse/HIVE-16725
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Priority: Major
>
> Hive introduced non-recursive CTEs in HIVE-1180.
> Recursive CTEs are commonly used to navigate hierarchies stored in relational 
> tables where a parent ID column "foreign key" refers to another "primary key" 
> field within the same table. In this context recursive CTEs are used to 
> traverse hierarchies, determine parents / children, measure depths, build 
> paths and so on.
> Recursive CTEs are constructed similarly to basic CTEs but include 2 queries 
> at a minimum: first a root query which is combined via UNION / UNION ALL to 
> additional queries that can refer to the CTE's table name.
> Support should include:
> * Basic recursive CTE support: i.e. allow the CTE's table name to be referred 
> in the table subquery after a UNION or UNION ALL.
> * Recursive CTEs should be supported as basic queries, in views, or in 
> subqueries.
> * Loop detection is highly desirable. If a loop is detected the query should 
> fail at runtime. Hive is commonly used in shared clusters where it is 
> difficult to track down rogue queries.
> * To ease portability, suggest  to not require the recursive keyword. It 
> could be made optional.
> * To ease portability, "with column list", i.e. with t(col1, col2) as ( ... ) 
> should be supported.
> Example (Postgres compatible):
> {code}
> create table hierarchy (id integer, parent integer);
> insert into hierarchy values (1, null), (2, 1), (3, 2);
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent is null
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
>  id | parent
> +
>   1 |
>   2 |  1
>   3 |  2
> (3 rows)
> update hierarchy set parent = 3 where id = 1;
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent = 1
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
> [ Query runs forever ]
> {code}
> Implementation Notes:
> The SQL standard requires use of the "recursive" keyword for recursive CTEs. 
> However, major commercial databases including Oracle, SQL Server and DB2 do 
> not require, or in some cases, don't even allow the "recursive" keyword. 
> Postgres requires the "recursive" keyword.
> If Oracle detects a loop it fails with this message: ORA-32044: cycle 
> detected while executing recursive WITH query
> If Postgres encounters a loop in a recursive CTE, the query runs forever and 
> must be killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.8.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-16725) Support recursive CTEs

2018-11-21 Thread Antoine CARME (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695355#comment-16695355
 ] 

Antoine CARME edited comment on HIVE-16725 at 11/21/18 11:27 PM:
-

Would be nice to have this feature in Hive and Impala. Recursive CTEs are 
useful to translate recurrent neural networks into SQL.

Some succesful usage of recursive CTEs is available here :(

[https://github.com/antoinecarme/keras2sql/issues/2]

For a lot of databases, it is OK. Hive/Impala and MonetDB are missing.

I know this is not a very standard usage, but it does the job in a very elegant 
way.

 

Some code for postgresql :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/pgsql/demo3_keras_KerasClassifier_SimpleRNN_pgsql.sql]

The same for SQLite :

[https://github.com/antoinecarme/keras2sql/blob/master/demo/KerasClassifier_SimpleRNN/iris/sqlite/demo3_keras_KerasClassifier_SimpleRNN_sqlite.sql]

etc ...

 

 

 


was (Author: antoinecarme):
Would be nice to have this feature in Hive and Impala. Recursive CTEs are 
useful to translate recurrent neural networks into SQL.

Some succesful usage of recursive CTEs is available here :(

[https://github.com/antoinecarme/keras2sql/issues/2]

For a lot of databases, it is OK. Hive/Impala and MonetDB are missing.

I know this is not a very standard usage, but it does the job in a very elegant 
way.

> Support recursive CTEs
> --
>
> Key: HIVE-16725
> URL: https://issues.apache.org/jira/browse/HIVE-16725
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Priority: Major
>
> Hive introduced non-recursive CTEs in HIVE-1180.
> Recursive CTEs are commonly used to navigate hierarchies stored in relational 
> tables where a parent ID column "foreign key" refers to another "primary key" 
> field within the same table. In this context recursive CTEs are used to 
> traverse hierarchies, determine parents / children, measure depths, build 
> paths and so on.
> Recursive CTEs are constructed similarly to basic CTEs but include 2 queries 
> at a minimum: first a root query which is combined via UNION / UNION ALL to 
> additional queries that can refer to the CTE's table name.
> Support should include:
> * Basic recursive CTE support: i.e. allow the CTE's table name to be referred 
> in the table subquery after a UNION or UNION ALL.
> * Recursive CTEs should be supported as basic queries, in views, or in 
> subqueries.
> * Loop detection is highly desirable. If a loop is detected the query should 
> fail at runtime. Hive is commonly used in shared clusters where it is 
> difficult to track down rogue queries.
> * To ease portability, suggest  to not require the recursive keyword. It 
> could be made optional.
> * To ease portability, "with column list", i.e. with t(col1, col2) as ( ... ) 
> should be supported.
> Example (Postgres compatible):
> {code}
> create table hierarchy (id integer, parent integer);
> insert into hierarchy values (1, null), (2, 1), (3, 2);
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent is null
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
>  id | parent
> +
>   1 |
>   2 |  1
>   3 |  2
> (3 rows)
> update hierarchy set parent = 3 where id = 1;
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent = 1
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
> [ Query runs forever ]
> {code}
> Implementation Notes:
> The SQL standard requires use of the "recursive" keyword for recursive CTEs. 
> However, major commercial databases including Oracle, SQL Server and DB2 do 
> not require, or in some cases, don't even allow the "recursive" keyword. 
> Postgres requires the "recursive" keyword.
> If Oracle detects a loop it fails with this message: ORA-32044: cycle 
> detected while executing recursive WITH query
> If Postgres encounters a loop in a recursive CTE, the query runs forever and 
> must be killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16725) Support recursive CTEs

2018-11-21 Thread Antoine CARME (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695355#comment-16695355
 ] 

Antoine CARME commented on HIVE-16725:
--

Would be nice to have this feature in Hive and Impala. Recursive CTEs are 
useful to translate recurrent neural networks into SQL.

Some succesful usage of recursive CTEs is available here :(

[https://github.com/antoinecarme/keras2sql/issues/2]

For a lot of databases, it is OK. Hive/Impala and MonetDB are missing.

I know this is not a very standard usage, but it does the job in a very elegant 
way.

> Support recursive CTEs
> --
>
> Key: HIVE-16725
> URL: https://issues.apache.org/jira/browse/HIVE-16725
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Priority: Major
>
> Hive introduced non-recursive CTEs in HIVE-1180.
> Recursive CTEs are commonly used to navigate hierarchies stored in relational 
> tables where a parent ID column "foreign key" refers to another "primary key" 
> field within the same table. In this context recursive CTEs are used to 
> traverse hierarchies, determine parents / children, measure depths, build 
> paths and so on.
> Recursive CTEs are constructed similarly to basic CTEs but include 2 queries 
> at a minimum: first a root query which is combined via UNION / UNION ALL to 
> additional queries that can refer to the CTE's table name.
> Support should include:
> * Basic recursive CTE support: i.e. allow the CTE's table name to be referred 
> in the table subquery after a UNION or UNION ALL.
> * Recursive CTEs should be supported as basic queries, in views, or in 
> subqueries.
> * Loop detection is highly desirable. If a loop is detected the query should 
> fail at runtime. Hive is commonly used in shared clusters where it is 
> difficult to track down rogue queries.
> * To ease portability, suggest  to not require the recursive keyword. It 
> could be made optional.
> * To ease portability, "with column list", i.e. with t(col1, col2) as ( ... ) 
> should be supported.
> Example (Postgres compatible):
> {code}
> create table hierarchy (id integer, parent integer);
> insert into hierarchy values (1, null), (2, 1), (3, 2);
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent is null
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
>  id | parent
> +
>   1 |
>   2 |  1
>   3 |  2
> (3 rows)
> update hierarchy set parent = 3 where id = 1;
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent = 1
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
> [ Query runs forever ]
> {code}
> Implementation Notes:
> The SQL standard requires use of the "recursive" keyword for recursive CTEs. 
> However, major commercial databases including Oracle, SQL Server and DB2 do 
> not require, or in some cases, don't even allow the "recursive" keyword. 
> Postgres requires the "recursive" keyword.
> If Oracle detects a loop it fails with this message: ORA-32044: cycle 
> detected while executing recursive WITH query
> If Postgres encounters a loop in a recursive CTE, the query runs forever and 
> must be killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2018-11-21 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20960:
-


> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> Now that we have HIVE-20941, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695347#comment-16695347
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949117/HIVE-20932.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15546 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=197)
[druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15034/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15034/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15034/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949117 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20958:
--
Description: 
mostly cleaning imports  like .* 

re ordering imports.

remove the unused ones

some logic simplification.

Use lambdas when possible.

 

  was:
mostly cleaning imports  like .* 

re ordering.

remove the unused ones

some logic simplification.


> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: HIVE-20958.patch
>
>
> mostly cleaning imports  like .* 
> re ordering imports.
> remove the unused ones
> some logic simplification.
> Use lambdas when possible.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20958:
--
Attachment: HIVE-20958.patch

> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: HIVE-20958.patch
>
>
> mostly cleaning imports  like .* 
> re ordering imports.
> remove the unused ones
> some logic simplification.
> Use lambdas when possible.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20958:
--
Status: Patch Available  (was: Open)

> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: HIVE-20958.patch
>
>
> mostly cleaning imports  like .* 
> re ordering imports.
> remove the unused ones
> some logic simplification.
> Use lambdas when possible.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20958) Cleaning of code at Hive-common using automatic inspection tool.

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20958:
--
Summary: Cleaning of code at Hive-common using automatic inspection tool.  
(was: Cleaning of imports at Hive-common)

> Cleaning of code at Hive-common using automatic inspection tool.
> 
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
>
> mostly cleaning imports  like .* 
> re ordering.
> remove the unused ones
> some logic simplification.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20953:

Attachment: HIVE-20953.02

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Laszlo Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1  4 2 
--- 
> 1 4 2

{code}

  was:
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1  4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code:java}
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 
11c11
< 1 4 2 
--- 
> 1 4 2

{code}

  was:
{code}

Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 11c11 < 1 4 2 --- > 1 4 2

{code}


> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
>
> {code:java}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 
> 11c11
> < 1 4 2 
> --- 
> > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695326#comment-16695326
 ] 

Hive QA commented on HIVE-20932:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
50s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
26s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15034/dev-support/hive-personality.sh
 |
| git revision | master / f5b14fc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15034/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20959) TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20959:

Description: 
{code}

Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_rp_limit.q 11c11 < 1 4 2 --- > 1 4 2

{code}

> TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] is flaky
> ---
>
> Key: HIVE-20959
> URL: https://issues.apache.org/jira/browse/HIVE-20959
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Priority: Major
>
> {code}
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_rp_limit.q 11c11 < 1 4 2 --- > 1 4 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695318#comment-16695318
 ] 

Laszlo Bodor commented on HIVE-20953:
-

cbo_rp_limit is flaky, unrelated failure, uploaded 02 patch

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20953:
---

Assignee: Ashutosh Bapat  (was: Laszlo Bodor)

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01, HIVE-20953.02
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20953:
---

Assignee: Laszlo Bodor  (was: Ashutosh Bapat)

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Laszlo Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20823) Make Compactor run in a transaction

2018-11-21 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695314#comment-16695314
 ] 

Vaibhav Gumashta commented on HIVE-20823:
-

+1

> Make Compactor run in a transaction
> ---
>
> Key: HIVE-20823
> URL: https://issues.apache.org/jira/browse/HIVE-20823
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20823.01.patch, HIVE-20823.03.patch, 
> HIVE-20823.04.patch, HIVE-20823.05.patch, HIVE-20823.07.patch, 
> HIVE-20823.08.patch, HIVE-20823.09.patch, HIVE-20823.10.patch, 
> HIVE-20823.11.patch, HIVE-20823.11.patch, HIVE-20823.12.patch, 
> HIVE-20823.13.patch, HIVE-20823.14.patch
>
>
> Have compactor open a transaction and run the job in that transaction.
> # make compactor produced base/delta include this txn id in the folder name, 
> e.g. base_7_c17 where 17 is the txnid.
> # add {{CQ_TXN_ID bigint}} to COMPACTION_QUEUE and COMPLETED_COMPACTIONS to 
> record this txn id
> # make sure {{AcidUtils.getAcidState()}} pays attention to this transaction 
> on read and ignores this dir if this txn id is not committed in the current 
> snapshot
> ## this means not only validWriteIdList but ValidTxnIdList should be passed 
> along in config (if it isn't yet)
> # once this is done, {{CompactorMR.createCompactorMarker()}} can be 
> eliminated and {{AcidUtils.isValidBase}} modified accordingly
> # modify Cleaner so that it doesn't clean old files until new file is visible 
> to all readers
> # 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695305#comment-16695305
 ] 

Ashutosh Chauhan commented on HIVE-20775:
-

+1

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20958) Cleaning of imports at Hive-common

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20958:
--
Description: 
mostly cleaning imports  like .* 

re ordering.

remove the unused ones

some logic simplification.

> Cleaning of imports at Hive-common
> --
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
>
> mostly cleaning imports  like .* 
> re ordering.
> remove the unused ones
> some logic simplification.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20958) Cleaning of imports at Hive-common

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-20958:
-


> Cleaning of imports at Hive-common
> --
>
> Key: HIVE-20958
> URL: https://issues.apache.org/jira/browse/HIVE-20958
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695302#comment-16695302
 ] 

Hive QA commented on HIVE-20775:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949114/HIVE-20775.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 15546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction]
 (batchId=159)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=273)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query95] 
(batchId=273)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query64]
 (batchId=273)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15033/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15033/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15033/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949114 - PreCommit-HIVE-Build

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.8.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695266#comment-16695266
 ] 

Hive QA commented on HIVE-20775:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 4 new + 123 unchanged - 2 
fixed = 127 total (was 125) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
57s{color} | {color:red} ql generated 2 new + 2310 unchanged - 2 fixed = 2312 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to tsRowSize in 
org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator,
 ExprNodeDesc, Statistics, ExprNodeDesc)  At 
TezCompiler.java:org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator,
 ExprNodeDesc, Statistics, ExprNodeDesc)  At TezCompiler.java:[line 1455] |
|  |  Should org.apache.hadoop.hive.ql.parse.TezCompiler$SemijoinOperatorInfo 
be a _static_ inner class?  At TezCompiler.java:inner class?  At 
TezCompiler.java:[lines 1656-1670] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15033/dev-support/hive-personality.sh
 |
| git revision | master / f5b14fc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15033/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15033/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15033/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions 

[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695251#comment-16695251
 ] 

Jesus Camacho Rodriguez commented on HIVE-20775:


[~ashutoshc], I have separated this code from HIVE-20783, I have added more 
comments as you mentioned in RB, and I have created HIVE-20957 as a follow-up.

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695252#comment-16695252
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949099/HIVE-20932.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15536 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q]
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_dynamic_partition.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_test_insert.q,druidkafkamini_delimited.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15032/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15032/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15032/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949099 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695248#comment-16695248
 ] 

Ashutosh Chauhan commented on HIVE-20954:
-

looks like valid failure.

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20775:
---
Attachment: HIVE-20775.05.patch

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Igor Kryvenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695229#comment-16695229
 ] 

Igor Kryvenko commented on HIVE-20941:
--

How it is related to producing delete_delta directory? As far as I understand, 
in  {{AcidUtils.getAcidState()}} we just get the delta directories and add them 
to the {{JobConf}}. Or I just misunderstand you?

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695220#comment-16695220
 ] 

Hive QA commented on HIVE-20932:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} druid-handler: The patch generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15032/dev-support/hive-personality.sh
 |
| git revision | master / f5b14fc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15032/yetus/diff-checkstyle-druid-handler.txt
 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15032/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20939) Branch 3.1 is unstable

2018-11-21 Thread Igor Kryvenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695215#comment-16695215
 ] 

Igor Kryvenko commented on HIVE-20939:
--

[~ashutoshc] Hi Ashutosh. Is it ready to be committed? The {{branch-3.1}} is 
still broken. 

> Branch 3.1 is unstable
> --
>
> Key: HIVE-20939
> URL: https://issues.apache.org/jira/browse/HIVE-20939
> Project: Hive
>  Issue Type: Bug
>Reporter: Igor Kryvenko
>Assignee: Igor Kryvenko
>Priority: Blocker
> Attachments: HIVE-20939-branch-3.1.patch, 
> HIVE-20939.1-branch-3.1.patch
>
>
> The latest Travis build from {{branch-3.1}} is failed. 
> Seems like just missed out change the dependencies version of 
> {{standalone-metastore}} and {{upgrade-acid}} modules from {{3.1.1}} to 
> {{3.1.2-SNAPSHOT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.7.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695156#comment-16695156
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949082/HIVE-20932.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestTxnCommands.testMergeOnTezEdges (batchId=324)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15031/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15031/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15031/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949082 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695124#comment-16695124
 ] 

Eugene Koifman edited comment on HIVE-20941 at 11/21/18 7:14 PM:
-

Actually, I just realized that there may some assumptions about delta and 
delete delta write ID ranges matching.  Let me check on that first.

Specifically somewhere in {{AcidUtils.getAcidState()}} or somewhere in 
{{CompactorMR}} where it figures out what files to include in compaction.


was (Author: ekoifman):
Actually, I just realized that there may some assumptions about delta and 
delete delta write ID ranges matching.  Let me check on that first.

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20949:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~ashutoshc]

> Improve PKFK cardinality estimation in physical planning
> 
>
> Key: HIVE-20949
> URL: https://issues.apache.org/jira/browse/HIVE-20949
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20949.01.patch, HIVE-20949.patch
>
>
> Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20949:
---
Fix Version/s: 4.0.0

> Improve PKFK cardinality estimation in physical planning
> 
>
> Key: HIVE-20949
> URL: https://issues.apache.org/jira/browse/HIVE-20949
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20949.01.patch, HIVE-20949.patch
>
>
> Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695124#comment-16695124
 ] 

Eugene Koifman commented on HIVE-20941:
---

Actually, I just realized that there may some assumptions about delta and 
delete delta write ID ranges matching.  Let me check on that first.

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695102#comment-16695102
 ] 

Eugene Koifman commented on HIVE-20941:
---

+1

 

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695101#comment-16695101
 ] 

Hive QA commented on HIVE-20932:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
11s{color} | {color:red} druid-handler generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15031/dev-support/hive-personality.sh
 |
| git revision | master / 9389a5a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15031/yetus/diff-javadoc-javadoc-druid-handler.txt
 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15031/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695051#comment-16695051
 ] 

Hive QA commented on HIVE-20949:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949075/HIVE-20949.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15546 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15030/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15030/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15030/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949075 - PreCommit-HIVE-Build

> Improve PKFK cardinality estimation in physical planning
> 
>
> Key: HIVE-20949
> URL: https://issues.apache.org/jira/browse/HIVE-20949
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20949.01.patch, HIVE-20949.patch
>
>
> Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.6.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694995#comment-16694995
 ] 

Hive QA commented on HIVE-20949:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 1 new + 28 unchanged - 0 fixed 
= 29 total (was 28) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15030/dev-support/hive-personality.sh
 |
| git revision | master / 9389a5a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15030/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15030/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve PKFK cardinality estimation in physical planning
> 
>
> Key: HIVE-20949
> URL: https://issues.apache.org/jira/browse/HIVE-20949
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20949.01.patch, HIVE-20949.patch
>
>
> Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20949:
---
Attachment: HIVE-20949.01.patch

> Improve PKFK cardinality estimation in physical planning
> 
>
> Key: HIVE-20949
> URL: https://issues.apache.org/jira/browse/HIVE-20949
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20949.01.patch, HIVE-20949.patch
>
>
> Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694922#comment-16694922
 ] 

Hive QA commented on HIVE-20932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949064/HIVE-20932.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15546 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=197)
[druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15029/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15029/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15029/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949064 - PreCommit-HIVE-Build

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694920#comment-16694920
 ] 

ASF GitHub Bot commented on HIVE-20932:
---

GitHub user b-slim opened a pull request:

https://github.com/apache/hive/pull/493

HIVE-20932 Adding Vectorize code to Druid storage handler (Slim B)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/b-slim/hive HIVE-20932

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/493.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #493


commit cbf555f2b54a3b5eaca0c98c7067b420ad488c08
Author: Slim Bouguerra 
Date:   2018-11-20T22:49:45Z

HIVE-20932 Adding Vectorize code to Druid storage handler (Slim B)

Change-Id: I1a95bbe0f1d0e3a5452111cd1d2262d4253dbdcb




> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20932:
--
Labels: pull-request-available  (was: )

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694883#comment-16694883
 ] 

Hive QA commented on HIVE-20932:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m  
9s{color} | {color:red} druid-handler: The patch generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
11s{color} | {color:red} druid-handler generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15029/dev-support/hive-personality.sh
 |
| git revision | master / 9389a5a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15029/yetus/diff-checkstyle-druid-handler.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15029/yetus/diff-javadoc-javadoc-druid-handler.txt
 |
| modules | C: ql druid-handler U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15029/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 

[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException

2018-11-21 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694877#comment-16694877
 ] 

slim bouguerra commented on HIVE-20955:
---

FYI this true for any other Hive table;

 

> Calcite Rule HiveExpandDistinctAggregatesRule seems throwing 
> IndexOutOfBoundsException
> --
>
> Key: HIVE-20955
> URL: https://issues.apache.org/jira/browse/HIVE-20955
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: slim bouguerra
>Priority: Major
>
>  
> Adde the following query to Druid test  
> ql/src/test/queries/clientpositive/druidmini_expressions.q
> {code}
> select count(distinct `__time`, cint) from (select * from 
> druid_table_alltypesorc) as src;
> {code}
> leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client 
> execution failed with error code = 4 running "\{code}
> with exception stack 
> {code}
> 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41)
>  ~[guava-19.0.jar:?]
>  at 
> org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) 
> ~[?:?]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) 
> ~[?:?]
>  at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:478)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12296)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358)
>  

[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException

2018-11-21 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694862#comment-16694862
 ] 

slim bouguerra commented on HIVE-20955:
---

cc [~vgarg] / [~ashutoshc] any idea ? 

 

> Calcite Rule HiveExpandDistinctAggregatesRule seems throwing 
> IndexOutOfBoundsException
> --
>
> Key: HIVE-20955
> URL: https://issues.apache.org/jira/browse/HIVE-20955
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: slim bouguerra
>Priority: Major
>
>  
> Adde the following query to Druid test  
> ql/src/test/queries/clientpositive/druidmini_expressions.q
> {code}
> select count(distinct `__time`, cint) from (select * from 
> druid_table_alltypesorc) as src;
> {code}
> leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client 
> execution failed with error code = 4 running "\{code}
> with exception stack 
> {code}
> 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41)
>  ~[guava-19.0.jar:?]
>  at 
> org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) 
> ~[?:?]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) 
> ~[?:?]
>  at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:478)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12296)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358)
>  

[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Igor Kryvenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694849#comment-16694849
 ] 

Igor Kryvenko commented on HIVE-20941:
--

Failures are not related.

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694847#comment-16694847
 ] 

Hive QA commented on HIVE-20941:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949037/HIVE-20941.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15528 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_dynamic_partition.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_test_insert.q,druidkafkamini_delimited.q]
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=154)

[intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15028/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15028/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15028/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949037 - PreCommit-HIVE-Build

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694846#comment-16694846
 ] 

slim bouguerra commented on HIVE-20932:
---

[~nishantbangarwa] This should enable Vectorize pipeline thus less object 
creation and SIMD on arithmetic therefore faster of course in theory :D 

Working on real benchmark.

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-21 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20932:
--
Attachment: HIVE-20932.5.patch

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694801#comment-16694801
 ] 

Hive QA commented on HIVE-20941:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
47s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 1 new + 577 unchanged - 2 
fixed = 578 total (was 579) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 169 
unchanged - 2 fixed = 170 total (was 171) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15028/dev-support/hive-personality.sh
 |
| git revision | master / 9389a5a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15028/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15028/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15028/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694762#comment-16694762
 ] 

Hive QA commented on HIVE-20954:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949035/HIVE-20954.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 15549 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[keep_uniform] 
(batchId=78)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_smb_mapjoin_14]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_9]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_exists]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_in]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_not_in]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cluster] 
(batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[keep_uniform]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[partialdhj] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in_having]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_select]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[table_access_keys_stats]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_reduce_side]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
 (batchId=166)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15027/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15027/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15027/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949035 - PreCommit-HIVE-Build

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 

[jira] [Commented] (HIVE-20952) Cleaning VectorizationContext.java

2018-11-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694753#comment-16694753
 ] 

ASF GitHub Bot commented on HIVE-20952:
---

Github user b-slim closed the pull request at:

https://github.com/apache/hive/pull/490


> Cleaning VectorizationContext.java
> --
>
> Key: HIVE-20952
> URL: https://issues.apache.org/jira/browse/HIVE-20952
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20952.patch, HIVE-20952.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20951) LLAP: Set Xms to 50% always

2018-11-21 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694739#comment-16694739
 ] 

slim bouguerra commented on HIVE-20951:
---

[~gopalv] wondering if it is possible to make those off heap structure part of 
pool with a fixed size that gets allocated eagerly at start time?

> LLAP: Set Xms to 50% always 
> 
>
> Key: HIVE-20951
> URL: https://issues.apache.org/jira/browse/HIVE-20951
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>
> The lack of GC pauses is killing LLAP containers whenever the significant 
> amount of memory is consumed by the off-heap structures which aren't cleaned 
> up automatically until the GC runs.
> There's a java.nio.DirectByteBuffer.Deallocator which runs when the Direct 
> buffers are garbage collected, which actually does the cleanup of the 
> underlying off-heap buffers.
> The lack of Garbage collection activity for several hours while responding to 
> queries triggers a build-up of these off-heap structures which end up forcing 
> YARN to kill the process instead.
> It is better to hit a GC pause occasionally rather than to lose a node every 
> few hours.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694709#comment-16694709
 ] 

Hive QA commented on HIVE-20954:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2318 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 10 new + 22 unchanged - 1 
fixed = 32 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15027/dev-support/hive-personality.sh
 |
| git revision | master / 61a57e9 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15027/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15027/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE 

[jira] [Commented] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694673#comment-16694673
 ] 

Hive QA commented on HIVE-20897:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949015/HIVE-20897.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15546 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15026/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15026/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15026/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949015 - PreCommit-HIVE-Build

> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20954:
--
Status: Patch Available  (was: Open)

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20941) Compactor produces a delete_delta_x_y even if there are no input delete events

2018-11-21 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-20941:
-
Attachment: HIVE-20941.02.patch

> Compactor produces a delete_delta_x_y even if there are no input delete events
> --
>
> Key: HIVE-20941
> URL: https://issues.apache.org/jira/browse/HIVE-20941
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20941.01.patch, HIVE-20941.02.patch
>
>
> see example in HIVE-20901
>  
> Probably change logic in CompactorMR.CompactorMap.map() which creates delete 
> event writer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694609#comment-16694609
 ] 

Hive QA commented on HIVE-20897:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch service-rpc passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} service: The patch generated 0 new + 60 unchanged - 
1 fixed = 60 total (was 61) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15026/dev-support/hive-personality.sh
 |
| git revision | master / 61a57e9 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15026/yetus/whitespace-eol.txt
 |
| modules | C: service-rpc service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15026/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694601#comment-16694601
 ] 

ASF GitHub Bot commented on HIVE-20954:
---

GitHub user pudidic opened a pull request:

https://github.com/apache/hive/pull/492

HIVE-20954: Vector RS operator is not using uniform hash function for…

… TPC-DS query 95 (Teddy Choi)

Change-Id: Ia23b5ddefc2b35cda9ed7d817bdbd767ec7f7671

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pudidic/hive HIVE-20954

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/492.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #492


commit 17dd49c160eeea7b5a511e9a6801e2e9b298ac1a
Author: Teddy Choi 
Date:   2018-11-21T11:55:04Z

HIVE-20954: Vector RS operator is not using uniform hash function for 
TPC-DS query 95 (Teddy Choi)

Change-Id: Ia23b5ddefc2b35cda9ed7d817bdbd767ec7f7671




> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20954:
--
Labels: pull-request-available  (was: )

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20954:
--
Attachment: HIVE-20954.1.patch

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20954.1.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694589#comment-16694589
 ] 

Hive QA commented on HIVE-20953:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949007/HIVE-20953.01

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit]
 (batchId=171)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15025/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15025/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15025/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949007 - PreCommit-HIVE-Build

> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 
> hive.in.repl.test.files.sorted to true, the objects are ordered by the 
> directory names. This ordering is available with minimal changes for testing, 
> hence we use it. With this ordering a function gets loaded before a table. So 
> changed the test to not expect the function to be available after the failed 
> load, but be available after the retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20924) Property 'hive.driver.parallel.compilation.global.limit' should be immutable at runtime

2018-11-21 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20924:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

Thanks for the patch [~dkuzmenko]!

> Property 'hive.driver.parallel.compilation.global.limit' should be immutable 
> at runtime
> ---
>
> Key: HIVE-20924
> URL: https://issues.apache.org/jira/browse/HIVE-20924
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20924.1.patch, HIVE-20924.2.patch, 
> HIVE-20924.3.patch, HIVE-20924.4.patch, HIVE-20924.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-21 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi reassigned HIVE-20954:
-


> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20953) Fix testcase TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions to not depend upon the order in which objects get loaded

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694550#comment-16694550
 ] 

Hive QA commented on HIVE-20953:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/hive-unit: The patch generated 2 new + 125 
unchanged - 3 fixed = 127 total (was 128) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15025/dev-support/hive-personality.sh
 |
| git revision | master / 59d85d7 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15025/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: itests/hive-unit U: itests/hive-unit |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15025/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix testcase 
> TestReplicationScenariosAcrossInstances#testBootstrapReplLoadRetryAfterFailureForPartitions
>  to not depend upon the order in which objects get loaded
> -
>
> Key: HIVE-20953
> URL: https://issues.apache.org/jira/browse/HIVE-20953
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20953.01
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The testcase is intended to test REPL LOAD with retry. The test creates a 
> partitioned table and a function in the source database and loads those to 
> the replica. The first attempt to load a dump is intended to fail while 
> loading one of the partitions. Based on the order in which the objects get 
> loaded, if the function is queued after the table, it will not be available 
> in replica after the load failure. But if it's queued before the table, it 
> will be available in replica even after the load failure. The test assumes 
> the later case, which may not be true always.
> Hence fix the testcase to order the objects by a fixed ordering. By setting 

[jira] [Commented] (HIVE-20915) Make dynamic sort partition optimization available to HoS and MR

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694483#comment-16694483
 ] 

Hive QA commented on HIVE-20915:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948993/HIVE-20915.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 55 failed/errored test(s), 15545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_1] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] 
(batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_6] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_8] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_partition_insert]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_partial]
 (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[implicit_cast_during_insert]
 (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_dyn_part]
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_into6] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part10] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part1] 
(batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part3] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part4] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part8] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part9] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge3] (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge4] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition2]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition3]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition4]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition5]
 (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_int_type_promotion] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge10] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge1] (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge2] (batchId=97)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_diff_fs] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats2] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats4] (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_empty_dyn_part] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_part2] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_17] 
(batchId=75)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_incompat2]
 (batchId=192)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part10] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part1] 
(batchId=149)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part3] 
(batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part4] 
(batchId=139)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part5] 
(batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part8] 
(batchId=140)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part9] 
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[stats2] 
(batchId=138)

[jira] [Updated] (HIVE-20818) Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-21 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20818:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

Thanks for the patch [~klcopp]!

> Views created with a WHERE subquery will regard views referenced in the 
> subquery as direct input
> 
>
> Key: HIVE-20818
> URL: https://issues.apache.org/jira/browse/HIVE-20818
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20818.2.patch, HIVE-20818.3.patch, 
> HIVE-20818.4.patch, HIVE-20818.5.patch, HIVE-20818.6.patch, HIVE-20818.patch
>
>
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> For example:
> {code:java}
> create database db1;
> create database db2;
> create database db3;
>  
> create table db1.table1 (cola string, colb string, colc string);
> insert into db1.table1 values ('a','b','c');
> insert into db1.table1 values ('x','y','z');
> CREATE VIEW db2.view1 AS SELECT cola, colb, colc FROM db1.table1 WHERE 
> cola="x"; 
> CREATE VIEW db2.view2 AS SELECT table1.cola, table1.colb, table1.colc FROM 
> db1.table1 WHERE table1.cola NOT IN (SELECT view1.cola FROM db2.view1); 
> create view db3.view3 as select * from db2.view2
> {code}
>  If test_user has read permission for only db3 (but not db1 or db2), their 
> query
> {code:java}
> select * from db3.view3;{code}
> will fail with :
> {code:java}
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges User test_user does not have privileges for QUERY The required 
> privileges: Server=server1->Db=db2->Table=view1->action=select; {code}
> WHERE IN and WHERE EXISTS cause the same issue.
> Cascading views created with no WHERE clauses (i.e. with simple SELECTs and 
> FROM clauses) work fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19098) Hive: impossible to insert data in a parquet's table with "union all" in the select query

2018-11-21 Thread Ke Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694476#comment-16694476
 ] 

Ke Zhang commented on HIVE-19098:
-

This bug may related to HIVE-16958 showing similar failure behaviors (in Hive 
2.2+2.3) and may be fixed in Hive 3.0. 

> Hive: impossible to insert data in a parquet's table with "union all" in the 
> select query
> -
>
> Key: HIVE-19098
> URL: https://issues.apache.org/jira/browse/HIVE-19098
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Hive
>Affects Versions: 2.3.2
>Reporter: ACOSS
>Assignee: Peter Vary
>Priority: Minor
>
> Hello
> We have a parquet's table.
> We want to insert data in the table by a querie like this:
> "insert into my_table select * from my_select_table_1 union all select * from 
> my_select_table_2"
> It's fail with the error:
> 2018-04-03 15:49:28,898 FATAL [IPC Server handler 2 on 38465] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
> attempt_1522749003448_0028_m_00_0 - exited : java.io.IOException: 
> java.lang.reflect.InvocationTargetException
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:217)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:345)
>  at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:695)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257)
>  ... 11 more
> Caused by: java.lang.NullPointerException
>  at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ProjectionPusher.pushProjectionsAndFilters(ProjectionPusher.java:118)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ProjectionPusher.pushProjectionsAndFilters(ProjectionPusher.java:189)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:99)
>  ... 16 more
>  
> Scenario:
> create table t1 (col1 string);
> create table t2 (col1 string);
> insert into t2 values ('2017');
> insert into t1 values ('2017');
> create table t3 (col1 string) STORED AS PARQUETFILE;
>  INSERT into t3 select col1 from t1 union all select col1 from t2; 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20915) Make dynamic sort partition optimization available to HoS and MR

2018-11-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694419#comment-16694419
 ] 

Hive QA commented on HIVE-20915:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2318 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 8 new + 45 unchanged - 0 fixed 
= 53 total (was 45) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15024/dev-support/hive-personality.sh
 |
| git revision | master / 0d0721f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15024/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql itests itests/hive-blobstore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15024/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Make dynamic sort partition optimization available to HoS and MR
> 
>
> Key: HIVE-20915
> URL: https://issues.apache.org/jira/browse/HIVE-20915
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
> Attachments: HIVE-20915.1.patch
>
>
> HIVE-20703 put dynamic sort partition optimization under cost based decision, 
> but it also makes the optimizer only available to tez. 
> hive.optimize.sort.dynamic.partition works with other execution engines for a 
> long time, we should keep the optimizer available to them. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694412#comment-16694412
 ] 

Laszlo Bodor commented on HIVE-20897:
-

[~maheshk114] : I uploaded 04.patch to get clean results

[~sankarh] : could you please review?

> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20897:

Attachment: HIVE-20897.04.patch

> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: Laszlo Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20897:
---

Assignee: Laszlo Bodor  (was: mahesh kumar behera)

> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: Laszlo Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error

2018-11-21 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20897:
---

Assignee: mahesh kumar behera  (was: Laszlo Bodor)

> TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
> 
>
> Key: HIVE-20897
> URL: https://issues.apache.org/jira/browse/HIVE-20897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20897.01.patch, HIVE-20897.02.patch, 
> HIVE-20897.03.patch, HIVE-20897.04.patch
>
>
> if async prepare is enabled, control will be returned to the client before 
> driver could set of the query has a result set or not. But in current code, 
> while generating the response for the query, it is not checked if the result 
> set field is set or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >