[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363599#comment-16363599
 ] 

Deepak Jaiswal commented on HIVE-18622:
---

I had a brief chat with Matt about this. Here is what he explained to me the 
problem is, hope it helps [~sershe]

 

 The problem is with ColumnVector reuse.
 For intermediate calculation we grab a (statically) allocated ColumnVector, 
generated output into it, pass it on to another vector expression and then that 
ColumnVector is implicitly returned to be available for another vector 
expression.

The pattern people were using was outputColVector.noNulls = 
inputColVector.noNulls
The problem is the ColumnVector.reset() method *assumes* that if noNulls is 
true that all isNull entries are false.  And, a huge amount of code was 
assuming they same thing.  That it did not have to set isNull entries if 
noNulls is true.  The crux of the issue though is the outputColVector.noNulls 
flag is basically corrupted if you set it from the inputColVector.
So if vector expression #1 sets one row as NULL by doing 
outputColVector.isNull[batchIndex] = true and outputColVector.noNulls = false 
that works for the current expression.  But if the next vector expression #2 
reuses outputColVector and sets noNulls to true we have a isNull array with a 
true lurking in it.  The output of #2 for that row will appear to other code as 
NULL which is wrong.

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363589#comment-16363589
 ] 

Hive QA commented on HIVE-18622:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910466/HIVE-18622.096.patch

{color:green}SUCCESS:{color} +1 due to 29 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 13174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9205/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9205/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9205/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910466 - PreCommit-HIVE-Build

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.

[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363581#comment-16363581
 ] 

Hive QA commented on HIVE-18622:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} storage-api: The patch generated 0 new + 57 
unchanged - 77 fixed = 57 total (was 134) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  3m 
15s{color} | {color:red} root: The patch generated 193 new + 5695 unchanged - 
399 fixed = 5888 total (was 6094) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch llap-server passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
40s{color} | {color:red} ql: The patch generated 191 new + 5256 unchanged - 322 
fixed = 5447 total (was 5578) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} vector-code-gen: The patch generated 2 new + 308 
unchanged - 0 fixed = 310 total (was 308) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 20 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} storage-api generated 1 new + 26 unchanged - 0 fixed = 
27 total (was 26) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  5m 
48s{color} | {color:red} root generated 1 new + 336 unchanged - 0 fixed = 337 
total (was 336) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fedefeb |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/diff-checkstyle-vector-code-gen.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/whitespace-eol.txt 
|
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/diff-javadoc-javadoc-storage-api.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9205/yetus/diff-javadoc-javadoc-root.txt
 |
| asflicense | 

[jira] [Commented] (HIVE-18672) Printed state in RemoteSparkJobMonitor is ambiguous

2018-02-13 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363562#comment-16363562
 ] 

Peter Vary commented on HIVE-18672:
---

+1, pending tests

> Printed state in RemoteSparkJobMonitor is ambiguous
> ---
>
> Key: HIVE-18672
> URL: https://issues.apache.org/jira/browse/HIVE-18672
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18672.1.patch, HIVE-18672.2.patch, 
> HIVE-18672.3.patch
>
>
> There are a few places in {{RemoteSparkJobMonitor}} (e.g. when the Spark job 
> is in state QUEUED) where the state of the Spark job is printed, but the info 
> is ambiguous (no reference to HoS, or the id of the Spark job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18713) Optimize: Transform IN clauses to = when there's only one element

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18713:
---
Attachment: HIVE-18713.1.patch

> Optimize: Transform IN clauses to = when there's only one element
> -
>
> Key: HIVE-18713
> URL: https://issues.apache.org/jira/browse/HIVE-18713
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-18713.1.patch
>
>
> (col1) IN (col2) can be transformed to (col1) = (col2), to avoid the hash-set 
> implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18713) Optimize: Transform IN clauses to = when there's only one element

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18713:
---
Status: Patch Available  (was: Open)

> Optimize: Transform IN clauses to = when there's only one element
> -
>
> Key: HIVE-18713
> URL: https://issues.apache.org/jira/browse/HIVE-18713
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-18713.1.patch
>
>
> (col1) IN (col2) can be transformed to (col1) = (col2), to avoid the hash-set 
> implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363538#comment-16363538
 ] 

Hive QA commented on HIVE-16491:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910463/HIVE-16491.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 13135 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_join] 
(batchId=45)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=140)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9204/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9204/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9204/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910463 - PreCommit-HIVE-Build

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363500#comment-16363500
 ] 

Hive QA commented on HIVE-16491:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fedefeb |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9204/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9204/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18713) Optimize: Transform IN clauses to = when there's only one element

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-18713:
--

Assignee: Gopal V

> Optimize: Transform IN clauses to = when there's only one element
> -
>
> Key: HIVE-18713
> URL: https://issues.apache.org/jira/browse/HIVE-18713
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>
> (col1) IN (col2) can be transformed to (col1) = (col2), to avoid the hash-set 
> implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363486#comment-16363486
 ] 

Hive QA commented on HIVE-18622:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910466/HIVE-18622.096.patch

{color:green}SUCCESS:{color} +1 due to 29 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 31 failed/errored test(s), 13174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[metadata_empty_table] 
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_drop_partition]
 (batchId=175)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.client.TestDropPartitions.testDropPartition[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.hcatalog.common.TestHiveClientCache.testCloseAllClients 
(batchId=200)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9203/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9203/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9203/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 31 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910466 - PreCommit-HIVE-Build

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are 

[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363467#comment-16363467
 ] 

Hive QA commented on HIVE-18622:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} storage-api: The patch generated 0 new + 57 
unchanged - 77 fixed = 57 total (was 134) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  3m 
11s{color} | {color:red} root: The patch generated 193 new + 5695 unchanged - 
399 fixed = 5888 total (was 6094) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch llap-server passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
41s{color} | {color:red} ql: The patch generated 191 new + 5256 unchanged - 322 
fixed = 5447 total (was 5578) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} vector-code-gen: The patch generated 2 new + 308 
unchanged - 0 fixed = 310 total (was 308) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 20 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} storage-api generated 1 new + 26 unchanged - 0 fixed = 
27 total (was 26) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  5m 
38s{color} | {color:red} root generated 1 new + 336 unchanged - 0 fixed = 337 
total (was 336) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fedefeb |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/diff-checkstyle-vector-code-gen.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/whitespace-eol.txt 
|
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/diff-javadoc-javadoc-storage-api.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9203/yetus/diff-javadoc-javadoc-root.txt
 |
| asflicense | 

[jira] [Updated] (HIVE-18573) Use proper Calcite operator instead of UDFs

2018-02-13 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-18573:
--
Attachment: HIVE-18573.8.patch

> Use proper Calcite operator instead of UDFs
> ---
>
> Key: HIVE-18573
> URL: https://issues.apache.org/jira/browse/HIVE-18573
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-18573.2.patch, HIVE-18573.3.patch, 
> HIVE-18573.4.patch, HIVE-18573.5.patch, HIVE-18573.6.patch, 
> HIVE-18573.7.patch, HIVE-18573.7.patch, HIVE-18573.8.patch, HIVE-18573.patch
>
>
> Currently, Hive is mostly using user-defined black box sql operators during 
> Query planning. It will be more beneficial to use proper calcite operators, 
> this is to prepare the ground for pushing complex expressions to 
> Druid-calcite adapter.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363446#comment-16363446
 ] 

Sergey Shelukhin edited comment on HIVE-18622 at 2/14/18 3:45 AM:
--

I looked at most of the iteration 7 on RB (pages 10-13 remaining to go thru), 
and 7-8 diff.
There's one correctness issue that I found, with 0/i mixup.
Some removals of setting isNull are not clear to me.
My main concern is that either I'm missing the big picture, or there is no 
unified semantic approach to noNulls, and to the pattern of setting and 
unsetting it.
The idea of noNulls is to avoid looking at isNull, so it's not clear why 
certain places in the code that fill the meaningful parts of the batch with 
non-nulls still don't set noNulls (I commented on one or two, there are many). 
Seems like noNulls should be set every time there are no nulls, and isNull 
array should be set correctly by whoever sets noNulls to false.
So, an example uniform approach could be:
0) The only parts of the batch that matter are those included in batch size.
1) set isNull to false every time we set a non-null value and noNulls is not 
true.
2) when flipping noNulls to false, make sure that isNull is correct; when 
flipping it as part of larger loop that always sets isNull unconditionally 
(e.g. in TreeReader::nextVector in ORC) no additional action necessary; when 
looping thru a bunch of non-nulls and finding a null it may be necessary to 
backfill the false-s in the preceding values and rely on (1) to fill the 
following values once noNulls is flipped; when setting individual elements 
without context it may be necessary to fill the array entirely (done only once 
when actually flipping noNulls).

Right now it seems like some places are too conservative and fill isNull even 
when noNulls is true; and some actually remove isNull-setting when setting 
values to non-nulls, without checking for noNulls (so, presumably if noNulls is 
true isNull could be incorrect and it's not clear the next set that happens to 
be a null doesn't flip noNulls and renders the previous value invalid.
Or at least the pattern in these approaches is not clear to me - could be 
because the patch is so large; perhaps it should be described somewhere in one 
place.




was (Author: sershe):
I looked at most of the iteration 7 on RB (pages 10-13 remaining to go thru), 
and 7-8 diff.
There's one correctness issue that I found, with 0/i mixup.
Some removals of setting isNull are not clear to me.
My main concern is that either I'm missing the big picture, or there is no 
unified semantic approach to noNulls, and to the pattern of setting and 
unsetting it.
The idea of noNulls is to avoid looking at isNull, so it's not clear why 
certain places in the code that fill the meaningful parts of the batch with 
non-nulls still don't set noNulls (I commented on one or two, there are many). 
Seems like noNulls should be set every time there are noNulls, and isNull array 
should be set correctly by whoever sets noNulls to false.
So, the approach could be:
1) set isNull to false every time we set a non-null value and noNulls is not 
true.
2) when flipping noNulls to false, make sure that isNull is correct; when 
flipping it as part of larger loop that always sets isNull unconditionally 
(e.g. in TreeReader::nextVector in ORC) it's not necessary; when looping thru a 
bunch of non-nulls and finding a null it may be necessary to backfill the 
false-s in the preceding values and rely on (1) to fill the following values 
once noNulls is flipped; when setting individual elements without context it 
may be necessary to fill the array entirely (done only once when actually 
flipping noNulls).

Right now it seems like some places are too conservative and fill isNull even 
when noNulls is true; and some actually remove isNull-setting when setting 
values to non-nulls, without checking for noNulls (so, presumably if noNulls is 
true isNull could be incorrect and it's not clear the next set that happens to 
be a null doesn't flip noNulls and renders the previous value invalid.
Or at least the pattern in these approaches is not clear to me - could be 
because the patch is so large; perhaps it should be described somewhere in one 
place.



> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, 

[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363446#comment-16363446
 ] 

Sergey Shelukhin commented on HIVE-18622:
-

I looked at most of the iteration 7 on RB (pages 10-13 remaining to go thru), 
and 7-8 diff.
There's one correctness issue that I found, with 0/i mixup.
Some removals of setting isNull are not clear to me.
My main concern is that either I'm missing the big picture, or there is no 
unified semantic approach to noNulls, and to the pattern of setting and 
unsetting it.
The idea of noNulls is to avoid looking at isNull, so it's not clear why 
certain places in the code that fill the meaningful parts of the batch with 
non-nulls still don't set noNulls (I commented on one or two, there are many). 
Seems like noNulls should be set every time there are noNulls, and isNull array 
should be set correctly by whoever sets noNulls to false.
So, the approach could be:
1) set isNull to false every time we set a non-null value and noNulls is not 
true.
2) when flipping noNulls to false, make sure that isNull is correct; when 
flipping it as part of larger loop that always sets isNull unconditionally 
(e.g. in TreeReader::nextVector in ORC) it's not necessary; when looping thru a 
bunch of non-nulls and finding a null it may be necessary to backfill the 
false-s in the preceding values and rely on (1) to fill the following values 
once noNulls is flipped; when setting individual elements without context it 
may be necessary to fill the array entirely (done only once when actually 
flipping noNulls).

Right now it seems like some places are too conservative and fill isNull even 
when noNulls is true; and some actually remove isNull-setting when setting 
values to non-nulls, without checking for noNulls (so, presumably if noNulls is 
true isNull could be incorrect and it's not clear the next set that happens to 
be a null doesn't flip noNulls and renders the previous value invalid.
Or at least the pattern in these approaches is not clear to me - could be 
because the patch is so large; perhaps it should be described somewhere in one 
place.



> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17871) Add non nullability flag to druid time column

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363440#comment-16363440
 ] 

Hive QA commented on HIVE-17871:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897126/HIVE-17871.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9202/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9202/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9202/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-02-14 03:30:39.145
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9202/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-02-14 03:30:39.149
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at fedefeb HIVE-18673: ErrorMsg.SPARK_JOB_MONITOR_TIMEOUT isn't 
formatted correctly (Sahil Takiar, reviewed by Chao Sun)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at fedefeb HIVE-18673: ErrorMsg.SPARK_JOB_MONITOR_TIMEOUT isn't 
formatted correctly (Sahil Takiar, reviewed by Chao Sun)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-02-14 03:30:42.309
+ rm -rf ../yetus
+ mkdir ../yetus
+ git gc
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9202/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java:287
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java'
 with conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java:112
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:2413
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java' with 
conflicts.
Going to apply patch with: git apply -p0
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java:287
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java'
 with conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java:112
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:2413
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java' with 
conflicts.
/data/hiveptest/working/scratch/build.patch:383: new blank line at EOF.
+
U 
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
U ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
warning: 1 line adds whitespace errors.
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897126 - PreCommit-HIVE-Build

> Add non nullability flag to druid time column
> -
>
> 

[jira] [Commented] (HIVE-18433) Upgrade version of com.fasterxml.jackson

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363438#comment-16363438
 ] 

Hive QA commented on HIVE-18433:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910446/HIVE-18433.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 13100 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestLocalSparkCliDriver.org.apache.hadoop.hive.cli.TestLocalSparkCliDriver
 (batchId=251)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz]
 (batchId=249)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv] 
(batchId=249)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
 (batchId=249)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
 (batchId=249)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=182)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.druid.TestHiveDruidQueryBasedInputFormat.testTimeZone 
(batchId=257)
org.apache.hadoop.hive.druid.serde.TestDruidSerDe.testDruidDeserializer 
(batchId=257)
org.apache.hadoop.hive.druid.serde.TestDruidSerDe.testDruidObjectDeserializer 
(batchId=257)
org.apache.hadoop.hive.druid.serde.TestDruidSerDe.testDruidObjectSerializer 
(batchId=257)
org.apache.hadoop.hive.metastore.client.TestAlterPartitions.testAlterPartitionWithEnvironmentCtx[Embedded]
 (batchId=212)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveBackKill 
(batchId=236)
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles (batchId=298)
org.apache.hive.spark.client.TestSparkClient.testCounters (batchId=298)
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection (batchId=298)
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob (batchId=298)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9201/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9201/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9201/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message 

[jira] [Commented] (HIVE-18433) Upgrade version of com.fasterxml.jackson

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363427#comment-16363427
 ] 

Hive QA commented on HIVE-18433:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
6s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 52 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fedefeb |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9201/yetus/patch-asflicense-problems.txt
 |
| modules | C: . druid-handler hcatalog/core hcatalog/server-extensions 
hcatalog/webhcat/svr ql testutils/ptest2 U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9201/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade version of com.fasterxml.jackson
> 
>
> Key: HIVE-18433
> URL: https://issues.apache.org/jira/browse/HIVE-18433
> Project: Hive
>  Issue Type: Task
>Reporter: Sahil Takiar
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-18433.1.patch, HIVE-18433.2.patch
>
>
> Let's upgrade to version 2.9.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18595) UNIX_TIMESTAMP UDF fails when type is Timestamp with local timezone

2018-02-13 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-18595:
--
Attachment: HIVE-18595.3.patch

> UNIX_TIMESTAMP  UDF fails when type is Timestamp with local timezone
> 
>
> Key: HIVE-18595
> URL: https://issues.apache.org/jira/browse/HIVE-18595
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18595.3.patch, HIVE-18595.patch, HIVE-18595.patch
>
>
> {code}
> 2018-01-31T12:59:45,464 ERROR [10e97c86-7f90-406b-a8fa-38be5d3529cc main] 
> ql.Driver: FAILED: SemanticException [Error 10014]: Line 3:456 Wrong 
> arguments ''-MM-dd HH:mm:ss'': The function UNIX_TIMESTAMP takes only 
> string/date/timestamp types
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 3:456 Wrong arguments 
> ''-MM-dd HH:mm:ss'': The function UNIX_TIMESTAMP takes only 
> string/date/timestamp types
>  at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1394)
>  at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>  at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>  at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>  at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:235)
>  at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:181)
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11847)
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11780)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genGBLogicalPlan(CalcitePlanner.java:3140)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4330)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1407)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1354)
>  at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>  at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>  at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1159)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1175)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:422)
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11393)
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:304)
>  at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
>  at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:163)
>  at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
>  at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:639)
>  at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1504)
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1632)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1395)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1382)
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:240)
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343)
>  at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1331)
>  at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1305)
>  at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:173)
>  at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
>  at 
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> 

[jira] [Commented] (HIVE-18712) Design HMS Api v2

2018-02-13 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363425#comment-16363425
 ] 

Alexander Kolbasov commented on HIVE-18712:
---

A few things to consider:
 # What parts of the current API work really well
 # What parts are obsolete, do not work or work poorly

There are several other considerations:
 # Transport/Encoding:
 ** Thrift
 ** HTTP
 ** gRPC
 ** Something else?
 # Security
 ** Kerberos
 ** SSL
 ** Something else?
 # Compatibility
 # API evolution

It is also worth establishing higher-level design principles. I would argue 
that the following are useful:
 * API should be able to deal efficiently with very large amounts of objects. 
 * It should have clear failure model
 * It should not be tailored to any specific implementation language and should 
work reasonably well with at least 3 different ones
 * It should support metadata needs of existing known consumers

 

Please treat items above as a starting point of discussion - I am sure there 
will be lots and lots of various opinions on the subject.

> Design HMS Api v2
> -
>
> Key: HIVE-18712
> URL: https://issues.apache.org/jira/browse/HIVE-18712
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Priority: Major
>
> This is an umbrella Jira covering the design of Hive Metastore API v2.
> It is supposed to be a placeholder for discussion and design documents.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-16125) Split work between reducers.

2018-02-13 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357920#comment-16357920
 ] 

slim bouguerra edited comment on HIVE-16125 at 2/14/18 2:28 AM:


To fix this, added a new table property that the user can use as an extra 
hashing salt to split further the reduce sink.

For instance, during the create statement use can add the property
{code:java}
"druid.segment.targetShardsPerGranularity"="6"{code}
to add some random keys between 0 and 5, thus per segment granularity will have 
up to 6 reducers. 

FYI still unsure about the insert statements if such benefit will occur as 
well.  The user has to make sure when using this feature to choose wisely the 
target number of shards per segment granularity. If the number is too high the 
segments will be too small. If the number is too high the segments will be 
huge. Further improvement can be using statistics or add an extra shuffle 
reduce stage that counts and partition the rows according to some partition 
size. 


was (Author: bslim):
To fix this, added a new table property that the user can use as an extra 
hashing salt to split further the reduce sink.

For instance, during the create statement use can add the property
{code:java}
"druid.segment.targetShardPerGranularity"="6"{code}
to add some random keys between 0 and 5, thus per segment granularity will have 
up to 6 reducers. 

FYI still unsure about the insert statements if such benefit will occur as 
well.  The user has to make sure when using this feature to choose wisely the 
target number of shards per segment granularity. If the number is too high the 
segments will be too small. If the number is too high the segments will be 
huge. Further improvement can be using statistics or add an extra shuffle 
reduce stage that counts and partition the rows according to some partition 
size. 

> Split work between reducers.
> 
>
> Key: HIVE-16125
> URL: https://issues.apache.org/jira/browse/HIVE-16125
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-16125.4.patch, HIVE-16125.5.patch, 
> HIVE-16125.6.patch, HIVE-16125.patch
>
>
> Split work between reducer.
> currently we have one reducer per segment granularity even if the interval 
> will be partitioned over multiple partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18686) Installation on Postgres and Oracle broken

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363377#comment-16363377
 ] 

Hive QA commented on HIVE-18686:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910438/HIVE-18686.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 13174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestMarkPartition.testMarkingPartitionSet 
(batchId=215)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9200/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9200/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9200/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910438 - PreCommit-HIVE-Build

> Installation on Postgres and Oracle broken
> --
>
> Key: HIVE-18686
> URL: https://issues.apache.org/jira/browse/HIVE-18686
> Project: Hive
>  Issue Type: Bug
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18686.2.patch, HIVE-18686.patch
>
>
> HIVE-18614 broke the installation and upgrade on Postgres and Oracle.  It 
> calls Connection.setSchema in the JDBC driver.  But the JDBC drivers for 
> these databases don't support that call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363373#comment-16363373
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


+1 (pending tests) LGTM.

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> 

[jira] [Commented] (HIVE-18710) extend inheritPerms to ACID in Hive 2.X

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363365#comment-16363365
 ] 

Sergey Shelukhin commented on HIVE-18710:
-

[~ashutoshc] branch-2 only patch. Can you take a look?

> extend inheritPerms to ACID in Hive 2.X
> ---
>
> Key: HIVE-18710
> URL: https://issues.apache.org/jira/browse/HIVE-18710
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18710-branch-2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18689) restore inheritPerms functionality and extend it to ACID

2018-02-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18689:

Status: Open  (was: Patch Available)

> restore inheritPerms functionality and extend it to ACID
> 
>
> Key: HIVE-18689
> URL: https://issues.apache.org/jira/browse/HIVE-18689
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18689.patch
>
>
> This functionality was removed for no clear reason (if it doesn't apply to 
> some use case it can just be disabled).
> It's still in use; in fact, it should be extended to ACID table 
> subdirectories.
> This patch restores the functionality with some cleanup (to not access config 
> everywhere, mostly), disables it by default, and extends it to ACID tables.
> There's a coming HDFS feature that will automatically inherit permissions. 
> When that is shipped in a non-beta version and stabilized a bit, we can 
> remove this functionality... however I dunno if that is good for other 
> potential use cases, like non-HDFS file systems that do have a concept of a 
> directory (Isilon?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18710) extend inheritPerms to ACID in Hive 2.X

2018-02-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18710:

Assignee: Sergey Shelukhin
  Status: Patch Available  (was: Open)

> extend inheritPerms to ACID in Hive 2.X
> ---
>
> Key: HIVE-18710
> URL: https://issues.apache.org/jira/browse/HIVE-18710
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18710-branch-2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-18553:

Attachment: HIVE-18553.11.patch

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]

[jira] [Commented] (HIVE-18672) Printed state in RemoteSparkJobMonitor is ambiguous

2018-02-13 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363361#comment-16363361
 ] 

Sahil Takiar commented on HIVE-18672:
-

Attaching updated patch:

* Realized that when {{JobHandle#State = QUEUED}} that Spark job id isn't known 
yet, and always returns -1; so I removed it
* Changed {{SparkJobMonitor}} to create a new {{LogHelper}} rather than use the 
default one in {{SessionState}}; this makes the Hive logs much clearer; 
anything printed from {{SparkJobMonitor}} will now be displayed in the logs as 
coming from {{SparkJobMonitor}} rather than {{SessionState}}
* Added Spark job id to logs printed when in the {{CANCELLED}} state (added in 
HIVE-18671)
* Added Spark job to the logs printed when in the {{SUCCEEDED}} state

> Printed state in RemoteSparkJobMonitor is ambiguous
> ---
>
> Key: HIVE-18672
> URL: https://issues.apache.org/jira/browse/HIVE-18672
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18672.1.patch, HIVE-18672.2.patch, 
> HIVE-18672.3.patch
>
>
> There are a few places in {{RemoteSparkJobMonitor}} (e.g. when the Spark job 
> is in state QUEUED) where the state of the Spark job is printed, but the info 
> is ambiguous (no reference to HoS, or the id of the Spark job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18711) Add percentile_cont and percentile_disc udaf

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363359#comment-16363359
 ] 

Ashutosh Chauhan commented on HIVE-18711:
-

References:

https://docs.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql

[https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions110.htm]

https://www.postgresql.org/docs/9.4/static/functions-aggregate.html

> Add percentile_cont and percentile_disc udaf
> 
>
> Key: HIVE-18711
> URL: https://issues.apache.org/jira/browse/HIVE-18711
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Ashutosh Chauhan
>Priority: Major
>
> Most common way to implement this is via ordered aggregate which allows users 
> to specify sort specification with group by clause. Some implementations also 
> allow to use these with window functions.
> Since Hive doesn't have concept of ordered aggregates yet, one possibility is 
> to support these only for window functions where sort specification is also 
> taken from window clause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18659) add acid version marker to acid files

2018-02-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18659:
--
Attachment: HIVE-18659.04.patch

> add acid version marker to acid files
> -
>
> Key: HIVE-18659
> URL: https://issues.apache.org/jira/browse/HIVE-18659
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18659.01.patch, HIVE-18659.04.patch
>
>
> add acid version marker to acid files so that we know which version of acid 
> wrote the file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18711) Add percentile_cont and percentile_disc udaf

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363357#comment-16363357
 ] 

Ashutosh Chauhan commented on HIVE-18711:
-

create table employees (name string, department_id tinyint, salary int);

SELECT name, salary, department_id, PERCENTILE_CONT(0.5) OVER (PARTITION BY 
department_id ORDER BY salary DESC) FROM employees;

> Add percentile_cont and percentile_disc udaf
> 
>
> Key: HIVE-18711
> URL: https://issues.apache.org/jira/browse/HIVE-18711
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Ashutosh Chauhan
>Priority: Major
>
> Most common way to implement this is via ordered aggregate which allows users 
> to specify sort specification with group by clause. Some implementations also 
> allow to use these with window functions.
> Since Hive doesn't have concept of ordered aggregates yet, one possibility is 
> to support these only for window functions where sort specification is also 
> taken from window clause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18672) Printed state in RemoteSparkJobMonitor is ambiguous

2018-02-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18672:

Attachment: HIVE-18672.3.patch

> Printed state in RemoteSparkJobMonitor is ambiguous
> ---
>
> Key: HIVE-18672
> URL: https://issues.apache.org/jira/browse/HIVE-18672
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18672.1.patch, HIVE-18672.2.patch, 
> HIVE-18672.3.patch
>
>
> There are a few places in {{RemoteSparkJobMonitor}} (e.g. when the Spark job 
> is in state QUEUED) where the state of the Spark job is printed, but the info 
> is ambiguous (no reference to HoS, or the id of the Spark job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18387) Minimize time that REBUILD locks the materialized view

2018-02-13 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363344#comment-16363344
 ] 

Jesus Camacho Rodriguez commented on HIVE-18387:


Applied changes based on review comments and rebased patch. Uploaded for a new 
ptests run.

> Minimize time that REBUILD locks the materialized view
> --
>
> Key: HIVE-18387
> URL: https://issues.apache.org/jira/browse/HIVE-18387
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18387.01.patch, HIVE-18387.02.patch, 
> HIVE-18387.03.patch, HIVE-18387.04.patch, HIVE-18387.05.patch, 
> HIVE-18387.patch
>
>
> Currently, REBUILD will block the materialized view while the final move task 
> is being executed. The idea for this improvement is to create the new 
> materialization in a new folder (new version) and then just flip the pointer 
> to the folder in the MV definition in the metastore. REBUILD operations for a 
> given MV should get an exclusive lock though, i.e., they cannot be executed 
> concurrently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18387) Minimize time that REBUILD locks the materialized view

2018-02-13 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18387:
---
Attachment: HIVE-18387.05.patch

> Minimize time that REBUILD locks the materialized view
> --
>
> Key: HIVE-18387
> URL: https://issues.apache.org/jira/browse/HIVE-18387
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18387.01.patch, HIVE-18387.02.patch, 
> HIVE-18387.03.patch, HIVE-18387.04.patch, HIVE-18387.05.patch, 
> HIVE-18387.patch
>
>
> Currently, REBUILD will block the materialized view while the final move task 
> is being executed. The idea for this improvement is to create the new 
> materialization in a new folder (new version) and then just flip the pointer 
> to the folder in the MV definition in the metastore. REBUILD operations for a 
> given MV should get an exclusive lock though, i.e., they cannot be executed 
> concurrently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-18553:

Description: 
For schema evolution, it includes the following points:
1. column changes
column reorder
column add, column delete
column rename
2. type conversion
low precision to high precision
type to String
For 1st type, current the code is not supporting the column addition operation. 
Detailed error is as follows:
{code}
0: jdbc:hive2://localhost:1/default> desc test_p;
+---++--+
| col_name  | data_type  | comment  |
+---++--+
| t1| tinyint|  |
| t2| tinyint|  |
| i1| int|  |
| i2| int|  |
+---++--+
0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
0: jdbc:hive2://localhost:1/default> set 
hive.vectorized.execution.enabled=true;
0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
timestamp);
0: jdbc:hive2://localhost:1/default> select * from test_p;
Error: Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
{code}

Following exception is seen in the logs

{code}
Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the store: 
[[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
at 
org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
 ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
 ~[hadoop-mapreduce-client-common-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) 

[jira] [Commented] (HIVE-18686) Installation on Postgres and Oracle broken

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363327#comment-16363327
 ] 

Hive QA commented on HIVE-18686:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fedefeb |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9200/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9200/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Installation on Postgres and Oracle broken
> --
>
> Key: HIVE-18686
> URL: https://issues.apache.org/jira/browse/HIVE-18686
> Project: Hive
>  Issue Type: Bug
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18686.2.patch, HIVE-18686.patch
>
>
> HIVE-18614 broke the installation and upgrade on Postgres and Oracle.  It 
> calls Connection.setSchema in the JDBC driver.  But the JDBC drivers for 
> these databases don't support that call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363310#comment-16363310
 ] 

Matt McCline commented on HIVE-18622:
-

No test failures are related.

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363303#comment-16363303
 ] 

Ferdinand Xu commented on HIVE-18553:
-

[~vihangk1], they're related to the change of boolean type handling. A fix is 
included in 10th patch.

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.2.patch, 
> HIVE-18553.3.patch, HIVE-18553.4.patch, HIVE-18553.5.patch, 
> HIVE-18553.6.patch, HIVE-18553.7.patch, HIVE-18553.8.patch, 
> HIVE-18553.9.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> 

[jira] [Updated] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-18553:

Attachment: HIVE-18553.10.patch

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.2.patch, 
> HIVE-18553.3.patch, HIVE-18553.4.patch, HIVE-18553.5.patch, 
> HIVE-18553.6.patch, HIVE-18553.7.patch, HIVE-18553.8.patch, 
> HIVE-18553.9.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> 

[jira] [Commented] (HIVE-18448) Drop Support For Indexes From Apache Hive

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363301#comment-16363301
 ] 

Ashutosh Chauhan commented on HIVE-18448:
-

+1

> Drop Support For Indexes From Apache Hive
> -
>
> Key: HIVE-18448
> URL: https://issues.apache.org/jira/browse/HIVE-18448
> Project: Hive
>  Issue Type: Improvement
>  Components: Indexing
>Reporter: BELUGA BEHR
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-18448.01.patch, HIVE-18448.01wip02.patch, 
> HIVE-18448.01wip03.patch, HIVE-18448.01wip04.patch, HIVE-18448.01wip05.patch
>
>
> If a user needs to look up a small subset of records quickly, they can use 
> Apache HBase, if they need fast retrieval of larger sets of data, or fast 
> joins, aggregations, they can use Apache Impala.  It seems to me that Hive 
> indexes do not serve much of a role in the future of Hive.
> Even without moving workloads to other products, columnar file formats with 
> their statistics achieve similar goals as Hive indexes.
> Please consider dropping Indexes from the Apache Hive project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363298#comment-16363298
 ] 

Hive QA commented on HIVE-18622:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910466/HIVE-18622.096.patch

{color:green}SUCCESS:{color} +1 due to 29 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 13174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=232)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9199/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9199/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9199/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910466 - PreCommit-HIVE-Build

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is 

[jira] [Commented] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363284#comment-16363284
 ] 

Hive QA commented on HIVE-18622:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} storage-api: The patch generated 0 new + 57 
unchanged - 77 fixed = 57 total (was 134) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  3m  
8s{color} | {color:red} root: The patch generated 193 new + 5695 unchanged - 
399 fixed = 5888 total (was 6094) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch llap-server passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
44s{color} | {color:red} ql: The patch generated 191 new + 5256 unchanged - 322 
fixed = 5447 total (was 5578) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} vector-code-gen: The patch generated 2 new + 308 
unchanged - 0 fixed = 310 total (was 308) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 20 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} storage-api generated 1 new + 26 unchanged - 0 fixed = 
27 total (was 26) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  5m 
39s{color} | {color:red} root generated 1 new + 336 unchanged - 0 fixed = 337 
total (was 336) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / cf4114e |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/diff-checkstyle-vector-code-gen.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/whitespace-eol.txt 
|
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/diff-javadoc-javadoc-storage-api.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9199/yetus/diff-javadoc-javadoc-root.txt
 |
| asflicense | 

[jira] [Commented] (HIVE-18695) fix TestAccumuloCliDriver.testCliDriver[accumulo_queries]

2018-02-13 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363282#comment-16363282
 ] 

Anthony Hsu commented on HIVE-18695:


Hi [~kgyrtkirk], could you include an example stack trace of or a link to a 
test failure?

> fix TestAccumuloCliDriver.testCliDriver[accumulo_queries]
> -
>
> Key: HIVE-18695
> URL: https://issues.apache.org/jira/browse/HIVE-18695
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> seems to be broken by HIVE-15680



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18710) extend inheritPerms to ACID in Hive 2.X

2018-02-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18710:

Attachment: HIVE-18710-branch-2.patch

> extend inheritPerms to ACID in Hive 2.X
> ---
>
> Key: HIVE-18710
> URL: https://issues.apache.org/jira/browse/HIVE-18710
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18710-branch-2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18672) Printed state in RemoteSparkJobMonitor is ambiguous

2018-02-13 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363279#comment-16363279
 ] 

Sahil Takiar commented on HIVE-18672:
-

[~pvary] {{LogHelper#printInfo(String)}} will check the value of 
{{LogHelper#getIsSilent()}} before printing to the info stream. 
{{getIsSilent()}} checks the {{SessionState}} for its current value of 
{{isSilent}}. So I think it should be ok. We follow the same pattern (using 
{{LogHelper(LOG)}}) in {{SparkTask.java}}.

> Printed state in RemoteSparkJobMonitor is ambiguous
> ---
>
> Key: HIVE-18672
> URL: https://issues.apache.org/jira/browse/HIVE-18672
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18672.1.patch, HIVE-18672.2.patch
>
>
> There are a few places in {{RemoteSparkJobMonitor}} (e.g. when the Spark job 
> is in state QUEUED) where the state of the Spark job is printed, but the info 
> is ambiguous (no reference to HoS, or the id of the Spark job).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18689) restore inheritPerms functionality and extend it to ACID

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363271#comment-16363271
 ] 

Sergey Shelukhin edited comment on HIVE-18689 at 2/14/18 12:20 AM:
---

Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough about this functionality to 
argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (for anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it; the kind of fall-back we had, and often used and still use, for every 
significant new feature in Hive from major e.g. CBO, Tez, LLAP, Vectorization, 
etc., to small optimizations.

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.


was (Author: sershe):
Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough about this functionality to 
argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (for anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it; the kind of fall-back we had, and often used and still use, for every 
significant new feature in Hive from major e.g. CBO, Tez, LLAP, Vectorization, 
etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.

> restore inheritPerms functionality and extend it to ACID
> 
>
> Key: HIVE-18689
> URL: https://issues.apache.org/jira/browse/HIVE-18689
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18689.patch
>
>
> This functionality was removed for no clear reason (if it doesn't apply to 
> some use case it can just be disabled).
> It's still in use; in fact, it should be extended to ACID table 
> subdirectories.
> This patch restores the functionality with some cleanup (to not access config 
> everywhere, mostly), disables it by default, and extends it to ACID tables.
> There's a coming HDFS feature that will automatically inherit permissions. 
> When that is shipped in a non-beta version and stabilized a bit, we can 
> remove this functionality... however I dunno if that is good for other 
> potential use cases, like non-HDFS file systems that do have a concept of a 
> directory (Isilon?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18689) restore inheritPerms functionality and extend it to ACID

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363271#comment-16363271
 ] 

Sergey Shelukhin edited comment on HIVE-18689 at 2/14/18 12:19 AM:
---

Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough about this functionality to 
argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (for anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it; the kind of fall-back we had, and often used and still use, for every 
significant new feature in Hive from major e.g. CBO, Tez, LLAP, Vectorization, 
etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.


was (Author: sershe):
Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough about this functionality to 
argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (to anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it (which we have or had for every significant new feature in Hive from 
major e.g. CBO, Tez, LLAP, Vectorization, etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.

> restore inheritPerms functionality and extend it to ACID
> 
>
> Key: HIVE-18689
> URL: https://issues.apache.org/jira/browse/HIVE-18689
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18689.patch
>
>
> This functionality was removed for no clear reason (if it doesn't apply to 
> some use case it can just be disabled).
> It's still in use; in fact, it should be extended to ACID table 
> subdirectories.
> This patch restores the functionality with some cleanup (to not access config 
> everywhere, mostly), disables it by default, and extends it to ACID tables.
> There's a coming HDFS feature that will automatically inherit permissions. 
> When that is shipped in a non-beta version and stabilized a bit, we can 
> remove this functionality... however I dunno if that is good for other 
> potential use cases, like non-HDFS file systems that do have a concept of a 
> directory (Isilon?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18689) restore inheritPerms functionality and extend it to ACID

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363271#comment-16363271
 ] 

Sergey Shelukhin commented on HIVE-18689:
-

Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough to argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (to anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it (which we have or had for every significant new feature in Hive from 
major e.g. CBO, Tez, LLAP, Vectorization, etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.

> restore inheritPerms functionality and extend it to ACID
> 
>
> Key: HIVE-18689
> URL: https://issues.apache.org/jira/browse/HIVE-18689
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18689.patch
>
>
> This functionality was removed for no clear reason (if it doesn't apply to 
> some use case it can just be disabled).
> It's still in use; in fact, it should be extended to ACID table 
> subdirectories.
> This patch restores the functionality with some cleanup (to not access config 
> everywhere, mostly), disables it by default, and extends it to ACID tables.
> There's a coming HDFS feature that will automatically inherit permissions. 
> When that is shipped in a non-beta version and stabilized a bit, we can 
> remove this functionality... however I dunno if that is good for other 
> potential use cases, like non-HDFS file systems that do have a concept of a 
> directory (Isilon?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18689) restore inheritPerms functionality and extend it to ACID

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363271#comment-16363271
 ] 

Sergey Shelukhin edited comment on HIVE-18689 at 2/14/18 12:18 AM:
---

Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough about this functionality to 
argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (to anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it (which we have or had for every significant new feature in Hive from 
major e.g. CBO, Tez, LLAP, Vectorization, etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.


was (Author: sershe):
Discussed with [~ashutoshc] who doesn't want to restore this feature to master. 
Our current plan is to rely on new HDFS feature (ACLs) for inherit permissions 
functionality in 3.0, since Hive 3 will anyway only work with Hadoop 3.X.
I don't think it is wise but I don't care enough to argue more about this.

We are basically making a conscious decision to rely on a brand new, never 
executed in production (and never even included in a released version as of 
now, as far as I know) feature for this crucial (to anyone using storage based 
auth) functionality with no possibility of fallback in case there's some issue 
with it (which we have or had for every significant new feature in Hive from 
major e.g. CBO, Tez, LLAP, Vectorization, etc., to small optimizations).

I will just leave this patch here for the record (and in case if it's needed 
for forward ports) and handle ACID separately on 2.X branch in HIVE-18710.

> restore inheritPerms functionality and extend it to ACID
> 
>
> Key: HIVE-18689
> URL: https://issues.apache.org/jira/browse/HIVE-18689
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18689.patch
>
>
> This functionality was removed for no clear reason (if it doesn't apply to 
> some use case it can just be disabled).
> It's still in use; in fact, it should be extended to ACID table 
> subdirectories.
> This patch restores the functionality with some cleanup (to not access config 
> everywhere, mostly), disables it by default, and extends it to ACID tables.
> There's a coming HDFS feature that will automatically inherit permissions. 
> When that is shipped in a non-beta version and stabilized a bit, we can 
> remove this functionality... however I dunno if that is good for other 
> potential use cases, like non-HDFS file systems that do have a concept of a 
> directory (Isilon?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18708) Vectorization: Delay out-of-tree fixups till whole work is vectorized

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18708:
---
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Vectorization: Delay out-of-tree fixups till whole work is vectorized
> -
>
> Key: HIVE-18708
> URL: https://issues.apache.org/jira/browse/HIVE-18708
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-18708.1.patch, HIVE-18708.2.patch
>
>
> The vectorization validation codepath should treat the existing operator tree 
> as immutable, so that the VectorizerCannotVectorizeException does not have to 
> undo any changes to the operator tree when caught.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18708) Vectorization: Delay out-of-tree fixups till whole work is vectorized

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18708:
---
Attachment: HIVE-18708.2.patch

> Vectorization: Delay out-of-tree fixups till whole work is vectorized
> -
>
> Key: HIVE-18708
> URL: https://issues.apache.org/jira/browse/HIVE-18708
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-18708.1.patch, HIVE-18708.2.patch
>
>
> The vectorization validation codepath should treat the existing operator tree 
> as immutable, so that the VectorizerCannotVectorizeException does not have to 
> undo any changes to the operator tree when caught.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18673) ErrorMsg.SPARK_JOB_MONITOR_TIMEOUT isn't formatted correctly

2018-02-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18673:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Chao for the review.

> ErrorMsg.SPARK_JOB_MONITOR_TIMEOUT isn't formatted correctly
> 
>
> Key: HIVE-18673
> URL: https://issues.apache.org/jira/browse/HIVE-18673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18673.1.patch
>
>
> {{ErrorMsg.SPARK_JOB_MONITOR_TIMEOUT}} doesn't format the amount of time 
> waited correctly. Mainly because Java's {{MessageFormat}} class requires 
> escaping single quotes with another single quote.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18436) Upgrade to Spark 2.3.0

2018-02-13 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363253#comment-16363253
 ] 

Sahil Takiar commented on HIVE-18436:
-

Had to upgrade the Hive netty-all version, but the looks like the basic tests 
I'm running locally are passing. Attached a patch to get full feedback from 
Hive QA.

> Upgrade to Spark 2.3.0
> --
>
> Key: HIVE-18436
> URL: https://issues.apache.org/jira/browse/HIVE-18436
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18436.1.patch
>
>
> Branching has been completed. Release candidates should be published soon. 
> Might be a while before the actual release, but at least we get to identify 
> any issues early.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18436) Upgrade to Spark 2.3.0

2018-02-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18436:

Status: Patch Available  (was: Open)

> Upgrade to Spark 2.3.0
> --
>
> Key: HIVE-18436
> URL: https://issues.apache.org/jira/browse/HIVE-18436
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18436.1.patch
>
>
> Branching has been completed. Release candidates should be published soon. 
> Might be a while before the actual release, but at least we get to identify 
> any issues early.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18436) Upgrade to Spark 2.3.0

2018-02-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18436:

Attachment: HIVE-18436.1.patch

> Upgrade to Spark 2.3.0
> --
>
> Key: HIVE-18436
> URL: https://issues.apache.org/jira/browse/HIVE-18436
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18436.1.patch
>
>
> Branching has been completed. Release candidates should be published soon. 
> Might be a while before the actual release, but at least we get to identify 
> any issues early.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16125) Split work between reducers.

2018-02-13 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-16125:
--
Attachment: HIVE-16125.6.patch

> Split work between reducers.
> 
>
> Key: HIVE-16125
> URL: https://issues.apache.org/jira/browse/HIVE-16125
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-16125.4.patch, HIVE-16125.5.patch, 
> HIVE-16125.6.patch, HIVE-16125.patch
>
>
> Split work between reducer.
> currently we have one reducer per segment granularity even if the interval 
> will be partitioned over multiple partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18702) INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting

2018-02-13 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-18702:

Status: In Progress  (was: Patch Available)

> INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
> ---
>
> Key: HIVE-18702
> URL: https://issues.apache.org/jira/browse/HIVE-18702
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0, 2.3.3
>
> Attachments: HIVE-18702.1.patch
>
>
> Enable Hive on TEZ. (MR works fine).
> *STEP 1. Create test data*
> {code}
> nano /home/test/users.txt
> {code}
> Add to file:
> {code}
> Peter,34
> John,25
> Mary,28
> {code}
> {code}
> hadoop fs -mkdir /bug
> hadoop fs -copyFromLocal /home/test/users.txt /bug
> hadoop fs -ls /bug
> {code}
> *EXPECTED RESULT:*
> {code}
> Found 2 items 
>   
> -rwxr-xr-x   3 root root 25 2015-10-15 16:11 /bug/users.txt
> {code}
> *STEP 2. Upload data to hive*
> {code}
> create external table bug(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug';
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Peter   34
> John25
> Mary28
> {code}
> {code}
> create external table bug1(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug1';
> insert overwrite table bug select * from bug1;
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Time taken: 0.097 seconds
> {code}
> *ACTUAL RESULT:*
> {code}
> hive>  select * from bug;
> OK
> Peter 34
> John  25
> Mary  28
> Time taken: 0.198 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18702) INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting

2018-02-13 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-18702:

Status: Patch Available  (was: In Progress)

> INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
> ---
>
> Key: HIVE-18702
> URL: https://issues.apache.org/jira/browse/HIVE-18702
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0, 2.3.3
>
> Attachments: HIVE-18702.1.patch
>
>
> Enable Hive on TEZ. (MR works fine).
> *STEP 1. Create test data*
> {code}
> nano /home/test/users.txt
> {code}
> Add to file:
> {code}
> Peter,34
> John,25
> Mary,28
> {code}
> {code}
> hadoop fs -mkdir /bug
> hadoop fs -copyFromLocal /home/test/users.txt /bug
> hadoop fs -ls /bug
> {code}
> *EXPECTED RESULT:*
> {code}
> Found 2 items 
>   
> -rwxr-xr-x   3 root root 25 2015-10-15 16:11 /bug/users.txt
> {code}
> *STEP 2. Upload data to hive*
> {code}
> create external table bug(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug';
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Peter   34
> John25
> Mary28
> {code}
> {code}
> create external table bug1(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug1';
> insert overwrite table bug select * from bug1;
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Time taken: 0.097 seconds
> {code}
> *ACTUAL RESULT:*
> {code}
> hive>  select * from bug;
> OK
> Peter 34
> John  25
> Mary  28
> Time taken: 0.198 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18702) INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363229#comment-16363229
 ] 

Hive QA commented on HIVE-18702:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910402/HIVE-18702.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9198/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9198/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9198/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-02-13 23:34:41.053
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9198/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-02-13 23:34:41.058
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at cf4114e HIVE-17627 : Use druid scan query instead of the select 
query. (Nishant Bangarwa via Slim B, Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at cf4114e HIVE-17627 : Use druid scan query instead of the select 
query. (Nishant Bangarwa via Slim B, Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-02-13 23:34:44.680
+ rm -rf ../yetus
rm: cannot remove ?../yetus/ql/target?: Directory not empty
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910402 - PreCommit-HIVE-Build

> INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
> ---
>
> Key: HIVE-18702
> URL: https://issues.apache.org/jira/browse/HIVE-18702
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0, 2.3.3
>
> Attachments: HIVE-18702.1.patch
>
>
> Enable Hive on TEZ. (MR works fine).
> *STEP 1. Create test data*
> {code}
> nano /home/test/users.txt
> {code}
> Add to file:
> {code}
> Peter,34
> John,25
> Mary,28
> {code}
> {code}
> hadoop fs -mkdir /bug
> hadoop fs -copyFromLocal /home/test/users.txt /bug
> hadoop fs -ls /bug
> {code}
> *EXPECTED RESULT:*
> {code}
> Found 2 items 
>   
> -rwxr-xr-x   3 root root 25 2015-10-15 16:11 /bug/users.txt
> {code}
> *STEP 2. Upload data to hive*
> {code}
> create external table bug(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug';
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Peter   34
> John25
> Mary28
> {code}
> {code}
> create external table bug1(name string, age int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug1';
> insert overwrite table bug select * from bug1;
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Time taken: 0.097 seconds
> {code}
> *ACTUAL RESULT:*
> {code}
> hive>  select * from bug;
> OK
> Peter 34
> John  25
> Mary  28
> Time taken: 0.198 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18448) Drop Support For Indexes From Apache Hive

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363227#comment-16363227
 ] 

Hive QA commented on HIVE-18448:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910386/HIVE-18448.01.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 13100 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=179)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=250)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=205)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=224)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=187)
org.apache.hive.hcatalog.pig.TestSequenceFileHCatStorer.testWriteTimestamp 
(batchId=192)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9197/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9197/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9197/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910386 - PreCommit-HIVE-Build

> Drop Support For Indexes From Apache Hive
> -
>
> Key: HIVE-18448
> URL: https://issues.apache.org/jira/browse/HIVE-18448
> Project: Hive
>  Issue Type: Improvement
>  Components: Indexing
>Reporter: BELUGA BEHR
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-18448.01.patch, HIVE-18448.01wip02.patch, 
> HIVE-18448.01wip03.patch, HIVE-18448.01wip04.patch, HIVE-18448.01wip05.patch
>
>
> If a user needs to look up a small subset of records quickly, they can use 
> Apache HBase, if they need fast retrieval of larger sets of data, or fast 
> joins, aggregations, they can use Apache Impala.  It seems to me that Hive 
> indexes do not serve much of a role in the future of Hive.
> Even without moving workloads to other products, columnar file formats with 
> their statistics achieve similar goals as Hive indexes.
> Please consider dropping Indexes from the Apache Hive project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16125) Split work between reducers.

2018-02-13 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-16125:
--
Attachment: HIVE-16125.5.patch

> Split work between reducers.
> 
>
> Key: HIVE-16125
> URL: https://issues.apache.org/jira/browse/HIVE-16125
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-16125.4.patch, HIVE-16125.5.patch, HIVE-16125.patch
>
>
> Split work between reducer.
> currently we have one reducer per segment granularity even if the interval 
> will be partitioned over multiple partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Attachment: HIVE-18622.096.patch

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Attachment: (was: HIVE-18622.096.patch)

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Miklos Gergely (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363169#comment-16363169
 ] 

Miklos Gergely commented on HIVE-16491:
---

[~ashutoshc] patch is here, it sets the flag allowIndexExpr in JoinTypeCheckCtx 
to true.

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Miklos Gergely (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-16491:
-

Assignee: Miklos Gergely

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Miklos Gergely (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-16491:
--
Attachment: HIVE-16491.patch

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16491) CBO cant handle join involving complex types in on condition

2018-02-13 Thread Miklos Gergely (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-16491:
--
Status: Patch Available  (was: Open)

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Miklos Gergely
>Priority: Major
> Attachments: HIVE-16491.patch
>
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18619) Verification of temporary Micromanaged table atomicity is needed

2018-02-13 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361859#comment-16361859
 ] 

Steve Yeom edited comment on HIVE-18619 at 2/13/18 10:20 PM:
-

Using the flag by Eugene we can roll back a transaction in the local TxnManager 
instance inside the Driver (as Hadoop and Metastore client) 
in a JUnit based unit test. 
The same basic scenario I mentioned 5 days ago works fine. 
Each reader from SELECT gets ValidTxnList that includes aborted transaction id 
so as to figure out aborted delta directories.
I will add a simple test as a patch for this jira.

Gopal mentioned we need to delete the global transaction entries for the 
temporary table commands which are left over after Hive-18192 is checked in.
The current IMetaStoreClient implementation for temporary table, 
SessionHiveMetaStoreClient, extends HiveMetaStoreClient and so rollback() 
method to update 
the MetaStore DB is there but whether the state of the transaction is in the 
place call that 
function in the state machine is questionable. 



was (Author: steveyeom2017):
Using the flag by Eugene we can roll back a transaction in the local TxnManager 
instance inside the Driver (as Hadoop and Metastore client) 
in a JUnit based unit test. 
The same basic scenario I mentioned 5 days ago works fine. 
Each reader from SELECT gets ValidTxnList that includes aborted transaction id 
so as to figure out aborted delta directories.
I will add a simple test as a patch for this jira.

Gopal mentioned we need to delete the global transaction entries for the 
temporary table commands which are left over after Hive-18192 is checked in.
The current IMetaStoreClient implementation for temporary table does not have 
rollback() method to update the MetaStore DB.


> Verification of temporary Micromanaged table atomicity is needed 
> -
>
> Key: HIVE-18619
> URL: https://issues.apache.org/jira/browse/HIVE-18619
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18619.01.patch
>
>
> Session based temporary table by HIVE-7090 had no consideration of 
> Micromanaged table 
> (MM) since there was no insert-only ACID table at its creation tije. 
> HIVE-18599 addressed the issue of no writes during CTTAS (Create Temporary 
> Table As Select)
> on Micro-Managed table. But atomicity of temporary MM table is not verified. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2018-02-13 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363134#comment-16363134
 ] 

Misha Dmitriev commented on HIVE-6430:
--

Thank you [~akolb]! This is nice work of the kind I wish I can do more :)

> MapJoin hash table has large memory overhead
> 
>
> Key: HIVE-6430
> URL: https://issues.apache.org/jira/browse/HIVE-6430
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 0.14.0
>
> Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
> HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
> HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
> HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
> HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
> HIVE-6430.14.patch, HIVE-6430.patch
>
>
> Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
> for row) can take several hundred bytes, which is ridiculous. I am reducing 
> the size of MJKey and MJRowContainer in other jiras, but in general we don't 
> need to have java hash table there.  We can either use primitive-friendly 
> hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
> primitive keys to single row storage structure without an object per row 
> (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18698) Fix TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363122#comment-16363122
 ] 

Hive QA commented on HIVE-18698:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910376/HIVE-18698.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 13174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9196/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9196/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9196/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910376 - PreCommit-HIVE-Build

> Fix TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]
> --
>
> Key: HIVE-18698
> URL: https://issues.apache.org/jira/browse/HIVE-18698
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-18698.01.patch
>
>
> HIVE-18416 have made some extra stat updates on the q.out which are unrelated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17627) Use druid scan query instead of the select query.

2018-02-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17627:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Nishant!

> Use druid scan query instead of the select query.
> -
>
> Key: HIVE-17627
> URL: https://issues.apache.org/jira/browse/HIVE-17627
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17627.1.patch, HIVE-17627.2.patch, HIVE-17627.patch
>
>
> The biggest difference between select query and scan query is that, scan 
> query doesn't retain all rows in memory before rows can be returned to client.
> It will cause memory pressure if too many rows required by select query.
> Scan query doesn't have this issue.
> Scan query can return all rows without issuing another pagination query, 
> which is extremely useful when query against historical or realtime node 
> directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18709) Enable Compaction to work on more than one partition per job

2018-02-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-18709:
-


> Enable Compaction to work on more than one partition per job
> 
>
> Key: HIVE-18709
> URL: https://issues.apache.org/jira/browse/HIVE-18709
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> currently compaction launches 1 MR job per partition that needs to be 
> compacted.
> The number of tasks is equal to the number of buckets in the table (or number 
> or writers in the 'widest' write).
> The number of AMs in a cluster is usually limited to a small percentage of 
> the nodes.  This limits how much compaction can be done in parallel.
> Investigate what it would take for a single job to be able to handle multiple 
> partitions.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Attachment: HIVE-18622.096.patch

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Attachment: (was: HIVE-18622.096.patch)

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17871) Add non nullability flag to druid time column

2018-02-13 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363080#comment-16363080
 ] 

slim bouguerra commented on HIVE-17871:
---

I need to remove the testing code since it is part of Calcite thought. Let me 
double check. 

> Add non nullability flag to druid time column
> -
>
> Key: HIVE-17871
> URL: https://issues.apache.org/jira/browse/HIVE-17871
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17871.2.patch, HIVE-17871.patch
>
>
> Druid time column is non null all the time.
> Adding the non nullability flag will enable extra calcite goodness  like 
> transforming 
> {code} select count(`__time`) from table {code} to {code} select count(*) 
> from table {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17627) Use druid scan query instead of the select query.

2018-02-13 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363077#comment-16363077
 ] 

slim bouguerra commented on HIVE-17627:
---

this can go in i will submit a follow up once the calcite release is done. 

> Use druid scan query instead of the select query.
> -
>
> Key: HIVE-17627
> URL: https://issues.apache.org/jira/browse/HIVE-17627
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17627.1.patch, HIVE-17627.2.patch, HIVE-17627.patch
>
>
> The biggest difference between select query and scan query is that, scan 
> query doesn't retain all rows in memory before rows can be returned to client.
> It will cause memory pressure if too many rows required by select query.
> Scan query doesn't have this issue.
> Scan query can return all rows without issuing another pagination query, 
> which is extremely useful when query against historical or realtime node 
> directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18708) Vectorization: Delay out-of-tree fixups till whole work is vectorized

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18708:
---
Attachment: HIVE-18708.1.patch

> Vectorization: Delay out-of-tree fixups till whole work is vectorized
> -
>
> Key: HIVE-18708
> URL: https://issues.apache.org/jira/browse/HIVE-18708
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-18708.1.patch
>
>
> The vectorization validation codepath should treat the existing operator tree 
> as immutable, so that the VectorizerCannotVectorizeException does not have to 
> undo any changes to the operator tree when caught.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18708) Vectorization: Delay out-of-tree fixups till whole work is vectorized

2018-02-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-18708:
--

Assignee: Gopal V

> Vectorization: Delay out-of-tree fixups till whole work is vectorized
> -
>
> Key: HIVE-18708
> URL: https://issues.apache.org/jira/browse/HIVE-18708
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>
> The vectorization validation codepath should treat the existing operator tree 
> as immutable, so that the VectorizerCannotVectorizeException does not have to 
> undo any changes to the operator tree when caught.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17627) Use druid scan query instead of the select query.

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363062#comment-16363062
 ] 

Ashutosh Chauhan commented on HIVE-17627:
-

[~bslim] You were reviewing this. Is this ready for commit or does this need 
more work?

> Use druid scan query instead of the select query.
> -
>
> Key: HIVE-17627
> URL: https://issues.apache.org/jira/browse/HIVE-17627
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17627.1.patch, HIVE-17627.2.patch, HIVE-17627.patch
>
>
> The biggest difference between select query and scan query is that, scan 
> query doesn't retain all rows in memory before rows can be returned to client.
> It will cause memory pressure if too many rows required by select query.
> Scan query doesn't have this issue.
> Scan query can return all rows without issuing another pagination query, 
> which is extremely useful when query against historical or realtime node 
> directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17871) Add non nullability flag to druid time column

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363058#comment-16363058
 ] 

Ashutosh Chauhan commented on HIVE-17871:
-

[~bslim] Does this need more work or is this ready ?

> Add non nullability flag to druid time column
> -
>
> Key: HIVE-17871
> URL: https://issues.apache.org/jira/browse/HIVE-17871
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17871.2.patch, HIVE-17871.patch
>
>
> Druid time column is non null all the time.
> Adding the non nullability flag will enable extra calcite goodness  like 
> transforming 
> {code} select count(`__time`) from table {code} to {code} select count(*) 
> from table {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18569) Hive Druid indexing not dealing with decimals in correct way.

2018-02-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18569:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Nishant!

> Hive Druid indexing not dealing with decimals in correct way.
> -
>
> Key: HIVE-18569
> URL: https://issues.apache.org/jira/browse/HIVE-18569
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18569.1.patch, HIVE-18569.patch
>
>
> Currently, a decimal column is indexed as double in druid.
> This should not happen and either the user has to add an explicit cast or we 
> can add a flag to enable approximation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18433) Upgrade version of com.fasterxml.jackson

2018-02-13 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-18433:
---
Attachment: HIVE-18433.2.patch

> Upgrade version of com.fasterxml.jackson
> 
>
> Key: HIVE-18433
> URL: https://issues.apache.org/jira/browse/HIVE-18433
> Project: Hive
>  Issue Type: Task
>Reporter: Sahil Takiar
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-18433.1.patch, HIVE-18433.2.patch
>
>
> Let's upgrade to version 2.9.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17735) ObjectStore.addNotificationEvent is leaking queries

2018-02-13 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363054#comment-16363054
 ] 

Aihua Xu commented on HIVE-17735:
-

Pushed to master. Thanks [~ychena] for reviewing.

> ObjectStore.addNotificationEvent is leaking queries
> ---
>
> Key: HIVE-17735
> URL: https://issues.apache.org/jira/browse/HIVE-17735
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-17735.1.patch, HIVE-17735.2.patch
>
>
> In ObjectStore.addNotificationEvent():
> {code}
>   Query objectQuery = pm.newQuery(MNotificationNextId.class);
>   Collection ids = (Collection) 
> objectQuery.execute();
> {code}
> The query is never closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18586) Upgrade Derby to 10.14.1.0

2018-02-13 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18586:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~janulatha] for the work.

> Upgrade Derby to 10.14.1.0
> --
>
> Key: HIVE-18586
> URL: https://issues.apache.org/jira/browse/HIVE-18586
> Project: Hive
>  Issue Type: Improvement
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18586.1.patch, HIVE-18586.2.patch, 
> HIVE-18586.3.patch, HIVE-18586.4.patch, HIVE-18586.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18686) Installation on Postgres and Oracle broken

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363053#comment-16363053
 ] 

Ashutosh Chauhan commented on HIVE-18686:
-

+1 pending tests

> Installation on Postgres and Oracle broken
> --
>
> Key: HIVE-18686
> URL: https://issues.apache.org/jira/browse/HIVE-18686
> Project: Hive
>  Issue Type: Bug
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18686.2.patch, HIVE-18686.patch
>
>
> HIVE-18614 broke the installation and upgrade on Postgres and Oracle.  It 
> calls Connection.setSchema in the JDBC driver.  But the JDBC drivers for 
> these databases don't support that call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18387) Minimize time that REBUILD locks the materialized view

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363049#comment-16363049
 ] 

Ashutosh Chauhan commented on HIVE-18387:
-

+1 some minor comments on RB.

> Minimize time that REBUILD locks the materialized view
> --
>
> Key: HIVE-18387
> URL: https://issues.apache.org/jira/browse/HIVE-18387
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18387.01.patch, HIVE-18387.02.patch, 
> HIVE-18387.03.patch, HIVE-18387.04.patch, HIVE-18387.patch
>
>
> Currently, REBUILD will block the materialized view while the final move task 
> is being executed. The idea for this improvement is to create the new 
> materialization in a new folder (new version) and then just flip the pointer 
> to the folder in the MV definition in the metastore. REBUILD operations for a 
> given MV should get an exclusive lock though, i.e., they cannot be executed 
> concurrently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18698) Fix TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363032#comment-16363032
 ] 

Hive QA commented on HIVE-18698:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
54s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 8cf36e7 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9196/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9196/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]
> --
>
> Key: HIVE-18698
> URL: https://issues.apache.org/jira/browse/HIVE-18698
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-18698.01.patch
>
>
> HIVE-18416 have made some extra stat updates on the q.out which are unrelated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18686) Installation on Postgres and Oracle broken

2018-02-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363030#comment-16363030
 ] 

Alan Gates commented on HIVE-18686:
---

To set the record straight, HIVE-18614 did not break the tests, my poor merging 
of that change into mine did.  Thanks to [~mgergely] for helping me find the 
error.  I've attached a new patch that fixes the problem correctly.

> Installation on Postgres and Oracle broken
> --
>
> Key: HIVE-18686
> URL: https://issues.apache.org/jira/browse/HIVE-18686
> Project: Hive
>  Issue Type: Bug
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18686.2.patch, HIVE-18686.patch
>
>
> HIVE-18614 broke the installation and upgrade on Postgres and Oracle.  It 
> calls Connection.setSchema in the JDBC driver.  But the JDBC drivers for 
> these databases don't support that call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18686) Installation on Postgres and Oracle broken

2018-02-13 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-18686:
--
Attachment: HIVE-18686.2.patch

> Installation on Postgres and Oracle broken
> --
>
> Key: HIVE-18686
> URL: https://issues.apache.org/jira/browse/HIVE-18686
> Project: Hive
>  Issue Type: Bug
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18686.2.patch, HIVE-18686.patch
>
>
> HIVE-18614 broke the installation and upgrade on Postgres and Oracle.  It 
> calls Connection.setSchema in the JDBC driver.  But the JDBC drivers for 
> these databases don't support that call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363018#comment-16363018
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910349/HIVE-18553.9.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 13154 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testExecuteJDOQL (batchId=226)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testListFSRoot (batchId=226)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testUpdateFSRootLocation 
(batchId=226)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9195/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9195/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9195/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910349 - PreCommit-HIVE-Build

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| 

[jira] [Commented] (HIVE-18635) Generalize hook dispatch logics in Driver

2018-02-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363003#comment-16363003
 ] 

Ashutosh Chauhan commented on HIVE-18635:
-

+1

> Generalize hook dispatch logics in Driver
> -
>
> Key: HIVE-18635
> URL: https://issues.apache.org/jira/browse/HIVE-18635
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-18635.01.patch, HIVE-18635.02.patch
>
>
> Currently it is only possible to "add" new hooks by either hard coding them; 
> or "pasting" the classname into the hiveconf value...it would be good to make 
> it possible to add hooks by some api as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Status: Patch Available  (was: In Progress)

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Attachment: HIVE-18622.096.patch

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18622) Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly

2018-02-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18622:

Status: In Progress  (was: Patch Available)

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> -
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>   // Carefully handle NULLs...
>   /*
>* For better performance on LONG/DOUBLE we don't want the conditional
>* statements inside the for loop.
>*/
>   outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2018-02-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362992#comment-16362992
 ] 

Sergey Shelukhin commented on HIVE-6430:


This has since been superseded by vectorized mapjoin that improves the 
hashtable further and specializes it for java types and special cases

> MapJoin hash table has large memory overhead
> 
>
> Key: HIVE-6430
> URL: https://issues.apache.org/jira/browse/HIVE-6430
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 0.14.0
>
> Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
> HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
> HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
> HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
> HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
> HIVE-6430.14.patch, HIVE-6430.patch
>
>
> Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
> for row) can take several hundred bytes, which is ridiculous. I am reducing 
> the size of MJKey and MJRowContainer in other jiras, but in general we don't 
> need to have java hash table there.  We can either use primitive-friendly 
> hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
> primitive keys to single row storage structure without an object per row 
> (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2018-02-13 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362986#comment-16362986
 ] 

Alexander Kolbasov commented on HIVE-6430:
--

[~mi...@cloudera.com] FYI.

> MapJoin hash table has large memory overhead
> 
>
> Key: HIVE-6430
> URL: https://issues.apache.org/jira/browse/HIVE-6430
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 0.14.0
>
> Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
> HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
> HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
> HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
> HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
> HIVE-6430.14.patch, HIVE-6430.patch
>
>
> Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
> for row) can take several hundred bytes, which is ridiculous. I am reducing 
> the size of MJKey and MJRowContainer in other jiras, but in general we don't 
> need to have java hash table there.  We can either use primitive-friendly 
> hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
> primitive keys to single row storage structure without an object per row 
> (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18685) Add catalogs to metastore

2018-02-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362969#comment-16362969
 ] 

Alan Gates commented on HIVE-18685:
---

Finally, the comments on the Thrift changes:

I am in full agreement that we need to make a v2 of the API.  I would like to 
start that discussion with asking if we should stick with Thrift or move to 
something else.  But, I don't want to tie this (or any other) feature to that, 
as that will be a many month project with a complex migration plan.

I also want this to be 100% backwards compatible, meaning old clients with no 
knowledge of catalogs should still be able to work.  So I don't want to change 
existing calls like get_table() to add the catalog name.

Alexander's idea of re-using the existing Thrift calls by jamming the catalog 
name into the dbname is very interesting.  It avoid duplicating 75% of the 
existing Thrift calls.  I would only need to add Thrift calls for 
createCatalog, getCatalog, etc.  I'll explore this and see if it's viable.  I 
think I will likely still change HiveMetaStoreClient to add methods with 
explicit catalog name, but that is much easier than adding thrift methods.  And 
in HiveMetaStoreClient I can explicitly deprecate the old methods, giving users 
a warning not to continue using them.  This will also hide 95% of our users 
from the hackery.

> Add catalogs to metastore
> -
>
> Key: HIVE-18685
> URL: https://issues.apache.org/jira/browse/HIVE-18685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Attachments: HMS Catalog Design Doc.pdf
>
>
> SQL supports two levels of namespaces, called in the spec catalogs and 
> schemas (with schema being equivalent to Hive's database).  I propose to add 
> the upper level of catalog.  The attached design doc covers the use cases, 
> requirements, and brief discussion of how it will be implemented in a 
> backwards compatible way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18685) Add catalogs to metastore

2018-02-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362962#comment-16362962
 ] 

Alan Gates commented on HIVE-18685:
---

This comment deals with all of the database questions.  These are the changes 
to the Derby upgrade script that I made:
{code:java}
-- Create new Catalog table
CREATE TABLE "APP"."CTLGS" (
"CTLG_ID" BIGINT NOT NULL,
"NAME" VARCHAR(256) UNIQUE,
"DESC" VARCHAR(4000),
"LOCATION_URI" VARCHAR(4000) NOT NULL);

ALTER TABLE "APP"."CTLGS" ADD CONSTRAINT "CTLGS_PK" PRIMARY KEY ("CTLG_ID");

-- Insert a default value. The location is TBD. Hive will fix this when it 
starts
INSERT INTO "APP"."CTLGS" VALUES (1, 'Hive', 'Default catalog for Hive', 'TBD');

-- Drop the unique index on DBS
DROP INDEX "APP"."UNIQUE_DATABASE";

-- Add the new column to the DBS table, can't put in the not null constraint yet
ALTER TABLE "APP"."DBS" ADD COLUMN "CTLG_NAME" VARCHAR(256);

-- Update all records in the DBS table to point to the Hive catalog
UPDATE "APP"."DBS" 
SET "CTLG_NAME" = 'hive';

-- Add the not null constraint
--ALTER TABLE "APP"."DBS" ADD CONSTRAINT "DBS_CTLG_NN" NOT NULL ("CTLG_NAME");
ALTER TABLE "APP"."DBS" ALTER COLUMN "CTLG_NAME" NOT NULL;

-- Put back the unique index 
CREATE UNIQUE INDEX "APP"."UNIQUE_DATABASE" ON "APP"."DBS" ("NAME", 
"CTLG_NAME");

-- Add the foreign key
ALTER TABLE "APP"."DBS" ADD CONSTRAINT "DBS_FK1" FOREIGN KEY ("CTLG_NAME") 
REFERENCES "APP"."CTLGS" ("NAME") ON DELETE NO ACTION ON UPDATE NO ACTION;{code}
Regarding the location, we need to store that because we need to use it when we 
create databases in a catalog. Currently Hive creates database locations by 
adding a directory named  default warehouse location (from the 
config file). But that won't work once we have multiple catalogs because two 
databases of the same name may exist in separate catalogs. So my plan is for 
each catalog to have a location (by default an HDFS directory, though of course 
it could be an S3 bucket or whatever) where database directories will be 
created. For the default 'hive' catalog that location will be the default 
warehouse location from the config file. I don't think there's any need to tie 
the catalog name and HDFS location. Unlike database and table I am not planning 
to allow the location to default to something, the user must specify it when 
creating a catalog.
{quote}Is there a need to explicitly create 'hive' catalog - can catalogs be 
created on demand?
{quote}
Yes, because of the constraints being added to the RDBMS each database will 
have to be associated with a catalog.  Plus it seems cleaner to explicitly have 
everything in a catalog.
{quote} * When the administrator defines the security model, how does it 
stored/retrieved? Maybe it should be a catalog level information
 * It might be difficult to have the same user base / security model working 
for every connecting application, especially with transient clusters - maybe it 
is not an immediate concern, but it might be good to keep in mind.{quote}
My plan is to store the security model for the catalog in the CTLGS table, 
though as you see above I haven't added that yet.  I haven't finished the 
design on the security piece yet, and I agree that having varying security 
models inside the system, especially once we allow users to do cross catalog 
operations, will be challenging.  But I believe it is a compelling enough 
feature that we will want it.

> Add catalogs to metastore
> -
>
> Key: HIVE-18685
> URL: https://issues.apache.org/jira/browse/HIVE-18685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Attachments: HMS Catalog Design Doc.pdf
>
>
> SQL supports two levels of namespaces, called in the spec catalogs and 
> schemas (with schema being equivalent to Hive's database).  I propose to add 
> the upper level of catalog.  The attached design doc covers the use cases, 
> requirements, and brief discussion of how it will be implemented in a 
> backwards compatible way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362958#comment-16362958
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} ql: The patch generated 0 new + 68 unchanged - 230 
fixed = 68 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 8cf36e7 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9195/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9195/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> 

[jira] [Commented] (HIVE-18685) Add catalogs to metastore

2018-02-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362951#comment-16362951
 ] 

Alan Gates commented on HIVE-18685:
---

{quote}Your document doesn't describe any restrictions on catalog name - should 
it be limited by size or allowed character set?
{quote}
I was assuming the same sorts of restrictions we have on database and table 
names, but I can make that explicit in the document.

> Add catalogs to metastore
> -
>
> Key: HIVE-18685
> URL: https://issues.apache.org/jira/browse/HIVE-18685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Attachments: HMS Catalog Design Doc.pdf
>
>
> SQL supports two levels of namespaces, called in the spec catalogs and 
> schemas (with schema being equivalent to Hive's database).  I propose to add 
> the upper level of catalog.  The attached design doc covers the use cases, 
> requirements, and brief discussion of how it will be implemented in a 
> backwards compatible way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18685) Add catalogs to metastore

2018-02-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362948#comment-16362948
 ] 

Alan Gates commented on HIVE-18685:
---

{quote}Do you envision catalog having some kind of properties (outside of 
security model) that can be shared by all databases within a catalog?
{quote}
Not at the moment, but we might think of others in the future.

> Add catalogs to metastore
> -
>
> Key: HIVE-18685
> URL: https://issues.apache.org/jira/browse/HIVE-18685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Attachments: HMS Catalog Design Doc.pdf
>
>
> SQL supports two levels of namespaces, called in the spec catalogs and 
> schemas (with schema being equivalent to Hive's database).  I propose to add 
> the upper level of catalog.  The attached design doc covers the use cases, 
> requirements, and brief discussion of how it will be implemented in a 
> backwards compatible way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >