[jira] [Commented] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains only base files

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956690#comment-16956690
 ] 

Hive QA commented on HIVE-22255:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
15s{color} | {color:blue} standalone-metastore/metastore-server in master has 
171 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
59s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19099/dev-support/hive-personality.sh
 |
| git revision | master / 40cd40d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19099/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive don't trigger Major Compaction automatically if table contains only base 
> files 
> 
>
> Key: HIVE-22255
> URL: https://issues.apache.org/jira/browse/HIVE-22255
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 3.1.2
> Environment: Hive-3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-22255.01.patch, HIVE-22255.patch
>
>
> user may run into the issue if the table consists of all base files but no 
> delta, then the following condition will yield false and automatic major 
> compaction will be skipped.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]
>  
> Steps to Reproduce:
>  # create Acid table 
> {code:java}
> //  create table myacid(id int);
> {code}
>  # Run multiple insert table 
> {code:java}
> // insert overwrite table myacid values(1);insert overwrite table myacid 
> values(2),(3),(4){code}
>  # DFS ls output
> 

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Gopal Vijayaraghavan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956685#comment-16956685
 ] 

Gopal Vijayaraghavan commented on HIVE-22315:
-

LGTM - +1 tests pending

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.6.patch, 
> HIVE-22315.7.patch, HIVE-22315.8.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956677#comment-16956677
 ] 

Hive QA commented on HIVE-21303:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983670/HIVE-21303.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_bigdata] 
(batchId=129)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[input14] 
(batchId=149)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[input17] 
(batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[input18] 
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[insert_into1] 
(batchId=123)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[insert_into2] 
(batchId=154)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[insert_into3] 
(batchId=125)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mapreduce1] 
(batchId=144)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mapreduce2] 
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_transform] 
(batchId=148)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[rcfile_bigdata] 
(batchId=123)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[script_env_var1] 
(batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[script_env_var2] 
(batchId=140)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[script_pipe] 
(batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[scriptfile1] 
(batchId=151)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform1] 
(batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform2] 
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr1] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union23] 
(batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_script] 
(batchId=144)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testCliDriver[spark_job_max_tasks]
 (batchId=301)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testCliDriver[spark_stage_max_tasks]
 (batchId=301)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testCliDriver[spark_task_failure]
 (batchId=301)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19098/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19098/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19098/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983670 - PreCommit-HIVE-Build

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch, 
> HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread Shubham Chaurasia (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956652#comment-16956652
 ] 

Shubham Chaurasia commented on HIVE-22360:
--

[~belugabehr]
{quote}
Shubham Chaurasia Please take a look at the work I did in HIVE-22337. I have 
addressed this there in a way that can apply to all text-based SerDe.
{quote}
Thanks [~belugabehr], great work there! I see that the scenarios like this have 
been handled nicely at a common place there.
I see that {{org.apache.hadoop.hive.serde2.text.DelimitedTextSerde}} is capable 
of handling all kinds of delimiters.

But if someone still want to use {{MultiSerDe}}, we will need this patch.



> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956651#comment-16956651
 ] 

Hive QA commented on HIVE-21303:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19098/dev-support/hive-personality.sh
 |
| git revision | master / 40cd40d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19098/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch, 
> HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22343) Fix incorrect spelling of 'artifectId' in pom.xml

2019-10-21 Thread ice bai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ice bai updated HIVE-22343:
---
Attachment: (was: HIVE-22343.patch)

> Fix incorrect spelling of 'artifectId' in pom.xml
> -
>
> Key: HIVE-22343
> URL: https://issues.apache.org/jira/browse/HIVE-22343
> Project: Hive
>  Issue Type: Improvement
>Reporter: ice bai
>Assignee: ice bai
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-22343.patch
>
>
> There are some incorrect spelling of 'artifectId' in pom.xml or xxx/pom.xml
> Such as:
> 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22378) Remove code duplicatoins from return path handling

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956639#comment-16956639
 ] 

Hive QA commented on HIVE-22378:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983620/HIVE-22378.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testNonAsciiStrings (batchId=284)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testNonAsciiStrings 
(batchId=287)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19097/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19097/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19097/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983620 - PreCommit-HIVE-Build

> Remove code duplicatoins from return path handling
> --
>
> Key: HIVE-22378
> URL: https://issues.apache.org/jira/browse/HIVE-22378
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22378.01.patch
>
>
> Return path handling have some code duplications, they should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Labels: performance  (was: )

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956622#comment-16956622
 ] 

Rajesh Balamohan commented on HIVE-22379:
-

Observed 10+ second difference in local cluster setup with dynamic partition 
load.

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22381:

Attachment: HIVE-22381.1.patch
Status: Patch Available  (was: Open)

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22381.1.patch, errors.xml
>
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22381:

Attachment: (was: HIVE-22381.1.patch)

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22381.1.patch, errors.xml
>
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22378) Remove code duplicatoins from return path handling

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956616#comment-16956616
 ] 

Hive QA commented on HIVE-22378:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19097/dev-support/hive-personality.sh
 |
| git revision | master / 40cd40d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19097/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove code duplicatoins from return path handling
> --
>
> Key: HIVE-22378
> URL: https://issues.apache.org/jira/browse/HIVE-22378
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22378.01.patch
>
>
> Return path handling have some code duplications, they should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22381:

Status: Open  (was: Patch Available)

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22381.1.patch, errors.xml
>
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Attachment: (was: HIVE-22379.1.patch)

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Status: Patch Available  (was: Open)

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Attachment: HIVE-22379.1.patch

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Attachment: HIVE-22379.1.patch

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22379.1.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22381:

Attachment: HIVE-22381.1.patch
Status: Patch Available  (was: Open)

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22381.1.patch, errors.xml
>
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956595#comment-16956595
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983639/HIVE-22363.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17545 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19096/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19096/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19096/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983639 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22363?focusedWorklogId=331751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331751
 ]

ASF GitHub Bot logged work on HIVE-22363:
-

Author: ASF GitHub Bot
Created on: 22/Oct/19 01:09
Start Date: 22/Oct/19 01:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on issue #819: HIVE-22363 
ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
URL: https://github.com/apache/hive/pull/819#issuecomment-544769886
 
 
   @kgyrtkirk , I left some comments. I still do not understand the problem we 
were having with Filter operator. Could you describe how the plan was looking 
before and after, and what we are trying to accomplish? Were we producing 
incorrect results? Were we missing a possible optimization?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331751)
Time Spent: 0.5h  (was: 20m)

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22363?focusedWorklogId=331749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331749
 ]

ASF GitHub Bot logged work on HIVE-22363:
-

Author: ASF GitHub Bot
Created on: 22/Oct/19 01:07
Start Date: 22/Oct/19 01:07
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #819: HIVE-22363 
ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
URL: https://github.com/apache/hive/pull/819#discussion_r337300628
 
 

 ##
 File path: ql/src/test/results/clientpositive/llap/explainuser_1.q.out
 ##
 @@ -4990,38 +4990,44 @@ Vertex dependency in root stage
 Reducer 2 <- Map 1 (SIMPLE_EDGE)
 Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
 Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
+Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
 
 Review comment:
   The patch seems to cause regressions (additional shuffle phases), unless 
previous plan was incorrect?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331749)
Time Spent: 20m  (was: 10m)

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22363?focusedWorklogId=331750=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331750
 ]

ASF GitHub Bot logged work on HIVE-22363:
-

Author: ASF GitHub Bot
Created on: 22/Oct/19 01:07
Start Date: 22/Oct/19 01:07
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #819: HIVE-22363 
ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
URL: https://github.com/apache/hive/pull/819#discussion_r337302372
 
 

 ##
 File path: ql/src/test/results/clientpositive/multi_insert_gby2.q.out
 ##
 @@ -37,9 +37,11 @@ POSTHOOK: Output: default@e2_n0
 STAGE DEPENDENCIES:
   Stage-2 is a root stage
   Stage-0 depends on stages: Stage-2
-  Stage-3 depends on stages: Stage-0
+  Stage-3 depends on stages: Stage-0, Stage-4, Stage-6
 
 Review comment:
   Additional stages?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331750)
Time Spent: 20m  (was: 10m)

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22363?focusedWorklogId=331748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331748
 ]

ASF GitHub Bot logged work on HIVE-22363:
-

Author: ASF GitHub Bot
Created on: 22/Oct/19 01:07
Start Date: 22/Oct/19 01:07
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #819: HIVE-22363 
ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
URL: https://github.com/apache/hive/pull/819#discussion_r337300331
 
 

 ##
 File path: ql/src/test/results/clientpositive/autoColumnStats_7.q.out
 ##
 @@ -48,7 +48,7 @@ STAGE PLANS:
   Reduce Output Operator
 key expressions: _col0 (type: string), _col1 (type: string)
 sort order: ++
-Map-reduce partition columns: _col0 (type: string)
+Map-reduce partition columns: _col0 (type: string), _col1 
(type: string)
 
 Review comment:
   Why do we partition by two keys now? We are only grouping by one key so this 
may be incorrect?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331748)
Time Spent: 20m  (was: 10m)

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22381:

Attachment: errors.xml

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: errors.xml
>
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956575#comment-16956575
 ] 

Hive QA commented on HIVE-22363:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
17s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 
total (was 1545) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p in 
org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:[line 328] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19096/dev-support/hive-personality.sh
 |
| git revision | master / 40cd40d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19096/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19096/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a 

[jira] [Commented] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956569#comment-16956569
 ] 

Ramesh Kumar Thangarajan commented on HIVE-22381:
-

Find bugs produces huge xml file as output for jmh-tests. And hence it is not 
able to convert xml to a text. I am confirming the output file in my machine 
produces no errors
{code:java}

{code}

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22382) Support Decimal64 column division with decimal64 Column

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22382:
---

Assignee: Ramesh Kumar Thangarajan

> Support Decimal64 column division with decimal64 Column
> ---
>
> Key: HIVE-22382
> URL: https://issues.apache.org/jira/browse/HIVE-22382
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Support Decimal64 column division with decimal64 Column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22381) JMH tests for Decimal64 and Decimal division on a column over scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22381:
---

Assignee: Ramesh Kumar Thangarajan

> JMH tests for Decimal64 and Decimal division on a column over scalar
> 
>
> Key: HIVE-22381
> URL: https://issues.apache.org/jira/browse/HIVE-22381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> JMH tests for Decimal64 and Decimal division on a column over scalar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Attachment: HIVE-22315.8.patch
Status: Patch Available  (was: Open)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.6.patch, 
> HIVE-22315.7.patch, HIVE-22315.8.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Attachment: (was: HIVE-22315.4.patch)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.6.patch, 
> HIVE-22315.7.patch, HIVE-22315.8.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Status: Open  (was: Patch Available)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.4.patch, 
> HIVE-22315.6.patch, HIVE-22315.7.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956553#comment-16956553
 ] 

Hive QA commented on HIVE-22217:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983628/HIVE-22217.2.branch-3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 149 failed/errored test(s), 14428 tests 
executed
*Failed tests:*
{noformat}
TestAddPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestAddPartitionsFromPartSpec - did not produce a TEST-*.xml file (likely timed 
out) (batchId=230)
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestAggregateStatsCache - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestAlterPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestAppendPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=276)
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogNonDefaultClient - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestCatalogNonDefaultSvr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogOldClient - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestCatalogs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestChainFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestCheckConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestCloseableThreadLocal - did not produce a TEST-*.xml file (likely timed out) 
(batchId=335)
TestCustomQueryFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=239)
TestDatabases - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestDefaultConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDropPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=276)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestExchangePartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestFMSketchSerialization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestForeignKey - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestFunctions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestGetPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestGetPartitionsUsingProjectionAndFilterSpecs - did not produce a TEST-*.xml 
file (likely timed out) (batchId=230)
TestGetTableMeta - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestGroupFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestHLLNoBias - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHLLSerialization - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHdfsUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHiveAlterHandler - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=239)
TestHiveMetaStorePartitionSpecs - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestHiveMetaStoreSchemaMethods - did not produce a TEST-*.xml file (likely 
timed out) (batchId=237)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=234)
TestHiveMetaToolCommandLine - did not produce a TEST-*.xml file (likely timed 
out) (batchId=232)
TestHiveMetastoreCli - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestHmsServerAuthorization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=237)
TestHyperLogLog - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)
TestHyperLogLogDense - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHyperLogLogMerge - did not produce a 

[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Status: Open  (was: Patch Available)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.4.patch, 
> HIVE-22315.6.patch, HIVE-22315.7.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Attachment: HIVE-22315.4.patch
Status: Patch Available  (was: Open)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.4.patch, 
> HIVE-22315.6.patch, HIVE-22315.7.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains only base files

2019-10-21 Thread Rajkumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-22255:
--
Attachment: HIVE-22255.01.patch
Status: Patch Available  (was: Open)

> Hive don't trigger Major Compaction automatically if table contains only base 
> files 
> 
>
> Key: HIVE-22255
> URL: https://issues.apache.org/jira/browse/HIVE-22255
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 3.1.2
> Environment: Hive-3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-22255.01.patch, HIVE-22255.patch
>
>
> user may run into the issue if the table consists of all base files but no 
> delta, then the following condition will yield false and automatic major 
> compaction will be skipped.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]
>  
> Steps to Reproduce:
>  # create Acid table 
> {code:java}
> //  create table myacid(id int);
> {code}
>  # Run multiple insert table 
> {code:java}
> // insert overwrite table myacid values(1);insert overwrite table myacid 
> values(2),(3),(4){code}
>  # DFS ls output
> {code:java}
> // dfs -ls -R /warehouse/tablespace/managed/hive/myacid;
> ++
> |                     DFS Output                     |
> ++
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        610 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/bucket_0 |
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        633 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/bucket_0 |
> ++{code}
>  
> you will see that Major compaction will not be trigger until you run alter 
> table compact MAJOR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains only base files

2019-10-21 Thread Rajkumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-22255:
--
Status: Open  (was: Patch Available)

> Hive don't trigger Major Compaction automatically if table contains only base 
> files 
> 
>
> Key: HIVE-22255
> URL: https://issues.apache.org/jira/browse/HIVE-22255
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 3.1.2
> Environment: Hive-3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-22255.patch
>
>
> user may run into the issue if the table consists of all base files but no 
> delta, then the following condition will yield false and automatic major 
> compaction will be skipped.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]
>  
> Steps to Reproduce:
>  # create Acid table 
> {code:java}
> //  create table myacid(id int);
> {code}
>  # Run multiple insert table 
> {code:java}
> // insert overwrite table myacid values(1);insert overwrite table myacid 
> values(2),(3),(4){code}
>  # DFS ls output
> {code:java}
> // dfs -ls -R /warehouse/tablespace/managed/hive/myacid;
> ++
> |                     DFS Output                     |
> ++
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        610 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/bucket_0 |
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        633 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/bucket_0 |
> ++{code}
>  
> you will see that Major compaction will not be trigger until you run alter 
> table compact MAJOR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956516#comment-16956516
 ] 

Hive QA commented on HIVE-22217:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 14s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-19095/patches/PreCommit-HIVE-Build-19095.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19095/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956512#comment-16956512
 ] 

Hive QA commented on HIVE-22330:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983622/HIVE-22330.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestPartitionManagement.testPartitionDiscoveryTransactionalTable
 (batchId=223)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19094/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19094/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19094/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983622 - PreCommit-HIVE-Build

> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch, HIVE-22330.02.patch, 
> HIVE-22330.03.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21303:
--
Status: Patch Available  (was: Open)

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch, 
> HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21303:
--
Attachment: HIVE-21303.2.patch

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch, 
> HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21303:
--
Status: Open  (was: Patch Available)

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21303:
--
Status: Patch Available  (was: Open)

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21303) Update TextRecordReader

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21303:
--
Status: Open  (was: Patch Available)

> Update TextRecordReader
> ---
>
> Key: HIVE-21303
> URL: https://issues.apache.org/jira/browse/HIVE-21303
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-21303.1.patch, HIVE-21303.2.patch
>
>
> Remove use of Deprecated 
> {{org.apache.hadoop.mapred.LineRecordReader.LineReader}}
> For every call to {{next}}, the code dives into the configuration map to see 
> if this feature is enabled.  Just look it up once and cache the value.
> {code:java}
> public int next(Writable row) throws IOException {
> ...
> if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVESCRIPTESCAPE)) {
>   return HiveUtils.unescapeText((Text) row);
> }
> return bytesConsumed;
> }
> {code}
> Other clean up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21426:
--
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Patch committed.  Thanks [~pvary]!

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22111) Materialized view based on replicated table might not get refreshed

2019-10-21 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956488#comment-16956488
 ] 

Jesus Camacho Rodriguez commented on HIVE-22111:


I have just seen this issue again. If we are replicating the creation metadata 
for the materialized views, then we cannot just set the flag to 'N'; indeed as 
a workaround, we could set it to 'Y' even if we fallback to full rebuild first 
time we rebuild the materialized view in the cluster with the replica.
AFAIK, materialized views replication has other problems right now wrt 
replication, see HIVE-18621 and HIVE-20543.

> Materialized view based on replicated table might not get refreshed
> ---
>
> Key: HIVE-22111
> URL: https://issues.apache.org/jira/browse/HIVE-22111
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, repl
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
>
> Consider the following scenario:
> * create a base table which we replicate
> * create a materialized view in the target hive based on the base table
> * modify (delete/update) the base table in the source hive
> * replicate the changes (delete/update) to the target hive
> * query the materialized view in the target hive
>  
> We do not refresh the data, since when the transaction is created by 
> replication we set ctc_update_delete to 'N'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956474#comment-16956474
 ] 

Hive QA commented on HIVE-22330:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19094/dev-support/hive-personality.sh
 |
| git revision | master / 72094da |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: storage-api U: storage-api |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19094/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch, HIVE-22330.02.patch, 
> HIVE-22330.03.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22378) Remove code duplicatoins from return path handling

2019-10-21 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22378:
--
Status: Patch Available  (was: Open)

> Remove code duplicatoins from return path handling
> --
>
> Key: HIVE-22378
> URL: https://issues.apache.org/jira/browse/HIVE-22378
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22378.01.patch
>
>
> Return path handling have some code duplications, they should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956461#comment-16956461
 ] 

Hive QA commented on HIVE-22238:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19093/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19093/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19093/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983619 - PreCommit-HIVE-Build

> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, 
> HIVE-22238.03.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956458#comment-16956458
 ] 

Hive QA commented on HIVE-21426:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983625/HIVE-21426.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17545 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19092/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19092/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19092/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983625 - PreCommit-HIVE-Build

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956432#comment-16956432
 ] 

Hive QA commented on HIVE-21426:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 135 unchanged - 1 
fixed = 135 total (was 136) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} ql generated 0 new + 1545 unchanged - 2 fixed = 1545 
total (was 1547) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19092/dev-support/hive-personality.sh
 |
| git revision | master / 72094da |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19092/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956422#comment-16956422
 ] 

Jesus Camacho Rodriguez commented on HIVE-22238:


[~kgyrtkirk], `getSelectivitySimpleTree` looks for the TS that is below that 
operator. Does it find it or do we go into logic for multiple operators? If it 
does, maybe we should skip the predicates that have already been accounted for 
on PK side (filter conditions on join keys) from the estimate. Does that make 
sense? Skipping any reduction performed by a join seems too radical (for 
instance, if we filter by year but joined by any other key, we will not predict 
any reduction due to join).

> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, 
> HIVE-22238.03.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956408#comment-16956408
 ] 

Hive QA commented on HIVE-22238:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=16)
org.apache.hadoop.hive.metastore.security.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
 (batchId=292)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19091/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19091/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19091/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983619 - PreCommit-HIVE-Build

> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, 
> HIVE-22238.03.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22240) Function percentile_cont fails when array parameter passed

2019-10-21 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22240:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~kkasa]!

> Function percentile_cont fails when array parameter passed
> --
>
> Key: HIVE-22240
> URL: https://issues.apache.org/jira/browse/HIVE-22240
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22240.1.patch, HIVE-22240.2.patch, 
> HIVE-22240.3.patch, HIVE-22240.3.patch, HIVE-22240.4.patch, 
> HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch
>
>
> {code}
> SELECT
> percentile_cont(array(0.2, 0.5, 0.9)) WITHIN GROUP (ORDER BY value)
> FROM t_test;
> {code}
> hive.log:
> {code}
> 2019-09-24T21:00:43,203 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
> org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:793)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
>   ... 11 more
> Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast 
> to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$PercentileContEvaluator.iterate(GenericUDAFPercentileCont.java:259)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:214)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:639)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:814)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:720)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:788)
>   ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956387#comment-16956387
 ] 

Hive QA commented on HIVE-22238:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19091/dev-support/hive-personality.sh
 |
| git revision | master / c9850b4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19091/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, 
> HIVE-22238.03.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22367) Transaction type not retrieved from OpenTxnRequest

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956371#comment-16956371
 ] 

Hive QA commented on HIVE-22367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983611/HIVE-22367.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthUDFBlacklist.testBlackListedUdfUsage
 (batchId=287)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19090/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19090/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19090/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983611 - PreCommit-HIVE-Build

> Transaction type not retrieved from OpenTxnRequest
> --
>
> Key: HIVE-22367
> URL: https://issues.apache.org/jira/browse/HIVE-22367
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22367.1.patch
>
>
> When opening a transaction, its type should be extracted from OpenTxnRequest 
> object. Currently it's hardcoded with TxnType.DEFAULT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-18922) Hive is not cleaning up staging directories

2019-10-21 Thread Anant Mittal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956357#comment-16956357
 ] 

Anant Mittal commented on HIVE-18922:
-

Issue still seen on newer Hive versions. EMR5.26 and mapr6.1 both show the same 
issue.

> Hive is not cleaning up  staging directories
> 
>
> Key: HIVE-18922
> URL: https://issues.apache.org/jira/browse/HIVE-18922
> Project: Hive
>  Issue Type: Bug
>Reporter: Anant Mittal
>Priority: Major
>
> Hive is creating hdfs folders with format 
> /.hive-staging_hive__-xx/-ext-x
> These are not being cleaned up even after long duration. The folder is used 
> to load to the table. Example:
> Loading data to table default.tablename from 
> hdfs://clustermachine/apps/hive/warehouse/tablename/.hive-staging_hive_2018-01-31_11-45-14_005_1129336997995057804-51/-ext-1
>  
> This might be covered to some extent by HIVE-11940 but, want to make sure all 
> cases are addressed.
> Update: It seems HIVE-11940 did not cover this as the issue is seen in later 
> versions too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-18922) Hive is not cleaning up staging directories

2019-10-21 Thread Anant Mittal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anant Mittal updated HIVE-18922:

Description: 
Hive is creating hdfs folders with format 
/.hive-staging_hive__-xx/-ext-x

These are not being cleaned up even after long duration. The folder is used to 
load to the table. Example:

Loading data to table default.tablename from 
hdfs://clustermachine/apps/hive/warehouse/tablename/.hive-staging_hive_2018-01-31_11-45-14_005_1129336997995057804-51/-ext-1

 

This might be covered to some extent by HIVE-11940 but, want to make sure all 
cases are addressed.

Update: It seems HIVE-11940 did not cover this as the issue is seen in later 
versions too.

  was:
Hive is creating hdfs folders with format 
/.hive-staging_hive__-xx/-ext-x

These are not being cleaned up even after long duration. The folder is used to 
load to the table. Example:

Loading data to table default.tablename from 
hdfs://clustermachine/apps/hive/warehouse/tablename/.hive-staging_hive_2018-01-31_11-45-14_005_1129336997995057804-51/-ext-1

 

This might be covered to some extent by HIVE-11940 but, want to make sure all 
cases are addressed.


> Hive is not cleaning up  staging directories
> 
>
> Key: HIVE-18922
> URL: https://issues.apache.org/jira/browse/HIVE-18922
> Project: Hive
>  Issue Type: Bug
>Reporter: Anant Mittal
>Priority: Major
>
> Hive is creating hdfs folders with format 
> /.hive-staging_hive__-xx/-ext-x
> These are not being cleaned up even after long duration. The folder is used 
> to load to the table. Example:
> Loading data to table default.tablename from 
> hdfs://clustermachine/apps/hive/warehouse/tablename/.hive-staging_hive_2018-01-31_11-45-14_005_1129336997995057804-51/-ext-1
>  
> This might be covered to some extent by HIVE-11940 but, want to make sure all 
> cases are addressed.
> Update: It seems HIVE-11940 did not cover this as the issue is seen in later 
> versions too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22367) Transaction type not retrieved from OpenTxnRequest

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956313#comment-16956313
 ] 

Hive QA commented on HIVE-22367:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
20s{color} | {color:blue} standalone-metastore/metastore-server in master has 
171 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19090/dev-support/hive-personality.sh
 |
| git revision | master / c9850b4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19090/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Transaction type not retrieved from OpenTxnRequest
> --
>
> Key: HIVE-22367
> URL: https://issues.apache.org/jira/browse/HIVE-22367
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22367.1.patch
>
>
> When opening a transaction, its type should be extracted from OpenTxnRequest 
> object. Currently it's hardcoded with TxnType.DEFAULT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956294#comment-16956294
 ] 

Hive QA commented on HIVE-22360:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983599/HIVE-22360.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17546 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19089/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19089/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19089/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983599 - PreCommit-HIVE-Build

> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956290#comment-16956290
 ] 

Hive QA commented on HIVE-22360:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
53s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} serde in master has 199 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
16s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19089/dev-support/hive-personality.sh
 |
| git revision | master / c9850b4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: serde ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19089/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> 

[jira] [Updated] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22363:

Attachment: HIVE-22363.03.patch

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956205#comment-16956205
 ] 

Hive QA commented on HIVE-19653:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12927286/HIVE-19653.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 17547 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_grouping_sets_pushdown1]
 (batchId=185)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets4]
 (batchId=170)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_grouping_sets_pushdown1]
 (batchId=151)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19088/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19088/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19088/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12927286 - PreCommit-HIVE-Build

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-10-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22379:

Description: {{HiveAlterHandler::alterPartitions}} could lookup all 
partition details via single call instead of multiple lookups.  (was: 
{\{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
single \{{getPartition}} call instead of multiple calls.)

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956171#comment-16956171
 ] 

Hive QA commented on HIVE-19653:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
57s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19088/dev-support/hive-personality.sh
 |
| git revision | master / c9850b4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19088/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956146#comment-16956146
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983588/HIVE-22363.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby2] 
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=194)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby10] 
(batchId=141)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby11] 
(batchId=146)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2] 
(batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8] 
(batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies]
 (batchId=138)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19087/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19087/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19087/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983588 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Status: Patch Available  (was: Open)

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.3.6, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Attachment: HIVE-22217.2.branch-3.patch

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Attachment: (was: HIVE-22217.2.branch-3.patch)

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Status: Open  (was: Patch Available)

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.3.6, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Attachment: HIVE-22217.2.branch-3.patch

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.01.branch-3.patch, HIVE-22217.1.patch, 
> HIVE-22217.2.branch-3.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-21 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956118#comment-16956118
 ] 

Peter Vary commented on HIVE-22330:
---

+1 pending tests

> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch, HIVE-22330.02.patch, 
> HIVE-22330.03.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956108#comment-16956108
 ] 

Peter Vary commented on HIVE-21426:
---

+1 pending tests

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21426:
--
Status: Open  (was: Patch Available)

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21426:
--
Attachment: (was: HIVE-21426.1.patch)

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21426:
--
Status: Patch Available  (was: Open)

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21426) Remove Utilities Global Random

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21426:
--
Attachment: HIVE-21426.2.patch

> Remove Utilities Global Random
> --
>
> Key: HIVE-21426
> URL: https://issues.apache.org/jira/browse/HIVE-21426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21426.1.patch, HIVE-21426.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253
> Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.
> {quote}
> ThreadLocalRandom is initialized with an internally generated seed that may 
> not otherwise be modified. When applicable, use of ThreadLocalRandom rather 
> than shared Random objects in concurrent programs will typically encounter 
> much less overhead and contention.
> {quote}
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21246:
--
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~abstractdog]

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21246.1.patch, HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956080#comment-16956080
 ] 

Hive QA commented on HIVE-22363:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
10s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
18s{color} | {color:red} ql generated 1 new + 1547 unchanged - 0 fixed = 1548 
total (was 1547) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p in 
org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:[line 328] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19087/dev-support/hive-personality.sh
 |
| git revision | master / 1866d7d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19087/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19087/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between 

[jira] [Updated] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-21 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-22330:
-
Status: Open  (was: Patch Available)

> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch, HIVE-22330.02.patch, 
> HIVE-22330.03.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-21 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-22330:
-
Attachment: HIVE-22330.03.patch
Status: Patch Available  (was: Open)

> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch, HIVE-22330.02.patch, 
> HIVE-22330.03.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-21 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21246:
--
Attachment: (was: HIVE-21246.1.patch)

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22378) Remove code duplicatoins from return path handling

2019-10-21 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22378:
--
Attachment: HIVE-22378.01.patch

> Remove code duplicatoins from return path handling
> --
>
> Key: HIVE-22378
> URL: https://issues.apache.org/jira/browse/HIVE-22378
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22378.01.patch
>
>
> Return path handling have some code duplications, they should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22238:

Attachment: HIVE-22238.03.patch

> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, 
> HIVE-22238.03.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956056#comment-16956056
 ] 

David Mollitor commented on HIVE-22360:
---

[~ShubhamChaurasia] Please take a look at the work I did in [HIVE-22337].  I 
have addressed this there in a way that can apply to all text-based SerDe.

> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22378) Remove code duplicatoins from return path handling

2019-10-21 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-22378:
-


> Remove code duplicatoins from return path handling
> --
>
> Key: HIVE-22378
> URL: https://issues.apache.org/jira/browse/HIVE-22378
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
>
> Return path handling have some code duplications, they should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22377) Refactor CalcitePlanner

2019-10-21 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-22377:
-


> Refactor CalcitePlanner
> ---
>
> Key: HIVE-22377
> URL: https://issues.apache.org/jira/browse/HIVE-22377
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
>
> CalcitePlanner is a 5000+ lines long class, which is trying to do too many 
> things on it's own. It extends SemanticAnalyzer, though it is not a 
> SemanticAnalyzer. It should have a better design.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=331401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331401
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 21/Oct/19 13:00
Start Date: 21/Oct/19 13:00
Worklog Time Spent: 10m 
  Work Description: richox commented on issue #354: HIVE-19653: Incorrect 
predicate pushdown for groupby with grouping sets
URL: https://github.com/apache/hive/pull/354#issuecomment-544503441
 
 
   > Now I'm facing this problem and I wonder why this pull request is still 
unmerged.
   
   i'm not interested in hive any more... maybe you can try set hive.cbo.enable 
to true and use the new cbo optimizer, this bug won't happen with cbo
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331401)
Time Spent: 20m  (was: 10m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22240) Function percentile_cont fails when array parameter passed

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956029#comment-16956029
 ] 

Hive QA commented on HIVE-22240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983589/HIVE-22240.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17545 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19086/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19086/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19086/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983589 - PreCommit-HIVE-Build

> Function percentile_cont fails when array parameter passed
> --
>
> Key: HIVE-22240
> URL: https://issues.apache.org/jira/browse/HIVE-22240
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22240.1.patch, HIVE-22240.2.patch, 
> HIVE-22240.3.patch, HIVE-22240.3.patch, HIVE-22240.4.patch, 
> HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch
>
>
> {code}
> SELECT
> percentile_cont(array(0.2, 0.5, 0.9)) WITHIN GROUP (ORDER BY value)
> FROM t_test;
> {code}
> hive.log:
> {code}
> 2019-09-24T21:00:43,203 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
> org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:793)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
>   ... 11 more
> Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast 
> to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$PercentileContEvaluator.iterate(GenericUDAFPercentileCont.java:259)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:214)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:639)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:814)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:720)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:788)
>   ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22238:

Attachment: HIVE-22238.02.patch

> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22367) Transaction type not retrieved from OpenTxnRequest

2019-10-21 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22367:
--
Status: Patch Available  (was: Open)

> Transaction type not retrieved from OpenTxnRequest
> --
>
> Key: HIVE-22367
> URL: https://issues.apache.org/jira/browse/HIVE-22367
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22367.1.patch
>
>
> When opening a transaction, its type should be extracted from OpenTxnRequest 
> object. Currently it's hardcoded with TxnType.DEFAULT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22367) Transaction type not retrieved from OpenTxnRequest

2019-10-21 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22367:
--
Attachment: HIVE-22367.1.patch

> Transaction type not retrieved from OpenTxnRequest
> --
>
> Key: HIVE-22367
> URL: https://issues.apache.org/jira/browse/HIVE-22367
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22367.1.patch
>
>
> When opening a transaction, its type should be extracted from OpenTxnRequest 
> object. Currently it's hardcoded with TxnType.DEFAULT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-10-21 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22238:

Description: 
at [this 
point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
 the parent operators rownum is scaled according to pkfkselectivity

however [pkfkselectivity is 
computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
 on a whole subtree.

Scaling it by that amount will count in estimation already used when 
parentstats was calculated...so depending on the number of upstream joins - 
this may lead to severe underestimations

what happened was:
* optimization was able to push the filter to the other side of the join
* as a result the incoming data was already filtered
* scaling down by the PK selectiviy - was actually already there...but a new 
"scaling" happened

  was:
at [this 
point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
 the parent operators rownum is scaled according to pkfkselectivity

however [pkfkselectivity is 
computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
 on a whole subtree.

Scaling it by that amount will count in estimation already used when 
parentstats was calculated...so depending on the number of upstream joins - 
this may lead to severe underestimations


> PK/FK selectivity estimation underscales estimations
> 
>
> Key: HIVE-22238
> URL: https://issues.apache.org/jira/browse/HIVE-22238
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22238.01.patch
>
>
> at [this 
> point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
>  the parent operators rownum is scaled according to pkfkselectivity
> however [pkfkselectivity is 
> computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
>  on a whole subtree.
> Scaling it by that amount will count in estimation already used when 
> parentstats was calculated...so depending on the number of upstream joins - 
> this may lead to severe underestimations
> what happened was:
> * optimization was able to push the filter to the other side of the join
> * as a result the incoming data was already filtered
> * scaling down by the PK selectiviy - was actually already there...but a new 
> "scaling" happened



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22240) Function percentile_cont fails when array parameter passed

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956001#comment-16956001
 ] 

Hive QA commented on HIVE-22240:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19086/dev-support/hive-personality.sh
 |
| git revision | master / 1866d7d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19086/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Function percentile_cont fails when array parameter passed
> --
>
> Key: HIVE-22240
> URL: https://issues.apache.org/jira/browse/HIVE-22240
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22240.1.patch, HIVE-22240.2.patch, 
> HIVE-22240.3.patch, HIVE-22240.3.patch, HIVE-22240.4.patch, 
> HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch, HIVE-22240.4.patch
>
>
> {code}
> SELECT
> percentile_cont(array(0.2, 0.5, 0.9)) WITHIN GROUP (ORDER BY value)
> FROM t_test;
> {code}
> hive.log:
> {code}
> 2019-09-24T21:00:43,203 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> 

[jira] [Commented] (HIVE-22375) ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of error

2019-10-21 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955996#comment-16955996
 ] 

Peter Vary commented on HIVE-22375:
---

+1

> ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of 
> error
> ---
>
> Key: HIVE-22375
> URL: https://issues.apache.org/jira/browse/HIVE-22375
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-22375.1.patch
>
>
> In the ObjectStore.lockNotificationSequenceForUpdate method, the query 
> doesn't get closed if an error occur:
> {noformat}
>   private void lockNotificationSequenceForUpdate() throws MetaException {
>     if (sqlGenerator.getDbProduct() == DatabaseProduct.DERBY && directSql != 
> null) {
>       // Derby doesn't allow FOR UPDATE to lock the row being selected (See 
> https://db.apache
>       // .org/derby/docs/10.1/ref/rrefsqlj31783.html) . So lock the whole 
> table. Since there's
>       // only one row in the table, this shouldn't cause any performance 
> degradation.
>       new RetryingExecutor(conf, () -> {
>         directSql.lockDbTable("NOTIFICATION_SEQUENCE");
>       }).run();
>     } else {
>       String selectQuery = "select \"NEXT_EVENT_ID\" from 
> \"NOTIFICATION_SEQUENCE\"";
>       String lockingQuery = sqlGenerator.addForUpdateClause(selectQuery);
>       new RetryingExecutor(conf, () -> {
>         prepareQuotes();
>         Query query = pm.newQuery("javax.jdo.query.SQL", lockingQuery);
>         query.setUnique(true);
>         // only need to execute it to get db Lock
>         query.execute();
>         query.closeAll();
>       }).run();
>     }
>   }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22375) ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of error

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955969#comment-16955969
 ] 

Hive QA commented on HIVE-22375:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983587/HIVE-22375.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17545 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19085/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19085/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19085/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983587 - PreCommit-HIVE-Build

> ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of 
> error
> ---
>
> Key: HIVE-22375
> URL: https://issues.apache.org/jira/browse/HIVE-22375
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-22375.1.patch
>
>
> In the ObjectStore.lockNotificationSequenceForUpdate method, the query 
> doesn't get closed if an error occur:
> {noformat}
>   private void lockNotificationSequenceForUpdate() throws MetaException {
>     if (sqlGenerator.getDbProduct() == DatabaseProduct.DERBY && directSql != 
> null) {
>       // Derby doesn't allow FOR UPDATE to lock the row being selected (See 
> https://db.apache
>       // .org/derby/docs/10.1/ref/rrefsqlj31783.html) . So lock the whole 
> table. Since there's
>       // only one row in the table, this shouldn't cause any performance 
> degradation.
>       new RetryingExecutor(conf, () -> {
>         directSql.lockDbTable("NOTIFICATION_SEQUENCE");
>       }).run();
>     } else {
>       String selectQuery = "select \"NEXT_EVENT_ID\" from 
> \"NOTIFICATION_SEQUENCE\"";
>       String lockingQuery = sqlGenerator.addForUpdateClause(selectQuery);
>       new RetryingExecutor(conf, () -> {
>         prepareQuotes();
>         Query query = pm.newQuery("javax.jdo.query.SQL", lockingQuery);
>         query.setUnique(true);
>         // only need to execute it to get db Lock
>         query.execute();
>         query.closeAll();
>       }).run();
>     }
>   }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22375) ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of error

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955941#comment-16955941
 ] 

Hive QA commented on HIVE-22375:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
11s{color} | {color:blue} standalone-metastore/metastore-server in master has 
171 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19085/dev-support/hive-personality.sh
 |
| git revision | master / 1866d7d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19085/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of 
> error
> ---
>
> Key: HIVE-22375
> URL: https://issues.apache.org/jira/browse/HIVE-22375
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-22375.1.patch
>
>
> In the ObjectStore.lockNotificationSequenceForUpdate method, the query 
> doesn't get closed if an error occur:
> {noformat}
>   private void lockNotificationSequenceForUpdate() throws MetaException {
>     if (sqlGenerator.getDbProduct() == DatabaseProduct.DERBY && directSql != 
> null) {
>       // Derby doesn't allow FOR UPDATE to lock the row being selected (See 
> https://db.apache
>       // .org/derby/docs/10.1/ref/rrefsqlj31783.html) . So lock the whole 
> table. Since there's
>       // only one row in the table, this shouldn't cause any performance 
> degradation.
>       new RetryingExecutor(conf, () -> {
>         directSql.lockDbTable("NOTIFICATION_SEQUENCE");
>       }).run();
>     } else {
>       String selectQuery = "select \"NEXT_EVENT_ID\" from 
> \"NOTIFICATION_SEQUENCE\"";
>       String lockingQuery = sqlGenerator.addForUpdateClause(selectQuery);
>       new RetryingExecutor(conf, () -> {
>         prepareQuotes();
>         Query query = pm.newQuery("javax.jdo.query.SQL", lockingQuery);
>         query.setUnique(true);
>      

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955930#comment-16955930
 ] 

Hive QA commented on HIVE-22315:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983585/HIVE-22315.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17547 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorArithmetic.testDecimal64
 (batchId=342)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19084/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19084/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19084/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983585 - PreCommit-HIVE-Build

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, HIVE-22315.4.patch, HIVE-22315.6.patch, HIVE-22315.7.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955927#comment-16955927
 ] 

Hive QA commented on HIVE-22315:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m  
8s{color} | {color:red} branch/itests/hive-jmh cannot run convertXmlToText from 
findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 386 unchanged - 1 
fixed = 388 total (was 387) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
2s{color} | {color:red} root: The patch generated 2 new + 731 unchanged - 1 
fixed = 733 total (was 732) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
12s{color} | {color:red} patch/itests/hive-jmh cannot run convertXmlToText from 
findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19084/dev-support/hive-personality.sh
 |
| git revision | master / 1866d7d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19084/yetus/branch-findbugs-itests_hive-jmh.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19084/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19084/yetus/diff-checkstyle-root.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19084/yetus/patch-findbugs-itests_hive-jmh.txt
 |
| modules | C: vector-code-gen ql . itests itests/hive-jmh U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19084/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch, HIVE-22315.2.patch, 
> HIVE-22315.3.patch, 

[jira] [Updated] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread Shubham Chaurasia (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22360:
-
Description: 
Repro steps:

Input file:
{code}
1^,1^,^,0^,0^,0 
2^,1^,^,0^,1^,0 
3^,1^,^,0^,0^,0 
4^,1^,^,0^,1^,0
{code}

Queries:
{code}
CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' OVERWRITE 
INTO TABLE n2;

 select * from n2;
// wrong last column results here.
+--+--+--+--+--+
| n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
+--+--+--+--+--+
| 1| 1| NULL | 0| NULL |
| 2| 1| NULL | 0| NULL |
| 3| 1| NULL | 0| NULL |
| 4| 1| NULL | 0| NULL |
+--+--+--+--+--+
{code}

Cause:
In multi-serde parsing, the total length calculation here: 
https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
 does not take extra fields into account.

  was:
Repro steps:

Input file:
{code}
1^,1^,^,0^,0^,0 
2^,1^,^,0^,1^,0 
3^,1^,^,0^,0^,0 
4^,1^,^,0^,1^,0
{code}

Queries:
{code}
CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
smallint) ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES 
("field.delim"="^,")STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' OVERWRITE 
INTO TABLE n2;

 select * from n2;
// wrong last column results here.
+--+--+--+--+--+
| n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
+--+--+--+--+--+
| 1| 1| NULL | 0| NULL |
| 2| 1| NULL | 0| NULL |
| 3| 1| NULL | 0| NULL |
| 4| 1| NULL | 0| NULL |
+--+--+--+--+--+
{code}

Cause:
In multi-serde parsing, the total length calculation here: 
https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
 does not take extra fields into account.


> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22360) MultiSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2019-10-21 Thread Shubham Chaurasia (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22360:
-
Attachment: HIVE-22360.1.patch
Status: Patch Available  (was: Open)

> MultiSerDe returns wrong results in last column when the loaded file has more 
> columns than those in table schema
> 
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22360.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH 
> SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >