date:20191009

[jira] [Comment Edited] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan edited comment on HIVE-22315 at 10/10/19 5:58 AM:
---

-The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.
-

(I need more sleep)


was (Author: gopalv):
The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.



> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan edited comment on HIVE-22315 at 10/10/19 5:58 AM:
---

-The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.-

(I need more sleep)


was (Author: gopalv):
-The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.
-

(I need more sleep)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan edited comment on HIVE-22315 at 10/10/19 5:56 AM:
---

The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.




was (Author: gopalv):
The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

for (12,2) and 3.95 + -728, here's the round-up/down values to turn into scale6

{code}
scala> ((395 * 1) / -728)
res1: Int = -5426
{code}

that with scale 6 = -0.005426 which is the same as the DecimalWritable output.

However, the upscaled version doesn't

{code}
scala>  ((395 * 1000_000) / (-728*100))
res4: Int = -5425
{code}

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22239:
---
Attachment: HIVE-22239.05.patch

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.03.patch, HIVE-22239.04.patch, HIVE-22239.04.patch, 
> HIVE-22239.05.patch, HIVE-22239.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan edited comment on HIVE-22315 at 10/10/19 5:50 AM:
---

The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

for (12,2) and 3.95 + -728, here's the round-up/down values to turn into scale6

{code}
scala> ((395 * 1) / -728)
res1: Int = -5426
{code}

that with scale 6 = -0.005426 which is the same as the DecimalWritable output.

However, the upscaled version doesn't

{code}
scala>  ((395 * 100) / (-728*100))
res4: Int = -5425
{code}


was (Author: gopalv):
The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

but for (12,2) and 3.95 + -728, here's the round-up/down values

{code}
scala> ((395 * 1 / -728))
res1: Int = -5426
{code}



> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan edited comment on HIVE-22315 at 10/10/19 5:50 AM:
---

The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

for (12,2) and 3.95 + -728, here's the round-up/down values to turn into scale6

{code}
scala> ((395 * 1) / -728)
res1: Int = -5426
{code}

that with scale 6 = -0.005426 which is the same as the DecimalWritable output.

However, the upscaled version doesn't

{code}
scala>  ((395 * 1000_000) / (-728*100))
res4: Int = -5425
{code}


was (Author: gopalv):
The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

for (12,2) and 3.95 + -728, here's the round-up/down values to turn into scale6

{code}
scala> ((395 * 1) / -728)
res1: Int = -5426
{code}

that with scale 6 = -0.005426 which is the same as the DecimalWritable output.

However, the upscaled version doesn't

{code}
scala>  ((395 * 100) / (-728*100))
res4: Int = -5425
{code}

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948225#comment-16948225
 ] 

Gopal Vijayaraghavan commented on HIVE-22315:
-

The math for rounding up/down works even for integer fixedpoints. I'll dig up 
some docs on it right now.

but for (12,2) and 3.95 + -728, here's the round-up/down values

{code}
scala> ((395 * 1 / -728))
res1: Int = -5426
{code}



> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948212#comment-16948212
 ] 

Gopal Vijayaraghavan commented on HIVE-22315:
-

{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
(key / 0)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:149)
 at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
 at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:126)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:864)
 ... 19 more
Caused by: java.lang.ArithmeticException: / by zero
{code}

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948203#comment-16948203
 ] 

Gopal Vijayaraghavan commented on HIVE-22315:
-

{code}
Row 1 typeName1 decimal(12,2) typeName2 decimal(12,2) outputTypeName 
decimal(16,6) DIVIDE VECTOR_EXPRESSION COLUMN_SCALAR result -0.005425 
(HiveDecimalWritable) does not match row-mode expected result -0.005426 
(HiveDecimalWritable) row values [3.95] scalar2 -728
{code}

Running it in bc

{code}
scale=6
3.95/-728
-.005425

3.95/-728
-.0054258
{code}

so this is a rounding issue.

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-22313:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[^HIVE-22313.01.patch] committed to master. Thanks [~ashutosh.bapat] for review.

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22313.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread mahesh kumar behera (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948201#comment-16948201
 ] 

mahesh kumar behera commented on HIVE-22313:


+1

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22313.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948199#comment-16948199
 ] 

Hive QA commented on HIVE-22315:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982625/HIVE-22315.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 17520 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal64_div_decimal64scalar]
 (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_col_scalar_division]
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_expressions]
 (batchId=58)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_expressions]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
 (batchId=187)
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorArithmetic.testDecimal64
 (batchId=342)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18932/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18932/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18932/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982625 - PreCommit-HIVE-Build

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22316) Avoid hostname resolution in LlapInputFormat

2019-10-09 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22316:

Affects Version/s: 4.0.0

> Avoid hostname resolution in LlapInputFormat
> 
>
> Key: HIVE-22316
> URL: https://issues.apache.org/jira/browse/HIVE-22316
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: Screenshot 2019-10-10 at 10.13.48 AM.png
>
>
> Attaching prof output, which showed up when running short query. It would be 
> good to have the hostname as static final.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948198#comment-16948198
 ] 

Hive QA commented on HIVE-22315:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
28s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
24s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
45s{color} | {color:red} branch/itests/hive-jmh cannot run convertXmlToText 
from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 386 unchanged - 1 
fixed = 388 total (was 387) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
6s{color} | {color:red} root: The patch generated 10 new + 731 unchanged - 1 
fixed = 741 total (was 732) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} itests/hive-jmh: The patch generated 8 new + 15 
unchanged - 0 fixed = 23 total (was 15) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
12s{color} | {color:red} patch/itests/hive-jmh cannot run convertXmlToText from 
findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18932/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/branch-findbugs-itests_hive-jmh.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/diff-checkstyle-itests_hive-jmh.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/patch-findbugs-itests_hive-jmh.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus/patch-asflicense-problems.txt
 |
| modules | C: vector-code-gen ql . itests itests/hive-jmh U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18932/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support Decimal64 column division with decimal64 scalar
>

[jira] [Commented] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948156#comment-16948156
 ] 

Hive QA commented on HIVE-22239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982617/HIVE-22239.04.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 107 failed/errored test(s), 17518 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status]
 (batchId=88)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status_disable_bitvector]
 (batchId=88)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[confirm_initial_tbl_stats]
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constprog_type] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[foldts] (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_arithmetic] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge5] (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge6] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat1] 
(batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=93)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_11]
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_12]
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_13]
 (batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_14]
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_15]
 (batchId=96)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_16]
 (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_17]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_2] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_3] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_5] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_7] 
(batchId=94)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_8] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_9] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamp_ints_casts] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_cast] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_mapjoin] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_non_constant_in_expr]
 (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_trunc] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_10] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_11] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_12] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_14] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_16] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_17] 
(batchId=97)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_2] 
(batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_3] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_5] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] 
(batchId=48)

[jira] [Commented] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948136#comment-16948136
 ] 

Hive QA commented on HIVE-22239:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
10s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
26s{color} | {color:red} ql generated 1 new + 1549 unchanged - 1 fixed = 1550 
total (was 1550) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
17s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to uniformWithinRange in 
org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(Node,
 Stack, NodeProcessorCtx, Object[])  At 
StatsRulesProcFactory.java:org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(Node,
 Stack, NodeProcessorCtx, Object[])  At StatsRulesProcFactory.java:[line 2025] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18931/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18931/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18931/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18931/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.03.patch,

[jira] [Commented] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948118#comment-16948118
 ] 

Hive QA commented on HIVE-22314:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982618/HIVE-22314.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18930/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18930/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18930/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12982618/HIVE-22314.01.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982618 - PreCommit-HIVE-Build

> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.01.patch, HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948117#comment-16948117
 ] 

Hive QA commented on HIVE-21414:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982619/HIVE-21414.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17518 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=111)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18929/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18929/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18929/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982619 - PreCommit-HIVE-Build

> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21414.1.patch
>
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948099#comment-16948099
 ] 

Hive QA commented on HIVE-21414:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
37s{color} | {color:blue} standalone-metastore/metastore-common in master has 
37 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18929/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18929/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-common U: 
standalone-metastore/metastore-common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18929/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21414.1.patch
>
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948089#comment-16948089
 ] 

Hive QA commented on HIVE-22314:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982618/HIVE-22314.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 17518 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats5] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[count_dist_rewrite] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_11] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[keep_uniform] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullgroup4] (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_count] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] 
(batchId=26)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[count_dist_rewrite]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[keep_uniform]
 (batchId=181)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[nullgroup4] 
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=301)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query28] 
(batchId=301)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=301)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query95] 
(batchId=301)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query28] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query28] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query95] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query28]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query16]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query28]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query94]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query95]
 (batchId=299)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18928/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18928/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18928/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982618 - PreCommit-HIVE-Build

> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.01.patch, HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-21806) Renaming a table to another db does not handle Impala stats in partitoned tables

2019-10-09 Thread Dinesh Garg (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Garg reassigned HIVE-21806:
--

Assignee: Thejas Nair

> Renaming a table to another db does not handle Impala stats in partitoned 
> tables
> 
>
> Key: HIVE-21806
> URL: https://issues.apache.org/jira/browse/HIVE-21806
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Assignee: Thejas Nair
>Priority: Major
>
> The issue seems similar to HIVE-9720, but only happens with partitioned 
> tables. I only tested with Hive 3.1 in Impala dev environment.
> The issue can be reproduced by the following statements (it doesn't matter 
> whether Hive or Impala runs them with the exception of compute/show stats).
> create database db1;
> create database db2;
> create table db1.t(i int) partitioned by (p int);
> insert into db1.t partition (p=1) values (2);
> compute stats db1.t; -- needs Impala
> alter table db1.t rename to db2.t;
> show column stats db1.2; -- needs Impala, null count and ndv will be 
> incorrectly -1
> drop table db2.t; -- this will hang



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Attachment: HIVE-14302.5.patch
Status: Patch Available  (was: In Progress)

Forgot to add llap/mapjoin_decimal_vectorized.q.out before. Now adding

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.5.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Status: In Progress  (was: Patch Available)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948067#comment-16948067
 ] 

Hive QA commented on HIVE-22314:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 1 new + 5 unchanged - 0 fixed 
= 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18928/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18928/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18928/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18928/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.01.patch, HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Ramesh Kumar Thangarajan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948065#comment-16948065
 ] 

Ramesh Kumar Thangarajan commented on HIVE-22315:
-

[~gopalv] 

I did couple of changes to the code:
 # The decimal64 column and the decimal64 scalar both are already scaled up by 
the max(scale1,scale2) (maximum of the two scales). So I changed the final 
equation to only multiply by 10 ^ (outputColumn's scale). – (a * 
10^(outputColumn's scale)) / b.
 # The second we discussed P + (z - x + y) <=18 is the same as checking R <= 
18. This is because we calculate R and z as below:
{code:java}
// Precision: P - x + y + max(6, x + Q + 1)-- R
// Scale: max(6, x + Q + 1) -- Z{code}

           which is same as R = P + (z-x+y), Since we already check the 
condition R <= 18, I have not added this check. Please let me                   
know what you think

Note: Decimal64Column – a(P,x) Decimal64 Scalar – b(Q,y) OutputColumn c(R,z)

Thanks!

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Ramesh Kumar Thangarajan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948059#comment-16948059
 ] 

Ramesh Kumar Thangarajan commented on HIVE-22315:
-

Hi [~gopalv] [~rzhappy],

I have a created a pull request [https://github.com/apache/hive/pull/808] , so 
it will be easy to review, please share your comments. I will work on them and 
create the final patch. Thanks! 

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-09 Thread Steve Carlin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin updated HIVE-22274:

Attachment: HIVE-22274.5.patch

> Upgrade Calcite version to 1.21.0
> -
>
> Key: HIVE-22274
> URL: https://issues.apache.org/jira/browse/HIVE-22274
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
> Attachments: HIVE-22274.1.patch, HIVE-22274.2.patch, 
> HIVE-22274.3.patch, HIVE-22274.4.patch, HIVE-22274.5.patch, HIVE-22274.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948055#comment-16948055
 ] 

Hive QA commented on HIVE-14302:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982604/HIVE-14302.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17520 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mapjoin_decimal_vectorized]
 (batchId=172)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18927/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18927/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18927/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982604 - PreCommit-HIVE-Build

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22315:

Attachment: HIVE-22315.1.patch
Status: Patch Available  (was: In Progress)

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22315.1.patch
>
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948028#comment-16948028
 ] 

Hive QA commented on HIVE-14302:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18927/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18927/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18927/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not

[jira] [Commented] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948019#comment-16948019
 ] 

Hive QA commented on HIVE-22313:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982600/HIVE-22313.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17518 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18926/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18926/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18926/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982600 - PreCommit-HIVE-Build

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22313.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22315:
---

Assignee: Ramesh Kumar Thangarajan

> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-22315) Support Decimal64 column division with decimal64 scalar

2019-10-09 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22315 started by Ramesh Kumar Thangarajan.
---
> Support Decimal64 column division with decimal64 scalar
> ---
>
> Key: HIVE-22315
> URL: https://issues.apache.org/jira/browse/HIVE-22315
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Currently division operation is not supported for Decimal64 column. This Jira 
> will take care of supporting decimal64 column division with a decimal64 
> scalar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21344) CBO: Reduce compilation time in presence of materialized views

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21344?focusedWorklogId=325910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325910
 ]

ASF GitHub Bot logged work on HIVE-21344:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 20:46
Start Date: 09/Oct/19 20:46
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #749: HIVE-21344
URL: https://github.com/apache/hive/pull/749#discussion_r33393
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -1612,6 +1612,150 @@ public Table 
apply(org.apache.hadoop.hive.metastore.api.Table table) {
 }
   }
 
+  /**
+   * Get the materialized views that have been enabled for rewriting from the
+   * cache (registry). It will preprocess them to discard those that are
+   * outdated and augment those that need to be augmented, e.g., if incremental
+   * rewriting is enabled.
+   *
+   * @return the list of materialized views available for rewriting from the 
registry
+   * @throws HiveException
+   */
+  public List 
getPreprocessedMaterializedViewsFromRegistry(
+  List tablesUsed, HiveTxnManager txnMgr) throws HiveException {
+// From cache
+List materializedViews =
+HiveMaterializedViewsRegistry.get().getRewritingMaterializedViews();
+if (materializedViews.isEmpty()) {
+  // Bail out: empty list
+  return new ArrayList<>();
+}
+// Add to final result
+return filterAugmentMaterializedViews(materializedViews, tablesUsed, 
txnMgr);
+  }
+
+  private List 
filterAugmentMaterializedViews(List materializedViews,
+List tablesUsed, HiveTxnManager txnMgr) throws HiveException {
+final String validTxnsList = conf.get(ValidTxnList.VALID_TXNS_KEY);
+final ValidTxnWriteIdList currentTxnWriteIds = 
txnMgr.getValidWriteIds(tablesUsed, validTxnsList);
+final boolean tryIncrementalRewriting =
+HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_INCREMENTAL);
+final long defaultTimeWindow =
+HiveConf.getTimeVar(conf, 
HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW,
+TimeUnit.MILLISECONDS);
+try {
+  // Final result
+  List result = new ArrayList<>();
+  for (RelOptMaterialization materialization : materializedViews) {
+final RelNode viewScan = materialization.tableRel;
+final Table materializedViewTable;
+if (viewScan instanceof Project) {
+  // There is a Project on top (due to nullability)
+  materializedViewTable = ((RelOptHiveTable) 
viewScan.getInput(0).getTable()).getHiveTableMD();
+} else {
+  materializedViewTable = ((RelOptHiveTable) 
viewScan.getTable()).getHiveTableMD();
+}
+final Boolean outdated = 
isOutdatedMaterializedView(materializedViewTable, currentTxnWriteIds,
+defaultTimeWindow, tablesUsed, false);
+if (outdated == null) {
+  continue;
+}
+
+final CreationMetadata creationMetadata = 
materializedViewTable.getCreationMetadata();
+if (outdated) {
+  // The MV is outdated, see whether we should consider it for 
rewriting or not
+  if (!tryIncrementalRewriting) {
+LOG.debug("Materialized view " + 
materializedViewTable.getFullyQualifiedName() +
+" ignored for rewriting as its contents are outdated");
+continue;
+  }
+  // We will rewrite it to include the filters on transaction list
+  // so we can produce partial rewritings.
+  // This would be costly since we are doing it for every materialized 
view
+  // that is outdated, but it only happens for more than one 
materialized view
+  // if rewriting with outdated materialized views is enabled 
(currently
+  // disabled by default).
+  materialization = augmentMaterializationWithTimeInformation(
+  materialization, validTxnsList, new ValidTxnWriteIdList(
+  creationMetadata.getValidTxnList()));
+}
+result.add(materialization);
+  }
+  return result;
+} catch (Exception e) {
+  throw new HiveException(e);
+}
+  }
+
+  /**
+   * Validate that the materialized views retrieved from registry are still 
up-to-date.
+   * For those that are not, the method loads them from the metastore into the 
registry.
+   *
+   * @return true if they are up-to-date, otherwise false
+   * @throws HiveException
+   */
+  public boolean validateMaterializedViewsFromRegistry(List 
cachedMaterializedViewTables,
+  List tablesUsed, HiveTxnManager txnMgr) throws HiveException {
+final long defaultTimeWindow =
+HiveConf.getTimeVar(conf, 
HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW,
+TimeUnit.MILLISECONDS);
+final String validTxnsList =

[jira] [Work logged] (HIVE-21344) CBO: Reduce compilation time in presence of materialized views

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21344?focusedWorklogId=325909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325909
 ]

ASF GitHub Bot logged work on HIVE-21344:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 20:46
Start Date: 09/Oct/19 20:46
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #749: HIVE-21344
URL: https://github.com/apache/hive/pull/749#discussion_r333172647
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
 ##
 @@ -159,15 +165,36 @@ public void run() {
 SessionState ss = new SessionState(db.getConf());
 ss.setIsHiveServerQuery(true); // All is served from HS2, we do not 
need e.g. Tez sessions
 SessionState.start(ss);
-final boolean cache = !db.getConf()
-
.get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname).equals("DUMMY");
-for (Table mv : db.getAllMaterializedViewObjectsForRewriting()) {
-  addMaterializedView(db.getConf(), mv, OpType.LOAD, cache);
+if (initialized.get()) {
+  for (Table mvTable : db.getAllMaterializedViewObjectsForRewriting()) 
{
 
 Review comment:
   I suggest we add a info/perf LOG here to mark the beginning of 
refresh/initialization. Since we already LOG when it is finished it will give 
us an idea how long this initialization/refresh spent. This will potentially 
help debug issues related to performance in future. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325909)
Time Spent: 20m  (was: 10m)

> CBO: Reduce compilation time in presence of materialized views
> --
>
> Key: HIVE-21344
> URL: https://issues.apache.org/jira/browse/HIVE-21344
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 4.0.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21344.01.patch, HIVE-21344.02.patch, 
> HIVE-21344.03.patch, HIVE-21344.04.patch, HIVE-21344.patch, 
> calcite-planner-after-fix.svg.zip, mv-get-from-remote.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For every query, {{getAllValidMaterializedViews}} still requires a call to 
> metastore to verify that the materializations exist, whether they are 
> outdated or not, etc. Since this is only useful for active-active HS2 
> deployments, we could take a less aggressive approach and check this 
> information only after rewriting has been triggered. In addition, we could 
> refresh the information in the HS2 registry periodically in a background 
> thread.
> {code}
> // This is not a rebuild, we retrieve all the materializations. In turn, we 
> do not need
> // to force the materialization contents to be up-to-date, as this is not a 
> rebuild, and
> // we apply the user parameters 
> (HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW) instead.
> materializations = db.getAllValidMaterializedViews(getTablesUsed(basePlan), 
> false, getTxnMgr());
> {code}
> !mv-get-from-remote.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947970#comment-16947970
 ] 

Hive QA commented on HIVE-22313:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
43s{color} | {color:blue} standalone-metastore/metastore-common in master has 
37 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18926/dev-support/hive-personality.sh
 |
| git revision | master / c104e8b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18926/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-common U: 
standalone-metastore/metastore-common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18926/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22313.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor reassigned HIVE-21414:
-

Assignee: David Mollitor

> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21414.1.patch
>
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21414:
--
Attachment: HIVE-21414.1.patch

> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21414.1.patch
>
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21414:
--
Status: Patch Available  (was: Open)

> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21414.1.patch
>
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-10-09 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21414:
--
Description: Field comments are handed to the JSON SerDe from HMS and then 
are ignored.  The result is that all field comments are 'from deserializer' and 
cannot be changed.  (was: Field comments are handed to the JSON SerDe from HMS 
and then are ignored.  The result is that all field comments are 'from 
deserializer' and cannot be changed.

For example, Avro SerDe handles comments:

https://github.com/apache/hive/blob/release-1.1.0/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java#L133)

> Hive JSON SerDe Does Not Properly Handle Field Comments
> ---
>
> Key: HIVE-21414
> URL: https://issues.apache.org/jira/browse/HIVE-21414
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Priority: Minor
>
> Field comments are handed to the JSON SerDe from HMS and then are ignored.  
> The result is that all field comments are 'from deserializer' and cannot be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22314:
---
Attachment: HIVE-22314.01.patch

> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.01.patch, HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22239:
---
Attachment: HIVE-22239.04.patch

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.03.patch, HIVE-22239.04.patch, HIVE-22239.04.patch, 
> HIVE-22239.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22239:
---
Attachment: HIVE-22239.04.patch

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.03.patch, HIVE-22239.04.patch, HIVE-22239.04.patch, 
> HIVE-22239.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947959#comment-16947959
 ] 

Hive QA commented on HIVE-22239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982599/HIVE-22239.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18925/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18925/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18925/

Messages:
{noformat}
 This message was trimmed, see log for full details 
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_2.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out: 
does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_rewrite_no_join_opt_2.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_rewrite_part_1.q.out:
 does not exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_llap.q.out: does not exist 
in index
error: a/ql/src/test/results/clientpositive/llap/orc_llap_nonvector.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_merge5.q.out: does not 
exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_merge6.q.out: does not 
exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_merge7.q.out: does not 
exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_merge_incompat1.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_merge_incompat2.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/orc_predicate_pushdown.q.out: 
does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/parquet_predicate_pushdown.q.out: 
does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/results_cache_with_masking.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/retry_failure_reorder.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/runtime_stats_hs2.q.out: does 
not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_orc_nonvec_part_all_primitive.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_orc_nonvec_part_all_primitive_llap_io.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_orc_vec_part_all_primitive.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_orc_vec_part_all_primitive_llap_io.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_nonvec_part_all_primitive.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_nonvec_part_all_primitive_llap_io.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_vec_part_all_primitive.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_vec_part_all_primitive_llap_io.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_vecrow_part_all_primitive.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/schema_evol_text_vecrow_part_all_primitive_llap_io.q.out:
 does not exist in index
error: a/ql/src/test/results/clientpositive/llap/semijoin.q.out: does not exist 
in index
error: a/ql/src/test/results/clientpositive/llap/smb_mapjoin_14.q.out: does not 
exist in index
error: a/ql/src/test/results/clientpositive/llap/subquery_in.q.out: does not 
exist in index
error: a/ql/src/test/results/clientpositive/llap/subquery_select.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/tez_dynpart_hashjoin_1.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/tez_dynpart_hashjoin_2.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/tez_dynpart_hashjoin_3.q.out: 
does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/tez_vector_dynpart_hashjoin_1.q.out: 
does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/tez_vector_dynpart_hashjoin_2.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/llap/vector_aggregate_9.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/vector_between_in.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out: 
does not exist in index
error:

[jira] [Commented] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947955#comment-16947955
 ] 

Hive QA commented on HIVE-22274:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982588/HIVE-22274.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 17518 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_SortUnionTransposeRule]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[inputwherefalse] 
(batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit0] (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_limit]
 (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[plan_json] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=40)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_extractTime]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_floorTime]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table_perf]
 (batchId=185)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_only_empty_query]
 (batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_join_transpose]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_ANY]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_limit]
 (batchId=171)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_limit]
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=121)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query72]
 (batchId=299)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18924/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18924/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18924/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982588 - PreCommit-HIVE-Build

> Upgrade Calcite version to 1.21.0
> -
>
> Key: HIVE-22274
> URL: https://issues.apache.org/jira/browse/HIVE-22274
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
> Attachments: HIVE-22274.1.patch, HIVE-22274.2.patch, 
> HIVE-22274.3.patch, HIVE-22274.4.patch, HIVE-22274.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947951#comment-16947951
 ] 

Hive QA commented on HIVE-22274:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 31 new + 369 unchanged - 11 
fixed = 400 total (was 380) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
0s{color} | {color:red} root: The patch generated 31 new + 369 unchanged - 11 
fixed = 400 total (was 380) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
18s{color} | {color:red} ql generated 7 new + 1548 unchanged - 2 fixed = 1555 
total (was 1550) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m  
1s{color} | {color:red} ql generated 4 new + 96 unchanged - 4 fixed = 100 total 
(was 100) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  6m 
44s{color} | {color:red} root generated 4 new + 333 unchanged - 4 fixed = 337 
total (was 337) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to joinInfo in 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveJoinFactoryImpl.createJoin(RelNode,
 RelNode, RexNode, Set, JoinRelType, boolean)  At 
HiveRelFactories.java:org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveJoinFactoryImpl.createJoin(RelNode,
 RelNode, RexNode, Set, JoinRelType, boolean)  At HiveRelFactories.java:[line 
166] |
|  |  Dead store to joinInfo in 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveSemiJoinFactoryImpl.createSemiJoin(RelNode,
 RelNode, RexNode)  At 
HiveRelFactories.java:org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveSemiJoinFactoryImpl.createSemiJoin(RelNode,
 RelNode, RexNode)  At HiveRelFactories.java:[line 183] |
|  |  Dead store to rightKeys in 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At 
HiveRelDecorrelator.java:org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At HiveRelDecorrelator.java:[line 1465] |
|  |  Dead store to leftKeys in 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At

[jira] [Updated] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22314:
---
Status: Patch Available  (was: In Progress)

> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22314:
---
Attachment: HIVE-22314.patch

> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22314 started by Jesus Camacho Rodriguez.
--
> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-22314.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22314) Disable count distinct rewrite in Hive optimizer if it is already rewritten by Calcite

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-22314:
--


> Disable count distinct rewrite in Hive optimizer if it is already rewritten 
> by Calcite
> --
>
> Key: HIVE-22314
> URL: https://issues.apache.org/jira/browse/HIVE-22314
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947918#comment-16947918
 ] 

Hive QA commented on HIVE-21407:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982585/HIVE-21407.5.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17532 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18923/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18923/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18923/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982585 - PreCommit-HIVE-Build

> Parquet predicate pushdown is not working correctly for char column types
> -
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-21407.2.patch, HIVE-21407.3.patch, 
> HIVE-21407.4.patch, HIVE-21407.5.patch, HIVE-21407.patch
>
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate 
> is not pushed to parquet, so the filtering only happens within Hive. If the 
> parameter is true, the filter is pushed to parquet, but for a char type, the 
> value which is pushed to Parquet will be padded with spaces:
> {noformat}
>   @Override
>   public void setValue(String val, int len) {
> super.setValue(HiveBaseChar.getPaddedValue(val, len), -1);
>   }
> {noformat} 
> So if we have a char(10) column which contains the value "apple" and the 
> where condition looks like 'where c='apple'', the value pushed to Paquet will 
> be 'apple' followed by 5 spaces. But the stored values are not padded, so no 
> rows will be returned from Parquet.
> How to reproduce:
> {noformat}
> $ create table ppd (c char(10), v varchar(10), i int) stored as parquet;
> $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello', 
> 'world', 1),('hello','vilag',3);
> $ set hive.optimize.ppd.storage=true;
> $ set hive.vectorized.execution.enabled=true;
> $ set hive.vectorized.execution.enabled=false;
> $ set hive.optimize.ppd=true;
> $ set hive.optimize.index.filter=true;
> $ set hive.parquet.timestamp.skip.conversion=false;
> $ select * from ppd where c='apple';
> ++++
> | ppd.c  | ppd.v  | ppd.i  |
> ++++
> ++++
> $ set hive.optimize.index.filter=false; or set 
> hive.optimize.ppd.storage=false;
> $ select * from ppd where c='apple';
> +-+++
> |ppd.c| ppd.v  | ppd.i  |
> +-+++
> | apple   | bee| 1  |
> | apple   | tree   | 2  |
> +-+++
> {noformat}
> The issue surfaced after uploading the fix for 
> [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded 
> upstream. Before the HIVE-21327 fix, setting the parameter 
> 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q 
> test hid this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22312) MapJoinCounterHook doesnot work for tez

2019-10-09 Thread Pulkit Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947846#comment-16947846
 ] 

Pulkit Sharma edited comment on HIVE-22312 at 10/9/19 6:28 PM:
---

Thanks [~jcamachorodriguez]. I am working on changes and will raise PR soon.


was (Author: pulkits):
[~jcamachorodriguez] thanks. I am working on changes and will attach PR soon.

> MapJoinCounterHook doesnot work for tez
> ---
>
> Key: HIVE-22312
> URL: https://issues.apache.org/jira/browse/HIVE-22312
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions
>Reporter: Pulkit Sharma
>Priority: Major
>
> In [HIVE-1792|https://issues.apache.org/jira/browse/HIVE-1792], 
> MapJoinCounterHook hook was added to track joins that get converted to map 
> join. This hook gets List of Tasks from hookContext and check Tag associated 
> with each task. For mr, we create Conditional tasks in case of joins and add 
> tags for respective join conversions. This does not work in case of tez as we 
> only create TezTask(no Conditional Task is created) which can handle multiple 
> joins in contrast to one Conditional Task per join in mr.
> The current approach will fail even if we add tag to TezTask as it can have 
> multiple joins of the same type which will require counter.
> One possible solution for tez, is to parse query-plan after query completion 
> which we get from hookContext to get workGraph. Using workGraph, we can walk 
> through Operator Tree to find join conversions.
>  If this approach looks good, I can raise Pull Request
> cc [~ashutoshc] [~jcamachorodriguez] [~pxiong] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22306) Use nonblocking thrift server for metastore

2019-10-09 Thread Alan Gates (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947915#comment-16947915
 ] 

Alan Gates commented on HIVE-22306:
---

This is a good idea, but will a lot of testing.  HMS depends on thread local 
variables in a number of places.  So we will need to make sure any changes 
don't result in issues there.

> Use nonblocking thrift server for metastore
> ---
>
> Key: HIVE-22306
> URL: https://issues.apache.org/jira/browse/HIVE-22306
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Qinghui Xu
>Priority: Major
>
> Currently hive metastore's threads are blocking for network io (it's using 
> `TThreadPoolServer` behind the scene), which means with increasing use cases 
> (in our tech stack there are different services relying on it, hiveserver2, 
> spark, presto, and more, all with a significant number of users), to handle 
> all connections it needs either a big thread pool or many instances with 
> smaller thread pools. And often, those metastores will see their thread pool 
> saturated, while the cpu usage is still quite low, just because most 
> connections stay idle and only run a query from time to time. This is thus a 
> great misuse of the computation resources.
> Thus I propose to use a non blocking threading model, and run computation 
> asynchronously. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947881#comment-16947881
 ] 

Hive QA commented on HIVE-21407:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 18 new + 260 unchanged - 12 
fixed = 278 total (was 272) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
31s{color} | {color:red} ql generated 1 new + 1549 unchanged - 1 fixed = 1550 
total (was 1550) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.hive.ql.io.parquet.LeafFilterFactory$BooleanFilterPredicateLeafBuilder.buildPredict(PredicateLeaf$Operator,
 Object, String, TypeInfo)  At LeafFilterFactory.java:then immediately reboxed 
in 
org.apache.hadoop.hive.ql.io.parquet.LeafFilterFactory$BooleanFilterPredicateLeafBuilder.buildPredict(PredicateLeaf$Operator,
 Object, String, TypeInfo)  At LeafFilterFactory.java:[line 138] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18923/dev-support/hive-personality.sh
 |
| git revision | master / 7ae6756 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18923/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18923/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18923/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18923/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Parquet predicate pushdown is not working correctly for char column types
> -
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-21407.2.patch, HIVE-21407.3.patch, 
> HIVE-21407.4.patch, HIVE-21407.5.patch, HIVE-21407.patch
>
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate 
> is not pushed to parquet, so the filtering only happens within Hive. If the 
> parameter is true, the filter is pushed to parquet, but for a char

[jira] [Updated] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21924:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to master!
Thanks [~mustafaiman] for the patch!

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread Sankar Hariappan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947861#comment-16947861
 ] 

Sankar Hariappan commented on HIVE-21924:
-

+1

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325827
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 17:11
Start Date: 09/Oct/19 17:11
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333131008
 
 

 ##
 File path: 
ql/src/test/queries/clientpositive/file_with_header_footer_aggregation.q
 ##
 @@ -0,0 +1,94 @@
+set hive.mapred.mode=nonstrict;
+
+dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir};
+dfs -copyFromLocal ../../data/files/header_footer_table_4  
${system:test.tmp.dir}/header_footer_table_4;
+
+CREATE TABLE numbrs (numbr int);
+INSERT INTO numbrs VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), 
(11), (12), (NULL);
+CREATE EXTERNAL TABLE header_footer_table_4 (header_int int, header_name 
string, header_choice varchar(10)) ROW FORMAT DELIMITED FIELDS TERMINATED BY 
',' LOCATION '${system:test.tmp.dir}/header_footer_table_4' tblproperties 
("skip.header.line.count"="1", "skip.footer.line.count"="2");
 
 Review comment:
   ok... got it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325827)
Time Spent: 4.5h  (was: 4h 20m)

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22303) TestObjectStore starts some deadline timers which are never stopped

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947854#comment-16947854
 ] 

Hive QA commented on HIVE-22303:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982571/HIVE-22303.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17516 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18922/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18922/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18922/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982571 - PreCommit-HIVE-Build

> TestObjectStore starts some deadline timers which are never stopped
> ---
>
> Key: HIVE-22303
> URL: https://issues.apache.org/jira/browse/HIVE-22303
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22303.01.patch, HIVE-22303.01.patch
>
>
> because these timers are not stopped; they may stay there as a threadlocal; 
> and eventually time out since the disarm logic is missing...
> https://github.com/apache/hive/blob/d907dfe68ed84714d62a22e5191efa616eab2b24/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java#L373



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22312) MapJoinCounterHook doesnot work for tez

2019-10-09 Thread Pulkit Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947846#comment-16947846
 ] 

Pulkit Sharma commented on HIVE-22312:
--

[~jcamachorodriguez] thanks. I am working on changes and will attach PR soon.

> MapJoinCounterHook doesnot work for tez
> ---
>
> Key: HIVE-22312
> URL: https://issues.apache.org/jira/browse/HIVE-22312
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions
>Reporter: Pulkit Sharma
>Priority: Major
>
> In [HIVE-1792|https://issues.apache.org/jira/browse/HIVE-1792], 
> MapJoinCounterHook hook was added to track joins that get converted to map 
> join. This hook gets List of Tasks from hookContext and check Tag associated 
> with each task. For mr, we create Conditional tasks in case of joins and add 
> tags for respective join conversions. This does not work in case of tez as we 
> only create TezTask(no Conditional Task is created) which can handle multiple 
> joins in contrast to one Conditional Task per join in mr.
> The current approach will fail even if we add tag to TezTask as it can have 
> multiple joins of the same type which will require counter.
> One possible solution for tez, is to parse query-plan after query completion 
> which we get from hookContext to get workGraph. Using workGraph, we can walk 
> through Operator Tree to find join conversions.
>  If this approach looks good, I can raise Pull Request
> cc [~ashutoshc] [~jcamachorodriguez] [~pxiong] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325808
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:31
Start Date: 09/Oct/19 16:31
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333113836
 
 

 ##
 File path: 
ql/src/test/queries/clientpositive/file_with_header_footer_aggregation.q
 ##
 @@ -0,0 +1,94 @@
+set hive.mapred.mode=nonstrict;
+
+dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir};
+dfs -copyFromLocal ../../data/files/header_footer_table_4  
${system:test.tmp.dir}/header_footer_table_4;
+
+CREATE TABLE numbrs (numbr int);
+INSERT INTO numbrs VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), 
(11), (12), (NULL);
+CREATE EXTERNAL TABLE header_footer_table_4 (header_int int, header_name 
string, header_choice varchar(10)) ROW FORMAT DELIMITED FIELDS TERMINATED BY 
',' LOCATION '${system:test.tmp.dir}/header_footer_table_4' tblproperties 
("skip.header.line.count"="1", "skip.footer.line.count"="2");
 
 Review comment:
   see `skiphf_aggr2.q`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325808)
Time Spent: 4h 20m  (was: 4h 10m)

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Status: In Progress  (was: Patch Available)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-09 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Attachment: HIVE-14302.4.patch
Status: Patch Available  (was: In Progress)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, 
> HIVE-14302.4.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325793=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325793
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:07
Start Date: 09/Oct/19 16:07
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333102646
 
 

 ##
 File path: 
ql/src/test/queries/clientpositive/file_with_header_footer_aggregation.q
 ##
 @@ -0,0 +1,94 @@
+set hive.mapred.mode=nonstrict;
+
+dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir};
+dfs -copyFromLocal ../../data/files/header_footer_table_4  
${system:test.tmp.dir}/header_footer_table_4;
+
+CREATE TABLE numbrs (numbr int);
+INSERT INTO numbrs VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), 
(11), (12), (NULL);
+CREATE EXTERNAL TABLE header_footer_table_4 (header_int int, header_name 
string, header_choice varchar(10)) ROW FORMAT DELIMITED FIELDS TERMINATED BY 
',' LOCATION '${system:test.tmp.dir}/header_footer_table_4' tblproperties 
("skip.header.line.count"="1", "skip.footer.line.count"="2");
+
+SELECT * FROM header_footer_table_4;
+
+SELECT * FROM header_footer_table_4 ORDER BY header_int LIMIT 8;
+
+-- should return nothing as title is correctly skipped
+SELECT * FROM header_footer_table_4 WHERE header_choice = 'header_choice';
 
 Review comment:
   this is covered.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325793)
Time Spent: 4h 10m  (was: 4h)

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325791=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325791
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:06
Start Date: 09/Oct/19 16:06
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333102183
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/SkippingTextInputFormat.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.FileSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.TextInputFormat;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * SkippingInputFormat is a header/footer aware input format. It truncates
+ * splits identified by TextInputFormat. Header and footers are removed
+ * from the splits.
+ */
+public class SkippingTextInputFormat extends TextInputFormat {
+
+  private final Map startIndexMap = new ConcurrentHashMap();
+  private final Map endIndexMap = new ConcurrentHashMap();
+  private JobConf conf;
+  private int headerCount;
+  private int footerCount;
+
+  @Override
+  public void configure(JobConf conf) {
+this.conf = conf;
+super.configure(conf);
+  }
+
+  public void configure(JobConf conf, int headerCount, int footerCount) {
+configure(conf);
+this.headerCount = headerCount;
+this.footerCount = footerCount;
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts) {
+return makeSplitInternal(file, start, length, hosts, null);
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts, String[] inMemoryHosts) {
+return makeSplitInternal(file, start, length, hosts, inMemoryHosts);
+  }
+
+  private FileSplit makeSplitInternal(Path file, long start, long length, 
String[] hosts, String[] inMemoryHosts) {
+long cachedStart;
+long cachedEnd;
+try {
+  cachedStart = getCachedStartIndex(file);
+  cachedEnd = getCachedEndIndex(file);
+} catch (IOException e) {
+  LOG.warn("Could not detect header/footer", e);
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start + length) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start) {
+  length = length - (cachedStart - start);
+  start = cachedStart;
+}
+if (cachedEnd < start) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedEnd < start + length) {
+  length = cachedEnd - start;
+}
+if (inMemoryHosts == null) {
+  return super.makeSplit(file, start, length, hosts);
+} else {
+  return super.makeSplit(file, start, length, hosts, inMemoryHosts);
+}
+  }
+
+  private long getCachedStartIndex(Path path) throws IOException {
+if (headerCount == 0) {
+  return 0;
+}
+Long startIndexForFile = startIndexMap.get(path);
+if (startIndexForFile == null) {
+  FileSystem fileSystem;
+  FSDataInputStream fis = null;
+  fileSystem = path.getFileSystem(conf);
+  try {
+fis = fileSystem.open(path);
+for (int j = 0; j < headerCount; j++) {
+  if (fis.readLine() == null) {
+startIndexMap.put(path, Long.MAX_VALUE);
+return Long.MAX_VALUE;
+  }
+}
+// back 1 byte because readers skip the entire first row if split 
start is not 0
+startIndexForFile = fis.getPos() - 1;
 
 Review comment:
   ok. make sense.
 

This is an automated message from the Apache Git Service.
To respond to the

[jira] [Commented] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread Alexey Diomin (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947811#comment-16947811
 ] 

Alexey Diomin commented on HIVE-19261:
--

Update patch
 # readability and prevent multiple time assignment to instance variable
 # remove comments about computeIfAbsent, we can't use it here, because 
makeInstance can throw checked exception

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19261.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325790
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:05
Start Date: 09/Oct/19 16:05
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333101701
 
 

 ##
 File path: 
ql/src/test/queries/clientpositive/file_with_header_footer_aggregation.q
 ##
 @@ -0,0 +1,94 @@
+set hive.mapred.mode=nonstrict;
+
+dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir};
+dfs -copyFromLocal ../../data/files/header_footer_table_4  
${system:test.tmp.dir}/header_footer_table_4;
+
+CREATE TABLE numbrs (numbr int);
+INSERT INTO numbrs VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), 
(11), (12), (NULL);
+CREATE EXTERNAL TABLE header_footer_table_4 (header_int int, header_name 
string, header_choice varchar(10)) ROW FORMAT DELIMITED FIELDS TERMINATED BY 
',' LOCATION '${system:test.tmp.dir}/header_footer_table_4' tblproperties 
("skip.header.line.count"="1", "skip.footer.line.count"="2");
 
 Review comment:
   how about this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325790)
Time Spent: 3h 50m  (was: 3h 40m)

> Split text files even if header/footer exists
> -
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.4.0, 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.3.patch, 
> HIVE-21924.4.patch, HIVE-21924.5.patch, HIVE-21924.6.patch, HIVE-21924.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>  
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
>   headerCount = Utilities.getHeaderCount(table);
>   footerCount = Utilities.getFooterCount(table, conf);
>   if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE, 
> Long.MAX_VALUE);
>   }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files 
> not splittable if header or footer is present. 
> If only header is present, we can find the offset after first line break and 
> use that to split. Similarly for footer, may be read few KB's of data at the 
> end and find the last line break offset. Use that to determine the data range 
> which can be used for splitting. Few reads during split generation are 
> cheaper than not splitting the file at all.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19261?focusedWorklogId=325789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325789
 ]

ASF GitHub Bot logged work on HIVE-19261:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:04
Start Date: 09/Oct/19 16:04
Worklog Time Spent: 10m 
  Work Description: xhumanoid commented on pull request #807: HIVE-19261: 
Avro SerDe's InstanceCache should not be synchronized on retrieve
URL: https://github.com/apache/hive/pull/807
 
 
   Remove synchronization on cache retrieve have big performance impact
   will be great to have in branch-3 also
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325789)
Remaining Estimate: 0h
Time Spent: 10m

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19261.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19261:
--
Labels: pull-request-available  (was: )

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19261.1.patch
>
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325788
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:04
Start Date: 09/Oct/19 16:04
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333100958
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/SkippingTextInputFormat.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.FileSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.TextInputFormat;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * SkippingInputFormat is a header/footer aware input format. It truncates
+ * splits identified by TextInputFormat. Header and footers are removed
+ * from the splits.
+ */
+public class SkippingTextInputFormat extends TextInputFormat {
+
+  private final Map startIndexMap = new ConcurrentHashMap();
+  private final Map endIndexMap = new ConcurrentHashMap();
+  private JobConf conf;
+  private int headerCount;
+  private int footerCount;
+
+  @Override
+  public void configure(JobConf conf) {
+this.conf = conf;
+super.configure(conf);
+  }
+
+  public void configure(JobConf conf, int headerCount, int footerCount) {
+configure(conf);
+this.headerCount = headerCount;
+this.footerCount = footerCount;
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts) {
+return makeSplitInternal(file, start, length, hosts, null);
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts, String[] inMemoryHosts) {
+return makeSplitInternal(file, start, length, hosts, inMemoryHosts);
+  }
+
+  private FileSplit makeSplitInternal(Path file, long start, long length, 
String[] hosts, String[] inMemoryHosts) {
+long cachedStart;
+long cachedEnd;
+try {
+  cachedStart = getCachedStartIndex(file);
+  cachedEnd = getCachedEndIndex(file);
+} catch (IOException e) {
+  LOG.warn("Could not detect header/footer", e);
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start + length) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start) {
+  length = length - (cachedStart - start);
+  start = cachedStart;
+}
+if (cachedEnd < start) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedEnd < start + length) {
+  length = cachedEnd - start;
+}
+if (inMemoryHosts == null) {
+  return super.makeSplit(file, start, length, hosts);
+} else {
+  return super.makeSplit(file, start, length, hosts, inMemoryHosts);
+}
+  }
+
+  private long getCachedStartIndex(Path path) throws IOException {
+if (headerCount == 0) {
+  return 0;
+}
+Long startIndexForFile = startIndexMap.get(path);
+if (startIndexForFile == null) {
+  FileSystem fileSystem;
+  FSDataInputStream fis = null;
+  fileSystem = path.getFileSystem(conf);
+  try {
+fis = fileSystem.open(path);
+for (int j = 0; j < headerCount; j++) {
+  if (fis.readLine() == null) {
+startIndexMap.put(path, Long.MAX_VALUE);
+return Long.MAX_VALUE;
+  }
+}
+// back 1 byte because readers skip the entire first row if split 
start is not 0
+startIndexForFile = fis.getPos() - 1;
+  } finally {
+if (fis != null) {
+  fis.close();
+}
+  }
+  startIndexMap.put(path, startIndexForFile);
+}
+return startIndexForFile;
+  }

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325787
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:03
Start Date: 09/Oct/19 16:03
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333100647
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/SkippingTextInputFormat.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.FileSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.TextInputFormat;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * SkippingInputFormat is a header/footer aware input format. It truncates
+ * splits identified by TextInputFormat. Header and footers are removed
+ * from the splits.
+ */
+public class SkippingTextInputFormat extends TextInputFormat {
+
+  private final Map startIndexMap = new ConcurrentHashMap();
+  private final Map endIndexMap = new ConcurrentHashMap();
+  private JobConf conf;
+  private int headerCount;
+  private int footerCount;
+
+  @Override
+  public void configure(JobConf conf) {
+this.conf = conf;
+super.configure(conf);
+  }
+
+  public void configure(JobConf conf, int headerCount, int footerCount) {
+configure(conf);
+this.headerCount = headerCount;
+this.footerCount = footerCount;
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts) {
+return makeSplitInternal(file, start, length, hosts, null);
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts, String[] inMemoryHosts) {
+return makeSplitInternal(file, start, length, hosts, inMemoryHosts);
+  }
+
+  private FileSplit makeSplitInternal(Path file, long start, long length, 
String[] hosts, String[] inMemoryHosts) {
+long cachedStart;
+long cachedEnd;
+try {
+  cachedStart = getCachedStartIndex(file);
+  cachedEnd = getCachedEndIndex(file);
+} catch (IOException e) {
+  LOG.warn("Could not detect header/footer", e);
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start + length) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start) {
+  length = length - (cachedStart - start);
+  start = cachedStart;
+}
+if (cachedEnd < start) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedEnd < start + length) {
+  length = cachedEnd - start;
+}
+if (inMemoryHosts == null) {
+  return super.makeSplit(file, start, length, hosts);
+} else {
+  return super.makeSplit(file, start, length, hosts, inMemoryHosts);
+}
+  }
+
+  private long getCachedStartIndex(Path path) throws IOException {
+if (headerCount == 0) {
+  return 0;
+}
+Long startIndexForFile = startIndexMap.get(path);
+if (startIndexForFile == null) {
+  FileSystem fileSystem;
+  FSDataInputStream fis = null;
+  fileSystem = path.getFileSystem(conf);
+  try {
+fis = fileSystem.open(path);
+for (int j = 0; j < headerCount; j++) {
+  if (fis.readLine() == null) {
+startIndexMap.put(path, Long.MAX_VALUE);
+return Long.MAX_VALUE;
+  }
+}
+// back 1 byte because readers skip the entire first row if split 
start is not 0
+startIndexForFile = fis.getPos() - 1;
+  } finally {
+if (fis != null) {
+  fis.close();
+}
+  }
+  startIndexMap.put(path, startIndexForFile);
+}
+return startIndexForFile;
+  }

[jira] [Work logged] (HIVE-21924) Split text files even if header/footer exists

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21924?focusedWorklogId=325786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325786
 ]

ASF GitHub Bot logged work on HIVE-21924:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 16:02
Start Date: 09/Oct/19 16:02
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #791: HIVE-21924
URL: https://github.com/apache/hive/pull/791#discussion_r333100168
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/SkippingTextInputFormat.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.FileSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.TextInputFormat;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * SkippingInputFormat is a header/footer aware input format. It truncates
+ * splits identified by TextInputFormat. Header and footers are removed
+ * from the splits.
+ */
+public class SkippingTextInputFormat extends TextInputFormat {
+
+  private final Map startIndexMap = new ConcurrentHashMap();
+  private final Map endIndexMap = new ConcurrentHashMap();
+  private JobConf conf;
+  private int headerCount;
+  private int footerCount;
+
+  @Override
+  public void configure(JobConf conf) {
+this.conf = conf;
+super.configure(conf);
+  }
+
+  public void configure(JobConf conf, int headerCount, int footerCount) {
+configure(conf);
+this.headerCount = headerCount;
+this.footerCount = footerCount;
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts) {
+return makeSplitInternal(file, start, length, hosts, null);
+  }
+
+  @Override
+  protected FileSplit makeSplit(Path file, long start, long length, String[] 
hosts, String[] inMemoryHosts) {
+return makeSplitInternal(file, start, length, hosts, inMemoryHosts);
+  }
+
+  private FileSplit makeSplitInternal(Path file, long start, long length, 
String[] hosts, String[] inMemoryHosts) {
+long cachedStart;
+long cachedEnd;
+try {
+  cachedStart = getCachedStartIndex(file);
+  cachedEnd = getCachedEndIndex(file);
+} catch (IOException e) {
+  LOG.warn("Could not detect header/footer", e);
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start + length) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedStart > start) {
+  length = length - (cachedStart - start);
+  start = cachedStart;
+}
+if (cachedEnd < start) {
+  return new NullRowsInputFormat.DummyInputSplit(file);
+}
+if (cachedEnd < start + length) {
+  length = cachedEnd - start;
+}
+if (inMemoryHosts == null) {
+  return super.makeSplit(file, start, length, hosts);
+} else {
+  return super.makeSplit(file, start, length, hosts, inMemoryHosts);
+}
+  }
+
+  private long getCachedStartIndex(Path path) throws IOException {
+if (headerCount == 0) {
+  return 0;
+}
+Long startIndexForFile = startIndexMap.get(path);
+if (startIndexForFile == null) {
+  FileSystem fileSystem;
+  FSDataInputStream fis = null;
+  fileSystem = path.getFileSystem(conf);
+  try {
+fis = fileSystem.open(path);
+for (int j = 0; j < headerCount; j++) {
+  if (fis.readLine() == null) {
+startIndexMap.put(path, Long.MAX_VALUE);
+return Long.MAX_VALUE;
+  }
+}
+// back 1 byte because readers skip the entire first row if split 
start is not 0
+startIndexForFile = fis.getPos() - 1;
+  } finally {
+if (fis != null) {
+  fis.close();
+}
+  }
+  startIndexMap.put(path, startIndexForFile);
+}
+return startIndexForFile;
+  }

[jira] [Commented] (HIVE-22303) TestObjectStore starts some deadline timers which are never stopped

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947806#comment-16947806
 ] 

Hive QA commented on HIVE-22303:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
15s{color} | {color:blue} standalone-metastore/metastore-server in master has 
170 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 4 new + 42 unchanged - 0 fixed = 46 total (was 42) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 2 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18922/dev-support/hive-personality.sh
 |
| git revision | master / 7ae6756 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18922/yetus/diff-checkstyle-standalone-metastore_metastore-server.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18922/yetus/whitespace-tabs.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18922/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18922/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TestObjectStore starts some deadline timers which are never stopped
> ---
>
> Key: HIVE-22303
> URL: https://issues.apache.org/jira/browse/HIVE-22303
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22303.01.patch, HIVE-22303.01.patch
>
>
> because these timers are not stopped; they may stay there as a threadlocal; 
> and eventually time out since the disarm logic is missing...
> https://github.com/apache/hive/blob/d907dfe68ed84714d62a22e5191efa616eab2b24/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java#L373



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread Alexey Diomin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Diomin updated HIVE-19261:
-
Comment: was deleted

(was: Update patch
 # readability and prevent multiple time assignment to instance variable
 # remove comments about computeIfAbsent, we can't use it here, because 
makeInstance can throw checked exception)

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
> Attachments: HIVE-19261.1.patch
>
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread Alexey Diomin (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947801#comment-16947801
 ] 

Alexey Diomin commented on HIVE-19261:
--

Update patch
 # readability and prevent multiple time assignment to instance variable
 # remove comments about computeIfAbsent, we can't use it here, because 
makeInstance can throw checked exception

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
> Attachments: HIVE-19261.1.patch
>
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread Alexey Diomin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Diomin updated HIVE-19261:
-
Comment: was deleted

(was: Update patch
 # readability and prevent multiple time assignment to instance variable
 # remove comments about computeIfAbsent, we can't use it here, because 
makeInstance can throw checked exception)

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
> Attachments: HIVE-19261.1.patch
>
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22038) Fix memory related sideeffects of opening/closing sessions

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947796#comment-16947796
 ] 

Hive QA commented on HIVE-22038:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982568/HIVE-22038.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17516 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.stats.TestStatsUpdaterThread.testQueueingWithThreads 
(batchId=321)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18921/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18921/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18921/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982568 - PreCommit-HIVE-Build

> Fix memory related sideeffects of opening/closing sessions
> --
>
> Key: HIVE-22038
> URL: https://issues.apache.org/jira/browse/HIVE-22038
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22038.01.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve

2019-10-09 Thread Alexey Diomin (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-19261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947795#comment-16947795
 ] 

Alexey Diomin commented on HIVE-19261:
--

Update patch
 # readability and prevent multiple time assignment to instance variable
 # remove comments about computeIfAbsent, we can't use it here, because 
makeInstance can throw checked exception

> Avro SerDe's InstanceCache should not be synchronized on retrieve
> -
>
> Key: HIVE-19261
> URL: https://issues.apache.org/jira/browse/HIVE-19261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Fangshi Li
>Assignee: Fangshi Li
>Priority: Major
> Attachments: HIVE-19261.1.patch
>
>
> In HIVE-16175, upstream made a patch to fix the thread safety issue in 
> AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache 
> synchronized. While it should make InstanceCache thread-safe, making retrieve 
> synchronized for the cache can be expensive in highly concurrent environment 
> like Spark, as multiple threads need to be synchronized on entering the 
> entire retrieve method.
> We are proposing another way to fix this thread safety issue by making the 
> underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use 
> atomic computeIfAbsent in the retrieve method to avoid synchronizing the 
> entire method.
> While computeIfAbsent is only available on java 8 and java 7 is still 
> supported in Hive,
> we use a pattern to simulate the behavior of computeIfAbsent. In the future, 
> we should move to computeIfAbsent when Hive requires java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread Ashutosh Bapat (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-22313:
--
Attachment: HIVE-22313.01.patch
Status: Patch Available  (was: Open)

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22313.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22313:
--
Labels: pull-request-available  (was: )

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22313?focusedWorklogId=325751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325751
 ]

ASF GitHub Bot logged work on HIVE-22313:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 15:14
Start Date: 09/Oct/19 15:14
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #806: 
HIVE-22313 : Some of the HMS auth LDAP hive config names do not start with 
"hive."
URL: https://github.com/apache/hive/pull/806
 
 
   @maheshk114 , can you please review the change?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325751)
Remaining Estimate: 0h
Time Spent: 10m

> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22239:
---
Attachment: HIVE-22239.03.patch

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.03.patch, HIVE-22239.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325750=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325750
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 15:08
Start Date: 09/Oct/19 15:08
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r333070056
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 ##
 @@ -967,13 +979,23 @@ private long evaluateComparator(Statistics stats, 
AnnotateStatsProcCtx aspCtx, E
   if (minValue > value) {
 return 0;
   }
+  if (uniformWithinRange) {
+// Assuming uniform distribution, we can use the range to 
calculate
+// new estimate for the number of rows
+return Math.round(((double) (value - minValue) / (maxValue - 
minValue)) * numRows);
 
 Review comment:
   Good catch. I fixed that in latest patch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325750)
Time Spent: 4h 20m  (was: 4h 10m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22313) Some of the HMS auth LDAP hive config names do not start with "hive."

2019-10-09 Thread Ashutosh Bapat (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat reassigned HIVE-22313:
-


> Some of the HMS auth LDAP hive config names do not start with "hive."
> -
>
> Key: HIVE-22313
> URL: https://issues.apache.org/jira/browse/HIVE-22313
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22038) Fix memory related sideeffects of opening/closing sessions

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947759#comment-16947759
 ] 

Hive QA commented on HIVE-22038:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
7s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18921/dev-support/hive-personality.sh
 |
| git revision | master / 7ae6756 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18921/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18921/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix memory related sideeffects of opening/closing sessions
> --
>
> Key: HIVE-22038
> URL: https://issues.apache.org/jira/browse/HIVE-22038
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22038.01.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch, HIVE-22038.02.patch, 
> HIVE-22038.02.patch, HIVE-22038.02.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22308) Add missing support of Azure Blobstore schemes

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22308:
--
Labels: pull-request-available  (was: )

> Add missing support of Azure Blobstore schemes
> --
>
> Key: HIVE-22308
> URL: https://issues.apache.org/jira/browse/HIVE-22308
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22308.patch
>
>
> Azure has been used as a filesystem for Hive, but its various schemes aren't 
> registered under
> {{HiveConf.HIVE_BLOBSTORE_SUPPORTED_SCHEMES.}}
> Found the list of elements in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemUriSchemes.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22308) Add missing support of Azure Blobstore schemes

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22308?focusedWorklogId=325734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325734
 ]

ASF GitHub Bot logged work on HIVE-22308:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 14:40
Start Date: 09/Oct/19 14:40
Worklog Time Spent: 10m 
  Work Description: dlavati commented on pull request #805: HIVE-22308 Add 
missing support of Azure Blobstore schemes
URL: https://github.com/apache/hive/pull/805
 
 
   Change-Id: I5e2fe35e5cb3118fe49c2ac3eb27bd185ea7de2a
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325734)
Remaining Estimate: 0h
Time Spent: 10m

> Add missing support of Azure Blobstore schemes
> --
>
> Key: HIVE-22308
> URL: https://issues.apache.org/jira/browse/HIVE-22308
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22308.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Azure has been used as a filesystem for Hive, but its various schemes aren't 
> registered under
> {{HiveConf.HIVE_BLOBSTORE_SUPPORTED_SCHEMES.}}
> Found the list of elements in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemUriSchemes.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-09 Thread Steve Carlin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin updated HIVE-22274:

Attachment: HIVE-22274.4.patch

> Upgrade Calcite version to 1.21.0
> -
>
> Key: HIVE-22274
> URL: https://issues.apache.org/jira/browse/HIVE-22274
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
> Attachments: HIVE-22274.1.patch, HIVE-22274.2.patch, 
> HIVE-22274.3.patch, HIVE-22274.4.patch, HIVE-22274.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types

2019-10-09 Thread Marta Kuczora (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947729#comment-16947729
 ] 

Marta Kuczora commented on HIVE-21407:
--

Reattach the patch to rerun the tests.

> Parquet predicate pushdown is not working correctly for char column types
> -
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-21407.2.patch, HIVE-21407.3.patch, 
> HIVE-21407.4.patch, HIVE-21407.5.patch, HIVE-21407.patch
>
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate 
> is not pushed to parquet, so the filtering only happens within Hive. If the 
> parameter is true, the filter is pushed to parquet, but for a char type, the 
> value which is pushed to Parquet will be padded with spaces:
> {noformat}
>   @Override
>   public void setValue(String val, int len) {
> super.setValue(HiveBaseChar.getPaddedValue(val, len), -1);
>   }
> {noformat} 
> So if we have a char(10) column which contains the value "apple" and the 
> where condition looks like 'where c='apple'', the value pushed to Paquet will 
> be 'apple' followed by 5 spaces. But the stored values are not padded, so no 
> rows will be returned from Parquet.
> How to reproduce:
> {noformat}
> $ create table ppd (c char(10), v varchar(10), i int) stored as parquet;
> $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello', 
> 'world', 1),('hello','vilag',3);
> $ set hive.optimize.ppd.storage=true;
> $ set hive.vectorized.execution.enabled=true;
> $ set hive.vectorized.execution.enabled=false;
> $ set hive.optimize.ppd=true;
> $ set hive.optimize.index.filter=true;
> $ set hive.parquet.timestamp.skip.conversion=false;
> $ select * from ppd where c='apple';
> ++++
> | ppd.c  | ppd.v  | ppd.i  |
> ++++
> ++++
> $ set hive.optimize.index.filter=false; or set 
> hive.optimize.ppd.storage=false;
> $ select * from ppd where c='apple';
> +-+++
> |ppd.c| ppd.v  | ppd.i  |
> +-+++
> | apple   | bee| 1  |
> | apple   | tree   | 2  |
> +-+++
> {noformat}
> The issue surfaced after uploading the fix for 
> [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded 
> upstream. Before the HIVE-21327 fix, setting the parameter 
> 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q 
> test hid this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types

2019-10-09 Thread Marta Kuczora (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-21407:
-
Attachment: HIVE-21407.5.patch

> Parquet predicate pushdown is not working correctly for char column types
> -
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-21407.2.patch, HIVE-21407.3.patch, 
> HIVE-21407.4.patch, HIVE-21407.5.patch, HIVE-21407.patch
>
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate 
> is not pushed to parquet, so the filtering only happens within Hive. If the 
> parameter is true, the filter is pushed to parquet, but for a char type, the 
> value which is pushed to Parquet will be padded with spaces:
> {noformat}
>   @Override
>   public void setValue(String val, int len) {
> super.setValue(HiveBaseChar.getPaddedValue(val, len), -1);
>   }
> {noformat} 
> So if we have a char(10) column which contains the value "apple" and the 
> where condition looks like 'where c='apple'', the value pushed to Paquet will 
> be 'apple' followed by 5 spaces. But the stored values are not padded, so no 
> rows will be returned from Parquet.
> How to reproduce:
> {noformat}
> $ create table ppd (c char(10), v varchar(10), i int) stored as parquet;
> $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello', 
> 'world', 1),('hello','vilag',3);
> $ set hive.optimize.ppd.storage=true;
> $ set hive.vectorized.execution.enabled=true;
> $ set hive.vectorized.execution.enabled=false;
> $ set hive.optimize.ppd=true;
> $ set hive.optimize.index.filter=true;
> $ set hive.parquet.timestamp.skip.conversion=false;
> $ select * from ppd where c='apple';
> ++++
> | ppd.c  | ppd.v  | ppd.i  |
> ++++
> ++++
> $ set hive.optimize.index.filter=false; or set 
> hive.optimize.ppd.storage=false;
> $ select * from ppd where c='apple';
> +-+++
> |ppd.c| ppd.v  | ppd.i  |
> +-+++
> | apple   | bee| 1  |
> | apple   | tree   | 2  |
> +-+++
> {noformat}
> The issue surfaced after uploading the fix for 
> [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded 
> upstream. Before the HIVE-21327 fix, setting the parameter 
> 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q 
> test hid this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22308) Add missing support of Azure Blobstore schemes

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947712#comment-16947712
 ] 

Hive QA commented on HIVE-22308:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982562/HIVE-22308.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17516 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18920/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18920/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18920/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982562 - PreCommit-HIVE-Build

> Add missing support of Azure Blobstore schemes
> --
>
> Key: HIVE-22308
> URL: https://issues.apache.org/jira/browse/HIVE-22308
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
> Attachments: HIVE-22308.patch
>
>
> Azure has been used as a filesystem for Hive, but its various schemes aren't 
> registered under
> {{HiveConf.HIVE_BLOBSTORE_SUPPORTED_SCHEMES.}}
> Found the list of elements in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemUriSchemes.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325714
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 14:04
Start Date: 09/Oct/19 14:04
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r333032465
 
 

 ##
 File path: ql/src/test/results/clientpositive/llap/subquery_select.q.out
 ##
 @@ -3918,14 +3918,14 @@ STAGE PLANS:
   Statistics: Num rows: 26 Data size: 208 Basic stats: 
COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: p_partkey BETWEEN 1 AND 2 (type: 
boolean)
-Statistics: Num rows: 8 Data size: 64 Basic stats: 
COMPLETE Column stats: COMPLETE
+Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
 Select Operator
   expressions: p_size (type: int)
   outputColumnNames: p_size
-  Statistics: Num rows: 8 Data size: 64 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
   Group By Operator
 aggregations: max(p_size)
-minReductionHashAggr: 0.875
+minReductionHashAggr: 0.0
 
 Review comment:
   Range is `15103`-`195606` for `p_partkey` column, out of 26 rows. Hence, the 
estimate of `1` seems right.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325714)
Time Spent: 4h 10m  (was: 4h)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22312) MapJoinCounterHook doesnot work for tez

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947667#comment-16947667
 ] 

Jesus Camacho Rodriguez commented on HIVE-22312:


[~pulkits], thanks. I think the approach you described makes sense.

> MapJoinCounterHook doesnot work for tez
> ---
>
> Key: HIVE-22312
> URL: https://issues.apache.org/jira/browse/HIVE-22312
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions
>Reporter: Pulkit Sharma
>Priority: Major
>
> In [HIVE-1792|https://issues.apache.org/jira/browse/HIVE-1792], 
> MapJoinCounterHook hook was added to track joins that get converted to map 
> join. This hook gets List of Tasks from hookContext and check Tag associated 
> with each task. For mr, we create Conditional tasks in case of joins and add 
> tags for respective join conversions. This does not work in case of tez as we 
> only create TezTask(no Conditional Task is created) which can handle multiple 
> joins in contrast to one Conditional Task per join in mr.
> The current approach will fail even if we add tag to TezTask as it can have 
> multiple joins of the same type which will require counter.
> One possible solution for tez, is to parse query-plan after query completion 
> which we get from hookContext to get workGraph. Using workGraph, we can walk 
> through Operator Tree to find join conversions.
>  If this approach looks good, I can raise Pull Request
> cc [~ashutoshc] [~jcamachorodriguez] [~pxiong] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325679
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 13:10
Start Date: 09/Oct/19 13:10
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r333003637
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -856,8 +856,15 @@ public static ColStatistics 
getColStatistics(ColumnStatisticsObj cso, String tab
 } else if (colTypeLowerCase.equals(serdeConstants.BINARY_TYPE_NAME)) {
   cs.setAvgColLen(csd.getBinaryStats().getAvgColLen());
   cs.setNumNulls(csd.getBinaryStats().getNumNulls());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
 
 Review comment:
   I think it is a good idea and we are not in a hurry... Let's do the right 
thing.
   I have created https://issues.apache.org/jira/browse/HIVE-22311.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325679)
Time Spent: 4h  (was: 3h 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22311) Propagate min/max column values from statistics to the optimizer for timestamp type

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-22311:
--


> Propagate min/max column values from statistics to the optimizer for 
> timestamp type
> ---
>
> Key: HIVE-22311
> URL: https://issues.apache.org/jira/browse/HIVE-22311
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently stats annotation does not consider timestamp type e.g. for 
> estimates with range predicates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22239:
---
Description: Currently, min/max values for columns are only used to 
determine whether a certain range filter falls out of range and thus filters 
all rows or none at all. If it does not, we just use a heuristic that the 
condition will filter 1/3 of the input rows. Instead of using that heuristic, 
we can use another one that assumes that data will be uniformly distributed 
across that range, and calculate the selectivity for the condition accordingly. 
 (was: Currently, min/max values for columns are only used to determine whether 
a certain range filter falls out of range and thus filters all rows or none at 
all. If it does not, we just use a heuristic that the condition will filter 1/3 
of the input rows. Instead of using that heuristic, we can use another one that 
assumes that data will be uniformly distributed across that range, and 
calculate the selectivity for the condition accordingly.

This patch also includes the propagation of min/max column values from 
statistics to the optimizer for timestamp type.)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22235?focusedWorklogId=325678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325678
 ]

ASF GitHub Bot logged work on HIVE-22235:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 13:04
Start Date: 09/Oct/19 13:04
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #784: 
HIVE-22235 CommandProcessorResponse should not be an exception
URL: https://github.com/apache/hive/pull/784
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325678)
Time Spent: 1h 10m  (was: 1h)

> CommandProcessorResponse should not be an exception
> ---
>
> Key: HIVE-22235
> URL: https://issues.apache.org/jira/browse/HIVE-22235
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22235.01.patch, HIVE-22235.02.patch, 
> HIVE-22235.03.patch, HIVE-22235.04.patch, HIVE-22235.05.patch, 
> HIVE-22235.06.patch, HIVE-22235.07.patch, HIVE-22235.08.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The CommandProcessorResponse class extends Exception. This may be convenient, 
> but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-09 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22235:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> CommandProcessorResponse should not be an exception
> ---
>
> Key: HIVE-22235
> URL: https://issues.apache.org/jira/browse/HIVE-22235
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22235.01.patch, HIVE-22235.02.patch, 
> HIVE-22235.03.patch, HIVE-22235.04.patch, HIVE-22235.05.patch, 
> HIVE-22235.06.patch, HIVE-22235.07.patch, HIVE-22235.08.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The CommandProcessorResponse class extends Exception. This may be convenient, 
> but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22308) Add missing support of Azure Blobstore schemes

2019-10-09 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947636#comment-16947636
 ] 

Hive QA commented on HIVE-22308:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18920/dev-support/hive-personality.sh
 |
| git revision | master / 8bcf7b9a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18920/yetus/patch-asflicense-problems.txt
 |
| modules | C: common U: common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18920/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add missing support of Azure Blobstore schemes
> --
>
> Key: HIVE-22308
> URL: https://issues.apache.org/jira/browse/HIVE-22308
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
> Attachments: HIVE-22308.patch
>
>
> Azure has been used as a filesystem for Hive, but its various schemes aren't 
> registered under
> {{HiveConf.HIVE_BLOBSTORE_SUPPORTED_SCHEMES.}}
> Found the list of elements in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemUriSchemes.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22303) TestObjectStore starts some deadline timers which are never stopped

2019-10-09 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22303:

Attachment: HIVE-22303.01.patch

> TestObjectStore starts some deadline timers which are never stopped
> ---
>
> Key: HIVE-22303
> URL: https://issues.apache.org/jira/browse/HIVE-22303
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22303.01.patch, HIVE-22303.01.patch
>
>
> because these timers are not stopped; they may stay there as a threadlocal; 
> and eventually time out since the disarm logic is missing...
> https://github.com/apache/hive/blob/d907dfe68ed84714d62a22e5191efa616eab2b24/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java#L373



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325677
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 09/Oct/19 12:52
Start Date: 09/Oct/19 12:52
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332980654
 
 

 ##
 File path: 
ql/src/test/results/clientpositive/llap/retry_failure_stat_changes.q.out
 ##
 @@ -139,25 +139,25 @@ Stage-0
   PARTITION_ONLY_SHUFFLE [RS_12]
 Group By Operator [GBY_11] (rows=1/1 width=8)
   Output:["_col0"],aggregations:["sum(_col0)"]
-  Select Operator [SEL_9] (rows=1/3 width=8)
+  Select Operator [SEL_9] (rows=4/3 width=8)
 Output:["_col0"]
-Merge Join Operator [MERGEJOIN_30] (rows=1/3 width=8)
+Merge Join Operator [MERGEJOIN_30] (rows=4/3 width=8)
   Conds:RS_6._col0=RS_7._col0(Inner),Output:["_col0","_col1"]
 <-Map 1 [SIMPLE_EDGE] llap
   SHUFFLE [RS_6]
 PartitionCols:_col0
-Select Operator [SEL_2] (rows=1/5 width=4)
+Select Operator [SEL_2] (rows=7/5 width=4)
   Output:["_col0"]
-  Filter Operator [FIL_18] (rows=1/5 width=4)
+  Filter Operator [FIL_18] (rows=7/5 width=4)
 
 Review comment:
   Since it was by-design, I have disabled the uniform stats for this specific 
test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325677)
Time Spent: 3h 50m  (was: 3h 40m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 147 matches

Mail list logo