[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Tristan Stevens (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900854#comment-15900854
 ] 

Tristan Stevens commented on HIVE-11266:


If Hive is still serving results directly from the stats then with external 
tables it cannot guarantee their accuracy.

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900851#comment-15900851
 ] 

Pengcheng Xiong commented on HIVE-11266:


I see. We changed a lot since then. This should be already fixed in the recent 
Hive versions.

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Simone (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900832#comment-15900832
 ] 

Simone commented on HIVE-11266:
---

It was Hive 1.1.0 in CDH distribution

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15468) Enhance the vectorized execution engine to support complex types

2017-03-07 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi reassigned HIVE-15468:
-

Assignee: Teddy Choi

> Enhance the vectorized execution engine to support complex types
> 
>
> Key: HIVE-15468
> URL: https://issues.apache.org/jira/browse/HIVE-15468
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Chao Sun
>Assignee: Teddy Choi
>
> Currently Hive's vectorized execution engine only supports scalar types, as 
> documented here: 
> https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution.
> To be complete, we should add support for complex types as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15001) Remove showConnectedUrl from command line help

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900826#comment-15900826
 ] 

Hive QA commented on HIVE-15001:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834160/HIVE-15001.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10332 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4011/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4011/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4011/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834160 - PreCommit-HIVE-Build

> Remove showConnectedUrl from command line help
> --
>
> Key: HIVE-15001
> URL: https://issues.apache.org/jira/browse/HIVE-15001
> Project: Hive
>  Issue Type: Sub-task
>  Components: Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Trivial
> Attachments: HIVE-15001.2.patch, HIVE-15001.3.patch, HIVE-15001.patch
>
>
> As discussed with [~nemon], the showConnectedUrl commandline parameter is not 
> working since a erroneous merge. Instead beeline always prints the currently 
> connected url. Since it is good for everyone, no extra parameter is needed to 
> turn this feature on.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-10494) hive metastore server can't release its java heap with no work on it

2017-03-07 Thread Zhaofei Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900798#comment-15900798
 ] 

Zhaofei Meng edited comment on HIVE-10494 at 3/8/17 6:40 AM:
-

Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction  60.


was (Author: 5feixiang):
Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction smaller.

> hive metastore server can't release its java heap with no work on it
> 
>
> Key: HIVE-10494
> URL: https://issues.apache.org/jira/browse/HIVE-10494
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
> Environment: cloudera cdh 5.2.0
> 10 nodes
>  128G ram, 10T disk, 32core CPU for each node
> using impala for data analysis
>Reporter: liqida
>
> I use impala for data analysis
> after a long time runing , impala DDL statements need a long time to complete 
> "Planning finished" and "DML Metastore update finished" steps.
> Both of them take 50 seconds and more.
> I found that HMS java heap has affected it so much .and I restart the hive 
> metastore server , the problem fixed .
> The HMS java ops like this :
> -XX:+UseParNewGC 
> -XX:+UseConcMarkSweepGC 
> -XX:-CMSConcurrentMTEnabled
> -XX:CMSInitiatingOccupancyFraction=70 
> -XX:+CMSParallelRemarkEnabled 
> -XX:+UseCMSCompactAtFullCollection 
> -XX:CMSFullGCsBeforeCompaction=0 
> -XX:SurvivorRatio=1
> and the total heap size is 3GB 
> after 3 days or less , I found the old genaration is full , and no matter 
> what kind of GC I tried , it never works .
> And then , after the whole work is done() , I run " jmap -F -histo PID "
> I found this :
> Object Histogram:
> num   #instances#bytes  Class description
> --
> 1:  3955457 696160432   com.mysql.jdbc.JDBC4ResultSet
> 2:  3942714 630834240   com.mysql.jdbc.StatementImpl
> 3:  4051520 194472960   java.util.HashMap
> 4:  4714330 150858560   java.util.HashMap$Entry
> 5:  3990264 63844224java.util.HashSet
> 6:  3978657 63658512java.util.HashMap$KeySet
> 7:  3955458 63463696com.mysql.jdbc.Field[]
> 8:  3964025 63424400
> java.util.concurrent.atomic.AtomicBoolean
> 9: 3961293 63380688java.lang.Object
> I think this is the causation
> So, what can I do with this , should I change some configuration or do 
> something to fix this , or HMS has any CACHE ? THANKS 
> BTW: Hive version 0.13.0 , I only use impala 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-10494) hive metastore server can't release its java heap with no work on it

2017-03-07 Thread Zhaofei Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900798#comment-15900798
 ] 

Zhaofei Meng commented on HIVE-10494:
-

Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction smaller.

> hive metastore server can't release its java heap with no work on it
> 
>
> Key: HIVE-10494
> URL: https://issues.apache.org/jira/browse/HIVE-10494
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
> Environment: cloudera cdh 5.2.0
> 10 nodes
>  128G ram, 10T disk, 32core CPU for each node
> using impala for data analysis
>Reporter: liqida
>
> I use impala for data analysis
> after a long time runing , impala DDL statements need a long time to complete 
> "Planning finished" and "DML Metastore update finished" steps.
> Both of them take 50 seconds and more.
> I found that HMS java heap has affected it so much .and I restart the hive 
> metastore server , the problem fixed .
> The HMS java ops like this :
> -XX:+UseParNewGC 
> -XX:+UseConcMarkSweepGC 
> -XX:-CMSConcurrentMTEnabled
> -XX:CMSInitiatingOccupancyFraction=70 
> -XX:+CMSParallelRemarkEnabled 
> -XX:+UseCMSCompactAtFullCollection 
> -XX:CMSFullGCsBeforeCompaction=0 
> -XX:SurvivorRatio=1
> and the total heap size is 3GB 
> after 3 days or less , I found the old genaration is full , and no matter 
> what kind of GC I tried , it never works .
> And then , after the whole work is done() , I run " jmap -F -histo PID "
> I found this :
> Object Histogram:
> num   #instances#bytes  Class description
> --
> 1:  3955457 696160432   com.mysql.jdbc.JDBC4ResultSet
> 2:  3942714 630834240   com.mysql.jdbc.StatementImpl
> 3:  4051520 194472960   java.util.HashMap
> 4:  4714330 150858560   java.util.HashMap$Entry
> 5:  3990264 63844224java.util.HashSet
> 6:  3978657 63658512java.util.HashMap$KeySet
> 7:  3955458 63463696com.mysql.jdbc.Field[]
> 8:  3964025 63424400
> java.util.concurrent.atomic.AtomicBoolean
> 9: 3961293 63380688java.lang.Object
> I think this is the causation
> So, what can I do with this , should I change some configuration or do 
> something to fix this , or HMS has any CACHE ? THANKS 
> BTW: Hive version 0.13.0 , I only use impala 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16123) Let user pick the granularity of bucketing and max in row memory

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900778#comment-15900778
 ] 

Hive QA commented on HIVE-16123:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/1285/HIVE-16123.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10332 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4010/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4010/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4010/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 1285 - PreCommit-HIVE-Build

> Let user pick the granularity of bucketing and max in row memory
> 
>
> Key: HIVE-16123
> URL: https://issues.apache.org/jira/browse/HIVE-16123
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16123.2.patch, HIVE-16123.patch
>
>
> Currently we index the data with granularity of none which puts lot of 
> pressure on the indexer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Status: Patch Available  (was: Open)

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, 
> HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, 
> HIVE-15903.06.patch, HIVE-15903.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Status: Open  (was: Patch Available)

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, 
> HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, 
> HIVE-15903.06.patch, HIVE-15903.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Attachment: HIVE-15903.07.patch

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, 
> HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, 
> HIVE-15903.06.patch, HIVE-15903.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900728#comment-15900728
 ] 

Rui Li commented on HIVE-16071:
---

Hi [~xuefuz], let me summarise my point: here we're talking about two issues - 
detecting disconnection and react to the disconnection. I think the root cause 
of your example is we don't react properly (i.e. we don't fail the future) on 
disconnection.
Regarding detecting the disconnection, I suppose we can rely on netty. The 
cancelTask is kind of a further insurance in case netty fails (or takes too 
long) to detect it.
bq. let cancelTask fail the Future so that Hive stops waiting
Like I mentioned in my proposal, I think SaslHandler is in a better place to do 
this. SaslHandler is intended for the SASL handshake, and it removes itself 
from the pipeline once the handshake finishes. Therefore, if SaslHandler 
detects disconnection, it means the channel is closed before the handshake 
finishes. And thus we should fail the Future. Do you think it makes sense to 
open another JIRA for this?

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14550) HiveServer2: enable ThriftJDBCBinarySerde use by default

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900724#comment-15900724
 ] 

Hive QA commented on HIVE-14550:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856658/HIVE-14550.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10332 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable (batchId=216)
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData (batchId=216)
org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant (batchId=216)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testEscapedStrings (batchId=218)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=218)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testNonAsciiStrings (batchId=218)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4009/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4009/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4009/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856658 - PreCommit-HIVE-Build

> HiveServer2: enable ThriftJDBCBinarySerde use by default
> 
>
> Key: HIVE-14550
> URL: https://issues.apache.org/jira/browse/HIVE-14550
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Ziyang Zhao
> Attachments: HIVE-14550.1.patch, HIVE-14550.1.patch, 
> HIVE-14550.2.patch
>
>
> We've covered all items in HIVE-12427 and created HIVE-14549 for part2 of the 
> effort. Before closing the umbrella jira, we should enable this feature by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-03-07 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900706#comment-15900706
 ] 

Sankar Hariappan commented on HIVE-16006:
-

Thank you [~sushanth] for the commit!

> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Fix For: 2.2.0
>
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, 
> HIVE-16006.03.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. INSERT INTO default.tbl values (10, 20);
> 2. REPL DUMP default FROM 52;
> 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.
> ==
> Additional note - this is happening for INSERT events, not other events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900703#comment-15900703
 ] 

Xuefu Zhang commented on HIVE-16071:


Hi [~lirui], thank for your input and experiment. I think we are making some 
progress in drawing a conclusion.
{quote}
If no SaslMessage is sent, Hive will still wait for 
hive.spark.client.server.connect.timeout, even if cancelTask closes the channel 
after 1s.
{quote}
I'm particularly concern on cases where Hive takes more than it needs to detect 
a problem and return the error to the user. In this case, Hive should know in 
1s that Sasl handshake doesn't complete. It doesn't make sense to let user know 
the failure until after 1 hr. (1 hr is set to accommodate the resource 
availability, not connection establishment.)

{quote}
the cancelTask only closes the channel, it doesn't set failure to the Future.
{quote}
This is a good observation. Is this another bug that we should fix? That is, 
let cancelTask fail the Future so that Hive stops waiting until 
server.connect.timeout elapses.

Any further thoughts?

[~ctang.ma], To answer your question, I don't think we need another property. 
We should use client.connect.timeout as it's also used on driver side. If the 
default value is too low, we can bump it up.


> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Attachment: (was: HIVE-12274.patch)

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Attachment: (was: HIVE-12274.2.patch)

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Attachment: (was: HIVE-12274.patch)

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16115) Stop printing progress info from operation logs with beeline progress bar

2017-03-07 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-16115:
---
Attachment: HIVE-16115.3.patch

patch after master rebase from upstream as the merge was failing in previous 
build on apache

> Stop printing progress info from operation logs with beeline progress bar
> -
>
> Key: HIVE-16115
> URL: https://issues.apache.org/jira/browse/HIVE-16115
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16115.1.patch, HIVE-16115.2.patch, 
> HIVE-16115.3.patch
>
>
> when in progress bar is enabled, we should not print the progress information 
> via the operations logs. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau

2017-03-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887496#comment-15887496
 ] 

Lefty Leverenz edited comment on HIVE-12492 at 3/8/17 4:12 AM:
---

Doc note:  This adds *hive.auto.convert.join.hashtable.max.entries* to 
HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Added a TODOC2.2 label.

Edit (7/Mar/17):  HIVE-16137 changes the default value to 40,000,000 so that's 
what should be documented in the wiki.

Typo alert:  In the parameter description, "does not take affect" should be 
"does not take effect."  This can be corrected in the wiki.


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.auto.convert.join.hashtable.max.entries* to 
HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Added a TODOC2.2 label.

Typo alert:  In the parameter description, "does not take affect" should be 
"does not take effect."  This can be corrected in the wiki.

> MapJoin: 4 million unique integers seems to be a probe plateau
> --
>
> Key: HIVE-12492
> URL: https://issues.apache.org/jira/browse/HIVE-12492
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-12492.01.patch, HIVE-12492.02.patch, 
> HIVE-12492.patch
>
>
> After 4 million keys, the map-join implementation seems to suffer from a 
> performance degradation. 
> The hashtable build & probe time makes this very inefficient, even if the 
> data is very compact (i.e 2 ints).
> Falling back onto the shuffle join or bucket map-join is useful after 2^22 
> items.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16137) Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m

2017-03-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900666#comment-15900666
 ] 

Lefty Leverenz commented on HIVE-16137:
---

Doc note:  This changes the default value of 
*hive.auto.convert.join.hashtable.max.entries*, which was created by HIVE-12492 
(also for release 2.2.0) and is not yet documented in the wiki.  I'll update 
the doc note on HIVE-12492.

Added a TODOC2.2 label.

> Default value of hive config hive.auto.convert.join.hashtable.max.entries 
> should be set to 40m instead of 4m
> 
>
> Key: HIVE-16137
> URL: https://issues.apache.org/jira/browse/HIVE-16137
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-16137.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16137) Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m

2017-03-07 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16137:
--
Labels: TODOC2.2  (was: )

> Default value of hive config hive.auto.convert.join.hashtable.max.entries 
> should be set to 40m instead of 4m
> 
>
> Key: HIVE-16137
> URL: https://issues.apache.org/jira/browse/HIVE-16137
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-16137.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-03-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900650#comment-15900650
 ] 

Lefty Leverenz commented on HIVE-16064:
---

Doc note:  This should be documented in the wiki, with version information.

* [LanguageManualSelect -- ALL and DISTINCT Clauses | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-ALLandDISTINCTClauses]
* [Hive Operators and UDFs -- Built-in Aggregate Functions (UDAF) | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]

Added a TODOC2.2 label.

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13567) Auto-gather column stats - phase 2

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900646#comment-15900646
 ] 

Hive QA commented on HIVE-13567:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856659/HIVE-13567.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 393 failed/errored test(s), 10332 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=220)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=220)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition]
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join14] (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join17] (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19_inclause] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join1] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join26] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join2] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join3] (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join4] (batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join5] (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join6] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join9] (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_13] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[binary_output_format] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket1] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket2] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket3] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark1] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] 
(batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark3] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin13] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin5] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative2] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_1]
 (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_3]
 (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_4]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_5]
 (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_8]
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[case_sensitivity] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cast1] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby]
 (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join17] 
(batchId=24)

[jira] [Commented] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-03-07 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900642#comment-15900642
 ] 

Sushanth Sowmyan commented on HIVE-16006:
-

(Forgot to explicitly mention in previous comment, this patch has my +1. :) )

> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Fix For: 2.2.0
>
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, 
> HIVE-16006.03.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. INSERT INTO default.tbl values (10, 20);
> 2. REPL DUMP default FROM 52;
> 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.
> ==
> Additional note - this is happening for INSERT events, not other events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16066) NPE in ExplainTask

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-16066.

Resolution: Fixed

> NPE in ExplainTask
> --
>
> Key: HIVE-16066
> URL: https://issues.apache.org/jira/browse/HIVE-16066
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
>Priority: Minor
>
> {noformat}
> 2017-02-28T20:05:13,412  WARN [ATS Logger 0] hooks.ATSHook: Failed to submit 
> plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:251) 
> 

[jira] [Assigned] (HIVE-16066) NPE in ExplainTask

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16066:
--

Assignee: Pengcheng Xiong

> NPE in ExplainTask
> --
>
> Key: HIVE-16066
> URL: https://issues.apache.org/jira/browse/HIVE-16066
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Pengcheng Xiong
>Priority: Minor
>
> {noformat}
> 2017-02-28T20:05:13,412  WARN [ATS Logger 0] hooks.ATSHook: Failed to submit 
> plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:251) 
> 

[jira] [Updated] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-03-07 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-16006:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks, [~sankarh]

> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Fix For: 2.2.0
>
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, 
> HIVE-16006.03.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. INSERT INTO default.tbl values (10, 20);
> 2. REPL DUMP default FROM 52;
> 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.
> ==
> Additional note - this is happening for INSERT events, not other events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Status: Patch Available  (was: Open)

I am attaching new patch that adds a couple of fixes more, one to a 
MetaSToreUtils where it checks for the length of Column TypeName. This check 
needs to be removed. Second fix is to the QTestUtil file that bulk loads some 
data into derby.

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, 
> HIVE-12274.3.patch, HIVE-12274.example.ddl.hql, HIVE-12274.patch, 
> HIVE-12274.patch, HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Attachment: HIVE-12274.3.patch

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, 
> HIVE-12274.3.patch, HIVE-12274.example.ddl.hql, HIVE-12274.patch, 
> HIVE-12274.patch, HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12274:
-
Status: Open  (was: Patch Available)

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, 
> HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900633#comment-15900633
 ] 

Naveen Gangam commented on HIVE-12274:
--

I am currently investigating test failures above from the TestPerfCliDriver. 
All failures seem to be from difference in the qtest output. I narrowed down 
the cause. The expected output is from CBO enabled. The actual output is a 
result of CBO being disabled because the test is unable to bulk load data into 
HMS metastore. This bulk loader uses a little known feature in derby to import 
the data from a txt file. Since we are changing the type of 
TABLE_PARAMS.PARAM_VALUE to CLOB, there format for the data needs to be 
different. Looking at the code, the CLOB column DATA needs to be separated into 
its own file and the original data file needs to have the filename, offset and 
the data length to read from. This is my understanding based on reading the 
code from.
http://people.apache.org/~kristwaa/jacoco/org.apache.derby.impl.load/ImportLobFile.java.html

I have been able to get past the initial failure, but CBO fails further along 
without clear message. [~thejas] Who can I approach to understand these CBO 
failures? Thanks


> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, 
> HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or 

[jira] [Updated] (HIVE-15212) merge branch into master

2017-03-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Attachment: HIVE-15212.patch

The branch is not ready for merge due to ACID merge being in progress... to 
make some preliminary progress on the merge, attaching the current branch patch 
to see what non-MM (or MM) tests would need to be fixed after fixing all the MM 
issues discovered in HIVE-14990 
cc [~wzheng] fyi

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.patch
>
>
> Filing the JIRA now; accidentally attached the merge patch somewhere, so I 
> will post the test results analysis here. We will re-run the tests here later.
> Relevant q file failures:
> load_dyn_part1, autoColumnStats_2 and _1, escape2, load_dyn_part2, 
> dynpart_sort_opt_vectorization, orc_createas1, combine3, update_tmp_table, 
> delete_where_non_partitioned, delete_where_no_match, update_where_no_match, 
> update_where_non_partitioned, update_all_types
> I suspect many ACID failures are due to incomplete ACID type patch.
> Also need to revert the pom change from spark test pom, that seems to break 
> Spark tests. I had it temporarily to get rid of the long non-maven download 
> in all cases (there's a separate JIRA for that)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900627#comment-15900627
 ] 

Rui Li commented on HIVE-16071:
---

Hi [~xuefuz], in your example, if the SASL handshake doesn't finish in time, 
the client side will exit after 1s. Even if netty can't detect the 
disconnection immediately, I don't think it takes 1h to detect it. Besides, the 
cancelTask only closes the channel, it doesn't set failure to the Future. 
Therefore we can't really rely on the cancelTask to stop the waiting. My 
proposal is:
# We need to reliably detect disconnection. I think netty is good enough for 
this (maybe with some reasonable delay). But I'm also OK to keep the cancelTask 
to close the channel ourselves.
# We need to reliably cancel the Future when disconnection is detected. This 
can be done in the SaslHandler which monitors the channel inactive event.

I also did some tests to verify. I modified the client code so that it makes 
the connection but doesn't finish SASL handshake. I tried two ways to do this, 
one is the client never sends the SaslMessage, the other is the client sends 
the SaslMessage and then just exits. The test is done in yarn-cluster mode.
# If no SaslMessage is sent, Hive will still wait for 
{{hive.spark.client.server.connect.timeout}}, even if cancelTask closes the 
channel after 1s.
# If SaslMessage is sent, SaslHandler will detect the disconnection and cancel 
the Future, no matter whether the cancelTask fires or not. Of course, this 
requires netty to detect the disconnection.

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15212) merge branch into master

2017-03-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15212:
---

Assignee: Sergey Shelukhin

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Filing the JIRA now; accidentally attached the merge patch somewhere, so I 
> will post the test results analysis here. We will re-run the tests here later.
> Relevant q file failures:
> load_dyn_part1, autoColumnStats_2 and _1, escape2, load_dyn_part2, 
> dynpart_sort_opt_vectorization, orc_createas1, combine3, update_tmp_table, 
> delete_where_non_partitioned, delete_where_no_match, update_where_no_match, 
> update_where_non_partitioned, update_all_types
> I suspect many ACID failures are due to incomplete ACID type patch.
> Also need to revert the pom change from spark test pom, that seems to break 
> Spark tests. I had it temporarily to get rid of the long non-maven download 
> in all cases (there's a separate JIRA for that)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16038) MM tables: fix (or disable) inferring buckets

2017-03-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-16038.
-
   Resolution: Fixed
Fix Version/s: hive-14535

Would be very easy to fix for a particular MM ID, but there's no guarantee that 
other MM IDs would conform to the inferred buckets, so I added comments and 
warnings and let it continue to fail (by discarding the inferred data, as it 
does already when the job doesn't produce the requisite number of files for a 
partition, see _dyn_part test).
I suspect similar issues may affect ACID tables and any other nested directory 
cases (and some overwrites?).

If somebody cares about this feature it should be easy to fix based on the 
comment added in the patch.

> MM tables: fix (or disable) inferring buckets
> -
>
> Key: HIVE-16038
> URL: https://issues.apache.org/jira/browse/HIVE-16038
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
>
> The following tests on minimr produce diffs if all tables are changed to MM:
> {noformat}
> infer_bucket_sort_dyn_part
> infer_bucket_sort_num_buckets
> infer_bucket_sort_merge
> infer_bucket_sort_reducers_power_two
> {noformat}
> Some of these disable strict checks for bucketing load, which wouldn't work 
> by design; the rest should work. Either that, or we should disable this for 
> MM tables - seems like an obscure feature.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException

2017-03-07 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900619#comment-15900619
 ] 

Vaibhav Gumashta commented on HIVE-16107:
-

[~daijy] [~sushanth] Can you please take a look? Thanks

> JDBC: HttpClient should retry one more time on NoHttpResponseException
> --
>
> Key: HIVE-16107
> URL: https://issues.apache.org/jira/browse/HIVE-16107
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.1, 2.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16107.1.patch
>
>
> Hive's JDBC client in HTTP transport mode doesn't retry on 
> NoHttpResponseException. We've seen the exception being thrown to the JDBC 
> end user when used with Knox as the proxy, when Knox upgraded its jetty 
> version, which has a smaller value for jetty connector idletimeout, and as a 
> result closes the HTTP connection on server side. The next jdbc query on the 
> client, throws a NoHttpResponseException. However, subsequent queries 
> reconnect, but the JDBC driver should ideally handle this by retrying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException

2017-03-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-16107:

Attachment: HIVE-16107.1.patch

> JDBC: HttpClient should retry one more time on NoHttpResponseException
> --
>
> Key: HIVE-16107
> URL: https://issues.apache.org/jira/browse/HIVE-16107
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.1, 2.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16107.1.patch
>
>
> Hive's JDBC client in HTTP transport mode doesn't retry on 
> NoHttpResponseException. We've seen the exception being thrown to the JDBC 
> end user when used with Knox as the proxy, when Knox upgraded its jetty 
> version, which has a smaller value for jetty connector idletimeout, and as a 
> result closes the HTTP connection on server side. The next jdbc query on the 
> client, throws a NoHttpResponseException. However, subsequent queries 
> reconnect, but the JDBC driver should ideally handle this by retrying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException

2017-03-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-16107:

Status: Patch Available  (was: Open)

> JDBC: HttpClient should retry one more time on NoHttpResponseException
> --
>
> Key: HIVE-16107
> URL: https://issues.apache.org/jira/browse/HIVE-16107
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 2.1.1, 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16107.1.patch
>
>
> Hive's JDBC client in HTTP transport mode doesn't retry on 
> NoHttpResponseException. We've seen the exception being thrown to the JDBC 
> end user when used with Knox as the proxy, when Knox upgraded its jetty 
> version, which has a smaller value for jetty connector idletimeout, and as a 
> result closes the HTTP connection on server side. The next jdbc query on the 
> client, throws a NoHttpResponseException. However, subsequent queries 
> reconnect, but the JDBC driver should ideally handle this by retrying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-07 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900607#comment-15900607
 ] 

Chaoyu Tang commented on HIVE-16071:


I agree with  [~xuefuz] that we need a timeout for SASL handshaking at RPC 
server site for the case he raised. This timeout should be shorter than 
client.server.connect.timeout used by RegisterClient, but ideally I think it 
should be a little longer than the client.connect.timeout used by RemoteDriver 
handshaking so that we can try to avoid the handshaking timeout initiated by 
the server given that starting a remoteDriver is quite expensive. If so, I 
would suggest we can introduce a new configuration like 
hive.spark.rpc.handshake.server.timeout, and rename   
hive.spark.client.connect.timeout to hive.spark.rpc.handshake.client.timeout 
(though it is also used as the socket connect timeout at RemoteDriver side like 
now). Also the hive.spark.client.server.connect.timeout could be renamed to 
something like hive.spark.register.remote.driver.timeout if necessary. What do 
you guys think about it?

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900603#comment-15900603
 ] 

Rajesh Balamohan commented on HIVE-16142:
-

Is HIVE-16066 similar to this one? 

> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16142.01.patch
>
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16144) CompactionInfo doesn't have equals/hashCode but used in Set

2017-03-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-16144:
-


> CompactionInfo doesn't have equals/hashCode but used in Set
> ---
>
> Key: HIVE-16144
> URL: https://issues.apache.org/jira/browse/HIVE-16144
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> CompactionTxnHandler.findPotentialCompactions() uses a Set 
> but CompactionInfo doesn't have equals/hashCode.
> should do the same as CompactionInfo.compareTo()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15903) Compute table stats when user computes column stats

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900596#comment-15900596
 ] 

Hive QA commented on HIVE-15903:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856657/HIVE-15903.06.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 10332 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_table_stats_orc] 
(batchId=6)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=138)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_stats] 
(batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] 
(batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_table_invalidate_column_stats]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_1]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[column_table_stats]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[drop_partition_with_stats]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[extrapolate_part_stats_partial_ndv]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries_with_filters]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_stats]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_only_null]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_remove_26]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join1]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join2]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join3]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join4]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join5]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction2]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=225)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=120)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4007/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4007/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4007/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856657 - PreCommit-HIVE-Build

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, 
> HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, 
> HIVE-15903.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Open  (was: Patch Available)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, 
> HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, 
> HIVE-15160.07.patch, HIVE-15160.08.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Patch Available  (was: Open)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, 
> HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, 
> HIVE-15160.07.patch, HIVE-15160.08.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Attachment: HIVE-15160.08.patch

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, 
> HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, 
> HIVE-15160.07.patch, HIVE-15160.08.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900574#comment-15900574
 ] 

Sergey Shelukhin commented on HIVE-16104:
-

RB https://reviews.apache.org/r/57405/ for review where whitespace changes make 
the patch too verbose

> LLAP: preemption may be too aggressive if the pre-empted task doesn't die 
> immediately
> -
>
> Key: HIVE-16104
> URL: https://issues.apache.org/jira/browse/HIVE-16104
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900571#comment-15900571
 ] 

Sergey Shelukhin edited comment on HIVE-16104 at 3/8/17 2:03 AM:
-

Some things that look like refactoring are not actually refactoring. The lock 
in trySchedule is unnecessary so I removed it and renamed the method; 
preemption was surrounded by a loop because previously, if the first task in 
queue was finishable it would bail without preempting anything even if there 
are more tasks.
I can merge updateQueueMetric back into being copy-pasted in 3 places... also 
one if was refactored because it has lots of repetitive code.
Another method was added because something that was previously called in one 
place is now called in 2 places and I didn't want to copy-paste it. 


was (Author: sershe):
Some things that look like refactoring are not actually refactoring. The lock 
in trySchedule is unnecessary so I removed it and renamed the method; 
preemption was surrounded by a loop because previously, if the first task in 
queue was finishable it would bail without preempting anything even if there 
are more tasks.
I can merge updateQueueMetric back into being copy-pasted in 3 places... also 
one if was refactored because it has lots of repetitive code.

> LLAP: preemption may be too aggressive if the pre-empted task doesn't die 
> immediately
> -
>
> Key: HIVE-16104
> URL: https://issues.apache.org/jira/browse/HIVE-16104
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900571#comment-15900571
 ] 

Sergey Shelukhin commented on HIVE-16104:
-

Some things that look like refactoring are not actually refactoring. The lock 
in trySchedule is unnecessary so I removed it and renamed the method; 
preemption was surrounded by a loop because previously, if the first task in 
queue was finishable it would bail without preempting anything even if there 
are more tasks.
I can merge updateQueueMetric back into being copy-pasted in 3 places... also 
one if was refactored because it has lots of repetitive code.

> LLAP: preemption may be too aggressive if the pre-empted task doesn't die 
> immediately
> -
>
> Key: HIVE-16104
> URL: https://issues.apache.org/jira/browse/HIVE-16104
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately

2017-03-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900560#comment-15900560
 ] 

Siddharth Seth edited comment on HIVE-16104 at 3/8/17 1:56 AM:
---

Looking. Can you please remove the unnecessary parts of the patch - formatting 
changes, refactored sections, refactored if/else statements. That makes it 
difficult to review, and is not required.
More than half the patch seems like a refactor.


was (Author: sseth):
Looking. Can you please remove the unnecessary parts of the patch - formatting 
changes, refactored if/else statements. That makes it difficult to review, and 
is not required.

> LLAP: preemption may be too aggressive if the pre-empted task doesn't die 
> immediately
> -
>
> Key: HIVE-16104
> URL: https://issues.apache.org/jira/browse/HIVE-16104
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately

2017-03-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900560#comment-15900560
 ] 

Siddharth Seth commented on HIVE-16104:
---

Looking. Can you please remove the unnecessary parts of the patch - formatting 
changes, refactored if/else statements. That makes it difficult to review, and 
is not required.

> LLAP: preemption may be too aggressive if the pre-empted task doesn't die 
> immediately
> -
>
> Key: HIVE-16104
> URL: https://issues.apache.org/jira/browse/HIVE-16104
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16123) Let user pick the granularity of bucketing and max in row memory

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900543#comment-15900543
 ] 

Hive QA commented on HIVE-16123:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/1285/HIVE-16123.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10330 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4006/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4006/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4006/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 1285 - PreCommit-HIVE-Build

> Let user pick the granularity of bucketing and max in row memory
> 
>
> Key: HIVE-16123
> URL: https://issues.apache.org/jira/browse/HIVE-16123
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16123.2.patch, HIVE-16123.patch
>
>
> Currently we index the data with granularity of none which puts lot of 
> pressure on the indexer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16098) Describe table doesn't show stats for partitioned tables

2017-03-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16098:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Describe table doesn't show stats for partitioned tables
> 
>
> Key: HIVE-16098
> URL: https://issues.apache.org/jira/browse/HIVE-16098
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-16098.1.patch, HIVE-16098.2.patch, 
> HIVE-16098.3.patch, HIVE-16098.4.patch, HIVE-16098.5.patch, HIVE-16098.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-03-07 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16064:
--
Labels: TODOC2.2  (was: )

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16102) Grouping sets do not conform to SQL standard

2017-03-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900520#comment-15900520
 ] 

Lefty Leverenz commented on HIVE-16102:
---

Does the wiki need to be updated?  If so, please add a TODOC2.2 label.

* [GroupBy -- Grouping Sets, Cubes, Rollups, and the GROUPING__ID Function | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy#LanguageManualGroupBy-GroupingSets,Cubes,Rollups,andtheGROUPING__IDFunction]
* [Enhanced Aggregation, Cube, Grouping and Rollup | 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup]

> Grouping sets do not conform to SQL standard
> 
>
> Key: HIVE-16102
> URL: https://issues.apache.org/jira/browse/HIVE-16102
> Project: Hive
>  Issue Type: Bug
>  Components: Operators, Parser
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-16102.01.patch, HIVE-16102.02.patch, 
> HIVE-16102.patch
>
>
> [~ashutoshc] realized that the implementation of GROUPING__ID in Hive was not 
> returning values as specified by SQL standard and other execution engines.
> After digging into this, I found out that the implementation was bogus, as 
> internally it was changing between big-endian/little-endian representation of 
> GROUPING__ID indistinctly, and in some cases conversions in both directions 
> were cancelling each other.
> In the documentation in 
> https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup
>  we can already find the problem, even if we did not spot it at first.
> {quote}
> The following query: SELECT key, value, GROUPING__ID, count(\*) from T1 GROUP 
> BY key, value WITH ROLLUP
> will have the following results.
> | NULL | NULL | 0 | 6 |
> | 1 | NULL | 1 | 2 |
> | 1 | NULL | 3 | 1 |
> | 1 | 1 | 3 | 1 |
> ...
> {quote}
> Observe that value for GROUPING__ID in first row should be `3`, while for 
> third and fourth rows, it should be `0`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15929) Fix HiveDecimalWritable to be compatible with Hive 2.1

2017-03-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900500#comment-15900500
 ] 

Lefty Leverenz commented on HIVE-15929:
---

Status nudge:  This was committed to master on Feb. 16 with commits 
74c50452c5c644a3898bce2738ee040e625caa01, 
a9c429e637cf366b90a87cc5c1f3c2b4e60ae0c8, and 
e732aa27efec014302af41fb77c0b1c5197c4b90.

[~owen.omalley], please update the status and fix version.

> Fix HiveDecimalWritable to be compatible with Hive 2.1
> --
>
> Key: HIVE-15929
> URL: https://issues.apache.org/jira/browse/HIVE-15929
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-15929.patch
>
>
> HIVE-15335 broke compatibility with Hive 2.1 by making 
> HiveDecimalWritable.getInternalStorate() throw an exception when called on an 
> unset value. It is easy to instead return an empty array, which will allow 
> the old code to allocate a new array.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2017-03-07 Thread TAK LON WU (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900499#comment-15900499
 ] 

TAK LON WU commented on HIVE-12274:
---

+1 and any update on this?

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>  Labels: metastore
> Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, 
> HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, 
> HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that 
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also 
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16124) Drop the segments data as soon it is pushed to HDFS

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900468#comment-15900468
 ] 

Hive QA commented on HIVE-16124:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856653/16124.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10330 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4005/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4005/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4005/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856653 - PreCommit-HIVE-Build

> Drop the segments data as soon it is pushed to HDFS
> ---
>
> Key: HIVE-16124
> URL: https://issues.apache.org/jira/browse/HIVE-16124
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: 16124.patch
>
>
> Drop the pushed segments from the indexer as soon as the HDFS push is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16143) Improve msck repair batching

2017-03-07 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16143:
--


> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16110) Vectorization: Support 2 Value CASE WHEN instead of fall back to VectorUDFAdaptor

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900422#comment-15900422
 ] 

Sergey Shelukhin commented on HIVE-16110:
-

+1 however the test has failed (case)

> Vectorization: Support 2 Value CASE WHEN instead of fall back to 
> VectorUDFAdaptor
> -
>
> Key: HIVE-16110
> URL: https://issues.apache.org/jira/browse/HIVE-16110
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16110.01.patch, HIVE-16110.02.patch, 
> HIVE-16110.03.patch
>
>
> Vectorize more queries by converting a GenericUDFWhen that has 2 values that 
> are either a column or a constant into a GenericUDFIf, which has  vectorized 
> classes.  This eliminates one case so to speak where we use the 
> VectorUDFAdaptor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16078) improve abort checking in Tez/LLAP

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900401#comment-15900401
 ] 

Sergey Shelukhin commented on HIVE-16078:
-

It completes for me with PPD disabled (PPD causes massive slowdown due to 
ORC-148). However, even with PPD enabled I cannot repro the original condition

> improve abort checking in Tez/LLAP
> --
>
> Key: HIVE-16078
> URL: https://issues.apache.org/jira/browse/HIVE-16078
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16078.01.patch, HIVE-16078.02.patch, 
> HIVE-16078.03.patch, HIVE-16078.patch
>
>
> Sometimes, a fragment can run for a long time after a query fails. It looks 
> from logs like the abort/interrupt were called correctly on the thread, yet 
> the thread hangs around minutes after, doing the below. Other tasks for the 
> same job appear to have exited correctly, after the same abort logic (at 
> least, the same log lines, fwiw)
> {noformat}
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByValue(VectorCopyRow.java:317)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:263)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628)
>   at 
> 

[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900390#comment-15900390
 ] 

Hive QA commented on HIVE-16122:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10331 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4004/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4004/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4004/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build

> NPE Hive Druid split introduced by HIVE-15928
> -
>
> Key: HIVE-16122
> URL: https://issues.apache.org/jira/browse/HIVE-16122
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, 
> HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16098) Describe table doesn't show stats for partitioned tables

2017-03-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900391#comment-15900391
 ] 

Pengcheng Xiong commented on HIVE-16098:


patch LGTM +1. I think we can open new jiras for the follow-up work.

> Describe table doesn't show stats for partitioned tables
> 
>
> Key: HIVE-16098
> URL: https://issues.apache.org/jira/browse/HIVE-16098
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-16098.1.patch, HIVE-16098.2.patch, 
> HIVE-16098.3.patch, HIVE-16098.4.patch, HIVE-16098.5.patch, HIVE-16098.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900364#comment-15900364
 ] 

Pengcheng Xiong edited comment on HIVE-11266 at 3/7/17 11:13 PM:
-

Hello there, which version of hive are you using? I saw you put a label 1.1.0 
as affected versions. Does that mean that you are using Hive 1.1? Thanks.


was (Author: pxiong):
Hello there, which version of hive are you using? I saw you put a lable 1.1.0 
as affected versions. Does that mean that you are using Hive 1.1? Thanks.

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16131) Hive building with Hadoop 3 - additional stuff broken recently

2017-03-07 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900365#comment-15900365
 ] 

Wei Zheng commented on HIVE-16131:
--

OK I see. +1 pending test

> Hive building with Hadoop 3 - additional stuff broken recently
> --
>
> Key: HIVE-16131
> URL: https://issues.apache.org/jira/browse/HIVE-16131
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16131.01.patch, HIVE-16131.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900364#comment-15900364
 ] 

Pengcheng Xiong commented on HIVE-11266:


Hello there, which version of hive are you using? I saw you put a lable 1.1.0 
as affected versions. Does that mean that you are using Hive 1.1? Thanks.

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-11266) count(*) wrong result based on table statistics

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-11266:
--

Assignee: Pengcheng Xiong

> count(*) wrong result based on table statistics
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16131) Hive building with Hadoop 3 - additional stuff broken recently

2017-03-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16131:

Attachment: HIVE-16131.01.patch

[~wzheng] that doesn't compile with hadoop2. Fixed for both for now.

> Hive building with Hadoop 3 - additional stuff broken recently
> --
>
> Key: HIVE-16131
> URL: https://issues.apache.org/jira/browse/HIVE-16131
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16131.01.patch, HIVE-16131.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15282) Different modification times are used when an index is built and when its staleness is checked

2017-03-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15282:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Marta!

> Different modification times are used when an index is built and when its 
> staleness is checked
> --
>
> Key: HIVE-15282
> URL: https://issues.apache.org/jira/browse/HIVE-15282
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing
>Affects Versions: 2.2.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Fix For: 2.2.0
>
> Attachments: HIVE-15282.2.patch, HIVE-15282.patch
>
>
> The index_auto_mult_tables and index_auto_mult_tables_compact q tests are 
> failing from time to time with the following error:
> {noformat}
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
> Failing for the past 1 build (Since Failed#16 )
> Took 16 sec.
> Error Message
> Unexpected exception junit.framework.AssertionFailedError: Client Execution 
> results failed with error code = 1
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
>  at junit.framework.Assert.fail(Assert.java:57)
>  at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2001)
>  at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:194)
>  at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables(TestCliDriver.java:142)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>  at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
> Stacktrace
> junit.framework.AssertionFailedError: Unexpected exception 
> junit.framework.AssertionFailedError: Client Execution results failed with 
> error code = 1
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2001)
>   at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:194)
>   at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables(TestCliDriver.java:142)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Commented] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-03-07 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900323#comment-15900323
 ] 

Matt McCline commented on HIVE-15857:
-

[~sershe] Thank you for your review!!

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, 
> HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-03-07 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15857:

Status: Patch Available  (was: In Progress)

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, 
> HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-03-07 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15857:

Attachment: HIVE-15857.05.patch

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, 
> HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-03-07 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15857:

Status: In Progress  (was: Patch Available)

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, 
> HIVE-15857.03.patch, HIVE-15857.04.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900312#comment-15900312
 ] 

Hive QA commented on HIVE-16122:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10330 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=186)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4003/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4003/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4003/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build

> NPE Hive Druid split introduced by HIVE-15928
> -
>
> Key: HIVE-16122
> URL: https://issues.apache.org/jira/browse/HIVE-16122
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, 
> HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900313#comment-15900313
 ] 

Ashutosh Chauhan commented on HIVE-16142:
-

+1

> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16142.01.patch
>
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list

2017-03-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-16141.
--
Resolution: Invalid

Closing it as invalid. 

> SQL auth whitelist configs should have a config for appending to existing list
> --
>
> Key: HIVE-16141
> URL: https://issues.apache.org/jira/browse/HIVE-16141
> Project: Hive
>  Issue Type: Bug
>  Components: SQLStandardAuthorization
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>
> Sql auth whitelist set configs can be added to 
> hive.security.authorization.sqlstd.confwhitelist but this will replace the 
> default white list patterns. If users want the default white list configs and 
> would like to append to it, then this can get complicated. We can have a 
> separate config that will let users to append to default whitelist. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list

2017-03-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900288#comment-15900288
 ] 

Prasanth Jayachandran commented on HIVE-16141:
--

This is probably invalid. I can see that append being applied in 
SettableConfigUpdater. 

> SQL auth whitelist configs should have a config for appending to existing list
> --
>
> Key: HIVE-16141
> URL: https://issues.apache.org/jira/browse/HIVE-16141
> Project: Hive
>  Issue Type: Bug
>  Components: SQLStandardAuthorization
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>
> Sql auth whitelist set configs can be added to 
> hive.security.authorization.sqlstd.confwhitelist but this will replace the 
> default white list patterns. If users want the default white list configs and 
> would like to append to it, then this can get complicated. We can have a 
> separate config that will let users to append to default whitelist. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900281#comment-15900281
 ] 

Pengcheng Xiong commented on HIVE-16142:


[~ashutoshc], could u take a look? Thanks.

> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16142.01.patch
>
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16142:
---
Attachment: HIVE-16142.01.patch

> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16142.01.patch
>
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16142:
---
Status: Patch Available  (was: Open)

> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16142.01.patch
>
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16142) ATSHook NPE via LLAP

2017-03-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16142:
--


> ATSHook NPE via LLAP
> 
>
> Key: HIVE-16142
> URL: https://issues.apache.org/jira/browse/HIVE-16142
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Exceptions in the log of the form:
> 2017-03-06T15:42:30,046 WARN  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(318)) - Failed to submit to ATS for 
> hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) 
> ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list

2017-03-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900279#comment-15900279
 ] 

Prasanth Jayachandran commented on HIVE-16141:
--

Just noticed, there is actually a config to append 
hive.security.authorization.sqlstd.confwhitelist.append but this is not used in 
HiveConf by default.

> SQL auth whitelist configs should have a config for appending to existing list
> --
>
> Key: HIVE-16141
> URL: https://issues.apache.org/jira/browse/HIVE-16141
> Project: Hive
>  Issue Type: Bug
>  Components: SQLStandardAuthorization
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>
> Sql auth whitelist set configs can be added to 
> hive.security.authorization.sqlstd.confwhitelist but this will replace the 
> default white list patterns. If users want the default white list configs and 
> would like to append to it, then this can get complicated. We can have a 
> separate config that will let users to append to default whitelist. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list

2017-03-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900274#comment-15900274
 ] 

Prasanth Jayachandran commented on HIVE-16141:
--

cc\ [~thejas] [~sushanth]


> SQL auth whitelist configs should have a config for appending to existing list
> --
>
> Key: HIVE-16141
> URL: https://issues.apache.org/jira/browse/HIVE-16141
> Project: Hive
>  Issue Type: Bug
>  Components: SQLStandardAuthorization
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>
> Sql auth whitelist set configs can be added to 
> hive.security.authorization.sqlstd.confwhitelist but this will replace the 
> default white list patterns. If users want the default white list configs and 
> would like to append to it, then this can get complicated. We can have a 
> separate config that will let users to append to default whitelist. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-03-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16064:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16140) Stabilize few randomly failing tests

2017-03-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-16140:
---


> Stabilize few randomly failing tests
> 
>
> Key: HIVE-16140
> URL: https://issues.apache.org/jira/browse/HIVE-16140
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure, Tests
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> Golden file update for vector_between_in test and sort_before_diff for couple 
> of Perf tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16090) Addendum to HIVE-16014

2017-03-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16090:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~vihangk1]. I committed this to master.

> Addendum to HIVE-16014
> --
>
> Key: HIVE-16090
> URL: https://issues.apache.org/jira/browse/HIVE-16090
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16090.01.patch, HIVE-16090.02.patch, 
> HIVE-16090.03.patch
>
>
> HIVE-16014 changed the HiveMetastoreChecker to use 
> {{METASTORE_FS_HANDLER_THREADS_COUNT}} for pool size. Some of the tests in 
> TestHiveMetastoreChecker still use {{HIVE_MOVE_FILES_THREAD_COUNT}} which 
> leads to incorrect test behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-03-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900211#comment-15900211
 ] 

Ashutosh Chauhan commented on HIVE-16064:
-

+1

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900206#comment-15900206
 ] 

Hive QA commented on HIVE-16122:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10330 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
org.apache.hive.hcatalog.pig.TestRCFileHCatStorer.testWriteDecimalXY 
(batchId=173)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteSmallint 
(batchId=173)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4002/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4002/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4002/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build

> NPE Hive Druid split introduced by HIVE-15928
> -
>
> Key: HIVE-16122
> URL: https://issues.apache.org/jira/browse/HIVE-16122
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, 
> HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16054) AMReporter should use application token instead of ugi.getCurrentUser

2017-03-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16054:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master

> AMReporter should use application token instead of ugi.getCurrentUser
> -
>
> Key: HIVE-16054
> URL: https://issues.apache.org/jira/browse/HIVE-16054
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-16054.1.patch, HIVE-16054.1.patch
>
>
> During the initial creation of the ugi we user appId but later we user the 
> user who submitted the request. Although this doesn't matter as long as the 
> job tokens are set correctly. It is good to keep it consistent. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16136) LLAP: Before SIGKILL, collect diagnostic information before daemon goes down

2017-03-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16136:
-
Summary: LLAP: Before SIGKILL, collect diagnostic information before daemon 
goes down  (was: LLAP: Before SIGKILL and collect diagnostic information before 
daemon goes down)

> LLAP: Before SIGKILL, collect diagnostic information before daemon goes down
> 
>
> Key: HIVE-16136
> URL: https://issues.apache.org/jira/browse/HIVE-16136
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Sometime daemons can get killed by YARN's pmem monitor which issue a kill 
> followed by kill -9 after 250ms. This is really a short duration to collect 
> anything useful. 
> There is no clean way to trap SIGKILL in java.  
> One option is to increase the time between kill and kill -9 in YARN and 
> during that time we can have a shutdown hook handler to collect all 
> diagnostics information like heapdump, jstack, jmx output etc. in a 
> non-container directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900148#comment-15900148
 ] 

Sergey Shelukhin commented on HIVE-15857:
-

+1 w/one small comment. There is some repetitive code that is handled less 
repetitively elsewhere in vectorization iirc (e.g. selectedInUse where it does 
int index = sIU ? sel[i] : i, instead of having 2 separate loops). I am 
assuming this is due to vectorization requirements that they are all handled 
separately :)

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, 
> HIVE-15857.03.patch, HIVE-15857.04.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16090) Addendum to HIVE-16014

2017-03-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900124#comment-15900124
 ] 

Sergio Peña commented on HIVE-16090:


Looks good.
+1

> Addendum to HIVE-16014
> --
>
> Key: HIVE-16090
> URL: https://issues.apache.org/jira/browse/HIVE-16090
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16090.01.patch, HIVE-16090.02.patch, 
> HIVE-16090.03.patch
>
>
> HIVE-16014 changed the HiveMetastoreChecker to use 
> {{METASTORE_FS_HANDLER_THREADS_COUNT}} for pool size. Some of the tests in 
> TestHiveMetastoreChecker still use {{HIVE_MOVE_FILES_THREAD_COUNT}} which 
> leads to incorrect test behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16076) LLAP packaging - include aux libs

2017-03-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16076:

Attachment: HIVE-16076.02.patch

Updated the patch.

> LLAP packaging - include aux libs 
> --
>
> Key: HIVE-16076
> URL: https://issues.apache.org/jira/browse/HIVE-16076
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16076.01.patch, HIVE-16076.02.patch, 
> HIVE-16076.patch
>
>
> The old auxlibs (or whatever) should be packaged by default, if present.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16114) NullPointerException in TezSessionPoolManager when getting the session

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900128#comment-15900128
 ] 

Sergey Shelukhin commented on HIVE-16114:
-

testGetNonDefaultSession failure may be related

> NullPointerException in TezSessionPoolManager when getting the session
> --
>
> Key: HIVE-16114
> URL: https://issues.apache.org/jira/browse/HIVE-16114
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
> Attachments: HIVE-16114.1.patch, HIVE-16114.patch
>
>
> hive version: apache-hive-2.1.1 
> we use hue(3.11.0) connecting to the HiveServer2.  when hue starts up, it 
> works with no problems, a few hours passed, when we use the same sql, an 
> exception about unable to execute TezTask will come into being.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16076) LLAP packaging - include aux libs

2017-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900106#comment-15900106
 ] 

Sergey Shelukhin commented on HIVE-16076:
-

[~prasanth_j] ADDEDJARS is actually set via Utilities.getResourceFiles(conf, 
SessionState.ResourceType.JAR). Fixing the rest

> LLAP packaging - include aux libs 
> --
>
> Key: HIVE-16076
> URL: https://issues.apache.org/jira/browse/HIVE-16076
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16076.01.patch, HIVE-16076.patch
>
>
> The old auxlibs (or whatever) should be packaged by default, if present.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15515) Remove the docs directory

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900103#comment-15900103
 ] 

Hive QA commented on HIVE-15515:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856646/HIVE-15515.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4001/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4001/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4001/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-03-07 20:31:46.522
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-4001/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-03-07 20:31:46.524
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 9368fec HIVE-15920 Implement a blocking version of a command to 
compact (Eugene Koifman, reviewed by Wei Zheng)
+ git clean -f -d
Removing druid-handler/src/test/org/apache/hadoop/hive/druid/io/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 9368fec HIVE-15920 Implement a blocking version of a command to 
compact (Eugene Koifman, reviewed by Wei Zheng)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-03-07 20:31:47.603
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
fatal: git diff header lacks filename information when removing 0 leading 
pathname components (line 523)
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856646 - PreCommit-HIVE-Build

> Remove the docs directory
> -
>
> Key: HIVE-15515
> URL: https://issues.apache.org/jira/browse/HIVE-15515
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Lefty Leverenz
>Assignee: Akira Ajisaka
> Attachments: HIVE-15515.01.patch
>
>
> Hive xdocs have not been used since 2012.  The docs directory only holds six 
> xml documents, and their contents are in the wiki.
> It's past time to remove the docs directory from the Hive code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928

2017-03-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900096#comment-15900096
 ] 

Hive QA commented on HIVE-16122:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10330 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] 
(batchId=119)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4000/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4000/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4000/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build

> NPE Hive Druid split introduced by HIVE-15928
> -
>
> Key: HIVE-16122
> URL: https://issues.apache.org/jira/browse/HIVE-16122
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
> Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, 
> HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15001) Remove showConnectedUrl from command line help

2017-03-07 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900054#comment-15900054
 ] 

Naveen Gangam commented on HIVE-15001:
--

makes sense to remove the dead code. So +1 for me.

> Remove showConnectedUrl from command line help
> --
>
> Key: HIVE-15001
> URL: https://issues.apache.org/jira/browse/HIVE-15001
> Project: Hive
>  Issue Type: Sub-task
>  Components: Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Trivial
> Attachments: HIVE-15001.2.patch, HIVE-15001.3.patch, HIVE-15001.patch
>
>
> As discussed with [~nemon], the showConnectedUrl commandline parameter is not 
> working since a erroneous merge. Instead beeline always prints the currently 
> connected url. Since it is good for everyone, no extra parameter is needed to 
> turn this feature on.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16136) LLAP: Before SIGKILL and collect diagnostic information before daemon goes down

2017-03-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900051#comment-15900051
 ] 

Prasanth Jayachandran commented on HIVE-16136:
--

bq. The shell scripts are probably where we can trap signals and dump 
/proc//smaps & /proc//stat ? Bash has a "trap" feature for this.

Yeah. We could get these as well. But I think this can only be triggered under 
OOM on error hook or other jvm fatal error (although I was not able to make 
this work with OnError hook + stack overflow exception). This won't work for 
SIGTERM or SIGKILL. 

bq. This is pretty easy to increase, but is cluster wide config.

If cluster wide config, then having shorter intervals will not be enough for 
full heap dump. We could add separate shutdown hooks to collect jstack, jmx, 
/proc/* etc. and let HeapDumpOnOOM handle heap dump. We can probably have a web 
endpoint for manual heapdump if it's useful. 



> LLAP: Before SIGKILL and collect diagnostic information before daemon goes 
> down
> ---
>
> Key: HIVE-16136
> URL: https://issues.apache.org/jira/browse/HIVE-16136
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Sometime daemons can get killed by YARN's pmem monitor which issue a kill 
> followed by kill -9 after 250ms. This is really a short duration to collect 
> anything useful. 
> There is no clean way to trap SIGKILL in java.  
> One option is to increase the time between kill and kill -9 in YARN and 
> during that time we can have a shutdown hook handler to collect all 
> diagnostics information like heapdump, jstack, jmx output etc. in a 
> non-container directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-03-07 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900049#comment-15900049
 ] 

Chaoyu Tang commented on HIVE-15997:


LGTM, +1

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak:
> {noformat}
> 2017-02-02 06:23:25,410 WARN  hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>   at com.sun.proxy.$Proxy25.delete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>   at com.sun.proxy.$Proxy26.delete(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>   at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405)
>   at org.apache.hadoop.hive.ql.Context.clear(Context.java:541)
>   at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109)
>   at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714)
>   at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
>   at 

  1   2   3   >