[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-28 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thank you [~jdere] for the review.

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Fix For: 2.2.0
>
> Attachments: HIVE-1555.7.patch, HIVE-1555.8.patch, HIVE-1555.9.patch, 
> JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16045) Print progress bar along with operation log

2017-02-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889713#comment-15889713
 ] 

Thejas M Nair commented on HIVE-16045:
--

Note that testQueryProgress has been failing in some other jiras, because 
sometimes progress bar gets printed before the rest of the logs.
This change might make that test more reliable.


> Print progress bar along with operation log
> ---
>
> Key: HIVE-16045
> URL: https://issues.apache.org/jira/browse/HIVE-16045
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 2.2.0
>
> Attachments: HIVE-16045.1.patch, HIVE-16045.2.patch, 
> HIVE-16045.3.patch, HIVE-16045.4.patch
>
>
> allow printing of the operation logs and progress bar such that,
> allow operations logs to output data once -> block it -> start progress bar 
> -> finish progress bar -> unblock the operations log -> finish operations log 
> -> print query results. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16045) Print progress bar along with operation log

2017-02-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889709#comment-15889709
 ] 

Thejas M Nair commented on HIVE-16045:
--

testQueryProgressParallel is failing because the test progress bar didn't get 
printed in that case.
One theory about the failure is that I have is that the log printing in beeline 
happened only after the query was complete. 
Another possibility is that the there is an issue with different ThreadLocal 
SessionState being used for task execution when it is being run in a different 
thread (in case of parallel task execution).
I think we can remove the check for "Elapsed Time" in this test case which I 
added in another recent patch and then have a follow up patch to get beeline 
progress bar reliably work with hive.exec.parallel=true. 
"hive.exec.parallel=true" is not a reliable mode of execution and is not the 
default.


> Print progress bar along with operation log
> ---
>
> Key: HIVE-16045
> URL: https://issues.apache.org/jira/browse/HIVE-16045
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 2.2.0
>
> Attachments: HIVE-16045.1.patch, HIVE-16045.2.patch, 
> HIVE-16045.3.patch, HIVE-16045.4.patch
>
>
> allow printing of the operation logs and progress bar such that,
> allow operations logs to output data once -> block it -> start progress bar 
> -> finish progress bar -> unblock the operations log -> finish operations log 
> -> print query results. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889701#comment-15889701
 ] 

Hive QA commented on HIVE-16072:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855289/HIVE-16072.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[skewjoinopt4] 
(batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
 (batchId=106)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3860/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3860/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3860/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855289 - PreCommit-HIVE-Build

> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16072.1.patch
>
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889695#comment-15889695
 ] 

Rui Li commented on HIVE-16071:
---

My understanding is the timeout on SparkClient side is longer because it needs 
to wait for the RemoteDriver to launch. The timeout on the RemoteDriver side 
should be shorter because the SparkClient is already running when RemoteDriver 
starts - and it usually won't take long to just connect back and finish SASL 
handshake. Although the default 1000ms may be a little too short.

Looking at the stack trace in description, we detect the channel is closed and 
eventually get a {{SaslException}} instead of a {{TimeoutException}}. I wonder 
why the channel is closed before the handshake finishes. [~ctang.ma], is it 
possible that your HS2 runs into some issue?

Another question (may be irrelevant to this JIRA) to [~vanzin]: we use the 
server side timeout in two places:
# [Constructing 
RpcServer|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L108]
# [Registering 
client|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L162]

I understand 2 needs the long timeout because it includes the time to launch 
the RemoteDriver. But does 1 also need that timeout? I think 1 only needs to 
take care of the SASL handshake, which should take much less time.

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889641#comment-15889641
 ] 

Hive QA commented on HIVE-16071:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855287/HIVE-16071.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3859/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3859/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3859/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855287 - PreCommit-HIVE-Build

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16065) Vectorization: Wrong Key/Value information used by Vectorizer

2017-02-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16065:

Attachment: HIVE-16065.01.patch

> Vectorization: Wrong Key/Value information used by Vectorizer
> -
>
> Key: HIVE-16065
> URL: https://issues.apache.org/jira/browse/HIVE-16065
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16065.01.patch
>
>
> Make Vectorizer class get reducer key/value information the same way 
> ExecReducer/ReduceRecordProcessor do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16065) Vectorization: Wrong Key/Value information used by Vectorizer

2017-02-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16065:

Status: Patch Available  (was: Open)

> Vectorization: Wrong Key/Value information used by Vectorizer
> -
>
> Key: HIVE-16065
> URL: https://issues.apache.org/jira/browse/HIVE-16065
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16065.01.patch
>
>
> Make Vectorizer class get reducer key/value information the same way 
> ExecReducer/ReduceRecordProcessor do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889567#comment-15889567
 ] 

Hive QA commented on HIVE-15820:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853949/HIVE-15820.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3858/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3858/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3858/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-03-01 06:06:45.302
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3858/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-03-01 06:06:45.305
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at a9de1cd HIVE-16047: Shouldn't try to get KeyProvider unless 
encryption is enabled (Rui reviewed by Xuefu and Ferdinand)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at a9de1cd HIVE-16047: Shouldn't try to get KeyProvider unless 
encryption is enabled (Rui reviewed by Xuefu and Ferdinand)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-03-01 06:06:46.294
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: beeline/src/java/org/apache/hive/beeline/Commands.java:795
error: beeline/src/java/org/apache/hive/beeline/Commands.java: patch does not 
apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853949 - PreCommit-HIVE-Build

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> 

[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889565#comment-15889565
 ] 

Hive QA commented on HIVE-16043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855284/HIVE-16043.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=217)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3857/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3857/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3857/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855284 - PreCommit-HIVE-Build

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15884) Optimize not between for vectorization

2017-02-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889531#comment-15889531
 ] 

Pengcheng Xiong commented on HIVE-15884:


[~ashutoshc], could u take a look? Thanks.

> Optimize not between for vectorization
> --
>
> Key: HIVE-15884
> URL: https://issues.apache.org/jira/browse/HIVE-15884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15884.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15884) Optimize not between for vectorization

2017-02-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15884:
---
Status: Patch Available  (was: Open)

> Optimize not between for vectorization
> --
>
> Key: HIVE-15884
> URL: https://issues.apache.org/jira/browse/HIVE-15884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15884.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15884) Optimize not between for vectorization

2017-02-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15884:
---
Attachment: HIVE-15884.01.patch

> Optimize not between for vectorization
> --
>
> Key: HIVE-15884
> URL: https://issues.apache.org/jira/browse/HIVE-15884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15884.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16067) LLAP: send out container complete messages after a fragment completes

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889512#comment-15889512
 ] 

Hive QA commented on HIVE-16067:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855275/HIVE-16067.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3856/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3856/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3856/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855275 - PreCommit-HIVE-Build

> LLAP: send out container complete messages after a fragment completes
> -
>
> Key: HIVE-16067
> URL: https://issues.apache.org/jira/browse/HIVE-16067
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16067.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16068) BloomFilter expectedEntries not always using NDV when it's available during runtime filtering

2017-02-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889487#comment-15889487
 ] 

Gunther Hagleitner commented on HIVE-16068:
---

+1

> BloomFilter expectedEntries not always using NDV when it's available during 
> runtime filtering
> -
>
> Key: HIVE-16068
> URL: https://issues.apache.org/jira/browse/HIVE-16068
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16068.1.patch
>
>
> The current logic only uses NDV if it's the only ColumnStat available, but it 
> looks like there can sometimes be other ColStats in the semijoin Select 
> operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-02-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889479#comment-15889479
 ] 

Sushanth Sowmyan commented on HIVE-16006:
-

Code-wise, this version fixes the issue correctly, and is the kind of change to 
ReplicationSpec that I thought was necessary. :)

However, from the test that was added, I see that that test succeeds even 
without these changes. i.e., if I pick only the test change and nothing else, 
the test still succeeds, which it should not, since this patch is what fixes 
the bug. Could you look into the test to see why that is?

> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.
> ==
> Additional note - this is happening for INSERT events, not other events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889478#comment-15889478
 ] 

Xuefu Zhang commented on HIVE-16071:


[~ctang.ma], thanks for looking into this. It seems that your patch basically 
reverts the change made in HIVE-15671. There is a long discussion in that JIRA, 
and you might want to visit that. The short summary is that 
hive.spark.client.connect.timeout is for SASL handshake, which happens after 
the remote driver is launched. This timeout doesn't need to be too long. On the 
other hand, hive.spark.client.server.connect.timeout is the timeout for Hive to 
wait for the remote driver to connect back. Because launching the remote driver 
takes time, not to mention there could be resource constraint, it takes much 
longer. I understand this is a frequent confusion that bothers many of us, so 
it's not surprising that this comes back again.

The particular problem you saw might not be attributed to a bug. SASL usually 
takes little time to establish. If the default value is too short for your 
network, you might consider to increase the timeout value a bit.

CC: [~vanzin]

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889474#comment-15889474
 ] 

Rui Li commented on HIVE-16071:
---

[~ctang.ma], you might want to take a look at HIVE-15671. I'll also have 
another look at these timeouts.

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-02-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889475#comment-15889475
 ] 

Siddharth Seth commented on HIVE-16072:
---

+1

> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16072.1.patch
>
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15898) add Type2 SCD merge tests

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889473#comment-15889473
 ] 

Hive QA commented on HIVE-15898:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855271/HIVE-15898.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10299 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_type2_scd]
 (batchId=139)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3855/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3855/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3855/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855271 - PreCommit-HIVE-Build

> add Type2 SCD merge tests
> -
>
> Key: HIVE-15898
> URL: https://issues.apache.org/jira/browse/HIVE-15898
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, 
> HIVE-15898.03.patch, HIVE-15898.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16030) LLAP: All rolled over logs should be compressed

2017-02-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889467#comment-15889467
 ] 

Prasanth Jayachandran commented on HIVE-16030:
--

This will still incur disk space on the daemons? Does YARN log aggregation also 
remove the aggregated files?

> LLAP: All rolled over logs should be compressed
> ---
>
> Key: HIVE-16030
> URL: https://issues.apache.org/jira/browse/HIVE-16030
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> When we rollover the logs we don't compress it. Have seen 256MB of 
> uncompressed logs get down to 20MB after compression. This can significantly 
> save disk space. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16054) AMReporter should use application token instead of ugi.getCurrentUser

2017-02-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889468#comment-15889468
 ] 

Siddharth Seth commented on HIVE-16054:
---

+1

> AMReporter should use application token instead of ugi.getCurrentUser
> -
>
> Key: HIVE-16054
> URL: https://issues.apache.org/jira/browse/HIVE-16054
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16054.1.patch
>
>
> During the initial creation of the ugi we user appId but later we user the 
> user who submitted the request. Although this doesn't matter as long as the 
> job tokens are set correctly. It is good to keep it consistent. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-02-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16072:
-
Status: Patch Available  (was: Open)

> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16072.1.patch
>
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-02-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16072:
-
Attachment: HIVE-16072.1.patch

[~sseth] can you please take a look?

> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16072.1.patch
>
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread muxin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889443#comment-15889443
 ] 

muxin edited comment on HIVE-15820 at 3/1/17 4:08 AM:
--

thanx [~vihangk1], this issue occurs only when using -e option, and I noticed 
that test cases in hive project are all sqls, which will not reproduce this 
issue. Please let me know if there is a way to add shell test cases. the patch 
is already submitted.


was (Author: muxin):
this issue will only reproduce when using -e option

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-02-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-16072:



> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread muxin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Comment: was deleted

(was: add tests)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread muxin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Status: Patch Available  (was: Open)

this issue will only reproduce when using -e option

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 1.2.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16071:
---
Attachment: HIVE-16071.patch

[~xuefuz], [~lirui], could you review the code to see if the change makes 
sense? thanks.

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16071:
---
Status: Patch Available  (was: Open)

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-02-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16071:
--


> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889438#comment-15889438
 ] 

Fei Hui commented on HIVE-16043:


[~Ferd] thanks i found it on jenkins

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-02-28 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-16047:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Xuefu and Ferdinand for the review.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread muxin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Status: Open  (was: Patch Available)

add tests

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 1.2.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889426#comment-15889426
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855251/HIVE-15844.07.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 125 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_mapjoin1] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_simple] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_expressions]
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic]
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_non_string_partition]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join2] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce1] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce2] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_simple] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_div0] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_offset_limit]
 (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
 (batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_adaptor_usage_mode]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_columns]
 (batchId=151)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread muxin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Status: Patch Available  (was: Open)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 1.2.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-02-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889420#comment-15889420
 ] 

Ferdinand Xu commented on HIVE-16047:
-

+1

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889418#comment-15889418
 ] 

Ferdinand Xu commented on HIVE-16043:
-

@Fei Hui you can check whether the jenkins job is triggered by visiting 
https://builds.apache.org/job/PreCommit-HIVE-Build/ And if not, you can build 
with configuration by providing your attachment ID and issue ID after login.

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-02-28 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889415#comment-15889415
 ] 

Rui Li commented on HIVE-16047:
---

Failure with age 1 can't be reproduced.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-16043:
---
Attachment: HIVE-16043.1.patch

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-16043:
---
Status: Patch Available  (was: Open)

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-16043:
---
Status: Open  (was: Patch Available)

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-16043:
---
Attachment: (was: HIVE-16043.1.patch)

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889408#comment-15889408
 ] 

Fei Hui commented on HIVE-16043:


[~sershe] thanks. i will resubmit it. i'm confused for testing were not running

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16070) MERGE etc should be a reserved SQL keyword

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-16070:
-


> MERGE etc should be a reserved SQL keyword
> --
>
> Key: HIVE-16070
> URL: https://issues.apache.org/jira/browse/HIVE-16070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16069) test case for beeline -e ''

2017-02-28 Thread muxin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin resolved HIVE-16069.
--
Resolution: Invalid

> test case for beeline -e ''
> ---
>
> Key: HIVE-16069
> URL: https://issues.apache.org/jira/browse/HIVE-16069
> Project: Hive
>  Issue Type: Test
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
> Environment: bash
>Reporter: muxin
>
> test case for issue 15820



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2017-02-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889397#comment-15889397
 ] 

Eugene Koifman commented on HIVE-13335:
---

no related failures
[~wzheng], could you review please?

> get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
> 
>
> Key: HIVE-13335
> URL: https://issues.apache.org/jira/browse/HIVE-13335
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13335.01.patch
>
>
> look for usages - it's no longer useful; in fact may be a perf hit
> made obsolete by HIVE-12439



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16068) BloomFilter expectedEntries not always using NDV when it's available during runtime filtering

2017-02-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16068:
--
Attachment: HIVE-16068.1.patch

[~hagleitn] [~prasanth_j] can you review?

> BloomFilter expectedEntries not always using NDV when it's available during 
> runtime filtering
> -
>
> Key: HIVE-16068
> URL: https://issues.apache.org/jira/browse/HIVE-16068
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16068.1.patch
>
>
> The current logic only uses NDV if it's the only ColumnStat available, but it 
> looks like there can sometimes be other ColStats in the semijoin Select 
> operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16068) BloomFilter expectedEntries not always using NDV when it's available during runtime filtering

2017-02-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-16068:
-


> BloomFilter expectedEntries not always using NDV when it's available during 
> runtime filtering
> -
>
> Key: HIVE-16068
> URL: https://issues.apache.org/jira/browse/HIVE-16068
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> The current logic only uses NDV if it's the only ColumnStat available, but it 
> looks like there can sometimes be other ColStats in the semijoin Select 
> operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889372#comment-15889372
 ] 

Sergey Shelukhin commented on HIVE-16043:
-

I was waiting for the tests to run. Perhaps the patch needs to be resubmitted. 
I can commit after that.

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889369#comment-15889369
 ] 

Hive QA commented on HIVE-13335:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855248/HIVE-13335.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10256 tests 
executed
*Failed tests:*
{noformat}
TestCommandProcessorFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=271)
TestDbTxnManager - did not produce a TEST-*.xml file (likely timed out) 
(batchId=271)
TestDummyTxnManager - did not produce a TEST-*.xml file (likely timed out) 
(batchId=271)
TestHiveInputSplitComparator - did not produce a TEST-*.xml file (likely timed 
out) (batchId=271)
TestIndexType - did not produce a TEST-*.xml file (likely timed out) 
(batchId=271)
TestSplitFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=271)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3853/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3853/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855248 - PreCommit-HIVE-Build

> get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
> 
>
> Key: HIVE-13335
> URL: https://issues.apache.org/jira/browse/HIVE-13335
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13335.01.patch
>
>
> look for usages - it's no longer useful; in fact may be a perf hit
> made obsolete by HIVE-12439



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-02-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-16006:

Description: 
During "Incremental Load", it is not considering the database name input in the 
command line. Hence load doesn't happen. At the same time, database with 
original name is getting modified.
Steps:
1. REPL DUMP default FROM 52;
2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
– This step modifies the default Db instead of replDb.


==

Additional note - this is happening for INSERT events, not other events.

  was:
During "Incremental Load", it is not considering the database name input in the 
command line. Hence load doesn't happen. At the same time, database with 
original name is getting modified.
Steps:
1. REPL DUMP default FROM 52;
2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
– This step modifies the default Db instead of replDb.


> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.
> ==
> Additional note - this is happening for INSERT events, not other events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.

2017-02-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-16006:

Summary: Incremental REPL LOAD Inserts doesn't operate on the target 
database if name differs from source database.  (was: Incremental REPL LOAD 
doesn't operate on the target database if name differs from source database.)

> Incremental REPL LOAD Inserts doesn't operate on the target database if name 
> differs from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16043) TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name

2017-02-28 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889358#comment-15889358
 ] 

Fei Hui commented on HIVE-16043:


CC [~Ferd] could you please give any suggestions?

> TezConfiguration.TEZ_QUEUE_NAME instead of tez.queue.name
> -
>
> Key: HIVE-16043
> URL: https://issues.apache.org/jira/browse/HIVE-16043
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-16043.1.patch
>
>
> I see the following source in hive
> {code:title=TezSessionPoolManager.java|borderStyle=solid}
>private TezSessionState getSession(HiveConf conf, boolean doOpen)
>throws Exception {
>  String queueName = conf.get("tez.queue.name");
>  ...
>}
>   private TezSessionState getNewSessionState(HiveConf conf,
>   String queueName, boolean doOpen) throws Exception {
> TezSessionPoolSession retTezSessionState = 
> createAndInitSession(queueName, false);
> if (queueName != null) {
>   conf.set(TezConfiguration.TEZ_QUEUE_NAME, queueName);
> }
>   ...
>   }
> {code}
> TezConfiguration.TEZ_QUEUE_NAME is the same as tez.queue.name , i think we 
> should consistently use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-28 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889345#comment-15889345
 ] 

Rui Li commented on HIVE-15882:
---

Thanks [~mi...@cloudera.com] for the update. +1

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, HIVE-15882.02.patch, 
> HIVE-15882.03.patch, HIVE-15882.04.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16066) NPE in ExplainTask

2017-02-28 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889337#comment-15889337
 ] 

Rajesh Balamohan commented on HIVE-16066:
-

This is not part of explain formatted or explain extended. Got this exception 
as a part of ATSHook.

> NPE in ExplainTask
> --
>
> Key: HIVE-16066
> URL: https://issues.apache.org/jira/browse/HIVE-16066
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> {noformat}
> 2017-02-28T20:05:13,412  WARN [ATS Logger 0] hooks.ATSHook: Failed to submit 
> plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-16067) LLAP: send out container complete messages after a fragment completes

2017-02-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889336#comment-15889336
 ] 

Sergey Shelukhin commented on HIVE-16067:
-

+1; why was getContext().containerStopRequested removed?

> LLAP: send out container complete messages after a fragment completes
> -
>
> Key: HIVE-16067
> URL: https://issues.apache.org/jira/browse/HIVE-16067
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16067.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15870) MM tables - parquet_join test fails

2017-02-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15870:
---

Assignee: Sergey Shelukhin

> MM tables - parquet_join test fails
> ---
>
> Key: HIVE-15870
> URL: https://issues.apache.org/jira/browse/HIVE-15870
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Both single-join queries produce results, but not the last query.
> Looking at MM logs, it looks like the inputs are read correctly. Must be 
> something parquet-specific w.r.t. multiple files in a table.
> {noformat}
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> select tbl1.id, t1.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t1 ON tbl1.id=t1.id;
> select tbl1.id, t1.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16051) MM tables: skewjoin test fails

2017-02-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889323#comment-15889323
 ] 

Sergey Shelukhin edited comment on HIVE-16051 at 3/1/17 2:01 AM:
-

Pushed to branch. Config-driven skew join is disabled when writing to MM tables 
due to FSOP commit limitations. It could be addressed but is not very simple 
and the feature seems very obscure.


was (Author: sershe):
Pushed to branch. Config-driven skew join is disabled for MM tables due to FSOP 
commit limitations. It could be addressed but is not very simple and the 
feature seems very obscure.

> MM tables: skewjoin test fails
> --
>
> Key: HIVE-16051
> URL: https://issues.apache.org/jira/browse/HIVE-16051
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
>
> {noformat}
> set hive.optimize.skewjoin = true;
> set hive.skewjoin.key = 2;
> set hive.optimize.metadataonly=false;
> CREATE TABLE dest_j1(key INT, value STRING) STORED AS TEXTFILE tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> FROM src src1 JOIN src src2 ON (src1.key = src2.key)
> INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value;
> select count(distinct key) from dest_j1;
> {noformat}
> Different results for MM and non-MM table.
> Probably has something to do with how skewjoin handles files; however, 
> looking at MM/debugging logs, there are no suspicious deletes, and everything 
> looks the same for both cases; all the logging for skewjoin row containers 
> and stuff is identical between the two runs (except for the numbers/guids; 
> the number of files, paths, etc. are all the same). So not sure what's going 
> on. Probably dfs dump can answer this question, but it doesn't work for me 
> currently on q files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16051) MM tables: skewjoin test fails

2017-02-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-16051.
-
   Resolution: Fixed
Fix Version/s: hive-14535

Pushed to branch. Config-driven skew join is disabled for MM tables due to FSOP 
commit limitations. It could be addressed but is not very simple and the 
feature seems very obscure.

> MM tables: skewjoin test fails
> --
>
> Key: HIVE-16051
> URL: https://issues.apache.org/jira/browse/HIVE-16051
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
>
> {noformat}
> set hive.optimize.skewjoin = true;
> set hive.skewjoin.key = 2;
> set hive.optimize.metadataonly=false;
> CREATE TABLE dest_j1(key INT, value STRING) STORED AS TEXTFILE tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> FROM src src1 JOIN src src2 ON (src1.key = src2.key)
> INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value;
> select count(distinct key) from dest_j1;
> {noformat}
> Different results for MM and non-MM table.
> Probably has something to do with how skewjoin handles files; however, 
> looking at MM/debugging logs, there are no suspicious deletes, and everything 
> looks the same for both cases; all the logging for skewjoin row containers 
> and stuff is identical between the two runs (except for the numbers/guids; 
> the number of files, paths, etc. are all the same). So not sure what's going 
> on. Probably dfs dump can answer this question, but it doesn't work for me 
> currently on q files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889318#comment-15889318
 ] 

Vihang Karajgaonkar edited comment on HIVE-15820 at 3/1/17 1:58 AM:


Hi [~muxin] Thanks for the patch. Can you add some test cases as well? Once 
adding the patch you may want to click on submit patch button so that 
pre-commit tests run so that we can know if there are any regressions. Thanks!


was (Author: vihangk1):
Hi [~muxin] Thanks for the patch. Can you add some test cases as well?

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889318#comment-15889318
 ] 

Vihang Karajgaonkar commented on HIVE-15820:


Hi [~muxin] Thanks for the patch. Can you add some test cases as well?

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15766) DBNotificationlistener leaks JDOPersistenceManager

2017-02-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888562#comment-15888562
 ] 

Thejas M Nair edited comment on HIVE-15766 at 3/1/17 1:58 AM:
--

[~akolb]
This is related to leak described  in HIVE-7353 . ThreadWithGarbageCleanup was 
added in that to explicitly call ObjectStore.shutdown to cleanup the leak. 
However, the cleanup added in ThreadWithGarbageCleanup doesn't work if you 
create a non-thread local RawStore/ObjecStore. This change to use the thread 
local one address the leak.


was (Author: thejas):
[~availlancourt]
This is related to leak described  in HIVE-7353 . ThreadWithGarbageCleanup was 
added in that to explicitly call ObjectStore.shutdown to cleanup the leak. 
However, the cleanup added in ThreadWithGarbageCleanup doesn't work if you 
create a non-thread local RawStore/ObjecStore. This change to use the thread 
local one address the leak.

> DBNotificationlistener leaks JDOPersistenceManager
> --
>
> Key: HIVE-15766
> URL: https://issues.apache.org/jira/browse/HIVE-15766
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15766.1.patch, HIVE-15766.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15822) beeline ignore all sql under comment after semicolon

2017-02-28 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-15822:
--

Assignee: muxin  (was: Vihang Karajgaonkar)

> beeline ignore all sql under comment after semicolon
> 
>
> Key: HIVE-15822
> URL: https://issues.apache.org/jira/browse/HIVE-15822
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
> Attachments: HIVE-15822.patch
>
>
> way to reproduce this error:
> beeline -u jdbc:hive2://localhost:1 -n test -e "
> show databases;--some comment here
> show tables;"
> it will only execute 'show databases', and consider 
> '--some comment here
> show tables;'
> as comment( all sql under the first comment appeared after semicolon).
> when the comment is also end with semicolon, the result will be right.
> the root cause is that beeline will only consider a entire command is inputed 
> when a line is end with semicolon, otherwise if this line is not started with 
> '--' or '#' beeline will combine it with next line until meet semicolon in 
> the end. so actually the comment above is not removed(which causes the 
> error). then beeline split the entire line by ';', so 'show databases' is 
> recognized and executed,
> '--some comment here\n show tables' is considered a comment and discarded.
> my solution is to just remove comment before split by ';', the code can refer 
> to solution 2 for similar issue 
> :https://issues.apache.org/jira/browse/HIVE-15820



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15820) comment at the head of beeline -e

2017-02-28 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-15820:
--

Assignee: muxin  (was: Vihang Karajgaonkar)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16067) LLAP: send out container complete messages after a fragment completes

2017-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-16067:
--
Status: Patch Available  (was: Open)

> LLAP: send out container complete messages after a fragment completes
> -
>
> Key: HIVE-16067
> URL: https://issues.apache.org/jira/browse/HIVE-16067
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16067.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11685) Restarting Metastore kills Compactions - store Hadoop job id in COMPACTION_QUEUE

2017-02-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889306#comment-15889306
 ] 

Eugene Koifman commented on HIVE-11685:
---

with HIVE-15851 launchCompactionJob() uses submitJob() so this may be solved - 
need to test it

> Restarting Metastore kills Compactions - store Hadoop job id in 
> COMPACTION_QUEUE
> 
>
> Key: HIVE-11685
> URL: https://issues.apache.org/jira/browse/HIVE-11685
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> CompactorMR submits MR job to do compaction and waits for completion.
> If the metastore need to be restarted, it will kill in-flight compactions.
> I ideally we'd want to add job ID to the COMPACTION_QUEUE table (and include 
> that in SHOW COMPACTIONS) and poll for it or register a callback so that the 
> job survives Metastore restart
> Also, 
> when running revokeTimedoutWorker() make sure to use this JobId to kill the 
> job is it's still running.
> Alternatively, if it's still running, maybe just a assign a new worker_id and 
> let it continue to run.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16067) LLAP: send out container complete messages after a fragment completes

2017-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-16067:
--
Attachment: HIVE-16067.01.patch

[~rajesh.balamohan] - please take a look.

This sends out a container complete message to tez when working with llap, 
which would normally come from the RM.

> LLAP: send out container complete messages after a fragment completes
> -
>
> Key: HIVE-16067
> URL: https://issues.apache.org/jira/browse/HIVE-16067
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16067.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16067) LLAP: send out container complete messages after a fragment completes

2017-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-16067:
-


> LLAP: send out container complete messages after a fragment completes
> -
>
> Key: HIVE-16067
> URL: https://issues.apache.org/jira/browse/HIVE-16067
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16066) NPE in ExplainTask

2017-02-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889276#comment-15889276
 ] 

Pengcheng Xiong commented on HIVE-16066:


How did u run the explain? Thanks.

> NPE in ExplainTask
> --
>
> Key: HIVE-16066
> URL: https://issues.apache.org/jira/browse/HIVE-16066
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> {noformat}
> 2017-02-28T20:05:13,412  WARN [ATS Logger 0] hooks.ATSHook: Failed to submit 
> plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:251) 
> 

[jira] [Updated] (HIVE-15898) add Type2 SCD merge tests

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15898:
--
Attachment: HIVE-15898.04.patch

> add Type2 SCD merge tests
> -
>
> Key: HIVE-15898
> URL: https://issues.apache.org/jira/browse/HIVE-15898
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, 
> HIVE-15898.03.patch, HIVE-15898.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16060) GenericUDTFJSONTuple's json cache could overgrow beyond its limit

2017-02-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889244#comment-15889244
 ] 

Xuefu Zhang commented on HIVE-16060:


+1

> GenericUDTFJSONTuple's json cache could overgrow beyond its limit
> -
>
> Key: HIVE-16060
> URL: https://issues.apache.org/jira/browse/HIVE-16060
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-16060.1.patch, image.png
>
>
> At the moment the [cache 
> object|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java#L96]
>  used in {{GenericUDTFJSONTuple}} is a static linked hashmap that is not 
> thread-safe. In the case of HoS it may be accessed concurrently and has race 
> conditions. In particular, its size may overgrow even though the limit is 32. 
> This can be observed from the attached image.
> An easy way to fix it is to make it non-static.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-7517) RecordIdentifier overrides equals() but not hashCode()

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7517:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

committed to master
thanks Wei for the review

> RecordIdentifier overrides equals() but not hashCode()
> --
>
> Key: HIVE-7517
> URL: https://issues.apache.org/jira/browse/HIVE-7517
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 2.2.0
>
> Attachments: HIVE-7517.01.patch, HIVE-7517.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-02-28 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889235#comment-15889235
 ] 

Vineet Garg edited comment on HIVE-16064 at 3/1/17 12:54 AM:
-

Currently HIVE grammar transforms all kind of functions (UDAs, UDFs, UDTFs) 
into three type of functions based on presence of star, distinct -  
(TOK_FUNCTION,  TOK_FUNCTIONSTAR, TOK_FUNCTIONDI).
This patch adds an optional keyword ALL for all  TOK_FUNCTION. This has a side 
effect of allowing ALL for all kind of aggregates (standard doesn’t permit ALL 
with STDDEV_POP, STDDEV_SAMP, VAR_POP, or VAR_SAMP), UDFs and UDTFs. But since 
this side effect is benign and doesn't modify the semantics in anyway it should 
be a safe change.


was (Author: vgarg):
Currently HIVE grammar transforms all kind of functions (UDAs, UDFs, UDTFs) 
into three type of functions based on presence of star, distinct -  
(TOK_FUNCTION,  TOK_FUNCTIONSTAR, TOK_FUNCTIONDI).
This patch adds an optional keyword ALL for all  TOK_FUNCTION. This has a side 
effect of allowing {{ALL}}for all kind of aggregates (standard doesn’t permit 
ALL with STDDEV_POP, STDDEV_SAMP, VAR_POP, or VAR_SAMP), UDFs and UDTFs. But 
since this side effect is benign and doesn't modify the semantics in anyway it 
should be a safe change.

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16064.1.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-02-28 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889235#comment-15889235
 ] 

Vineet Garg commented on HIVE-16064:


Currently HIVE grammar transforms all kind of functions (UDAs, UDFs, UDTFs) 
into three type of functions based on presence of star, distinct -  
(TOK_FUNCTION,  TOK_FUNCTIONSTAR, TOK_FUNCTIONDI).
This patch adds an optional keyword ALL for all  TOK_FUNCTION. This has a side 
effect of allowing {{ALL}}for all kind of aggregates (standard doesn’t permit 
ALL with STDDEV_POP, STDDEV_SAMP, VAR_POP, or VAR_SAMP), UDFs and UDTFs. But 
since this side effect is benign and doesn't modify the semantics in anyway it 
should be a safe change.

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16064.1.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16060) GenericUDTFJSONTuple's json cache could overgrow beyond its limit

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889233#comment-15889233
 ] 

Hive QA commented on HIVE-16060:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855249/HIVE-16060.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10297 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3852/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3852/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3852/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855249 - PreCommit-HIVE-Build

> GenericUDTFJSONTuple's json cache could overgrow beyond its limit
> -
>
> Key: HIVE-16060
> URL: https://issues.apache.org/jira/browse/HIVE-16060
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-16060.1.patch, image.png
>
>
> At the moment the [cache 
> object|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java#L96]
>  used in {{GenericUDTFJSONTuple}} is a static linked hashmap that is not 
> thread-safe. In the case of HoS it may be accessed concurrently and has race 
> conditions. In particular, its size may overgrow even though the limit is 32. 
> This can be observed from the attached image.
> An easy way to fix it is to make it non-static.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16065) Vectorization: Wrong Key/Value information used by Vectorizer

2017-02-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-16065:
---

Assignee: Matt McCline

> Vectorization: Wrong Key/Value information used by Vectorizer
> -
>
> Key: HIVE-16065
> URL: https://issues.apache.org/jira/browse/HIVE-16065
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Make Vectorizer class get reducer key/value information the same way 
> ExecReducer/ReduceRecordProcessor do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-02-28 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889216#comment-15889216
 ] 

Vineet Garg edited comment on HIVE-16064 at 3/1/17 12:39 AM:
-

First patch only contain source code changes. Tests will be added in next patch


was (Author: vgarg):
First patch only contains source code changes. Tests will be added in next patch

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16064.1.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-02-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16064:
---
Attachment: HIVE-16064.1.patch

First patch only contains source code changes. Tests will be added in next patch

> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16064.1.patch
>
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16061) Some of console output is not printed to the beeline console

2017-02-28 Thread muxin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889215#comment-15889215
 ] 

muxin commented on HIVE-16061:
--

we met and resolved it in issue https://issues.apache.org/jira/browse/HIVE-15821

> Some of console output is not printed to the beeline console
> 
>
> Key: HIVE-16061
> URL: https://issues.apache.org/jira/browse/HIVE-16061
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Run a hiveserver2 instance "hive --service hiveserver2".
> Then from another console, connect to hiveserver2 "beeline -u 
> "jdbc:hive2://localhost:1"
> When you run a MR job like "select t1.key from src t1 join src t2 on 
> t1.key=t2.key", some of the console logs like MR job info are not printed to 
> the console while it just print to the hiveserver2 console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16064) Allow ALL set quantifier with aggregate functions

2017-02-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-16064:
--


> Allow ALL set quantifier with aggregate functions
> -
>
> Key: HIVE-16064
> URL: https://issues.apache.org/jira/browse/HIVE-16064
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> SQL:2011  allows  ALL with aggregate functions which is 
> equivalent to aggregate function without ALL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16063) instead of explicitly specifying mmWriteId during compilation phase, it should only be generated whenever needed during runtime

2017-02-28 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889200#comment-15889200
 ] 

Wei Zheng commented on HIVE-16063:
--

We will open transaction for all mm table write operations. Once we open a 
transaction in Driver, the txnId should be available across the session.

> instead of explicitly specifying mmWriteId during compilation phase, it 
> should only be generated whenever needed during runtime
> ---
>
> Key: HIVE-16063
> URL: https://issues.apache.org/jira/browse/HIVE-16063
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> For ACID transaction logic to work with mm table, first thing is to make the 
> ID usage logic consistent. ACID stores valid txn list in VALID_TXNS_KEY.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16063) instead of explicitly specifying mmWriteId during compilation phase, it should only be generated whenever needed during runtime

2017-02-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889178#comment-15889178
 ] 

Sergey Shelukhin commented on HIVE-16063:
-

By generated do you mean retrieved from config, like txn ID? execution cannot 
talk to metastore.

> instead of explicitly specifying mmWriteId during compilation phase, it 
> should only be generated whenever needed during runtime
> ---
>
> Key: HIVE-16063
> URL: https://issues.apache.org/jira/browse/HIVE-16063
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> For ACID transaction logic to work with mm table, first thing is to make the 
> ID usage logic consistent. ACID stores valid txn list in VALID_TXNS_KEY.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15920) Implement a blocking version of a command to compact

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15920:
--
Description: 
currently 
{noformat}
alter table AcidTable compact 'major'
{noformat} 
is supported which enqueues a msg to compact.

Would be nice for testing and script building to support 
{noformat} 
alter table AcidTable compact 'major' blocking
{noformat} 
perhaps another variation is to block until either compaction is done or until 
cleaning is finished.

DDLTask.compact() gets a request id back so it can then just block and wait for 
it using some new API


may also be useful to let users compact all partitions but only if  a separate 
queue has been set up for compaction jobs.
The later is because with a 1M partition table, this may create very many jobs 
and saturate the cluster.
This probably requires HIVE-12376 to make sure the compaction queue does the 
throttling, not the number of worker threads


  was:
currently 
{noformat}
alter table AcidTable compact 'major'
{noformat} 
is supported which enqueues a msg to compact.

Would be nice for testing and script building to support 
{noformat} 
alter table AcidTable compact 'major' blocking
{noformat} 
perhaps another variation is to block until either compaction is done or until 
cleaning is finished.


DDLTask.compact() gets a request id back so it can then just block and wait for 
it using some new API



> Implement a blocking version of a command to compact
> 
>
> Key: HIVE-15920
> URL: https://issues.apache.org/jira/browse/HIVE-15920
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> currently 
> {noformat}
> alter table AcidTable compact 'major'
> {noformat} 
> is supported which enqueues a msg to compact.
> Would be nice for testing and script building to support 
> {noformat} 
> alter table AcidTable compact 'major' blocking
> {noformat} 
> perhaps another variation is to block until either compaction is done or 
> until cleaning is finished.
> DDLTask.compact() gets a request id back so it can then just block and wait 
> for it using some new API
> may also be useful to let users compact all partitions but only if  a 
> separate queue has been set up for compaction jobs.
> The later is because with a 1M partition table, this may create very many 
> jobs and saturate the cluster.
> This probably requires HIVE-12376 to make sure the compaction queue does the 
> throttling, not the number of worker threads



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15934) Downgrade Maven surefire plugin from 2.19.1 to 2.18.1

2017-02-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889162#comment-15889162
 ] 

ASF GitHub Bot commented on HIVE-15934:
---

Github user weiatwork closed the pull request at:

https://github.com/apache/hive/pull/152


> Downgrade Maven surefire plugin from 2.19.1 to 2.18.1
> -
>
> Key: HIVE-15934
> URL: https://issues.apache.org/jira/browse/HIVE-15934
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-15934.1.patch
>
>
> Surefire 2.19.1 has some issue 
> (https://issues.apache.org/jira/browse/SUREFIRE-1255) which caused debugging 
> session to abort after a short period of time. Many IntelliJ users have seen 
> this, although it looks fine for Eclipse users. Version 2.18.1 works fine.
> We'd better make the change to not impact the development for IntelliJ guys. 
> We can upgrade again once the root cause is figured out.
> cc [~kgyrtkirk] [~ashutoshc]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14879) integrate MM tables into ACID: replace MM metastore calls and structures with ACID ones

2017-02-28 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14879:
-
Attachment: HIVE-14879.patch

The attached patch is preliminary. And it depends on HIVE-16063.

> integrate MM tables into ACID: replace MM metastore calls and structures with 
> ACID ones
> ---
>
> Key: HIVE-14879
> URL: https://issues.apache.org/jira/browse/HIVE-14879
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Attachments: HIVE-14879.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16063) instead of explicitly specifying mmWriteId during compilation phase, it should only be generated whenever needed during runtime

2017-02-28 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-16063:



> instead of explicitly specifying mmWriteId during compilation phase, it 
> should only be generated whenever needed during runtime
> ---
>
> Key: HIVE-16063
> URL: https://issues.apache.org/jira/browse/HIVE-16063
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> For ACID transaction logic to work with mm table, first thing is to make the 
> ID usage logic consistent. ACID stores valid txn list in VALID_TXNS_KEY.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2017-02-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889138#comment-15889138
 ] 

Eugene Koifman commented on HIVE-13335:
---

on 2nd thought it's better to keep this to safeguard against extremely large 
number of Hive transactions to abort (and keep DB undo log reasonable) but 
increase the batch size so that we run fewer queries

> get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
> 
>
> Key: HIVE-13335
> URL: https://issues.apache.org/jira/browse/HIVE-13335
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13335.01.patch
>
>
> look for usages - it's no longer useful; in fact may be a perf hit
> made obsolete by HIVE-12439



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15844:
--
Attachment: HIVE-15844.07.patch

patch 7 integrating comments from Matt and Prasanth

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch, HIVE-15844.07.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16059) Addendum to HIVE-15879

2017-02-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889118#comment-15889118
 ] 

Sergio Peña commented on HIVE-16059:


Is this mock going to be available in another patch? I don't see any 
constructor that accepts another different Queue implementation besides 
ConcurrentLinkedQueue.

> Addendum to HIVE-15879
> --
>
> Key: HIVE-16059
> URL: https://issues.apache.org/jira/browse/HIVE-16059
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Trivial
> Attachments: HIVE-16059.01.patch
>
>
> Added a minor change to addressed Sahil's comment on the review board for 
> HIVE-15879 It changes the type of the variable pendingPaths in 
> PathDepthInfoCallable class from ConcurrentLinkedQueue to a 
> more generic Queue which makes the class more testable should 
> we want to test it using a mock/custom queue implementation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15935) ACL is not set in ATS data

2017-02-28 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-15935:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Patch pushed to master.

> ACL is not set in ATS data
> --
>
> Key: HIVE-15935
> URL: https://issues.apache.org/jira/browse/HIVE-15935
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 2.2.0
>
> Attachments: HIVE-15935.1.patch, HIVE-15935.2.patch, 
> HIVE-15935.3.patch, HIVE-15935.4.patch, HIVE-15935.5.patch, HIVE-15935.6.patch
>
>
> When publishing ATS info, Hive does not set ACL, that make Hive ATS entries 
> visible to all users. On the other hand, Tez ATS entires is using Tez DAG ACL 
> which limit both view/modify ACL to end user only. We shall make them 
> consistent. In the Jira, I am going to limit ACL to end user for both Tez ATS 
> and Hive ATS, also provide config "hive.view.acls" and "hive.modify.acls" if 
> user need to overridden.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16060) GenericUDTFJSONTuple's json cache could overgrow beyond its limit

2017-02-28 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-16060:

Status: Patch Available  (was: Open)

> GenericUDTFJSONTuple's json cache could overgrow beyond its limit
> -
>
> Key: HIVE-16060
> URL: https://issues.apache.org/jira/browse/HIVE-16060
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-16060.1.patch, image.png
>
>
> At the moment the [cache 
> object|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java#L96]
>  used in {{GenericUDTFJSONTuple}} is a static linked hashmap that is not 
> thread-safe. In the case of HoS it may be accessed concurrently and has race 
> conditions. In particular, its size may overgrow even though the limit is 32. 
> This can be observed from the attached image.
> An easy way to fix it is to make it non-static.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13335:
--
Status: Patch Available  (was: Open)

> get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
> 
>
> Key: HIVE-13335
> URL: https://issues.apache.org/jira/browse/HIVE-13335
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13335.01.patch
>
>
> look for usages - it's no longer useful; in fact may be a perf hit
> made obsolete by HIVE-12439



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16060) GenericUDTFJSONTuple's json cache could overgrow beyond its limit

2017-02-28 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-16060:

Attachment: HIVE-16060.1.patch

> GenericUDTFJSONTuple's json cache could overgrow beyond its limit
> -
>
> Key: HIVE-16060
> URL: https://issues.apache.org/jira/browse/HIVE-16060
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-16060.1.patch, image.png
>
>
> At the moment the [cache 
> object|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java#L96]
>  used in {{GenericUDTFJSONTuple}} is a static linked hashmap that is not 
> thread-safe. In the case of HoS it may be accessed concurrently and has race 
> conditions. In particular, its size may overgrow even though the limit is 32. 
> This can be observed from the attached image.
> An easy way to fix it is to make it non-static.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2017-02-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13335:
--
Attachment: HIVE-13335.01.patch

> get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
> 
>
> Key: HIVE-13335
> URL: https://issues.apache.org/jira/browse/HIVE-13335
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13335.01.patch
>
>
> look for usages - it's no longer useful; in fact may be a perf hit
> made obsolete by HIVE-12439



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15210) support replication v2 for MM tables

2017-02-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15210:

Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-14535)

> support replication v2 for MM tables
> 
>
> Key: HIVE-15210
> URL: https://issues.apache.org/jira/browse/HIVE-15210
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Merging master right now, the merge is so broken (and will be broken again 
> because repl v2 seems to be in flux) that I gave up on the paths in ISA/ESA 
> that create copy task (MM branch allows CopyTask to support multiple paths 
> and is different from ReplCopyTask). At some point when repl v2 is not in 
> flux, this needs to be merged.
> cc [~sushanth]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15265) support snapshot isolation for MM tables

2017-02-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15265:

Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-14535)

> support snapshot isolation for MM tables
> 
>
> Key: HIVE-15265
> URL: https://issues.apache.org/jira/browse/HIVE-15265
> Project: Hive
>  Issue Type: Bug
>Reporter: Wei Zheng
>
> Since MM table is using the incremental "delta" insertion mechanism via ACID, 
> it makes sense to make MM tables support snapshot isolation as well



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16059) Addendum to HIVE-15879

2017-02-28 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16059:
---
Description: Added a minor change to addressed Sahil's comment on the 
review board for HIVE-15879 It changes the type of the variable pendingPaths in 
PathDepthInfoCallable class from ConcurrentLinkedQueue to a more 
generic Queue which makes the class more testable should we want 
to test it using a mock/custom queue implementation  (was: Added a minor change 
to addressed Sahil's comment on the review board for HIVE-15879)

> Addendum to HIVE-15879
> --
>
> Key: HIVE-16059
> URL: https://issues.apache.org/jira/browse/HIVE-16059
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Trivial
> Attachments: HIVE-16059.01.patch
>
>
> Added a minor change to addressed Sahil's comment on the review board for 
> HIVE-15879 It changes the type of the variable pendingPaths in 
> PathDepthInfoCallable class from ConcurrentLinkedQueue to a 
> more generic Queue which makes the class more testable should 
> we want to test it using a mock/custom queue implementation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15935) ACL is not set in ATS data

2017-02-28 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-15935:
--
Attachment: HIVE-15935.6.patch

Added comments. Thanks for creating followup Jiras.

> ACL is not set in ATS data
> --
>
> Key: HIVE-15935
> URL: https://issues.apache.org/jira/browse/HIVE-15935
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15935.1.patch, HIVE-15935.2.patch, 
> HIVE-15935.3.patch, HIVE-15935.4.patch, HIVE-15935.5.patch, HIVE-15935.6.patch
>
>
> When publishing ATS info, Hive does not set ACL, that make Hive ATS entries 
> visible to all users. On the other hand, Tez ATS entires is using Tez DAG ACL 
> which limit both view/modify ACL to end user only. We shall make them 
> consistent. In the Jira, I am going to limit ACL to end user for both Tez ATS 
> and Hive ATS, also provide config "hive.view.acls" and "hive.modify.acls" if 
> user need to overridden.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888977#comment-15888977
 ] 

Prasanth Jayachandran commented on HIVE-15844:
--

mostly looks good to me. +1 for non-vectorization changes.

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >