date:20160801

[jira] [Commented] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.

2016-08-01 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401652#comment-15401652
 ] 

zhihai xu commented on HIVE-14303:
--

Thanks for finding this issue, It was my fault, I missed these test failures. I 
just find out checkAndGenObject may be called by  Derived class's closeOp such 
as CommonMergeJoinOperator and SMBMapJoinOperator. Because the state is changed 
before closeOp is called, HIVE-14303.0.patch will cause checkAndGenObject 
return wrongly from  {{CommonMergeJoinOperator. joinFinalLeftData}} and 
{{SMBMapJoinOperator.joinFinalLeftData}}. Since the contradictions are between 
{{CommonJoinOperator.checkAndGenObject}} and {{CommonJoinOperator.closeOp}}, we 
shouldn't depend on {{state}} which is changed outside CommonJoinOperator. We 
could do the same thing as how {{closeCalled}} is used in class 
{{SMBMapJoinOperator}}. use a variable to prevent 
{{CommonJoinOperator.checkAndGenObject}} called after 
{{CommonJoinOperator.closeOp}} was called. I attached a new patch 
HIVE-14303.1.patch which use a private variable 
{{CommonJoinOperator.closeOpCalled}} to protect checkAndGenObject from NPE.

> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice.
> -
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements 
> Closeable interface and ExecReducer.close can be called multiple time. We saw 
> the following NPE which hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
>

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.

2016-08-01 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Attachment: HIVE-14303.1.patch

> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice.
> -
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements 
> Closeable interface and ExecReducer.close can be called multiple time. We saw 
> the following NPE which hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 
> 53466
> 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
> operators - failing tree
> 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
>   at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessContro

[jira] [Commented] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1

2016-08-01 Thread Rajat Khandelwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401653#comment-15401653
 ] 

Rajat Khandelwal commented on HIVE-14357:
-

+1. Changes look good. 

One question: Do these tests fail in 2.1 release too? The commit seems to be a 
part of 2.1.0-rc3. 

Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for 
something, Thanks. 





> TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
> ---
>
> Key: HIVE-14357
> URL: https://issues.apache.org/jira/browse/HIVE-14357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Rajat Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14357.patch
>
>
> {noformat}
> checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S 
> where a in (select a from T where b = 1)"));
> txnMgr.openTxn("three");
> txnMgr.acquireLocks(driver.getPlan(), ctx, "three");
> locks = getLocks();
> Assert.assertEquals("Unexpected lock count", 3, locks.size());
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, 
> locks.get(0));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, 
> locks.get(1));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, 
> locks.get(2));
> {noformat}
> This test case is failing. The expected order of locks is supposed to be T, 
> S, R. But upon closer inspection, it seems to be R,S,T. 
> I'm not much familiar with what these locks are and why the order is 
> important. Raising this jira so while I try to understand it all. Meanwhile, 
> if somebody can explain here, would be helpful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.

2016-08-01 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401656#comment-15401656
 ] 

zhihai xu commented on HIVE-14303:
--

Attached the stack trace which prove checkAndGenObject is called by Derived 
class(SMBMapJoinOperator)'s closeOp
{code}
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:686)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinObject(SMBMapJoinOperator.java:414)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinOneGroup(SMBMapJoinOperator.java:383)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinFinalLeftData(SMBMapJoinOperator.java:357)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.closeOp(SMBMapJoinOperator.java:625)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:683) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
[hadoop-mapreduce-client-core-2.6.1.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
[hadoop-mapreduce-client-core-2.6.1.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
[hadoop-mapreduce-client-core-2.6.1.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
 [hadoop-mapreduce-client-common-2.6.1.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[?:1.7.0_79]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_79]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[?:1.7.0_79]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[?:1.7.0_79]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_79]
{code}

> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice.
> -
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements 
> Closeable interface and ExecReducer.close can be called multiple time. We saw 
> the following NPE which hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>

[jira] [Comment Edited] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1

2016-08-01 Thread Rajat Khandelwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401653#comment-15401653
 ] 

Rajat Khandelwal edited comment on HIVE-14357 at 8/1/16 7:38 AM:
-

+1. Changes look good. 

One question: Do these tests fail in 2.1 release too? The commit seems to be a 
part of 2.1.0-rc3. If yes, it makes sense to get 2.1.1 out soon. cc 
[~jcamachorodriguez]

Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for 
something, Thanks. 






was (Author: prongs):
+1. Changes look good. 

One question: Do these tests fail in 2.1 release too? The commit seems to be a 
part of 2.1.0-rc3. 

Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for 
something, Thanks. 





> TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
> ---
>
> Key: HIVE-14357
> URL: https://issues.apache.org/jira/browse/HIVE-14357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Rajat Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14357.patch
>
>
> {noformat}
> checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S 
> where a in (select a from T where b = 1)"));
> txnMgr.openTxn("three");
> txnMgr.acquireLocks(driver.getPlan(), ctx, "three");
> locks = getLocks();
> Assert.assertEquals("Unexpected lock count", 3, locks.size());
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, 
> locks.get(0));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, 
> locks.get(1));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, 
> locks.get(2));
> {noformat}
> This test case is failing. The expected order of locks is supposed to be T, 
> S, R. But upon closer inspection, it seems to be R,S,T. 
> I'm not much familiar with what these locks are and why the order is 
> important. Raising this jira so while I try to understand it all. Meanwhile, 
> if somebody can explain here, would be helpful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-01 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Summary: CommonJoinOperator.checkAndGenObject should return directly to 
avoid NPE if ExecReducer.close is called twice.  (was: 
CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
avoid NPE if ExecReducer.close is called twice.)

> CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if 
> ExecReducer.close is called twice.
> --
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
> avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements 
> Closeable interface and ExecReducer.close can be called multiple time. We saw 
> the following NPE which hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 
> 53466
> 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
> operators - failing tree
> 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
>   at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(Reduce

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-01 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Description: 
CommonJoinOperator.checkAndGenObject should return directly (after 
{{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close 
is called twice. ExecReducer.close implements Closeable interface and 
ExecReducer.close can be called multiple time. We saw the following NPE which 
hide the real exception due to this bug.
{code}
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: 
null

at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)

at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)

at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Caused by: java.lang.NullPointerException

at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)

at 
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)

at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)

... 8 more
{code}
The code from ReduceTask.runOldReducer:
{code}
  reducer.close(); //line 453
  reducer = null;
  
  out.close(reporter);
  out = null;
} finally {
  IOUtils.cleanup(LOG, reducer);// line 459
  closeQuietly(out, reporter);
}
{code}
Based on the above stack trace and code, reducer.close() is called twice 
because the exception happened when reducer.close() is called for the first 
time at line 453, the code exit before reducer was set to null. 
NullPointerException is triggered when reducer.close() is called for the second 
time in IOUtils.cleanup at line 459. NullPointerException hide the real 
exception which happened when reducer.close() is called for the first time at 
line 453.
The reason for NPE is:
The first reducer.close called CommonJoinOperator.closeOp which clear 
{{storage}}
{code}
Arrays.fill(storage, null);
{code}
the second reduce.close generated NPE due to null {{storage[alias]}} which is 
set to null by first reducer.close.
The following reducer log can give more proof:
{code}
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
2016-07-14 22:24:51,016 INFO [main] 
org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 53466
2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
operators - failing tree
2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.RuntimeException: Hive Runtime Error while 
closing operators: null
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
at 
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
... 8 more
{code}

  was:
CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to 
avoid NPE if E

[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-08-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Component/s: Statistics

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-08-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Status: Patch Available  (was: Open)

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Status: Open  (was: Patch Available)

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Attachment: HIVE-14378.3.patch

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions

2016-08-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14397:
-
Status: Patch Available  (was: Open)

> Queries launched after reopening of tez session launches additional sessions
> 
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions

2016-08-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14397:
-
Attachment: HIVE-14397.1.patch

[~sseth]/[~sershe] Can someone plz review this patch?

> Queries launched after reopening of tez session launches additional sessions
> 
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14323) Reduce number of FS permissions and redundant FS operations

2016-08-01 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14323:

Attachment: HIVE-14323.3.patch

Fixed Hive.replaceFiles which caused the tests to fail.

> Reduce number of FS permissions and redundant FS operations
> ---
>
> Key: HIVE-14323
> URL: https://issues.apache.org/jira/browse/HIVE-14323
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14323.1.patch, HIVE-14323.2.patch, 
> HIVE-14323.3.patch
>
>
> Some examples are given below.
> 1. When creating stage directory, FileUtils sets the directory permissions by 
> running a set of chgrp and chmod commands. In systems like S3, this would not 
> be relevant.
> 2. In some cases, fs.delete() is followed by fs.exists(). In this case, it 
> might be redundant to check for exists() (lookup ops are expensive in systems 
> like S3). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14323) Reduce number of FS permissions and redundant FS operations

2016-08-01 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14323:

Status: Patch Available  (was: Open)

> Reduce number of FS permissions and redundant FS operations
> ---
>
> Key: HIVE-14323
> URL: https://issues.apache.org/jira/browse/HIVE-14323
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14323.1.patch, HIVE-14323.2.patch, 
> HIVE-14323.3.patch
>
>
> Some examples are given below.
> 1. When creating stage directory, FileUtils sets the directory permissions by 
> running a set of chgrp and chmod commands. In systems like S3, this would not 
> be relevant.
> 2. In some cases, fs.delete() is followed by fs.exists(). In this case, it 
> might be redundant to check for exists() (lookup ops are expensive in systems 
> like S3). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401734#comment-15401734
 ] 

Hive QA commented on HIVE-14346:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821178/HIVE-14346.2.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10418 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/718/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/718/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-718/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821178 - PreCommit-HIVE-MASTER-Build

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12954) NPE with str_to_map on null strings

2016-08-01 Thread Marta Kuczora (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401759#comment-15401759
 ] 

Marta Kuczora commented on HIVE-12954:
--

The failing tests are not related to this patch.

> NPE with str_to_map on null strings
> ---
>
> Key: HIVE-12954
> URL: https://issues.apache.org/jira/browse/HIVE-12954
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.0.0
>Reporter: Charles Pritchard
>Assignee: Marta Kuczora
> Attachments: HIVE-12954.2.patch, HIVE-12954.patch
>
>
> Running str_to_map on a null string will return a NullPointerException.
> Workaround is to use coalesce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14123) Add beeline configuration option to show database in the prompt

2016-08-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14123:
--
Release Note: New BeeLine Command option --showDbInPrompt to display the 
current database name in prompt

> Add beeline configuration option to show database in the prompt
> ---
>
> Key: HIVE-14123
> URL: https://issues.apache.org/jira/browse/HIVE-14123
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, CLI
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14123.10.patch, HIVE-14123.2.patch, 
> HIVE-14123.3.patch, HIVE-14123.4.patch, HIVE-14123.5.patch, 
> HIVE-14123.6.patch, HIVE-14123.7.patch, HIVE-14123.8.patch, 
> HIVE-14123.9.patch, HIVE-14123.patch
>
>
> There are several jira issues complaining that, the Beeline does not respect 
> hive.cli.print.current.db.
> This is partially true, since in embedded mode, it uses the 
> hive.cli.print.current.db to change the prompt, since HIVE-10511.
> In beeline mode, I think this function should use a beeline command line 
> option instead, like for the showHeader option emphasizing, that this is a 
> client side option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12077) MSCK Repair table should fix partitions in batches

2016-08-01 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401761#comment-15401761
 ] 

Chinna Rao Lalam commented on HIVE-12077:
-

Committed to master. 

> MSCK Repair table should fix partitions in batches 
> ---
>
> Key: HIVE-12077
> URL: https://issues.apache.org/jira/browse/HIVE-12077
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Ryan P
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, 
> HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch
>
>
> If a user attempts to run MSCK REPAIR TABLE on a directory with a large 
> number of untracked partitions HMS will OOME. I suspect this is because it 
> attempts to do one large bulk load in an effort to save time. Ultimately this 
> can lead to a collection so large in size that HMS eventually hits an Out of 
> Memory Exception. 
> Instead I suggest that Hive include a configurable batch size that HMS can 
> use to break up the load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14123) Add beeline configuration option to show database in the prompt

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401763#comment-15401763
 ] 

Peter Vary commented on HIVE-14123:
---

[~leftylev] Could you please check my modifications?
- Wiki - [Beeline Command 
Options|https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions]
- Jira - Relase notes of this Jira is set

Do I need to do anything more, or everything is ok now, and I could remove the 
TODOC## label as well.

Thanks,
Peter

> Add beeline configuration option to show database in the prompt
> ---
>
> Key: HIVE-14123
> URL: https://issues.apache.org/jira/browse/HIVE-14123
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, CLI
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14123.10.patch, HIVE-14123.2.patch, 
> HIVE-14123.3.patch, HIVE-14123.4.patch, HIVE-14123.5.patch, 
> HIVE-14123.6.patch, HIVE-14123.7.patch, HIVE-14123.8.patch, 
> HIVE-14123.9.patch, HIVE-14123.patch
>
>
> There are several jira issues complaining that, the Beeline does not respect 
> hive.cli.print.current.db.
> This is partially true, since in embedded mode, it uses the 
> hive.cli.print.current.db to change the prompt, since HIVE-10511.
> In beeline mode, I think this function should use a beeline command line 
> option instead, like for the showHeader option emphasizing, that this is a 
> client side option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12077) MSCK Repair table should fix partitions in batches

2016-08-01 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-12077:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

> MSCK Repair table should fix partitions in batches 
> ---
>
> Key: HIVE-12077
> URL: https://issues.apache.org/jira/browse/HIVE-12077
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Ryan P
>Assignee: Chinna Rao Lalam
> Fix For: 2.2.0
>
> Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, 
> HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch
>
>
> If a user attempts to run MSCK REPAIR TABLE on a directory with a large 
> number of untracked partitions HMS will OOME. I suspect this is because it 
> attempts to do one large bulk load in an effort to save time. Ultimately this 
> can lead to a collection so large in size that HMS eventually hits an Out of 
> Memory Exception. 
> Instead I suggest that Hive include a configurable batch size that HMS can 
> use to break up the load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Fix Version/s: 1.1.0
   Status: Patch Available  (was: Open)

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Status: Open  (was: Patch Available)

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Attachment: HIVE-14398.1.patch

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Status: Patch Available  (was: Open)

please take a look,thanks

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Attachment: (was: HIVE-14398.1.patch)

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Attachment: HIVE-14398.1.patch

please take a look,thanks

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-14398) import database.tablename from path error

2016-08-01 Thread Yechao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HIVE-14398:
---
Comment: was deleted

(was: please take a look,thanks)

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401797#comment-15401797
 ] 

Peter Vary commented on HIVE-14374:
---

Hi,

I have collected the configuration variables:

{panel:title=Wiki}
-u, -r, -n, -p, -d, -e, -f, -w --password-file, --hiveconf, --hivevar, --color, 
--showHeader, --headerInterval, --fastConnect, --autoCommit, --verbose, 
--showWarnings, --showDbInPrompt, --showNestedErrs, --numberFormat, --force, 
--maxWidth, --maxColumnWidth, --silent, --autosave, --outputformat, 
--truncateTable, --delimiterForDSV, --isolation, --nullemptystring, 
--incremental, --help
{panel}

{panel:title=Help text - beeline mode}
-u, -r, -n, -p, -d, -i, -e, -f, -w --password-file, --hiveconf, --hivevar, 
--property-file, --color, --showHeader, --headerInterval, --fastConnect, 
--autoCommit, --verbose, --showWarnings, --showDbInPrompt, --showNestedErrs, 
--numberFormat, --force, --maxWidth, --maxColumnWidth, --silent, --autosave, 
--outputformat, --incremental, --truncateTable, --delimiterForDSV, --isolation, 
--nullemptystring, --addlocaldriverjar, --addlocaldrivername, 
--showConnectedUrl, --help
{panel}

{panel:title=Help text - compatibility mode}
Generated from code, so the same as below
{panel}

{panel:title=Code - beeline compatibility mode}
-database, -e, -f, -i, --hiveconf, --hivevar, -d --define, -S| --silent, -v| 
--verbose, -H| --help 
{panel}

{panel:title=Code - beeline beeline mode}
-d, -u, -r, -n, -p, -w --password-file, -a, -i, -e, -f, -help, --hivevar, 
--hiveconf, --property-file
+ all of the configuration file options
{panel}

{panel:title=Configuration file - beeline beeline mode}
headerinterval, fastconnect, incremental, outputformat, autosave, 
entirelineascommand, authtype, delimiterfordsv, force, initfiles, 
showconnectedurl, maxheight, maxcolumnwidth, numberformat, timeout, 
showelapsedtime, verbose, showwarnings, hivevariables, lastconnectedurl, 
truncatetable, isolation, nullemptystring, trimscripts, showdbinprompt, 
scriptfile, color, shownestederrs, showheader, autocommit, hiveconfvariables, 
historyfile
{panel}


> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOpt

[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401860#comment-15401860
 ] 

Hive QA commented on HIVE-14392:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821183/HIVE-14392.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10418 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby2_map_skew_multi_distinct
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby2_multi_distinct
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby3_map_skew_multi_distinct
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby3_multi_distinct
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby_grouping_sets7
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/719/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/719/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-719/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821183 - PreCommit-HIVE-MASTER-Build

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14128) Parallelize jobClose phases

2016-08-01 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401900#comment-15401900
 ] 

Rajesh Balamohan commented on HIVE-14128:
-

[~ashutoshc] - In non-partitioned case, there can be multiple part files within 
the temp directory. When this is moved in HDFS, it would be simpler. But in 
some file systems like S3, it would turn out to be expensive still.  E.g 
lineitem is a non-partitioned dataset in TPC-H.  Simple insert overwrite would 
have the following move at the end of the job.  Please note that this 
internally has 300+ part files. So it rename would turn out to be expensive 
here.

{noformat}
2016-08-01T04:40:00,154  INFO [JobClose-Thread-0] exec.FileSinkOperator: Moving 
tmp dir: 
s3a://bucket/lineitem/.hive-staging_hive_2016-08-01_04-31-26_432_5317262787271448273-1/_tmp.-ext-1
 to: 
s3a://bucket/lineitem/.hive-staging_hive_2016-08-01_04-31-26_432_5317262787271448273-1/-ext-1
{noformat}

Should we consider a file by file move in such cases?

> Parallelize jobClose phases
> ---
>
> Key: HIVE-14128
> URL: https://issues.apache.org/jira/browse/HIVE-14128
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.2.0, 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14128.1.patch, HIVE-14128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401903#comment-15401903
 ] 

Peter Vary commented on HIVE-14374:
---

{panel:title=Missing wiki documentation, but existing help documentation}
-a (authType) - jdbc connection authentication type
-i  script file for initialization
--property-file= the file to read connection properties (url, 
driver, user, password) from
--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side
--addlocaldrivername=DRIVERNAME Add drivier name needs to be supported in the 
beeline client side
--incremental=[true/false] When set to false, the entire result set is fetched 
and buffered before being displayed...
--showConnectedUrl=[true/false] Prompt HiveServer2's URI to which this beeline 
connected.
{panel}

Handling the ones abover is easy. Just add the documentation to wiki, I will do 
this

{panel:title=Configuration file specific, and could be set by command line as 
well}
entirelineascommand - should beeline try to split the commands on ;
maxheight - set by the terminal on start, but could be overwritten by command 
line
maxwidth - set by the terminal on start, but could be overwritten by command 
line
timeout - did not set/get - do not know what is it about
showelapsedtime - on commit, rollback, execute should beeline print the elapsed 
time
lastconnectedurl - used by -r (reconnect) to connect to the database, but could 
be overwritten by command line
trimscripts - should beeline trim the script lines before executing
historyfile - where should be the history file saved (absolute path)
{panel}

Handling these is more complicated. [~leftylev] I have seen, that you know 
about the documentation of the parameters. Do you know anything about these 
features? Are these are planned and just the documentation is lacking, or the 
possiblity of setting these parameters is an unintended "feature"? Or I just 
have to dig the code, and try to guess from there?

The question is still stands, only we have more data now:
- Do we create 2 groups of parameters for command line parameters, and 
configuration file parameters, or we use only one group?

I think we might have only one group, and it would be used for both type of 
parameters just as [~stakiar] proposed.

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOp

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401952#comment-15401952
 ] 

Peter Vary commented on HIVE-14374:
---

I was trying to check the following configuration options but I think they were 
never intended to work as a command line parameter. 
--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side
--addlocaldrivername=DRIVERNAME Add drivier name needs to be supported in the 
beeline client side

The following does not work:
{code}
$ ./beeline --addlocaldriverjar=pgsql.jar
{code}

I think these are commands intended to use in an already running client, like 
this:
{code}
$ ./beeline
0: jdbc:hive2://localhost:1> !addlocaldriverjar plqsl.jar
{code}

If so, then I think it should be removed from the BeeLine.properties file. 
Am i right [~Ferd]

Thanks,
Peter

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401997#comment-15401997
 ] 

Peter Vary commented on HIVE-14374:
---

The showConnectedUrl parameter (HIVE-11244) is never used anymore.
It is lost in merge: HIVE-11769

[~nemon] Shall we reintroduce it, or if nobody missed it, we should just remove 
the possibility altogether?

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14361) Empty method in TestClientCommandHookFactory

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402001#comment-15402001
 ] 

Peter Vary commented on HIVE-14361:
---

Tests are not related.
[~spena], [~aihuaxu] please commit it when you have time

Thanks,
Peter

> Empty method in TestClientCommandHookFactory
> 
>
> Key: HIVE-14361
> URL: https://issues.apache.org/jira/browse/HIVE-14361
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Trivial
> Attachments: HIVE-14361.patch
>
>
> Remove the empty method left in TestClientCommandHookFactory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14387) Add an option to skip the table names for the column headers

2016-08-01 Thread Marta Kuczora (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora reassigned HIVE-14387:


Assignee: Marta Kuczora

> Add an option to skip the table names for the column headers
> 
>
> Key: HIVE-14387
> URL: https://issues.apache.org/jira/browse/HIVE-14387
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Marta Kuczora
>Priority: Minor
>
> It would be good to have an option where the beeline output could skip 
> reporting the . in the headers.
> Eg:
> {noformat}
> 0: jdbc:hive2://:> select * from sample_07 limit 1; 
> --
> sample_07.codesample_07.description   sample_07.total_emp 
> sample_07.salary
> --
> 00-   Operations  123 12345
> --
> {noformat}
> b) After the option is set:
> {noformat}
> 0: jdbc:hive2://:> select * from sample_07 limit 1; 
> ---
> code   descriptiontotal_empsalary
> ---
> 00-   Operations  123 12345
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402019#comment-15402019
 ] 

Hive QA commented on HIVE-14393:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821224/HIVE-14393.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10420 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/720/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/720/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-720/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821224 - PreCommit-HIVE-MASTER-Build

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402020#comment-15402020
 ] 

Nemon Lou commented on HIVE-14374:
--

[~pvary] Thanks for reminding this.If it was removed by accident ,then it will 
be good to reintroduce it.We have already use this in our production. 

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402024#comment-15402024
 ] 

Peter Vary commented on HIVE-14374:
---

The connected URLis always shown since HIVE-11769 - which I think made it into 
2.1.0

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12754) AuthTypes.NONE cause exception after HS2 start

2016-08-01 Thread Murshid Chalaev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402028#comment-15402028
 ] 

Murshid Chalaev commented on HIVE-12754:


Hi, Heng

On which version of hive did you face this issue?
Did you manage to solve it or find a workaround?

> AuthTypes.NONE cause exception after HS2 start
> --
>
> Key: HIVE-12754
> URL: https://issues.apache.org/jira/browse/HIVE-12754
> Project: Hive
>  Issue Type: Bug
>Reporter: Heng Chen
>
> I set {{hive.server2.authentication}} to be {{NONE}}
> After HS2 start, i see exception in log below:
> {code}
> 2015-12-29 16:58:42,339 ERROR [HiveServer2-Handler-Pool: Thread-31]: 
> server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred 
> during processing of message.
> java.lang.RuntimeException: 
> org.apache.thrift.transport.TSaslTransportException: No data or no sasl data 
> in the stream
> at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no 
> sasl data in the stream
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:328)
> at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> ... 4 more
> {code}
> IMO the problem is we use Sasl transport when authType is NONE, 
> {code:title=HiveAuthFactory.java}
>   public TTransportFactory getAuthTransFactory() throws LoginException {
> TTransportFactory transportFactory;
> if (authTypeStr.equalsIgnoreCase(AuthTypes.KERBEROS.getAuthName())) {
>   try {
> transportFactory = 
> saslServer.createTransportFactory(getSaslProperties());
>   } catch (TTransportException e) {
> throw new LoginException(e.getMessage());
>   }
> } else if (authTypeStr.equalsIgnoreCase(AuthTypes.NONE.getAuthName())) {
>   transportFactory = 
> PlainSaslHelper.getPlainTransportFactory(authTypeStr);
> } else if (authTypeStr.equalsIgnoreCase(AuthTypes.LDAP.getAuthName())) {
>   transportFactory = 
> PlainSaslHelper.getPlainTransportFactory(authTypeStr);
> } else if (authTypeStr.equalsIgnoreCase(AuthTypes.PAM.getAuthName())) {
>   transportFactory = 
> PlainSaslHelper.getPlainTransportFactory(authTypeStr);
> } else if (authTypeStr.equalsIgnoreCase(AuthTypes.NOSASL.getAuthName())) {
>   transportFactory = new TTransportFactory();
> } else if (authTypeStr.equalsIgnoreCase(AuthTypes.CUSTOM.getAuthName())) {
>   transportFactory = 
> PlainSaslHelper.getPlainTransportFactory(authTypeStr);
> } else {
>   throw new LoginException("Unsupported authentication type " + 
> authTypeStr);
> }
> return transportFactory;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402036#comment-15402036
 ] 

Nemon Lou commented on HIVE-14374:
--

For my part,it will be fine to remove it. 

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402041#comment-15402041
 ] 

Peter Vary commented on HIVE-14374:
---

Thanks, will see if anyone else needs it.

> BeeLine argument, and configuration handling cleanup
> 
>
> Key: HIVE-14374
> URL: https://issues.apache.org/jira/browse/HIVE-14374
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> BeeLine uses reflection, to set the BeeLineOpts attributes when parsing 
> command line arguments, and when loading the configuration file.
> This means, that creating a setXXX, getXXX method in BeeLineOpts is a 
> potential risk of exposing an attribute for the user unintentionally. There 
> is a possibility to exclude an attribute from saving the value in the 
> configuration file with the Ignore annotation. This does not restrict the 
> loading or command line setting of these parameters which means there are 
> many undocumented "features" as-is, like setting the lastConnectedUrl, 
> allowMultilineCommand, maxHeight, trimScripts, etc. from command line.
> This part of the code needs a little cleanup.
> I think we should make this exposure more explicit, and be able to 
> differentiate the configurable options depending on the source (command line, 
> and configuration file), so I propose to create a mechanism to tell 
> explicitly which BeeLineOpts attributes are settable by command line, and 
> configuration file, and every other attribute should be inaccessible by the 
> user of the beeline cli.
> One possible solution could be two annotations like these:
> - CommandLineOption - there could be a mandatory text parameter here, so the 
> developer had to provide the help text for it which could be displayed to the 
> user
> - ConfigurationFileOption - no text is required here
> Something like this:
> - This attribute could be provided by command line, and from a configuration 
> file too:
> {noformat}
> @CommandLineOption("automatically save preferences")
> @ConfigurationFileOption
> public void setAutosave(boolean autosave) {
>   this.autosave = autosave;
> }
> public void getAutosave() {
>   return this.autosave;
> }
> {noformat}
> - This attribute could be set through the configuration only
> {noformat}
> @ConfigurationFileOption
> public void setLastConnectedUrl(String lastConnectedUrl) {
>   this.lastConnectedUrl = lastConnectedUrl; 
> }
>  
> public String getLastConnectedUrl()
> { 
>   return lastConnectedUrl;
>  }
>  {noformat}
> - Attribute could be set through command line only - I think this is not too 
> relevant, but possible
> {noformat}
> @CommandLineOption("specific command line option")
> public void setSpecificCommandLineOption(String specificCommandLineOption) {
>    this.specificCommandLineOption = specificCommandLineOption;
>  }
>  
> public String getSpecificCommandLineOption() {
>    return specificCommandLineOption;
>  }
>  {noformat}
> - Attribute could not be set
> {noformat}
> public static Env getEnv() {
>    return env;
>  }
>   public static void setEnv(Env envToUse) {
>    env = envToUse;
>  }
> {noformat}
> Accouring to our previous conversations, I think you might be interested in: 
> [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz]
> but anyone is welcome to discuss this.
> What do you think about the proposed solution?
> Any better ideas, or extensions?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14146:
--
Attachment: HIVE-14146.7.patch

Regenerated the diff, since I do not know why it was not recognized.

The decision on the CLI formatted comment handling is still required

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging

2016-08-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402097#comment-15402097
 ] 

Josh Elser commented on HIVE-14394:
---

Closing the loop on the upstream fix (since most probably won't be watching 
[~sushanth]'s PR): I just merged in his change (thanks so much for catching and 
fixing) and released an 0.1.1 of the reporter. It should be available via Maven 
central now (but not available via 
http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22dropwizard-metrics-hadoop-metrics2-reporter%22
 for a few hours).

Sorry for the temporary pain!

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12954) NPE with str_to_map on null strings

2016-08-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402163#comment-15402163
 ] 

Aihua Xu commented on HIVE-12954:
-

+1. The change looks good.

> NPE with str_to_map on null strings
> ---
>
> Key: HIVE-12954
> URL: https://issues.apache.org/jira/browse/HIVE-12954
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.0.0
>Reporter: Charles Pritchard
>Assignee: Marta Kuczora
> Attachments: HIVE-12954.2.patch, HIVE-12954.patch
>
>
> Running str_to_map on a null string will return a NullPointerException.
> Workaround is to use coalesce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14387) Add an option to skip the table names for the column headers

2016-08-01 Thread Marta Kuczora (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402213#comment-15402213
 ] 

Marta Kuczora commented on HIVE-14387:
--

Found the following config parameter: 
hive.resultset.use.unique.column.names
Make column names unique in the result set by qualifying column names with 
table alias if needed. Table alias will be added to column names for queries of 
type "select *" or if query explicitly uses table alias "select r1.x..".
Default value: true

If this parameter is set to false, the result will contain only the column 
names without the table name prefix.

Example:
hive.resultset.use.unique.column.names=true
{noformat}
0: jdbc:hive2://> select * from car limit 1;
OK
++---++---+---+--+
| car.carid  | car.type  | car.color  | car.lnum  | car.year  |
++---++---+---+--+
| 1000   | Audi  | red| AAA111| 2009  |
++---++---+---+--+
1 row selected (0.084 seconds)
{noformat}

hive.resultset.use.unique.column.names=false
{noformat}
0: jdbc:hive2://> select * from car limit 1;
OK
++---++-+---+--+
| carid  | type  | color  |  lnum   | year  |
++---++-+---+--+
| 1000   | Audi  | red| AAA111  | 2009  |
++---++-+---+--+
1 row selected (0.084 seconds)
{noformat}

[~vihangk1], would this parameter be a suitable option?

> Add an option to skip the table names for the column headers
> 
>
> Key: HIVE-14387
> URL: https://issues.apache.org/jira/browse/HIVE-14387
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Marta Kuczora
>Priority: Minor
>
> It would be good to have an option where the beeline output could skip 
> reporting the . in the headers.
> Eg:
> {noformat}
> 0: jdbc:hive2://:> select * from sample_07 limit 1; 
> --
> sample_07.codesample_07.description   sample_07.total_emp 
> sample_07.salary
> --
> 00-   Operations  123 12345
> --
> {noformat}
> b) After the option is set:
> {noformat}
> 0: jdbc:hive2://:> select * from sample_07 limit 1; 
> ---
> code   descriptiontotal_empsalary
> ---
> 00-   Operations  123 12345
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402216#comment-15402216
 ] 

Aihua Xu commented on HIVE-14146:
-

[~pvary] Yeah. You may need to rebase to the latest change. Do you need to 
generate the new baseline for your new test case?

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402245#comment-15402245
 ] 

Peter Vary commented on HIVE-14146:
---

I rebased and run the qtests. It seems ok now - we will see the result not too 
soon :)
What I am not sure about, how to handle the CLI/BeeLine differences here.
CLI uses formatted outputting, which specifically displays comments with \n in 
new lines, but only in column comments, and not in table comments.

Probably the nicest solution would be to keep the \n-s as newlines in CLI mode 
in table comments as well, so at least in CLI every newline would remain a 
newline, and in BeeLine every newline would be a \n.

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14395) Add the missing data files to Avro union tests (HIVE-14205 addendum)

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402264#comment-15402264
 ] 

Hive QA commented on HIVE-14395:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821239/HIVE-14395.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10404 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-script_pipe.q-orc_ppd_schema_evol_2a.q-join1.q-and-12-more 
- did not produce a TEST-*.xml file
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/721/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/721/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-721/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821239 - PreCommit-HIVE-MASTER-Build

> Add the missing data files to Avro union tests (HIVE-14205 addendum)
> 
>
> Key: HIVE-14395
> URL: https://issues.apache.org/jira/browse/HIVE-14395
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Trivial
> Attachments: HIVE-14395.patch
>
>
> The union_non_nullable.txt & union_nullable.txt were not checked in for 
> HIVE-14205. It was my mistake.
> It is the reason that testCliDriver_avro_nullable_union & 
> testNegativeCliDriver_avro_non_nullable_union are failing in current 
> pre-commit build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging

2016-08-01 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402297#comment-15402297
 ] 

Sushanth Sowmyan commented on HIVE-14394:
-

Awesome! :)

Thanks, I'll update the pom dep instead of this.

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14394) Reduce excessive INFO level logging

2016-08-01 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-14394:

Status: Patch Available  (was: Open)

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.2.patch, HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14394) Reduce excessive INFO level logging

2016-08-01 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-14394:

Attachment: HIVE-14394.2.patch

Updated patch attached.

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.2.patch, HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402337#comment-15402337
 ] 

Aihua Xu commented on HIVE-14146:
-

I see. Sure. Seems reasonable to make such change.



> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-01 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402366#comment-15402366
 ] 

Chao Sun commented on HIVE-14346:
-

Test failures unrelated.

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14128) Parallelize jobClose phases

2016-08-01 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402391#comment-15402391
 ] 

Ashutosh Chauhan commented on HIVE-14128:
-

I think approach on HIVE-14270 is better than this. So, I have abandoned this 
in favor of that. Once we have HIVE-14270 I think this won't be necessary.

> Parallelize jobClose phases
> ---
>
> Key: HIVE-14128
> URL: https://issues.apache.org/jira/browse/HIVE-14128
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.2.0, 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14128.1.patch, HIVE-14128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-01 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402393#comment-15402393
 ] 

Ashutosh Chauhan commented on HIVE-14393:
-

+1

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging

2016-08-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402434#comment-15402434
 ] 

Josh Elser commented on HIVE-14394:
---

I was able to build master with your v2 patch locally. LGTM.

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.2.patch, HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3776) support PIVOT in hive

2016-08-01 Thread Aditya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402436#comment-15402436
 ] 

Aditya commented on HIVE-3776:
--

Any alternate option!! for this

> support PIVOT in hive
> -
>
> Key: HIVE-3776
> URL: https://issues.apache.org/jira/browse/HIVE-3776
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> It is a fairly well understood feature in databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7239) Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files

2016-08-01 Thread Illya Yalovyy (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402460#comment-15402460
 ] 

Illya Yalovyy commented on HIVE-7239:
-

[~gopalv], [~owen.omalley], [~ashutoshc],

Could you please review this patch and suggest the next step to get it accepted?

> Fix bug in HiveIndexedInputFormat implementation that causes incorrect query 
> result when input backed by Sequence/RC files
> --
>
> Key: HIVE-7239
> URL: https://issues.apache.org/jira/browse/HIVE-7239
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing
>Affects Versions: 2.1.0
>Reporter: Sumit Kumar
>Assignee: Illya Yalovyy
> Attachments: HIVE-7239.2.patch, HIVE-7239.3.patch, HIVE-7239.4.patch, 
> HIVE-7239.patch
>
>
> In case of sequence files, it's crucial that splits are calculated around the 
> boundaries enforced by the input sequence file. However by default hadoop 
> creates input splits depending on the configuration parameters which may not 
> match the boundaries for the input sequence file. Hive provides 
> HiveIndexedInputFormat that provides extra logic and recalculates the split 
> boundaries for each split depending on the sequence file's boundaries.
> However we noticed this behavior of "over" reporting from data backed by 
> sequence file. We've a sample data on which we experimented and fixed this 
> bug, we have verified this fix by comparing the query output for input being 
> sequence file format, rc file and regular format. However we have not able to 
> find the right place to include this as a unit test that would execute as 
> part of hive tests. We tried writing a "clientpositive" test as part of ql 
> module but the output seems quite verbose and i couldn't interpret it that 
> well. Can someone please review this change and guide on how to write a test 
> that will execute as part of Hive testing?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402499#comment-15402499
 ] 

Hive QA commented on HIVE-14390:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821158/HIVE-14390.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 370 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_SortUnionTransposeRule
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_deleteAnalyze
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_join_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppd
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_convert_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_innerjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2
org.apache.hadoop.hive.cli.

[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402503#comment-15402503
 ] 

Sergey Shelukhin commented on HIVE-14397:
-

Which part of the patch fixes the problem where "when we run more queries the 
reopened sessions are not used instead new session is opened"?
Queue name logic makes sense to me

> Queries launched after reopening of tez session launches additional sessions
> 
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402506#comment-15402506
 ] 

Sergey Shelukhin commented on HIVE-14357:
-

I will commit to both places. It's a test-only issue though, so I dunno if it 
should really impact the release plans.

> TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
> ---
>
> Key: HIVE-14357
> URL: https://issues.apache.org/jira/browse/HIVE-14357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Rajat Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14357.patch
>
>
> {noformat}
> checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S 
> where a in (select a from T where b = 1)"));
> txnMgr.openTxn("three");
> txnMgr.acquireLocks(driver.getPlan(), ctx, "three");
> locks = getLocks();
> Assert.assertEquals("Unexpected lock count", 3, locks.size());
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, 
> locks.get(0));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, 
> locks.get(1));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, 
> locks.get(2));
> {noformat}
> This test case is failing. The expected order of locks is supposed to be T, 
> S, R. But upon closer inspection, it seems to be R,S,T. 
> I'm not much familiar with what these locks are and why the order is 
> important. Raising this jira so while I try to understand it all. Meanwhile, 
> if somebody can explain here, would be helpful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-08-01 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402519#comment-15402519
 ] 

Sushanth Sowmyan commented on HIVE-13966:
-

Sorry for the late response, I've been tied on a bunch of other issues off 
late. I will look at this and review in the next couple of days. This is an 
issue that is important to fix, and I'm glad you have a patch up for it. :)

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-01 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402521#comment-15402521
 ] 

Pengcheng Xiong commented on HIVE-14390:


[~nemon], most of the output files look good to me. Could u double check 
union15.q and union.9.q in SparkCliDriver? It seems that they generate a 
different plan? Thanks.

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs

2016-08-01 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402561#comment-15402561
 ] 

Daniel Dai commented on HIVE-14399:
---

The test wait for EVENTS_TTL * 2 = 60s and assume events are cleaned up. In the 
meantime, DbNotificationListener.CleanerThread is invoked every 60s. It is 
possible cleanup thread doesn't get a chance to run during the test wait time.

I'd like to increase frequency of CleanerThread in the test to make sure it 
will get a chance to run during the waiting.

> Fix test flakiness of 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
> 
>
> Key: HIVE-14399
> URL: https://issues.apache.org/jira/browse/HIVE-14399
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>
> We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. 
> We shall make it stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402567#comment-15402567
 ] 

Sergey Shelukhin commented on HIVE-14392:
-

Why did we make it required in the first place? I remember there was an 
explicit reason, I just don't remember what it was.

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14373) Add integration tests for hive on S3

2016-08-01 Thread Abdullah Yousufi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdullah Yousufi reassigned HIVE-14373:
---

Assignee: Abdullah Yousufi

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Abdullah Yousufi
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-01 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402576#comment-15402576
 ] 

Pengcheng Xiong commented on HIVE-14390:


ccing [~jcamachorodriguez] as well

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402567#comment-15402567
 ] 

Sergey Shelukhin edited comment on HIVE-14392 at 8/1/16 6:23 PM:
-

Why did we make it required in the first place? I remember there was an 
explicit reason, I just don't remember what it was. We were always running on 
YARN only, too


was (Author: sershe):
Why did we make it required in the first place? I remember there was an 
explicit reason, I just don't remember what it was.

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions

2016-08-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402582#comment-15402582
 ] 

Prasanth Jayachandran commented on HIVE-14397:
--

[~sershe] The root cause of this bug is this code
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L275-L287

If the queue name is set in conf then it always creates a new session when 
getSession is invoked. 

The part that fixes this is conf.unset("tez.queue.name"). This makes sure we 
use the sessions from the pool when available. 

The other part of the patch
{code}
conf.set(TezConfiguration.TEZ_QUEUE_NAME, sessionState.getQueueName());
{code}
is required to reopen the initial sessions when user specified queue name is 
not available. If we don't set this then all the reopened sessions will use 
"default" queue. 


> Queries launched after reopening of tez session launches additional sessions
> 
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs

2016-08-01 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-14399:
--
Attachment: HIVE-14399.1.patch

> Fix test flakiness of 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
> 
>
> Key: HIVE-14399
> URL: https://issues.apache.org/jira/browse/HIVE-14399
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-14399.1.patch
>
>
> We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. 
> We shall make it stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs

2016-08-01 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-14399:
--
Status: Patch Available  (was: Open)

> Fix test flakiness of 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
> 
>
> Key: HIVE-14399
> URL: https://issues.apache.org/jira/browse/HIVE-14399
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-14399.1.patch
>
>
> We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. 
> We shall make it stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-01 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402586#comment-15402586
 ] 

Jesus Camacho Rodriguez commented on HIVE-14390:


As [~pxiong] indicated, indeed quick pass over the changes indicates that there 
are no regressions and logic in the patch makes sense.

I wonder if there might be some performance impact on the time spent on join 
reordering algorithm; I remember having a conversation with [~jpullokkaran] 
about a reason to not use different aliases, but honestly I cannot remember the 
details anymore. On the other hand, we do something similar for return path as 
we use a different ID for every (sub)query block (line 136 in HiveTableScan).

Thus, IMO it is OK to check it in, and we can keep an eye on the future 
compilation for multi-join queries.

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1

2016-08-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14357:

  Resolution: Fixed
   Fix Version/s: 2.1.1
  2.2.0
Target Version/s: 2.1.1  (was: 1.3.0, 2.1.1)
  Status: Resolved  (was: Patch Available)

Committed to master and branch-2.1

> TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
> ---
>
> Key: HIVE-14357
> URL: https://issues.apache.org/jira/browse/HIVE-14357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Rajat Khandelwal
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14357.patch
>
>
> {noformat}
> checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S 
> where a in (select a from T where b = 1)"));
> txnMgr.openTxn("three");
> txnMgr.acquireLocks(driver.getPlan(), ctx, "three");
> locks = getLocks();
> Assert.assertEquals("Unexpected lock count", 3, locks.size());
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, 
> locks.get(0));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, 
> locks.get(1));
> checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, 
> locks.get(2));
> {noformat}
> This test case is failing. The expected order of locks is supposed to be T, 
> S, R. But upon closer inspection, it seems to be R,S,T. 
> I'm not much familiar with what these locks are and why the order is 
> important. Raising this jira so while I try to understand it all. Meanwhile, 
> if somebody can explain here, would be helpful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14322) Postgres db issues after Datanucleus 4.x upgrade

2016-08-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14322:

   Resolution: Fixed
Fix Version/s: 2.0.2
   2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed. Thanks for the review!

> Postgres db issues after Datanucleus 4.x upgrade
> 
>
> Key: HIVE-14322
> URL: https://issues.apache.org/jira/browse/HIVE-14322
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0, 2.0.1
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1, 2.0.2
>
> Attachments: HIVE-14322.02.patch, HIVE-14322.03.patch, 
> HIVE-14322.04.patch, HIVE-14322.1.patch
>
>
> With the upgrade to  datanucleus 4.x versions in HIVE-6113, hive does not 
> work properly with postgres.
> The nullable fields in the database have string "NULL::character varying" 
> instead of real NULL values. This causes various issues.
> One example is -
> {code}
> hive> create table t(i int);
> OK
> Time taken: 1.9 seconds
> hive> create view v as select * from t;
> OK
> Time taken: 0.542 seconds
> hive> select * from v;
> FAILED: SemanticException Unable to fetch table v. 
> java.net.URISyntaxException: Relative path in absolute URI: 
> NULL::character%20varying
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14386) UGI clone shim also needs to clone credentials

2016-08-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14386:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed

> UGI clone shim also needs to clone credentials
> --
>
> Key: HIVE-14386
> URL: https://issues.apache.org/jira/browse/HIVE-14386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-14386.patch
>
>
> Discovered while testing HADOOP-13081



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402615#comment-15402615
 ] 

Sergey Shelukhin commented on HIVE-14397:
-

+1

> Queries launched after reopening of tez session launches additional sessions
> 
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-01 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14340:

Attachment: HIVE-14340.1.patch

Attaching new patch. This changes {{hive.query.hooks}} to 
{{hive.query.lifetime.hooks}}. Also, now the hook activates at 4 places:
# before query compilation
# after query compilation
# before query execution
# after query execution

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14397) Queries ran after reopening of tez session launches additional sessions

2016-08-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14397:
-
Summary: Queries ran after reopening of tez session launches additional 
sessions  (was: Queries launched after reopening of tez session launches 
additional sessions)

> Queries ran after reopening of tez session launches additional sessions
> ---
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14377) LLAP IO: issue with how estimate cache removes unneeded buffers

2016-08-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402631#comment-15402631
 ] 

Sergey Shelukhin commented on HIVE-14377:
-

All the tests for Tez failed on one instance due to {noformat}Caused by: 
org.apache.hadoop.hive.metastore.api.MetaException: Unable to read from or 
write to hbase Failed 1 action: RetriesExhaustedException: 1 time, 
at 
org.apache.hadoop.hive.metastore.hbase.HBaseStore.createDatabase(HBaseStore.java:158)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:612)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:628)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:418)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:376)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:237)
 ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3356) 
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3397) 
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3377) 
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3631) 
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]{noformat}

I cannot repro the rest (or they are some existing failures).

> LLAP IO: issue with how estimate cache removes unneeded buffers
> ---
>
> Key: HIVE-14377
> URL: https://issues.apache.org/jira/browse/HIVE-14377
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14377.01.patch, HIVE-14377.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode

2016-08-01 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402656#comment-15402656
 ] 

Chao Sun commented on HIVE-14343:
-

[~xuefuz], [~jxiang] can you give a review on this? Thanks.

> HiveDriverRunHookContext's command is null in HS2 mode
> --
>
> Key: HIVE-14343
> URL: https://issues.apache.org/jira/browse/HIVE-14343
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch
>
>
> Looking at the {{Driver#runInternal(String command, boolean 
> alreadyCompiled)}}:
> {code}
> HiveDriverRunHookContext hookContext = new 
> HiveDriverRunHookContextImpl(conf, command);
> // Get all the driver run hooks and pre-execute them.
> List driverRunHooks;
> {code}
> The context is initialized with the {{command}} passed in to the method. 
> However, this command is always null if {{alreadyCompiled}} is true, which is 
> the case for HS2 mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-01 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402366#comment-15402366
 ] 

Chao Sun edited comment on HIVE-14346 at 8/1/16 7:14 PM:
-

Test failures unrelated, although I changed {{subquery_multiinsert.q.out}} 
since it's not consistent with the qfile.


was (Author: csun):
Test failures unrelated.

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-01 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14346:

Attachment: HIVE-14346.3.patch

Address rebase issue.

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch, HIVE-14346.3.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14350) Aborted txns cause false positive "Not enough history available..." msgs

2016-08-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14350:
--
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   1.3.0
   Status: Resolved  (was: Patch Available)

> Aborted txns cause false positive "Not enough history available..." msgs
> 
>
> Key: HIVE-14350
> URL: https://issues.apache.org/jira/browse/HIVE-14350
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-14350.2.patch, HIVE-14350.3.patch, 
> HIVE-14350.5.patch, HIVE-14350.6.patch, HIVE-14350.7.patch, 
> HIVE-14350.8.patch, HIVE-14350.9.patch
>
>
> this is a followup to HIVE-13369.  Only open txns should prevent use of a 
> base file.  But ValidTxnList does not make a distinction between open and 
> aborted txns.  The presence of aborted txns causes false positives which can 
> happen too often since the flow is 
> 1. Worker generates a new base file, 
> 2. then asynchronously Cleaner removes now-compacted aborted txns.  (strictly 
> speaking it's Initiator that does the actual clean up)
> So we may have base_5 and base_10 and txnid 7 aborted.  Then current impl 
> will disallow use of base_10 though there is no need for that.  Worse, if 
> txnid_4 is aborted and hasn't been purged yet, base_5 will be rejected as 
> well and then an error will be raised since there is no suitable base file 
> left.
> ErrorMsg.ACID_NOT_ENOUGH_HISTORY is msg produced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14366) Conversion of a Non-ACID table to an ACID table produces non-unique primary keys

2016-08-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14366:
--
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   1.3.0
   Status: Resolved  (was: Patch Available)

> Conversion of a Non-ACID table to an ACID table produces non-unique primary 
> keys
> 
>
> Key: HIVE-14366
> URL: https://issues.apache.org/jira/browse/HIVE-14366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Blocker
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-14366.01.patch, HIVE-14366.02.patch
>
>
> When a Non-ACID table is converted to an ACID table, the primary key 
> consisting of (original transaction id, bucket_id, row_id) is not generated 
> uniquely. Currently, the row_id is always set to 0 for most rows. This leads 
> to correctness issue for such tables.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>   @Test
>   public void testOriginalReader() throws Exception {
> FileSystem fs = FileSystem.get(hiveConf);
> FileStatus[] status;
> // 1. Insert five rows to Non-ACID table.
> runStatementOnDriver("insert into " + Table.NONACIDORCTBL + "(a,b) 
> values(1,2),(3,4),(5,6),(7,8),(9,10)");
> // 2. Convert NONACIDORCTBL to ACID table.
> runStatementOnDriver("alter table " + Table.NONACIDORCTBL + " SET 
> TBLPROPERTIES ('transactional'='true')");
> // 3. Perform a major compaction.
> runStatementOnDriver("alter table "+ Table.NONACIDORCTBL + " compact 
> 'MAJOR'");
> runWorker(hiveConf);
> // 4. Perform a delete.
> runStatementOnDriver("delete from " + Table.NONACIDORCTBL + " where a = 
> 1");
> // 5. Now do a projection should have (3,4) (5,6),(7,8),(9,10) only since 
> (1,2) has been deleted.
> List rs = runStatementOnDriver("select a,b from " + 
> Table.NONACIDORCTBL + " order by a,b");
> int[][] resultData = new int[][] {{3,4}, {5,6}, {7,8}, {9,10}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14273) branch1 test

2016-08-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14273:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> branch1 test
> 
>
> Key: HIVE-14273
> URL: https://issues.apache.org/jira/browse/HIVE-14273
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14273-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14377) LLAP IO: issue with how estimate cache removes unneeded buffers

2016-08-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14377:

   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed. Thanks for the review!

> LLAP IO: issue with how estimate cache removes unneeded buffers
> ---
>
> Key: HIVE-14377
> URL: https://issues.apache.org/jira/browse/HIVE-14377
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14377.01.patch, HIVE-14377.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode

2016-08-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402728#comment-15402728
 ] 

Xuefu Zhang commented on HIVE-14343:


+1

> HiveDriverRunHookContext's command is null in HS2 mode
> --
>
> Key: HIVE-14343
> URL: https://issues.apache.org/jira/browse/HIVE-14343
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch
>
>
> Looking at the {{Driver#runInternal(String command, boolean 
> alreadyCompiled)}}:
> {code}
> HiveDriverRunHookContext hookContext = new 
> HiveDriverRunHookContextImpl(conf, command);
> // Get all the driver run hooks and pre-execute them.
> List driverRunHooks;
> {code}
> The context is initialized with the {{command}} passed in to the method. 
> However, this command is always null if {{alreadyCompiled}} is true, which is 
> the case for HS2 mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11116) Can not select data from table which points to remote hdfs location

2016-08-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402731#comment-15402731
 ] 

Hive QA commented on HIVE-6:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820392/HIVE-6.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/724/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/724/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-724/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820392 - PreCommit-HIVE-MASTER-Build

> Can not select data from table which points to remote hdfs location
> ---
>
> Key: HIVE-6
> URL: https://issues.apache.org/jira/browse/HIVE-6
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Affects Versions: 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Alexander Pivovarov
>Assignee: David Karoly
> Attachments: HIVE-6.1.patch
>
>
> I tried to create new table which points to remote hdfs location and select 
> data from it.
> It works for hive-0.14 and hive-1.0  but it does not work starting from 
> hive-1.1
> to reproduce the issue
> 1. create folder on remote hdfs
> {code}
> hadoop fs -mkdir -p hdfs://remote-nn/tmp/et1
> {code}
> 2. create table 
> {code}
> CREATE TABLE et1 (
>   a string
> ) stored as textfile
> LOCATION 'hdfs://remote-nn/tmp/et1';
> {code}
> 3. run select
> {code}
> select * from et1 limit 10;
> {code}
> 4. Should get the following error
> {code}
> select * from et1;
> 15/06/25 13:43:44 [main]: ERROR parse.CalcitePlanner: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if 
> hdfs://remote-nn/tmp/et1is encrypted: java.lang.IllegalArgumentException: 
> Wrong FS: hdfs://remote-nn/tmp/et1, expected: hdfs://localhost:8020
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1763)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:190)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at

[jira] [Updated] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode

2016-08-01 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14343:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master branch. Thanks [~xuefuz] for the review!

> HiveDriverRunHookContext's command is null in HS2 mode
> --
>
> Key: HIVE-14343
> URL: https://issues.apache.org/jira/browse/HIVE-14343
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch
>
>
> Looking at the {{Driver#runInternal(String command, boolean 
> alreadyCompiled)}}:
> {code}
> HiveDriverRunHookContext hookContext = new 
> HiveDriverRunHookContextImpl(conf, command);
> // Get all the driver run hooks and pre-execute them.
> List driverRunHooks;
> {code}
> The context is initialized with the {{command}} passed in to the method. 
> However, this command is always null if {{alreadyCompiled}} is true, which is 
> the case for HS2 mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-01 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14340:

Attachment: HIVE-14340.2.patch

Fix style.

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, 
> HIVE-14340.2.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402804#comment-15402804
 ] 

Xuefu Zhang commented on HIVE-14340:


Patch looks good to me. +1

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, 
> HIVE-14340.2.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14202) Change tez version used to 0.8.4

2016-08-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14202:
-
Fix Version/s: 2.1.1

> Change tez version used to 0.8.4
> 
>
> Key: HIVE-14202
> URL: https://issues.apache.org/jira/browse/HIVE-14202
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14202.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14202) Change tez version used to 0.8.4

2016-08-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402898#comment-15402898
 ] 

Prasanth Jayachandran commented on HIVE-14202:
--

HIVE-13934 backport needs this patch. Without this many HIVE-13934 will trigger 
test failures. Backporting this to branch-2.1

> Change tez version used to 0.8.4
> 
>
> Key: HIVE-14202
> URL: https://issues.apache.org/jira/browse/HIVE-14202
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14202.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-01 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402904#comment-15402904
 ] 

Siddharth Seth commented on HIVE-14392:
---

[~leftylev] - thanks for the detailed look at the description. Will incorporate 
most of that in the next patch.

bq. Why did we make it required in the first place? I remember there was an 
explicit reason, I just don't remember what it was. We were always running on 
YARN only, too
LLAP started with running without Slider - i.e. setup daemons manually on 
individual nodes (outside of YARN). At that point the work.dir configuration 
was required. I think we just never really needed to change the way it was 
used, so this did not get attention.

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402908#comment-15402908
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13822:
--

Looked at why some of these qfile changes are missing:
1.
{code}
2016-08-01T14:50:03,831 DEBUG [630ce616-f45b-4c59-b8c2-6e27e80e2cca main] 
parse.TypeCheckCtx: Setting error: [Line 2:121 Invalid table alias or column 
reference 'ws_ext_sales_price': (possible column names are: .(tok_table_or_col 
i_item_id), .(tok_table_or_col i_item_desc\
), .(tok_table_or_col i_category), .(tok_table_or_col i_class), 
.(tok_table_or_col i_current_price), .(tok_function sum (tok_table_or_col 
ws_ext_sales_price)))] from (tok_table_or_col ws_ext_sales_price)
java.lang.Exception
at 
org.apache.hadoop.hive.ql.parse.TypeCheckCtx.setError(TypeCheckCtx.java:162) 
[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$ColumnExprProcessor.process(TypeCheckProcFactory.java:653)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:217)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:163)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11252)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11208)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4195)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3977)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9428)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9383)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10250)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10128)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10801)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10812)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10507)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:75)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
 [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:435) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) 
[hive-exec-2.2.0-SNAPSHOT.jar:?]
at 
org.apache.had

[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13822:
-
Attachment: HIVE-13822.3.patch

> TestPerfCliDriver throws warning in StatsSetupConst that  JsonParser cannot 
> parse COLUMN_STATS
> --
>
> Key: HIVE-13822
> URL: https://issues.apache.org/jira/browse/HIVE-13822
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, 
> HIVE-13822.3.patch
>
>
> Thanks to [~jcamachorodriguez] for uncovering this issue as part of 
> HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether 
> stats are up-to-date.  In case of PerfCliDriver, ‘false’ (thus, not 
> up-to-date) is returned and the following debug message in the logs:
> {code}
> In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in 
> StatsSetupConst)
> {code}
> Looks like the issue started happening after HIVE-12261 went in. 
> The fix would be to replace
> {color:red}COLUMN_STATS_ACCURATE,true{color}
> with
> {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color}
> where key, value are the column names.
> in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13822:
-
Status: Open  (was: Patch Available)

> TestPerfCliDriver throws warning in StatsSetupConst that  JsonParser cannot 
> parse COLUMN_STATS
> --
>
> Key: HIVE-13822
> URL: https://issues.apache.org/jira/browse/HIVE-13822
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, 
> HIVE-13822.3.patch
>
>
> Thanks to [~jcamachorodriguez] for uncovering this issue as part of 
> HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether 
> stats are up-to-date.  In case of PerfCliDriver, ‘false’ (thus, not 
> up-to-date) is returned and the following debug message in the logs:
> {code}
> In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in 
> StatsSetupConst)
> {code}
> Looks like the issue started happening after HIVE-12261 went in. 
> The fix would be to replace
> {color:red}COLUMN_STATS_ACCURATE,true{color}
> with
> {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color}
> where key, value are the column names.
> in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13822:
-
Status: Patch Available  (was: Open)

> TestPerfCliDriver throws warning in StatsSetupConst that  JsonParser cannot 
> parse COLUMN_STATS
> --
>
> Key: HIVE-13822
> URL: https://issues.apache.org/jira/browse/HIVE-13822
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, 
> HIVE-13822.3.patch
>
>
> Thanks to [~jcamachorodriguez] for uncovering this issue as part of 
> HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether 
> stats are up-to-date.  In case of PerfCliDriver, ‘false’ (thus, not 
> up-to-date) is returned and the following debug message in the logs:
> {code}
> In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in 
> StatsSetupConst)
> {code}
> Looks like the issue started happening after HIVE-12261 went in. 
> The fix would be to replace
> {color:red}COLUMN_STATS_ACCURATE,true{color}
> with
> {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color}
> where key, value are the column names.
> in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 146 matches

Mail list logo