[jira] [Commented] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.
[ https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401652#comment-15401652 ] zhihai xu commented on HIVE-14303: -- Thanks for finding this issue, It was my fault, I missed these test failures. I just find out checkAndGenObject may be called by Derived class's closeOp such as CommonMergeJoinOperator and SMBMapJoinOperator. Because the state is changed before closeOp is called, HIVE-14303.0.patch will cause checkAndGenObject return wrongly from {{CommonMergeJoinOperator. joinFinalLeftData}} and {{SMBMapJoinOperator.joinFinalLeftData}}. Since the contradictions are between {{CommonJoinOperator.checkAndGenObject}} and {{CommonJoinOperator.closeOp}}, we shouldn't depend on {{state}} which is changed outside CommonJoinOperator. We could do the same thing as how {{closeCalled}} is used in class {{SMBMapJoinOperator}}. use a variable to prevent {{CommonJoinOperator.checkAndGenObject}} called after {{CommonJoinOperator.closeOp}} was called. I attached a new patch HIVE-14303.1.patch which use a private variable {{CommonJoinOperator.closeOpCalled}} to protect checkAndGenObject from NPE. > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. > - > > Key: HIVE-14303 > URL: https://issues.apache.org/jira/browse/HIVE-14303 > Project: Hive > Issue Type: Bug >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.2.0 > > Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch > > > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements > Closeable interface and ExecReducer.close can be called multiple time. We saw > the following NPE which hide the real exception due to this bug. > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) > at > org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) > ... 8 more > {code} > The code from ReduceTask.runOldReducer: > {code} > reducer.close(); //line 453 > reducer = null; > > out.close(reporter); > out = null; > } finally { > IOUtils.cleanup(LOG, reducer);// line 459 > closeQuietly(out, reporter); > } > {code} > Based on the above stack trace and code, reducer.close() is called twice > because the exception happened when reducer.close() is called for the first > time at line 453, the code exit before reducer was set to null. > NullPointerException is triggered when reducer.close() is called for the > second time in IOUtils.cleanup at line 459. NullPointerException hide the > real exception which happened when reducer.close() is called for the first > time at line 453. > The reason for NPE is: > The first reducer.close called CommonJoinOperator.closeOp which clear > {{storage}} > {code} > Arrays.fill(storage, null); > {code} > the second reduce.close generated NPE due to null {{storage[alias]}} which is > set to null by first reducer.close. > The following reducer log can give more proof: > {code} > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] >
[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.
[ https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HIVE-14303: - Attachment: HIVE-14303.1.patch > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. > - > > Key: HIVE-14303 > URL: https://issues.apache.org/jira/browse/HIVE-14303 > Project: Hive > Issue Type: Bug >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.2.0 > > Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch > > > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements > Closeable interface and ExecReducer.close can be called multiple time. We saw > the following NPE which hide the real exception due to this bug. > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) > at > org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) > ... 8 more > {code} > The code from ReduceTask.runOldReducer: > {code} > reducer.close(); //line 453 > reducer = null; > > out.close(reporter); > out = null; > } finally { > IOUtils.cleanup(LOG, reducer);// line 459 > closeQuietly(out, reporter); > } > {code} > Based on the above stack trace and code, reducer.close() is called twice > because the exception happened when reducer.close() is called for the first > time at line 453, the code exit before reducer was set to null. > NullPointerException is triggered when reducer.close() is called for the > second time in IOUtils.cleanup at line 459. NullPointerException hide the > real exception which happened when reducer.close() is called for the first > time at line 453. > The reason for NPE is: > The first reducer.close called CommonJoinOperator.closeOp which clear > {{storage}} > {code} > Arrays.fill(storage, null); > {code} > the second reduce.close generated NPE due to null {{storage[alias]}} which is > set to null by first reducer.close. > The following reducer log can give more proof: > {code} > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - > 53466 > 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing > operators - failing tree > 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: Hive Runtime Error > while closing operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessContro
[jira] [Commented] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
[ https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401653#comment-15401653 ] Rajat Khandelwal commented on HIVE-14357: - +1. Changes look good. One question: Do these tests fail in 2.1 release too? The commit seems to be a part of 2.1.0-rc3. Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for something, Thanks. > TestDbTxnManager2#testLocksInSubquery failing in branch-2.1 > --- > > Key: HIVE-14357 > URL: https://issues.apache.org/jira/browse/HIVE-14357 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Rajat Khandelwal >Assignee: Sergey Shelukhin > Attachments: HIVE-14357.patch > > > {noformat} > checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S > where a in (select a from T where b = 1)")); > txnMgr.openTxn("three"); > txnMgr.acquireLocks(driver.getPlan(), ctx, "three"); > locks = getLocks(); > Assert.assertEquals("Unexpected lock count", 3, locks.size()); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, > locks.get(0)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, > locks.get(1)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, > locks.get(2)); > {noformat} > This test case is failing. The expected order of locks is supposed to be T, > S, R. But upon closer inspection, it seems to be R,S,T. > I'm not much familiar with what these locks are and why the order is > important. Raising this jira so while I try to understand it all. Meanwhile, > if somebody can explain here, would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.
[ https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401656#comment-15401656 ] zhihai xu commented on HIVE-14303: -- Attached the stack trace which prove checkAndGenObject is called by Derived class(SMBMapJoinOperator)'s closeOp {code} at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:686) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinObject(SMBMapJoinOperator.java:414) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinOneGroup(SMBMapJoinOperator.java:383) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinFinalLeftData(SMBMapJoinOperator.java:357) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.closeOp(SMBMapJoinOperator.java:625) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:683) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) [hadoop-mapreduce-client-core-2.6.1.jar:?] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) [hadoop-mapreduce-client-core-2.6.1.jar:?] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) [hadoop-mapreduce-client-core-2.6.1.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) [hadoop-mapreduce-client-common-2.6.1.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [?:1.7.0_79] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_79] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_79] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_79] at java.lang.Thread.run(Thread.java:745) [?:1.7.0_79] {code} > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. > - > > Key: HIVE-14303 > URL: https://issues.apache.org/jira/browse/HIVE-14303 > Project: Hive > Issue Type: Bug >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.2.0 > > Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch > > > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements > Closeable interface and ExecReducer.close can be called multiple time. We saw > the following NPE which hide the real exception due to this bug. > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) > at > org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) > ... 8 more > {code} > The code from ReduceTask.runOldReducer: > {code} > reducer.close(); //line 453 > reducer = null; >
[jira] [Comment Edited] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
[ https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401653#comment-15401653 ] Rajat Khandelwal edited comment on HIVE-14357 at 8/1/16 7:38 AM: - +1. Changes look good. One question: Do these tests fail in 2.1 release too? The commit seems to be a part of 2.1.0-rc3. If yes, it makes sense to get 2.1.1 out soon. cc [~jcamachorodriguez] Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for something, Thanks. was (Author: prongs): +1. Changes look good. One question: Do these tests fail in 2.1 release too? The commit seems to be a part of 2.1.0-rc3. Secondly, can we merge this to branch-2.1 soon? I'm kind of blocked on this for something, Thanks. > TestDbTxnManager2#testLocksInSubquery failing in branch-2.1 > --- > > Key: HIVE-14357 > URL: https://issues.apache.org/jira/browse/HIVE-14357 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Rajat Khandelwal >Assignee: Sergey Shelukhin > Attachments: HIVE-14357.patch > > > {noformat} > checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S > where a in (select a from T where b = 1)")); > txnMgr.openTxn("three"); > txnMgr.acquireLocks(driver.getPlan(), ctx, "three"); > locks = getLocks(); > Assert.assertEquals("Unexpected lock count", 3, locks.size()); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, > locks.get(0)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, > locks.get(1)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, > locks.get(2)); > {noformat} > This test case is failing. The expected order of locks is supposed to be T, > S, R. But upon closer inspection, it seems to be R,S,T. > I'm not much familiar with what these locks are and why the order is > important. Raising this jira so while I try to understand it all. Meanwhile, > if somebody can explain here, would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.
[ https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HIVE-14303: - Summary: CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice. (was: CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if ExecReducer.close is called twice.) > CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if > ExecReducer.close is called twice. > -- > > Key: HIVE-14303 > URL: https://issues.apache.org/jira/browse/HIVE-14303 > Project: Hive > Issue Type: Bug >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.2.0 > > Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch > > > CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to > avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements > Closeable interface and ExecReducer.close can be called multiple time. We saw > the following NPE which hide the real exception due to this bug. > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) > at > org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) > ... 8 more > {code} > The code from ReduceTask.runOldReducer: > {code} > reducer.close(); //line 453 > reducer = null; > > out.close(reporter); > out = null; > } finally { > IOUtils.cleanup(LOG, reducer);// line 459 > closeQuietly(out, reporter); > } > {code} > Based on the above stack trace and code, reducer.close() is called twice > because the exception happened when reducer.close() is called for the first > time at line 453, the code exit before reducer was set to null. > NullPointerException is triggered when reducer.close() is called for the > second time in IOUtils.cleanup at line 459. NullPointerException hide the > real exception which happened when reducer.close() is called for the first > time at line 453. > The reason for NPE is: > The first reducer.close called CommonJoinOperator.closeOp which clear > {{storage}} > {code} > Arrays.fill(storage, null); > {code} > the second reduce.close generated NPE due to null {{storage[alias]}} which is > set to null by first reducer.close. > The following reducer log can give more proof: > {code} > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... > 2016-07-14 22:24:51,016 INFO [main] > org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - > 53466 > 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing > operators - failing tree > 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: Hive Runtime Error > while closing operators: null > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) > at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(Reduce
[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.
[ https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HIVE-14303: - Description: CommonJoinOperator.checkAndGenObject should return directly (after {{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close is called twice. ExecReducer.close implements Closeable interface and ExecReducer.close can be called multiple time. We saw the following NPE which hide the real exception due to this bug. {code} Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) ... 8 more {code} The code from ReduceTask.runOldReducer: {code} reducer.close(); //line 453 reducer = null; out.close(reporter); out = null; } finally { IOUtils.cleanup(LOG, reducer);// line 459 closeQuietly(out, reporter); } {code} Based on the above stack trace and code, reducer.close() is called twice because the exception happened when reducer.close() is called for the first time at line 453, the code exit before reducer was set to null. NullPointerException is triggered when reducer.close() is called for the second time in IOUtils.cleanup at line 459. NullPointerException hide the real exception which happened when reducer.close() is called for the first time at line 453. The reason for NPE is: The first reducer.close called CommonJoinOperator.closeOp which clear {{storage}} {code} Arrays.fill(storage, null); {code} the second reduce.close generated NPE due to null {{storage[alias]}} which is set to null by first reducer.close. The following reducer log can give more proof: {code} 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 2016-07-14 22:24:51,016 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 53466 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing operators - failing tree 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296) at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718) at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284) ... 8 more {code} was: CommonJoinOperator.checkAndGenObject should return directly at CLOSE state to avoid NPE if E
[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0
[ https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14367: Component/s: Statistics > Estimated size for constant nulls is 0 > -- > > Key: HIVE-14367 > URL: https://issues.apache.org/jira/browse/HIVE-14367 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Affects Versions: 2.0.0, 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.2.0 > > Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, > HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch > > > since type is incorrectly assumed as void. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0
[ https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14367: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. > Estimated size for constant nulls is 0 > -- > > Key: HIVE-14367 > URL: https://issues.apache.org/jira/browse/HIVE-14367 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Affects Versions: 2.0.0, 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.2.0 > > Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, > HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch > > > since type is incorrectly assumed as void. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator
[ https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14378: Status: Patch Available (was: Open) > Data size may be estimated as 0 if no columns are being projected after an > operator > --- > > Key: HIVE-14378 > URL: https://issues.apache.org/jira/browse/HIVE-14378 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch > > > in those cases we still emit rows.. but they may not have any columns within > it. We shouldn't estimate 0 data size in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator
[ https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14378: Status: Open (was: Patch Available) > Data size may be estimated as 0 if no columns are being projected after an > operator > --- > > Key: HIVE-14378 > URL: https://issues.apache.org/jira/browse/HIVE-14378 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch > > > in those cases we still emit rows.. but they may not have any columns within > it. We shouldn't estimate 0 data size in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator
[ https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14378: Attachment: HIVE-14378.3.patch > Data size may be estimated as 0 if no columns are being projected after an > operator > --- > > Key: HIVE-14378 > URL: https://issues.apache.org/jira/browse/HIVE-14378 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, HIVE-14378.patch > > > in those cases we still emit rows.. but they may not have any columns within > it. We shouldn't estimate 0 data size in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14397: - Status: Patch Available (was: Open) > Queries launched after reopening of tez session launches additional sessions > > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14397: - Attachment: HIVE-14397.1.patch [~sseth]/[~sershe] Can someone plz review this patch? > Queries launched after reopening of tez session launches additional sessions > > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14323) Reduce number of FS permissions and redundant FS operations
[ https://issues.apache.org/jira/browse/HIVE-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-14323: Attachment: HIVE-14323.3.patch Fixed Hive.replaceFiles which caused the tests to fail. > Reduce number of FS permissions and redundant FS operations > --- > > Key: HIVE-14323 > URL: https://issues.apache.org/jira/browse/HIVE-14323 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14323.1.patch, HIVE-14323.2.patch, > HIVE-14323.3.patch > > > Some examples are given below. > 1. When creating stage directory, FileUtils sets the directory permissions by > running a set of chgrp and chmod commands. In systems like S3, this would not > be relevant. > 2. In some cases, fs.delete() is followed by fs.exists(). In this case, it > might be redundant to check for exists() (lookup ops are expensive in systems > like S3). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14323) Reduce number of FS permissions and redundant FS operations
[ https://issues.apache.org/jira/browse/HIVE-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-14323: Status: Patch Available (was: Open) > Reduce number of FS permissions and redundant FS operations > --- > > Key: HIVE-14323 > URL: https://issues.apache.org/jira/browse/HIVE-14323 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14323.1.patch, HIVE-14323.2.patch, > HIVE-14323.3.patch > > > Some examples are given below. > 1. When creating stage directory, FileUtils sets the directory permissions by > running a set of chgrp and chmod commands. In systems like S3, this would not > be relevant. > 2. In some cases, fs.delete() is followed by fs.exists(). In this case, it > might be redundant to check for exists() (lookup ops are expensive in systems > like S3). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null
[ https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401734#comment-15401734 ] Hive QA commented on HIVE-14346: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12821178/HIVE-14346.2.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10418 tests executed *Failed tests:* {noformat} TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/718/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/718/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-718/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12821178 - PreCommit-HIVE-MASTER-Build > Change the default value for hive.mapred.mode to null > - > > Key: HIVE-14346 > URL: https://issues.apache.org/jira/browse/HIVE-14346 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, > HIVE-14346.2.patch > > > HIVE-12727 introduces three new configurations to replace the existing > {{hive.mapred.mode}}, which is deprecated. However, the default value for the > latter is 'nonstrict', which prevent the new configurations from being used > (see comments in that JIRA for more details). > This proposes to change the default value for {{hive.mapred.mode}} to null. > Users can then set the three new configurations to get more fine-grained > control over the strict checking. If user want to use the old configuration, > they can set {{hive.mapred.mode}} to strict/nonstrict. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12954) NPE with str_to_map on null strings
[ https://issues.apache.org/jira/browse/HIVE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401759#comment-15401759 ] Marta Kuczora commented on HIVE-12954: -- The failing tests are not related to this patch. > NPE with str_to_map on null strings > --- > > Key: HIVE-12954 > URL: https://issues.apache.org/jira/browse/HIVE-12954 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0 >Reporter: Charles Pritchard >Assignee: Marta Kuczora > Attachments: HIVE-12954.2.patch, HIVE-12954.patch > > > Running str_to_map on a null string will return a NullPointerException. > Workaround is to use coalesce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14123) Add beeline configuration option to show database in the prompt
[ https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14123: -- Release Note: New BeeLine Command option --showDbInPrompt to display the current database name in prompt > Add beeline configuration option to show database in the prompt > --- > > Key: HIVE-14123 > URL: https://issues.apache.org/jira/browse/HIVE-14123 > Project: Hive > Issue Type: Improvement > Components: Beeline, CLI >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14123.10.patch, HIVE-14123.2.patch, > HIVE-14123.3.patch, HIVE-14123.4.patch, HIVE-14123.5.patch, > HIVE-14123.6.patch, HIVE-14123.7.patch, HIVE-14123.8.patch, > HIVE-14123.9.patch, HIVE-14123.patch > > > There are several jira issues complaining that, the Beeline does not respect > hive.cli.print.current.db. > This is partially true, since in embedded mode, it uses the > hive.cli.print.current.db to change the prompt, since HIVE-10511. > In beeline mode, I think this function should use a beeline command line > option instead, like for the showHeader option emphasizing, that this is a > client side option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12077) MSCK Repair table should fix partitions in batches
[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401761#comment-15401761 ] Chinna Rao Lalam commented on HIVE-12077: - Committed to master. > MSCK Repair table should fix partitions in batches > --- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryan P >Assignee: Chinna Rao Lalam > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, > HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14123) Add beeline configuration option to show database in the prompt
[ https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401763#comment-15401763 ] Peter Vary commented on HIVE-14123: --- [~leftylev] Could you please check my modifications? - Wiki - [Beeline Command Options|https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions] - Jira - Relase notes of this Jira is set Do I need to do anything more, or everything is ok now, and I could remove the TODOC## label as well. Thanks, Peter > Add beeline configuration option to show database in the prompt > --- > > Key: HIVE-14123 > URL: https://issues.apache.org/jira/browse/HIVE-14123 > Project: Hive > Issue Type: Improvement > Components: Beeline, CLI >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14123.10.patch, HIVE-14123.2.patch, > HIVE-14123.3.patch, HIVE-14123.4.patch, HIVE-14123.5.patch, > HIVE-14123.6.patch, HIVE-14123.7.patch, HIVE-14123.8.patch, > HIVE-14123.9.patch, HIVE-14123.patch > > > There are several jira issues complaining that, the Beeline does not respect > hive.cli.print.current.db. > This is partially true, since in embedded mode, it uses the > hive.cli.print.current.db to change the prompt, since HIVE-10511. > In beeline mode, I think this function should use a beeline command line > option instead, like for the showHeader option emphasizing, that this is a > client side option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12077) MSCK Repair table should fix partitions in batches
[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-12077: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) > MSCK Repair table should fix partitions in batches > --- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryan P >Assignee: Chinna Rao Lalam > Fix For: 2.2.0 > > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, > HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Fix Version/s: 1.1.0 Status: Patch Available (was: Open) > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Status: Open (was: Patch Available) > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Attachment: HIVE-14398.1.patch > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Status: Patch Available (was: Open) please take a look,thanks > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Attachment: (was: HIVE-14398.1.patch) > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Attachment: HIVE-14398.1.patch please take a look,thanks > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yechao Chen updated HIVE-14398: --- Comment: was deleted (was: please take a look,thanks) > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401797#comment-15401797 ] Peter Vary commented on HIVE-14374: --- Hi, I have collected the configuration variables: {panel:title=Wiki} -u, -r, -n, -p, -d, -e, -f, -w --password-file, --hiveconf, --hivevar, --color, --showHeader, --headerInterval, --fastConnect, --autoCommit, --verbose, --showWarnings, --showDbInPrompt, --showNestedErrs, --numberFormat, --force, --maxWidth, --maxColumnWidth, --silent, --autosave, --outputformat, --truncateTable, --delimiterForDSV, --isolation, --nullemptystring, --incremental, --help {panel} {panel:title=Help text - beeline mode} -u, -r, -n, -p, -d, -i, -e, -f, -w --password-file, --hiveconf, --hivevar, --property-file, --color, --showHeader, --headerInterval, --fastConnect, --autoCommit, --verbose, --showWarnings, --showDbInPrompt, --showNestedErrs, --numberFormat, --force, --maxWidth, --maxColumnWidth, --silent, --autosave, --outputformat, --incremental, --truncateTable, --delimiterForDSV, --isolation, --nullemptystring, --addlocaldriverjar, --addlocaldrivername, --showConnectedUrl, --help {panel} {panel:title=Help text - compatibility mode} Generated from code, so the same as below {panel} {panel:title=Code - beeline compatibility mode} -database, -e, -f, -i, --hiveconf, --hivevar, -d --define, -S| --silent, -v| --verbose, -H| --help {panel} {panel:title=Code - beeline beeline mode} -d, -u, -r, -n, -p, -w --password-file, -a, -i, -e, -f, -help, --hivevar, --hiveconf, --property-file + all of the configuration file options {panel} {panel:title=Configuration file - beeline beeline mode} headerinterval, fastconnect, incremental, outputformat, autosave, entirelineascommand, authtype, delimiterfordsv, force, initfiles, showconnectedurl, maxheight, maxcolumnwidth, numberformat, timeout, showelapsedtime, verbose, showwarnings, hivevariables, lastconnectedurl, truncatetable, isolation, nullemptystring, trimscripts, showdbinprompt, scriptfile, color, shownestederrs, showheader, autocommit, hiveconfvariables, historyfile {panel} > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOpt
[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available
[ https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401860#comment-15401860 ] Hive QA commented on HIVE-14392: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12821183/HIVE-14392.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10418 tests executed *Failed tests:* {noformat} TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby2_map_skew_multi_distinct org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby2_multi_distinct org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby3_map_skew_multi_distinct org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby3_multi_distinct org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testNegativeCliDriver_groupby_grouping_sets7 org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/719/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/719/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-719/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12821183 - PreCommit-HIVE-MASTER-Build > llap daemons should try using YARN local dirs, if available > --- > > Key: HIVE-14392 > URL: https://issues.apache.org/jira/browse/HIVE-14392 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14392.01.patch > > > LLAP required hive.llap.daemon.work.dirs to be specified. When running as a > YARN app - this can use the local dirs for the container - removing the > requirement to setup this parameter (for secure and non-secure clusters). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14128) Parallelize jobClose phases
[ https://issues.apache.org/jira/browse/HIVE-14128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401900#comment-15401900 ] Rajesh Balamohan commented on HIVE-14128: - [~ashutoshc] - In non-partitioned case, there can be multiple part files within the temp directory. When this is moved in HDFS, it would be simpler. But in some file systems like S3, it would turn out to be expensive still. E.g lineitem is a non-partitioned dataset in TPC-H. Simple insert overwrite would have the following move at the end of the job. Please note that this internally has 300+ part files. So it rename would turn out to be expensive here. {noformat} 2016-08-01T04:40:00,154 INFO [JobClose-Thread-0] exec.FileSinkOperator: Moving tmp dir: s3a://bucket/lineitem/.hive-staging_hive_2016-08-01_04-31-26_432_5317262787271448273-1/_tmp.-ext-1 to: s3a://bucket/lineitem/.hive-staging_hive_2016-08-01_04-31-26_432_5317262787271448273-1/-ext-1 {noformat} Should we consider a file by file move in such cases? > Parallelize jobClose phases > --- > > Key: HIVE-14128 > URL: https://issues.apache.org/jira/browse/HIVE-14128 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0, 2.0.0, 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14128.1.patch, HIVE-14128.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401903#comment-15401903 ] Peter Vary commented on HIVE-14374: --- {panel:title=Missing wiki documentation, but existing help documentation} -a (authType) - jdbc connection authentication type -i script file for initialization --property-file= the file to read connection properties (url, driver, user, password) from --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side --addlocaldrivername=DRIVERNAME Add drivier name needs to be supported in the beeline client side --incremental=[true/false] When set to false, the entire result set is fetched and buffered before being displayed... --showConnectedUrl=[true/false] Prompt HiveServer2's URI to which this beeline connected. {panel} Handling the ones abover is easy. Just add the documentation to wiki, I will do this {panel:title=Configuration file specific, and could be set by command line as well} entirelineascommand - should beeline try to split the commands on ; maxheight - set by the terminal on start, but could be overwritten by command line maxwidth - set by the terminal on start, but could be overwritten by command line timeout - did not set/get - do not know what is it about showelapsedtime - on commit, rollback, execute should beeline print the elapsed time lastconnectedurl - used by -r (reconnect) to connect to the database, but could be overwritten by command line trimscripts - should beeline trim the script lines before executing historyfile - where should be the history file saved (absolute path) {panel} Handling these is more complicated. [~leftylev] I have seen, that you know about the documentation of the parameters. Do you know anything about these features? Are these are planned and just the documentation is lacking, or the possiblity of setting these parameters is an unintended "feature"? Or I just have to dig the code, and try to guess from there? The question is still stands, only we have more data now: - Do we create 2 groups of parameters for command line parameters, and configuration file parameters, or we use only one group? I think we might have only one group, and it would be used for both type of parameters just as [~stakiar] proposed. > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOp
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401952#comment-15401952 ] Peter Vary commented on HIVE-14374: --- I was trying to check the following configuration options but I think they were never intended to work as a command line parameter. --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side --addlocaldrivername=DRIVERNAME Add drivier name needs to be supported in the beeline client side The following does not work: {code} $ ./beeline --addlocaldriverjar=pgsql.jar {code} I think these are commands intended to use in an already running client, like this: {code} $ ./beeline 0: jdbc:hive2://localhost:1> !addlocaldriverjar plqsl.jar {code} If so, then I think it should be removed from the BeeLine.properties file. Am i right [~Ferd] Thanks, Peter > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401997#comment-15401997 ] Peter Vary commented on HIVE-14374: --- The showConnectedUrl parameter (HIVE-11244) is never used anymore. It is lost in merge: HIVE-11769 [~nemon] Shall we reintroduce it, or if nobody missed it, we should just remove the possibility altogether? > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14361) Empty method in TestClientCommandHookFactory
[ https://issues.apache.org/jira/browse/HIVE-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402001#comment-15402001 ] Peter Vary commented on HIVE-14361: --- Tests are not related. [~spena], [~aihuaxu] please commit it when you have time Thanks, Peter > Empty method in TestClientCommandHookFactory > > > Key: HIVE-14361 > URL: https://issues.apache.org/jira/browse/HIVE-14361 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Trivial > Attachments: HIVE-14361.patch > > > Remove the empty method left in TestClientCommandHookFactory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14387) Add an option to skip the table names for the column headers
[ https://issues.apache.org/jira/browse/HIVE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora reassigned HIVE-14387: Assignee: Marta Kuczora > Add an option to skip the table names for the column headers > > > Key: HIVE-14387 > URL: https://issues.apache.org/jira/browse/HIVE-14387 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Marta Kuczora >Priority: Minor > > It would be good to have an option where the beeline output could skip > reporting the . in the headers. > Eg: > {noformat} > 0: jdbc:hive2://:> select * from sample_07 limit 1; > -- > sample_07.codesample_07.description sample_07.total_emp > sample_07.salary > -- > 00- Operations 123 12345 > -- > {noformat} > b) After the option is set: > {noformat} > 0: jdbc:hive2://:> select * from sample_07 limit 1; > --- > code descriptiontotal_empsalary > --- > 00- Operations 123 12345 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list
[ https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402019#comment-15402019 ] Hive QA commented on HIVE-14393: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12821224/HIVE-14393.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10420 tests executed *Failed tests:* {noformat} TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/720/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/720/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-720/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12821224 - PreCommit-HIVE-MASTER-Build > Tuple in list feature fails if there's only 1 tuple in the list > --- > > Key: HIVE-14393 > URL: https://issues.apache.org/jira/browse/HIVE-14393 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-14393.01.patch > > > So this works: > {code} > hive> select * from test where (x,y) in ((1,1),(2,2)); > OK > 1 1 > 2 2 > Time taken: 0.063 seconds, Fetched: 2 row(s) > {code} > And this doesn't: > {code} > hive> select * from test where (x,y) in ((1,1)); > org.antlr.runtime.EarlyExitException > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510) > {code} > If I'm generating SQL I'd like to not have to special case 1 tuple. > As a point of comparison this works in Postgres: > {code} > vagrant=# select * from test where (x, y) in ((1, 1)); > x | y > ---+--- > 1 | 1 > (1 row) > {code} > Any thoughts on this [~pxiong] ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402020#comment-15402020 ] Nemon Lou commented on HIVE-14374: -- [~pvary] Thanks for reminding this.If it was removed by accident ,then it will be good to reintroduce it.We have already use this in our production. > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402024#comment-15402024 ] Peter Vary commented on HIVE-14374: --- The connected URLis always shown since HIVE-11769 - which I think made it into 2.1.0 > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12754) AuthTypes.NONE cause exception after HS2 start
[ https://issues.apache.org/jira/browse/HIVE-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402028#comment-15402028 ] Murshid Chalaev commented on HIVE-12754: Hi, Heng On which version of hive did you face this issue? Did you manage to solve it or find a workaround? > AuthTypes.NONE cause exception after HS2 start > -- > > Key: HIVE-12754 > URL: https://issues.apache.org/jira/browse/HIVE-12754 > Project: Hive > Issue Type: Bug >Reporter: Heng Chen > > I set {{hive.server2.authentication}} to be {{NONE}} > After HS2 start, i see exception in log below: > {code} > 2015-12-29 16:58:42,339 ERROR [HiveServer2-Handler-Pool: Thread-31]: > server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred > during processing of message. > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:328) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 4 more > {code} > IMO the problem is we use Sasl transport when authType is NONE, > {code:title=HiveAuthFactory.java} > public TTransportFactory getAuthTransFactory() throws LoginException { > TTransportFactory transportFactory; > if (authTypeStr.equalsIgnoreCase(AuthTypes.KERBEROS.getAuthName())) { > try { > transportFactory = > saslServer.createTransportFactory(getSaslProperties()); > } catch (TTransportException e) { > throw new LoginException(e.getMessage()); > } > } else if (authTypeStr.equalsIgnoreCase(AuthTypes.NONE.getAuthName())) { > transportFactory = > PlainSaslHelper.getPlainTransportFactory(authTypeStr); > } else if (authTypeStr.equalsIgnoreCase(AuthTypes.LDAP.getAuthName())) { > transportFactory = > PlainSaslHelper.getPlainTransportFactory(authTypeStr); > } else if (authTypeStr.equalsIgnoreCase(AuthTypes.PAM.getAuthName())) { > transportFactory = > PlainSaslHelper.getPlainTransportFactory(authTypeStr); > } else if (authTypeStr.equalsIgnoreCase(AuthTypes.NOSASL.getAuthName())) { > transportFactory = new TTransportFactory(); > } else if (authTypeStr.equalsIgnoreCase(AuthTypes.CUSTOM.getAuthName())) { > transportFactory = > PlainSaslHelper.getPlainTransportFactory(authTypeStr); > } else { > throw new LoginException("Unsupported authentication type " + > authTypeStr); > } > return transportFactory; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402036#comment-15402036 ] Nemon Lou commented on HIVE-14374: -- For my part,it will be fine to remove it. > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14374) BeeLine argument, and configuration handling cleanup
[ https://issues.apache.org/jira/browse/HIVE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402041#comment-15402041 ] Peter Vary commented on HIVE-14374: --- Thanks, will see if anyone else needs it. > BeeLine argument, and configuration handling cleanup > > > Key: HIVE-14374 > URL: https://issues.apache.org/jira/browse/HIVE-14374 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > > BeeLine uses reflection, to set the BeeLineOpts attributes when parsing > command line arguments, and when loading the configuration file. > This means, that creating a setXXX, getXXX method in BeeLineOpts is a > potential risk of exposing an attribute for the user unintentionally. There > is a possibility to exclude an attribute from saving the value in the > configuration file with the Ignore annotation. This does not restrict the > loading or command line setting of these parameters which means there are > many undocumented "features" as-is, like setting the lastConnectedUrl, > allowMultilineCommand, maxHeight, trimScripts, etc. from command line. > This part of the code needs a little cleanup. > I think we should make this exposure more explicit, and be able to > differentiate the configurable options depending on the source (command line, > and configuration file), so I propose to create a mechanism to tell > explicitly which BeeLineOpts attributes are settable by command line, and > configuration file, and every other attribute should be inaccessible by the > user of the beeline cli. > One possible solution could be two annotations like these: > - CommandLineOption - there could be a mandatory text parameter here, so the > developer had to provide the help text for it which could be displayed to the > user > - ConfigurationFileOption - no text is required here > Something like this: > - This attribute could be provided by command line, and from a configuration > file too: > {noformat} > @CommandLineOption("automatically save preferences") > @ConfigurationFileOption > public void setAutosave(boolean autosave) { > this.autosave = autosave; > } > public void getAutosave() { > return this.autosave; > } > {noformat} > - This attribute could be set through the configuration only > {noformat} > @ConfigurationFileOption > public void setLastConnectedUrl(String lastConnectedUrl) { > this.lastConnectedUrl = lastConnectedUrl; > } > > public String getLastConnectedUrl() > { > return lastConnectedUrl; > } > {noformat} > - Attribute could be set through command line only - I think this is not too > relevant, but possible > {noformat} > @CommandLineOption("specific command line option") > public void setSpecificCommandLineOption(String specificCommandLineOption) { > this.specificCommandLineOption = specificCommandLineOption; > } > > public String getSpecificCommandLineOption() { > return specificCommandLineOption; > } > {noformat} > - Attribute could not be set > {noformat} > public static Env getEnv() { > return env; > } > public static void setEnv(Env envToUse) { > env = envToUse; > } > {noformat} > Accouring to our previous conversations, I think you might be interested in: > [~spena], [~vihangk1], [~aihuaxu], [~ngangam], [~ychena], [~xuefuz] > but anyone is welcome to discuss this. > What do you think about the proposed solution? > Any better ideas, or extensions? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata
[ https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14146: -- Attachment: HIVE-14146.7.patch Regenerated the diff, since I do not know why it was not recognized. The decision on the CLI formatted comment handling is still required > Column comments with "\n" character "corrupts" table metadata > - > > Key: HIVE-14146 > URL: https://issues.apache.org/jira/browse/HIVE-14146 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, > HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, > HIVE-14146.7.patch, HIVE-14146.patch > > > Create a table with the following(noting the \n in the COMMENT): > {noformat} > CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an > individual’); > {noformat} > Describe shows that now the metadata is messed up: > {noformat} > beeline> describe commtest; > +---++---+--+ > | col_name | data_type |comment| > +---++---+--+ > | first_nm | string | Indicates First name | > | of an individual | NULL | NULL | > +---++---+--+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging
[ https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402097#comment-15402097 ] Josh Elser commented on HIVE-14394: --- Closing the loop on the upstream fix (since most probably won't be watching [~sushanth]'s PR): I just merged in his change (thanks so much for catching and fixing) and released an 0.1.1 of the reporter. It should be available via Maven central now (but not available via http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22dropwizard-metrics-hadoop-metrics2-reporter%22 for a few hours). Sorry for the temporary pain! > Reduce excessive INFO level logging > --- > > Key: HIVE-14394 > URL: https://issues.apache.org/jira/browse/HIVE-14394 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-14394.patch > > > We need to cull down on the number of logs we generate in HMS and HS2 that > are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12954) NPE with str_to_map on null strings
[ https://issues.apache.org/jira/browse/HIVE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402163#comment-15402163 ] Aihua Xu commented on HIVE-12954: - +1. The change looks good. > NPE with str_to_map on null strings > --- > > Key: HIVE-12954 > URL: https://issues.apache.org/jira/browse/HIVE-12954 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0 >Reporter: Charles Pritchard >Assignee: Marta Kuczora > Attachments: HIVE-12954.2.patch, HIVE-12954.patch > > > Running str_to_map on a null string will return a NullPointerException. > Workaround is to use coalesce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14387) Add an option to skip the table names for the column headers
[ https://issues.apache.org/jira/browse/HIVE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402213#comment-15402213 ] Marta Kuczora commented on HIVE-14387: -- Found the following config parameter: hive.resultset.use.unique.column.names Make column names unique in the result set by qualifying column names with table alias if needed. Table alias will be added to column names for queries of type "select *" or if query explicitly uses table alias "select r1.x..". Default value: true If this parameter is set to false, the result will contain only the column names without the table name prefix. Example: hive.resultset.use.unique.column.names=true {noformat} 0: jdbc:hive2://> select * from car limit 1; OK ++---++---+---+--+ | car.carid | car.type | car.color | car.lnum | car.year | ++---++---+---+--+ | 1000 | Audi | red| AAA111| 2009 | ++---++---+---+--+ 1 row selected (0.084 seconds) {noformat} hive.resultset.use.unique.column.names=false {noformat} 0: jdbc:hive2://> select * from car limit 1; OK ++---++-+---+--+ | carid | type | color | lnum | year | ++---++-+---+--+ | 1000 | Audi | red| AAA111 | 2009 | ++---++-+---+--+ 1 row selected (0.084 seconds) {noformat} [~vihangk1], would this parameter be a suitable option? > Add an option to skip the table names for the column headers > > > Key: HIVE-14387 > URL: https://issues.apache.org/jira/browse/HIVE-14387 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Marta Kuczora >Priority: Minor > > It would be good to have an option where the beeline output could skip > reporting the . in the headers. > Eg: > {noformat} > 0: jdbc:hive2://:> select * from sample_07 limit 1; > -- > sample_07.codesample_07.description sample_07.total_emp > sample_07.salary > -- > 00- Operations 123 12345 > -- > {noformat} > b) After the option is set: > {noformat} > 0: jdbc:hive2://:> select * from sample_07 limit 1; > --- > code descriptiontotal_empsalary > --- > 00- Operations 123 12345 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata
[ https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402216#comment-15402216 ] Aihua Xu commented on HIVE-14146: - [~pvary] Yeah. You may need to rebase to the latest change. Do you need to generate the new baseline for your new test case? > Column comments with "\n" character "corrupts" table metadata > - > > Key: HIVE-14146 > URL: https://issues.apache.org/jira/browse/HIVE-14146 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, > HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, > HIVE-14146.7.patch, HIVE-14146.patch > > > Create a table with the following(noting the \n in the COMMENT): > {noformat} > CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an > individual’); > {noformat} > Describe shows that now the metadata is messed up: > {noformat} > beeline> describe commtest; > +---++---+--+ > | col_name | data_type |comment| > +---++---+--+ > | first_nm | string | Indicates First name | > | of an individual | NULL | NULL | > +---++---+--+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata
[ https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402245#comment-15402245 ] Peter Vary commented on HIVE-14146: --- I rebased and run the qtests. It seems ok now - we will see the result not too soon :) What I am not sure about, how to handle the CLI/BeeLine differences here. CLI uses formatted outputting, which specifically displays comments with \n in new lines, but only in column comments, and not in table comments. Probably the nicest solution would be to keep the \n-s as newlines in CLI mode in table comments as well, so at least in CLI every newline would remain a newline, and in BeeLine every newline would be a \n. > Column comments with "\n" character "corrupts" table metadata > - > > Key: HIVE-14146 > URL: https://issues.apache.org/jira/browse/HIVE-14146 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, > HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, > HIVE-14146.7.patch, HIVE-14146.patch > > > Create a table with the following(noting the \n in the COMMENT): > {noformat} > CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an > individual’); > {noformat} > Describe shows that now the metadata is messed up: > {noformat} > beeline> describe commtest; > +---++---+--+ > | col_name | data_type |comment| > +---++---+--+ > | first_nm | string | Indicates First name | > | of an individual | NULL | NULL | > +---++---+--+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14395) Add the missing data files to Avro union tests (HIVE-14205 addendum)
[ https://issues.apache.org/jira/browse/HIVE-14395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402264#comment-15402264 ] Hive QA commented on HIVE-14395: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12821239/HIVE-14395.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10404 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-script_pipe.q-orc_ppd_schema_evol_2a.q-join1.q-and-12-more - did not produce a TEST-*.xml file TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/721/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/721/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-721/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12821239 - PreCommit-HIVE-MASTER-Build > Add the missing data files to Avro union tests (HIVE-14205 addendum) > > > Key: HIVE-14395 > URL: https://issues.apache.org/jira/browse/HIVE-14395 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Trivial > Attachments: HIVE-14395.patch > > > The union_non_nullable.txt & union_nullable.txt were not checked in for > HIVE-14205. It was my mistake. > It is the reason that testCliDriver_avro_nullable_union & > testNegativeCliDriver_avro_non_nullable_union are failing in current > pre-commit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging
[ https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402297#comment-15402297 ] Sushanth Sowmyan commented on HIVE-14394: - Awesome! :) Thanks, I'll update the pom dep instead of this. > Reduce excessive INFO level logging > --- > > Key: HIVE-14394 > URL: https://issues.apache.org/jira/browse/HIVE-14394 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-14394.patch > > > We need to cull down on the number of logs we generate in HMS and HS2 that > are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14394) Reduce excessive INFO level logging
[ https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-14394: Status: Patch Available (was: Open) > Reduce excessive INFO level logging > --- > > Key: HIVE-14394 > URL: https://issues.apache.org/jira/browse/HIVE-14394 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-14394.2.patch, HIVE-14394.patch > > > We need to cull down on the number of logs we generate in HMS and HS2 that > are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14394) Reduce excessive INFO level logging
[ https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-14394: Attachment: HIVE-14394.2.patch Updated patch attached. > Reduce excessive INFO level logging > --- > > Key: HIVE-14394 > URL: https://issues.apache.org/jira/browse/HIVE-14394 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-14394.2.patch, HIVE-14394.patch > > > We need to cull down on the number of logs we generate in HMS and HS2 that > are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata
[ https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402337#comment-15402337 ] Aihua Xu commented on HIVE-14146: - I see. Sure. Seems reasonable to make such change. > Column comments with "\n" character "corrupts" table metadata > - > > Key: HIVE-14146 > URL: https://issues.apache.org/jira/browse/HIVE-14146 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, > HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, > HIVE-14146.7.patch, HIVE-14146.patch > > > Create a table with the following(noting the \n in the COMMENT): > {noformat} > CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an > individual’); > {noformat} > Describe shows that now the metadata is messed up: > {noformat} > beeline> describe commtest; > +---++---+--+ > | col_name | data_type |comment| > +---++---+--+ > | first_nm | string | Indicates First name | > | of an individual | NULL | NULL | > +---++---+--+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null
[ https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402366#comment-15402366 ] Chao Sun commented on HIVE-14346: - Test failures unrelated. > Change the default value for hive.mapred.mode to null > - > > Key: HIVE-14346 > URL: https://issues.apache.org/jira/browse/HIVE-14346 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, > HIVE-14346.2.patch > > > HIVE-12727 introduces three new configurations to replace the existing > {{hive.mapred.mode}}, which is deprecated. However, the default value for the > latter is 'nonstrict', which prevent the new configurations from being used > (see comments in that JIRA for more details). > This proposes to change the default value for {{hive.mapred.mode}} to null. > Users can then set the three new configurations to get more fine-grained > control over the strict checking. If user want to use the old configuration, > they can set {{hive.mapred.mode}} to strict/nonstrict. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14128) Parallelize jobClose phases
[ https://issues.apache.org/jira/browse/HIVE-14128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402391#comment-15402391 ] Ashutosh Chauhan commented on HIVE-14128: - I think approach on HIVE-14270 is better than this. So, I have abandoned this in favor of that. Once we have HIVE-14270 I think this won't be necessary. > Parallelize jobClose phases > --- > > Key: HIVE-14128 > URL: https://issues.apache.org/jira/browse/HIVE-14128 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0, 2.0.0, 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14128.1.patch, HIVE-14128.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list
[ https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402393#comment-15402393 ] Ashutosh Chauhan commented on HIVE-14393: - +1 > Tuple in list feature fails if there's only 1 tuple in the list > --- > > Key: HIVE-14393 > URL: https://issues.apache.org/jira/browse/HIVE-14393 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-14393.01.patch > > > So this works: > {code} > hive> select * from test where (x,y) in ((1,1),(2,2)); > OK > 1 1 > 2 2 > Time taken: 0.063 seconds, Fetched: 2 row(s) > {code} > And this doesn't: > {code} > hive> select * from test where (x,y) in ((1,1)); > org.antlr.runtime.EarlyExitException > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510) > {code} > If I'm generating SQL I'd like to not have to special case 1 tuple. > As a point of comparison this works in Postgres: > {code} > vagrant=# select * from test where (x, y) in ((1, 1)); > x | y > ---+--- > 1 | 1 > (1 row) > {code} > Any thoughts on this [~pxiong] ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging
[ https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402434#comment-15402434 ] Josh Elser commented on HIVE-14394: --- I was able to build master with your v2 patch locally. LGTM. > Reduce excessive INFO level logging > --- > > Key: HIVE-14394 > URL: https://issues.apache.org/jira/browse/HIVE-14394 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-14394.2.patch, HIVE-14394.patch > > > We need to cull down on the number of logs we generate in HMS and HS2 that > are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3776) support PIVOT in hive
[ https://issues.apache.org/jira/browse/HIVE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402436#comment-15402436 ] Aditya commented on HIVE-3776: -- Any alternate option!! for this > support PIVOT in hive > - > > Key: HIVE-3776 > URL: https://issues.apache.org/jira/browse/HIVE-3776 > Project: Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > > It is a fairly well understood feature in databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7239) Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files
[ https://issues.apache.org/jira/browse/HIVE-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402460#comment-15402460 ] Illya Yalovyy commented on HIVE-7239: - [~gopalv], [~owen.omalley], [~ashutoshc], Could you please review this patch and suggest the next step to get it accepted? > Fix bug in HiveIndexedInputFormat implementation that causes incorrect query > result when input backed by Sequence/RC files > -- > > Key: HIVE-7239 > URL: https://issues.apache.org/jira/browse/HIVE-7239 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 2.1.0 >Reporter: Sumit Kumar >Assignee: Illya Yalovyy > Attachments: HIVE-7239.2.patch, HIVE-7239.3.patch, HIVE-7239.4.patch, > HIVE-7239.patch > > > In case of sequence files, it's crucial that splits are calculated around the > boundaries enforced by the input sequence file. However by default hadoop > creates input splits depending on the configuration parameters which may not > match the boundaries for the input sequence file. Hive provides > HiveIndexedInputFormat that provides extra logic and recalculates the split > boundaries for each split depending on the sequence file's boundaries. > However we noticed this behavior of "over" reporting from data backed by > sequence file. We've a sample data on which we experimented and fixed this > bug, we have verified this fix by comparing the query output for input being > sequence file format, rc file and regular format. However we have not able to > find the right place to include this as a unit test that would execute as > part of hive tests. We tried writing a "clientpositive" test as part of ql > module but the output seems quite verbose and i couldn't interpret it that > well. Can someone please review this change and guide on how to write a test > that will execute as part of Hive testing? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402499#comment-15402499 ] Hive QA commented on HIVE-14390: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12821158/HIVE-14390.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 370 failed/errored test(s), 10419 tests executed *Failed tests:* {noformat} TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_SortUnionTransposeRule org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join_merge org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_deleteAnalyze org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_join_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_convert_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_innerjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2 org.apache.hadoop.hive.cli.
[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402503#comment-15402503 ] Sergey Shelukhin commented on HIVE-14397: - Which part of the patch fixes the problem where "when we run more queries the reopened sessions are not used instead new session is opened"? Queue name logic makes sense to me > Queries launched after reopening of tez session launches additional sessions > > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
[ https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402506#comment-15402506 ] Sergey Shelukhin commented on HIVE-14357: - I will commit to both places. It's a test-only issue though, so I dunno if it should really impact the release plans. > TestDbTxnManager2#testLocksInSubquery failing in branch-2.1 > --- > > Key: HIVE-14357 > URL: https://issues.apache.org/jira/browse/HIVE-14357 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Rajat Khandelwal >Assignee: Sergey Shelukhin > Attachments: HIVE-14357.patch > > > {noformat} > checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S > where a in (select a from T where b = 1)")); > txnMgr.openTxn("three"); > txnMgr.acquireLocks(driver.getPlan(), ctx, "three"); > locks = getLocks(); > Assert.assertEquals("Unexpected lock count", 3, locks.size()); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, > locks.get(0)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, > locks.get(1)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, > locks.get(2)); > {noformat} > This test case is failing. The expected order of locks is supposed to be T, > S, R. But upon closer inspection, it seems to be R,S,T. > I'm not much familiar with what these locks are and why the order is > important. Raising this jira so while I try to understand it all. Meanwhile, > if somebody can explain here, would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications
[ https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402519#comment-15402519 ] Sushanth Sowmyan commented on HIVE-13966: - Sorry for the late response, I've been tied on a bunch of other issues off late. I will look at this and review in the next couple of days. This is an issue that is important to fix, and I'm glad you have a patch up for it. :) > DbNotificationListener: can loose DDL operation notifications > - > > Key: HIVE-13966 > URL: https://issues.apache.org/jira/browse/HIVE-13966 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Nachiket Vaidya >Assignee: Rahul Sharma >Priority: Critical > Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch > > > The code for each API in HiveMetaStore.java is like this: > 1. openTransaction() > 2. -- operation-- > 3. commit() or rollback() based on result of the operation. > 4. add entry to notification log (unconditionally) > If the operation is failed (in step 2), we still add entry to notification > log. Found this issue in testing. > It is still ok as this is the case of false positive. > If the operation is successful and adding to notification log failed, the > user will get an MetaException. It will not rollback the operation, as it is > already committed. We need to handle this case so that we will not have false > negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402521#comment-15402521 ] Pengcheng Xiong commented on HIVE-14390: [~nemon], most of the output files look good to me. Could u double check union15.q and union.9.q in SparkCliDriver? It seems that they generate a different plan? Thanks. > Wrong Table alias when CBO is on > > > Key: HIVE-14390 > URL: https://issues.apache.org/jira/browse/HIVE-14390 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Minor > Attachments: HIVE-14390.patch, explain.rar > > > There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5. > But the query plan only has ws1 when CBO is on. > query95 : > {noformat} > SELECT count(distinct ws1.ws_order_number) as order_count, >sum(ws1.ws_ext_ship_cost) as total_shipping_cost, >sum(ws1.ws_net_profit) as total_net_profit > FROM web_sales ws1 > JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk) > JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk) > JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk) > LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number >FROM web_sales ws2 JOIN web_sales ws3 >ON (ws2.ws_order_number = ws3.ws_order_number) >WHERE ws2.ws_warehouse_sk <> > ws3.ws_warehouse_sk > ) ws_wh1 > ON (ws1.ws_order_number = ws_wh1.ws_order_number) > LEFT SEMI JOIN (SELECT wr_order_number >FROM web_returns wr >JOIN (SELECT ws4.ws_order_number as > ws_order_number > FROM web_sales ws4 JOIN web_sales > ws5 > ON (ws4.ws_order_number = > ws5.ws_order_number) > WHERE ws4.ws_warehouse_sk <> > ws5.ws_warehouse_sk > ) ws_wh2 >ON (wr.wr_order_number = > ws_wh2.ws_order_number)) tmp1 > ON (ws1.ws_order_number = tmp1.wr_order_number) > WHERE d.d_date between '2002-05-01' and '2002-06-30' and >ca.ca_state = 'GA' and >s.web_company_name = 'pri'; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
[ https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402561#comment-15402561 ] Daniel Dai commented on HIVE-14399: --- The test wait for EVENTS_TTL * 2 = 60s and assume events are cleaned up. In the meantime, DbNotificationListener.CleanerThread is invoked every 60s. It is possible cleanup thread doesn't get a chance to run during the test wait time. I'd like to increase frequency of CleanerThread in the test to make sure it will get a chance to run during the waiting. > Fix test flakiness of > org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs > > > Key: HIVE-14399 > URL: https://issues.apache.org/jira/browse/HIVE-14399 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > > We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. > We shall make it stable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available
[ https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402567#comment-15402567 ] Sergey Shelukhin commented on HIVE-14392: - Why did we make it required in the first place? I remember there was an explicit reason, I just don't remember what it was. > llap daemons should try using YARN local dirs, if available > --- > > Key: HIVE-14392 > URL: https://issues.apache.org/jira/browse/HIVE-14392 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14392.01.patch > > > LLAP required hive.llap.daemon.work.dirs to be specified. When running as a > YARN app - this can use the local dirs for the container - removing the > requirement to setup this parameter (for secure and non-secure clusters). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi reassigned HIVE-14373: --- Assignee: Abdullah Yousufi > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402576#comment-15402576 ] Pengcheng Xiong commented on HIVE-14390: ccing [~jcamachorodriguez] as well > Wrong Table alias when CBO is on > > > Key: HIVE-14390 > URL: https://issues.apache.org/jira/browse/HIVE-14390 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Minor > Attachments: HIVE-14390.patch, explain.rar > > > There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5. > But the query plan only has ws1 when CBO is on. > query95 : > {noformat} > SELECT count(distinct ws1.ws_order_number) as order_count, >sum(ws1.ws_ext_ship_cost) as total_shipping_cost, >sum(ws1.ws_net_profit) as total_net_profit > FROM web_sales ws1 > JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk) > JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk) > JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk) > LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number >FROM web_sales ws2 JOIN web_sales ws3 >ON (ws2.ws_order_number = ws3.ws_order_number) >WHERE ws2.ws_warehouse_sk <> > ws3.ws_warehouse_sk > ) ws_wh1 > ON (ws1.ws_order_number = ws_wh1.ws_order_number) > LEFT SEMI JOIN (SELECT wr_order_number >FROM web_returns wr >JOIN (SELECT ws4.ws_order_number as > ws_order_number > FROM web_sales ws4 JOIN web_sales > ws5 > ON (ws4.ws_order_number = > ws5.ws_order_number) > WHERE ws4.ws_warehouse_sk <> > ws5.ws_warehouse_sk > ) ws_wh2 >ON (wr.wr_order_number = > ws_wh2.ws_order_number)) tmp1 > ON (ws1.ws_order_number = tmp1.wr_order_number) > WHERE d.d_date between '2002-05-01' and '2002-06-30' and >ca.ca_state = 'GA' and >s.web_company_name = 'pri'; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14392) llap daemons should try using YARN local dirs, if available
[ https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402567#comment-15402567 ] Sergey Shelukhin edited comment on HIVE-14392 at 8/1/16 6:23 PM: - Why did we make it required in the first place? I remember there was an explicit reason, I just don't remember what it was. We were always running on YARN only, too was (Author: sershe): Why did we make it required in the first place? I remember there was an explicit reason, I just don't remember what it was. > llap daemons should try using YARN local dirs, if available > --- > > Key: HIVE-14392 > URL: https://issues.apache.org/jira/browse/HIVE-14392 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14392.01.patch > > > LLAP required hive.llap.daemon.work.dirs to be specified. When running as a > YARN app - this can use the local dirs for the container - removing the > requirement to setup this parameter (for secure and non-secure clusters). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402582#comment-15402582 ] Prasanth Jayachandran commented on HIVE-14397: -- [~sershe] The root cause of this bug is this code https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L275-L287 If the queue name is set in conf then it always creates a new session when getSession is invoked. The part that fixes this is conf.unset("tez.queue.name"). This makes sure we use the sessions from the pool when available. The other part of the patch {code} conf.set(TezConfiguration.TEZ_QUEUE_NAME, sessionState.getQueueName()); {code} is required to reopen the initial sessions when user specified queue name is not available. If we don't set this then all the reopened sessions will use "default" queue. > Queries launched after reopening of tez session launches additional sessions > > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
[ https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-14399: -- Attachment: HIVE-14399.1.patch > Fix test flakiness of > org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs > > > Key: HIVE-14399 > URL: https://issues.apache.org/jira/browse/HIVE-14399 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-14399.1.patch > > > We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. > We shall make it stable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14399) Fix test flakiness of org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs
[ https://issues.apache.org/jira/browse/HIVE-14399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-14399: -- Status: Patch Available (was: Open) > Fix test flakiness of > org.apache.hive.hcatalog.listener.TestDbNotificationListener.cleanupNotifs > > > Key: HIVE-14399 > URL: https://issues.apache.org/jira/browse/HIVE-14399 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-14399.1.patch > > > We get intermittent test failure of TestDbNotificationListener.cleanupNotifs. > We shall make it stable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402586#comment-15402586 ] Jesus Camacho Rodriguez commented on HIVE-14390: As [~pxiong] indicated, indeed quick pass over the changes indicates that there are no regressions and logic in the patch makes sense. I wonder if there might be some performance impact on the time spent on join reordering algorithm; I remember having a conversation with [~jpullokkaran] about a reason to not use different aliases, but honestly I cannot remember the details anymore. On the other hand, we do something similar for return path as we use a different ID for every (sub)query block (line 136 in HiveTableScan). Thus, IMO it is OK to check it in, and we can keep an eye on the future compilation for multi-join queries. > Wrong Table alias when CBO is on > > > Key: HIVE-14390 > URL: https://issues.apache.org/jira/browse/HIVE-14390 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Minor > Attachments: HIVE-14390.patch, explain.rar > > > There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5. > But the query plan only has ws1 when CBO is on. > query95 : > {noformat} > SELECT count(distinct ws1.ws_order_number) as order_count, >sum(ws1.ws_ext_ship_cost) as total_shipping_cost, >sum(ws1.ws_net_profit) as total_net_profit > FROM web_sales ws1 > JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk) > JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk) > JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk) > LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number >FROM web_sales ws2 JOIN web_sales ws3 >ON (ws2.ws_order_number = ws3.ws_order_number) >WHERE ws2.ws_warehouse_sk <> > ws3.ws_warehouse_sk > ) ws_wh1 > ON (ws1.ws_order_number = ws_wh1.ws_order_number) > LEFT SEMI JOIN (SELECT wr_order_number >FROM web_returns wr >JOIN (SELECT ws4.ws_order_number as > ws_order_number > FROM web_sales ws4 JOIN web_sales > ws5 > ON (ws4.ws_order_number = > ws5.ws_order_number) > WHERE ws4.ws_warehouse_sk <> > ws5.ws_warehouse_sk > ) ws_wh2 >ON (wr.wr_order_number = > ws_wh2.ws_order_number)) tmp1 > ON (ws1.ws_order_number = tmp1.wr_order_number) > WHERE d.d_date between '2002-05-01' and '2002-06-30' and >ca.ca_state = 'GA' and >s.web_company_name = 'pri'; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14357) TestDbTxnManager2#testLocksInSubquery failing in branch-2.1
[ https://issues.apache.org/jira/browse/HIVE-14357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14357: Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 Target Version/s: 2.1.1 (was: 1.3.0, 2.1.1) Status: Resolved (was: Patch Available) Committed to master and branch-2.1 > TestDbTxnManager2#testLocksInSubquery failing in branch-2.1 > --- > > Key: HIVE-14357 > URL: https://issues.apache.org/jira/browse/HIVE-14357 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Rajat Khandelwal >Assignee: Sergey Shelukhin > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14357.patch > > > {noformat} > checkCmdOnDriver(driver.compileAndRespond("insert into R select * from S > where a in (select a from T where b = 1)")); > txnMgr.openTxn("three"); > txnMgr.acquireLocks(driver.getPlan(), ctx, "three"); > locks = getLocks(); > Assert.assertEquals("Unexpected lock count", 3, locks.size()); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "T", null, > locks.get(0)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "S", null, > locks.get(1)); > checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "R", null, > locks.get(2)); > {noformat} > This test case is failing. The expected order of locks is supposed to be T, > S, R. But upon closer inspection, it seems to be R,S,T. > I'm not much familiar with what these locks are and why the order is > important. Raising this jira so while I try to understand it all. Meanwhile, > if somebody can explain here, would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14322) Postgres db issues after Datanucleus 4.x upgrade
[ https://issues.apache.org/jira/browse/HIVE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14322: Resolution: Fixed Fix Version/s: 2.0.2 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Committed. Thanks for the review! > Postgres db issues after Datanucleus 4.x upgrade > > > Key: HIVE-14322 > URL: https://issues.apache.org/jira/browse/HIVE-14322 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0, 2.0.1 >Reporter: Thejas M Nair >Assignee: Sergey Shelukhin > Fix For: 2.2.0, 2.1.1, 2.0.2 > > Attachments: HIVE-14322.02.patch, HIVE-14322.03.patch, > HIVE-14322.04.patch, HIVE-14322.1.patch > > > With the upgrade to datanucleus 4.x versions in HIVE-6113, hive does not > work properly with postgres. > The nullable fields in the database have string "NULL::character varying" > instead of real NULL values. This causes various issues. > One example is - > {code} > hive> create table t(i int); > OK > Time taken: 1.9 seconds > hive> create view v as select * from t; > OK > Time taken: 0.542 seconds > hive> select * from v; > FAILED: SemanticException Unable to fetch table v. > java.net.URISyntaxException: Relative path in absolute URI: > NULL::character%20varying > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14386) UGI clone shim also needs to clone credentials
[ https://issues.apache.org/jira/browse/HIVE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14386: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed > UGI clone shim also needs to clone credentials > -- > > Key: HIVE-14386 > URL: https://issues.apache.org/jira/browse/HIVE-14386 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-14386.patch > > > Discovered while testing HADOOP-13081 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14397) Queries launched after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402615#comment-15402615 ] Sergey Shelukhin commented on HIVE-14397: - +1 > Queries launched after reopening of tez session launches additional sessions > > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution
[ https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-14340: Attachment: HIVE-14340.1.patch Attaching new patch. This changes {{hive.query.hooks}} to {{hive.query.lifetime.hooks}}. Also, now the hook activates at 4 places: # before query compilation # after query compilation # before query execution # after query execution > Add a new hook triggers before query compilation and after query execution > -- > > Key: HIVE-14340 > URL: https://issues.apache.org/jira/browse/HIVE-14340 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch > > > In some cases we may need to have a hook that activates before a query > compilation and after its execution. For instance, dynamically generate a UDF > specifically for the running query and clean up the resource after the query > is done. The current hooks only covers pre & post semantic analysis, pre & > post query execution, which doesn't fit the requirement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14397) Queries ran after reopening of tez session launches additional sessions
[ https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14397: - Summary: Queries ran after reopening of tez session launches additional sessions (was: Queries launched after reopening of tez session launches additional sessions) > Queries ran after reopening of tez session launches additional sessions > --- > > Key: HIVE-14397 > URL: https://issues.apache.org/jira/browse/HIVE-14397 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0, 2.2.0 >Reporter: Takahiko Saito >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-14397.1.patch > > > Say we have configured hive.server2.tez.default.queues with 2 queues q1 and > q2 with default expiry interval of 5 mins. > After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will > be expired. When new set of queries are issue after this expiry, the default > sessions backed by q1 and q2 and reopened again. Now when we run more queries > the reopened sessions are not used instead new session is opened. > At this point there will be 4 sessions running (2 abandoned sessions and 2 > current sessions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14377) LLAP IO: issue with how estimate cache removes unneeded buffers
[ https://issues.apache.org/jira/browse/HIVE-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402631#comment-15402631 ] Sergey Shelukhin commented on HIVE-14377: - All the tests for Tez failed on one instance due to {noformat}Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to read from or write to hbase Failed 1 action: RetriesExhaustedException: 1 time, at org.apache.hadoop.hive.metastore.hbase.HBaseStore.createDatabase(HBaseStore.java:158) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:612) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:628) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:418) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:376) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:237) ~[hive-metastore-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3356) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3397) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3377) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3631) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]{noformat} I cannot repro the rest (or they are some existing failures). > LLAP IO: issue with how estimate cache removes unneeded buffers > --- > > Key: HIVE-14377 > URL: https://issues.apache.org/jira/browse/HIVE-14377 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-14377.01.patch, HIVE-14377.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode
[ https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402656#comment-15402656 ] Chao Sun commented on HIVE-14343: - [~xuefuz], [~jxiang] can you give a review on this? Thanks. > HiveDriverRunHookContext's command is null in HS2 mode > -- > > Key: HIVE-14343 > URL: https://issues.apache.org/jira/browse/HIVE-14343 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch > > > Looking at the {{Driver#runInternal(String command, boolean > alreadyCompiled)}}: > {code} > HiveDriverRunHookContext hookContext = new > HiveDriverRunHookContextImpl(conf, command); > // Get all the driver run hooks and pre-execute them. > List driverRunHooks; > {code} > The context is initialized with the {{command}} passed in to the method. > However, this command is always null if {{alreadyCompiled}} is true, which is > the case for HS2 mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14346) Change the default value for hive.mapred.mode to null
[ https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402366#comment-15402366 ] Chao Sun edited comment on HIVE-14346 at 8/1/16 7:14 PM: - Test failures unrelated, although I changed {{subquery_multiinsert.q.out}} since it's not consistent with the qfile. was (Author: csun): Test failures unrelated. > Change the default value for hive.mapred.mode to null > - > > Key: HIVE-14346 > URL: https://issues.apache.org/jira/browse/HIVE-14346 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, > HIVE-14346.2.patch > > > HIVE-12727 introduces three new configurations to replace the existing > {{hive.mapred.mode}}, which is deprecated. However, the default value for the > latter is 'nonstrict', which prevent the new configurations from being used > (see comments in that JIRA for more details). > This proposes to change the default value for {{hive.mapred.mode}} to null. > Users can then set the three new configurations to get more fine-grained > control over the strict checking. If user want to use the old configuration, > they can set {{hive.mapred.mode}} to strict/nonstrict. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14346) Change the default value for hive.mapred.mode to null
[ https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-14346: Attachment: HIVE-14346.3.patch Address rebase issue. > Change the default value for hive.mapred.mode to null > - > > Key: HIVE-14346 > URL: https://issues.apache.org/jira/browse/HIVE-14346 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, > HIVE-14346.2.patch, HIVE-14346.3.patch > > > HIVE-12727 introduces three new configurations to replace the existing > {{hive.mapred.mode}}, which is deprecated. However, the default value for the > latter is 'nonstrict', which prevent the new configurations from being used > (see comments in that JIRA for more details). > This proposes to change the default value for {{hive.mapred.mode}} to null. > Users can then set the three new configurations to get more fine-grained > control over the strict checking. If user want to use the old configuration, > they can set {{hive.mapred.mode}} to strict/nonstrict. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14350) Aborted txns cause false positive "Not enough history available..." msgs
[ https://issues.apache.org/jira/browse/HIVE-14350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-14350: -- Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 1.3.0 Status: Resolved (was: Patch Available) > Aborted txns cause false positive "Not enough history available..." msgs > > > Key: HIVE-14350 > URL: https://issues.apache.org/jira/browse/HIVE-14350 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Fix For: 1.3.0, 2.2.0, 2.1.1 > > Attachments: HIVE-14350.2.patch, HIVE-14350.3.patch, > HIVE-14350.5.patch, HIVE-14350.6.patch, HIVE-14350.7.patch, > HIVE-14350.8.patch, HIVE-14350.9.patch > > > this is a followup to HIVE-13369. Only open txns should prevent use of a > base file. But ValidTxnList does not make a distinction between open and > aborted txns. The presence of aborted txns causes false positives which can > happen too often since the flow is > 1. Worker generates a new base file, > 2. then asynchronously Cleaner removes now-compacted aborted txns. (strictly > speaking it's Initiator that does the actual clean up) > So we may have base_5 and base_10 and txnid 7 aborted. Then current impl > will disallow use of base_10 though there is no need for that. Worse, if > txnid_4 is aborted and hasn't been purged yet, base_5 will be rejected as > well and then an error will be raised since there is no suitable base file > left. > ErrorMsg.ACID_NOT_ENOUGH_HISTORY is msg produced -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14366) Conversion of a Non-ACID table to an ACID table produces non-unique primary keys
[ https://issues.apache.org/jira/browse/HIVE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-14366: -- Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 1.3.0 Status: Resolved (was: Patch Available) > Conversion of a Non-ACID table to an ACID table produces non-unique primary > keys > > > Key: HIVE-14366 > URL: https://issues.apache.org/jira/browse/HIVE-14366 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Saket Saurabh >Assignee: Saket Saurabh >Priority: Blocker > Fix For: 1.3.0, 2.2.0, 2.1.1 > > Attachments: HIVE-14366.01.patch, HIVE-14366.02.patch > > > When a Non-ACID table is converted to an ACID table, the primary key > consisting of (original transaction id, bucket_id, row_id) is not generated > uniquely. Currently, the row_id is always set to 0 for most rows. This leads > to correctness issue for such tables. > Quickest way to reproduce is to add the following unit test to > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java > {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid} > @Test > public void testOriginalReader() throws Exception { > FileSystem fs = FileSystem.get(hiveConf); > FileStatus[] status; > // 1. Insert five rows to Non-ACID table. > runStatementOnDriver("insert into " + Table.NONACIDORCTBL + "(a,b) > values(1,2),(3,4),(5,6),(7,8),(9,10)"); > // 2. Convert NONACIDORCTBL to ACID table. > runStatementOnDriver("alter table " + Table.NONACIDORCTBL + " SET > TBLPROPERTIES ('transactional'='true')"); > // 3. Perform a major compaction. > runStatementOnDriver("alter table "+ Table.NONACIDORCTBL + " compact > 'MAJOR'"); > runWorker(hiveConf); > // 4. Perform a delete. > runStatementOnDriver("delete from " + Table.NONACIDORCTBL + " where a = > 1"); > // 5. Now do a projection should have (3,4) (5,6),(7,8),(9,10) only since > (1,2) has been deleted. > List rs = runStatementOnDriver("select a,b from " + > Table.NONACIDORCTBL + " order by a,b"); > int[][] resultData = new int[][] {{3,4}, {5,6}, {7,8}, {9,10}}; > Assert.assertEquals(stringifyValues(resultData), rs); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14273) branch1 test
[ https://issues.apache.org/jira/browse/HIVE-14273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-14273: -- Resolution: Won't Fix Status: Resolved (was: Patch Available) > branch1 test > > > Key: HIVE-14273 > URL: https://issues.apache.org/jira/browse/HIVE-14273 > Project: Hive > Issue Type: Bug > Components: Encryption >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-14273-branch-2.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14377) LLAP IO: issue with how estimate cache removes unneeded buffers
[ https://issues.apache.org/jira/browse/HIVE-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14377: Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Committed. Thanks for the review! > LLAP IO: issue with how estimate cache removes unneeded buffers > --- > > Key: HIVE-14377 > URL: https://issues.apache.org/jira/browse/HIVE-14377 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14377.01.patch, HIVE-14377.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode
[ https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402728#comment-15402728 ] Xuefu Zhang commented on HIVE-14343: +1 > HiveDriverRunHookContext's command is null in HS2 mode > -- > > Key: HIVE-14343 > URL: https://issues.apache.org/jira/browse/HIVE-14343 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch > > > Looking at the {{Driver#runInternal(String command, boolean > alreadyCompiled)}}: > {code} > HiveDriverRunHookContext hookContext = new > HiveDriverRunHookContextImpl(conf, command); > // Get all the driver run hooks and pre-execute them. > List driverRunHooks; > {code} > The context is initialized with the {{command}} passed in to the method. > However, this command is always null if {{alreadyCompiled}} is true, which is > the case for HS2 mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11116) Can not select data from table which points to remote hdfs location
[ https://issues.apache.org/jira/browse/HIVE-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402731#comment-15402731 ] Hive QA commented on HIVE-6: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12820392/HIVE-6.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10419 tests executed *Failed tests:* {noformat} TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/724/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/724/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-724/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12820392 - PreCommit-HIVE-MASTER-Build > Can not select data from table which points to remote hdfs location > --- > > Key: HIVE-6 > URL: https://issues.apache.org/jira/browse/HIVE-6 > Project: Hive > Issue Type: Bug > Components: Encryption >Affects Versions: 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Alexander Pivovarov >Assignee: David Karoly > Attachments: HIVE-6.1.patch > > > I tried to create new table which points to remote hdfs location and select > data from it. > It works for hive-0.14 and hive-1.0 but it does not work starting from > hive-1.1 > to reproduce the issue > 1. create folder on remote hdfs > {code} > hadoop fs -mkdir -p hdfs://remote-nn/tmp/et1 > {code} > 2. create table > {code} > CREATE TABLE et1 ( > a string > ) stored as textfile > LOCATION 'hdfs://remote-nn/tmp/et1'; > {code} > 3. run select > {code} > select * from et1 limit 10; > {code} > 4. Should get the following error > {code} > select * from et1; > 15/06/25 13:43:44 [main]: ERROR parse.CalcitePlanner: > org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if > hdfs://remote-nn/tmp/et1is encrypted: java.lang.IllegalArgumentException: > Wrong FS: hdfs://remote-nn/tmp/et1, expected: hdfs://localhost:8020 > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1763) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:190) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) > at
[jira] [Updated] (HIVE-14343) HiveDriverRunHookContext's command is null in HS2 mode
[ https://issues.apache.org/jira/browse/HIVE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-14343: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master branch. Thanks [~xuefuz] for the review! > HiveDriverRunHookContext's command is null in HS2 mode > -- > > Key: HIVE-14343 > URL: https://issues.apache.org/jira/browse/HIVE-14343 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Fix For: 2.2.0 > > Attachments: HIVE-14343.0.patch, HIVE-14343.1.patch > > > Looking at the {{Driver#runInternal(String command, boolean > alreadyCompiled)}}: > {code} > HiveDriverRunHookContext hookContext = new > HiveDriverRunHookContextImpl(conf, command); > // Get all the driver run hooks and pre-execute them. > List driverRunHooks; > {code} > The context is initialized with the {{command}} passed in to the method. > However, this command is always null if {{alreadyCompiled}} is true, which is > the case for HS2 mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution
[ https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-14340: Attachment: HIVE-14340.2.patch Fix style. > Add a new hook triggers before query compilation and after query execution > -- > > Key: HIVE-14340 > URL: https://issues.apache.org/jira/browse/HIVE-14340 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, > HIVE-14340.2.patch > > > In some cases we may need to have a hook that activates before a query > compilation and after its execution. For instance, dynamically generate a UDF > specifically for the running query and clean up the resource after the query > is done. The current hooks only covers pre & post semantic analysis, pre & > post query execution, which doesn't fit the requirement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14340) Add a new hook triggers before query compilation and after query execution
[ https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402804#comment-15402804 ] Xuefu Zhang commented on HIVE-14340: Patch looks good to me. +1 > Add a new hook triggers before query compilation and after query execution > -- > > Key: HIVE-14340 > URL: https://issues.apache.org/jira/browse/HIVE-14340 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, > HIVE-14340.2.patch > > > In some cases we may need to have a hook that activates before a query > compilation and after its execution. For instance, dynamically generate a UDF > specifically for the running query and clean up the resource after the query > is done. The current hooks only covers pre & post semantic analysis, pre & > post query execution, which doesn't fit the requirement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14202) Change tez version used to 0.8.4
[ https://issues.apache.org/jira/browse/HIVE-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14202: - Fix Version/s: 2.1.1 > Change tez version used to 0.8.4 > > > Key: HIVE-14202 > URL: https://issues.apache.org/jira/browse/HIVE-14202 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14202.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14202) Change tez version used to 0.8.4
[ https://issues.apache.org/jira/browse/HIVE-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402898#comment-15402898 ] Prasanth Jayachandran commented on HIVE-14202: -- HIVE-13934 backport needs this patch. Without this many HIVE-13934 will trigger test failures. Backporting this to branch-2.1 > Change tez version used to 0.8.4 > > > Key: HIVE-14202 > URL: https://issues.apache.org/jira/browse/HIVE-14202 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14202.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available
[ https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402904#comment-15402904 ] Siddharth Seth commented on HIVE-14392: --- [~leftylev] - thanks for the detailed look at the description. Will incorporate most of that in the next patch. bq. Why did we make it required in the first place? I remember there was an explicit reason, I just don't remember what it was. We were always running on YARN only, too LLAP started with running without Slider - i.e. setup daemons manually on individual nodes (outside of YARN). At that point the work.dir configuration was required. I think we just never really needed to change the way it was used, so this did not get attention. > llap daemons should try using YARN local dirs, if available > --- > > Key: HIVE-14392 > URL: https://issues.apache.org/jira/browse/HIVE-14392 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14392.01.patch > > > LLAP required hive.llap.daemon.work.dirs to be specified. When running as a > YARN app - this can use the local dirs for the container - removing the > requirement to setup this parameter (for secure and non-secure clusters). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS
[ https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402908#comment-15402908 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13822: -- Looked at why some of these qfile changes are missing: 1. {code} 2016-08-01T14:50:03,831 DEBUG [630ce616-f45b-4c59-b8c2-6e27e80e2cca main] parse.TypeCheckCtx: Setting error: [Line 2:121 Invalid table alias or column reference 'ws_ext_sales_price': (possible column names are: .(tok_table_or_col i_item_id), .(tok_table_or_col i_item_desc\ ), .(tok_table_or_col i_category), .(tok_table_or_col i_class), .(tok_table_or_col i_current_price), .(tok_function sum (tok_table_or_col ws_ext_sales_price)))] from (tok_table_or_col ws_ext_sales_price) java.lang.Exception at org.apache.hadoop.hive.ql.parse.TypeCheckCtx.setError(TypeCheckCtx.java:162) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$ColumnExprProcessor.process(TypeCheckProcFactory.java:653) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:217) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:163) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11252) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11208) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4195) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3977) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9428) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9383) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10250) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10128) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10801) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10812) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10507) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:75) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) [hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:435) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) [hive-exec-2.2.0-SNAPSHOT.jar:?] at org.apache.had
[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS
[ https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13822: - Attachment: HIVE-13822.3.patch > TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot > parse COLUMN_STATS > -- > > Key: HIVE-13822 > URL: https://issues.apache.org/jira/browse/HIVE-13822 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, > HIVE-13822.3.patch > > > Thanks to [~jcamachorodriguez] for uncovering this issue as part of > HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether > stats are up-to-date. In case of PerfCliDriver, ‘false’ (thus, not > up-to-date) is returned and the following debug message in the logs: > {code} > In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in > StatsSetupConst) > {code} > Looks like the issue started happening after HIVE-12261 went in. > The fix would be to replace > {color:red}COLUMN_STATS_ACCURATE,true{color} > with > {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color} > where key, value are the column names. > in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS
[ https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13822: - Status: Open (was: Patch Available) > TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot > parse COLUMN_STATS > -- > > Key: HIVE-13822 > URL: https://issues.apache.org/jira/browse/HIVE-13822 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, > HIVE-13822.3.patch > > > Thanks to [~jcamachorodriguez] for uncovering this issue as part of > HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether > stats are up-to-date. In case of PerfCliDriver, ‘false’ (thus, not > up-to-date) is returned and the following debug message in the logs: > {code} > In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in > StatsSetupConst) > {code} > Looks like the issue started happening after HIVE-12261 went in. > The fix would be to replace > {color:red}COLUMN_STATS_ACCURATE,true{color} > with > {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color} > where key, value are the column names. > in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS
[ https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13822: - Status: Patch Available (was: Open) > TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot > parse COLUMN_STATS > -- > > Key: HIVE-13822 > URL: https://issues.apache.org/jira/browse/HIVE-13822 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, > HIVE-13822.3.patch > > > Thanks to [~jcamachorodriguez] for uncovering this issue as part of > HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether > stats are up-to-date. In case of PerfCliDriver, ‘false’ (thus, not > up-to-date) is returned and the following debug message in the logs: > {code} > In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in > StatsSetupConst) > {code} > Looks like the issue started happening after HIVE-12261 went in. > The fix would be to replace > {color:red}COLUMN_STATS_ACCURATE,true{color} > with > {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color} > where key, value are the column names. > in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)