[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168575#comment-15168575 ] Prasanth Jayachandran commented on HIVE-12935: -- Sure. I will do the nightly run tonight. > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13169) HiveServer2: Support delegation token based connection when using http transport
[ https://issues.apache.org/jira/browse/HIVE-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13169: Affects Version/s: 1.2.1 2.0.0 > HiveServer2: Support delegation token based connection when using http > transport > > > Key: HIVE-13169 > URL: https://issues.apache.org/jira/browse/HIVE-13169 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > HIVE-5155 introduced support for delegation token based connection. However, > it was intended for tcp transport mode. We need to have similar mechanisms > for http transport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13169) HiveServer2: Support delegation token based connection when using http transport
[ https://issues.apache.org/jira/browse/HIVE-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13169: Description: HIVE-5155 introduced support for delegation token based connection. However, it was intended for tcp transport mode. We need to have similar mechanisms for http transport. (was: [HIVE-5155|https://issues.apache.org/jira/browse/HIVE-5155] introduced support for delegation token based connection. However, it was intended for tcp transport mode. We need to have similar mechanisms for http transport.) > HiveServer2: Support delegation token based connection when using http > transport > > > Key: HIVE-13169 > URL: https://issues.apache.org/jira/browse/HIVE-13169 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > HIVE-5155 introduced support for delegation token based connection. However, > it was intended for tcp transport mode. We need to have similar mechanisms > for http transport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168554#comment-15168554 ] Siddharth Seth commented on HIVE-12935: --- +1. > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168555#comment-15168555 ] Hive QA commented on HIVE-13013: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789801/HIVE-13013.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9828 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.txn.TestTxnHandlerNegative.testBadConnection org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7095/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7095/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7095/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789801 - PreCommit-HIVE-TRUNK-Build > Further Improve concurrency in TxnHandler > - > > Key: HIVE-13013 > URL: https://issues.apache.org/jira/browse/HIVE-13013 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch > > > There are still a few operations in TxnHandler that run at Serializable > isolation. > Most or all of them can be dropped to READ_COMMITTED now that we have SELECT > ... FOR UPDATE support. This will reduce number of deadlocks in the DBs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168545#comment-15168545 ] Gopal V commented on HIVE-12935: Not sure I can do a nightly run tonight - can you kick-off a run with Chaosmonkey interval of 120s with at least 4 nodes & run the q55-random.sql at 1Tb scale to validate this? > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12935: - Attachment: HIVE-12935.7.patch added proper synchronization per Sid's comments. > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13108) Operators: SORT BY randomness is not safe with network partitions
[ https://issues.apache.org/jira/browse/HIVE-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168534#comment-15168534 ] Gopal V commented on HIVE-13108: [~sershe:]: +1? :) > Operators: SORT BY randomness is not safe with network partitions > - > > Key: HIVE-13108 > URL: https://issues.apache.org/jira/browse/HIVE-13108 > Project: Hive > Issue Type: Bug > Components: Spark, Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.0.1 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-13108.1.patch > > > SORT BY relies on a transient Random object, which is initialized once per > deserialize operation. > This results in complications during a network partition and when Tez/Spark > reuses a cached plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12935: - Attachment: HIVE-12935.6.patch Addressed [~sseth]'s review comments. > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13122) LLAP: simple Model/View separation for UI
[ https://issues.apache.org/jira/browse/HIVE-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168516#comment-15168516 ] Siddharth Seth commented on HIVE-13122: --- +1. > LLAP: simple Model/View separation for UI > - > > Key: HIVE-13122 > URL: https://issues.apache.org/jira/browse/HIVE-13122 > Project: Hive > Issue Type: Improvement > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-13122.1.patch, HIVE-13122.2.patch > > > The current LLAP UI in master uses a fixed loop to both extract data and to > display it in the same loop. > Split this up into a model-view, for modularity. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13166) Log the selection from llap decider
[ https://issues.apache.org/jira/browse/HIVE-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168506#comment-15168506 ] Sergey Shelukhin commented on HIVE-13166: - That's part of explain... > Log the selection from llap decider > --- > > Key: HIVE-13166 > URL: https://issues.apache.org/jira/browse/HIVE-13166 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth > > llap decider logs when it considers a vertex, however the actual placement > (llap, container, etc) is not logged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13167) LLAP: Remove yarn-site resource from zookeeper based registry
[ https://issues.apache.org/jira/browse/HIVE-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13167: - Priority: Minor (was: Major) > LLAP: Remove yarn-site resource from zookeeper based registry > - > > Key: HIVE-13167 > URL: https://issues.apache.org/jira/browse/HIVE-13167 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > > With zookeeper registry adding yarn-site.xml resource is no longer required. > Following line should be removed from LlapZookeeperRegistryImpl > {code} > this.conf.addResource(YarnConfiguration.YARN_SITE_CONFIGURATION_FILE); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf
[ https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168450#comment-15168450 ] Hive QA commented on HIVE-12679: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789639/HIVE-12679.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9816 tests executed *Failed tests:* {noformat} TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7094/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7094/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7094/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789639 - PreCommit-HIVE-TRUNK-Build > Allow users to be able to specify an implementation of IMetaStoreClient via > HiveConf > > > Key: HIVE-12679 > URL: https://issues.apache.org/jira/browse/HIVE-12679 > Project: Hive > Issue Type: Improvement > Components: Configuration, Metastore, Query Planning >Affects Versions: 2.1.0 >Reporter: Austin Lee >Assignee: Austin Lee >Priority: Minor > Labels: metastore > Attachments: HIVE-12679.1.patch, HIVE-12679.patch > > > Hi, > I would like to propose a change that would make it possible for users to > choose an implementation of IMetaStoreClient via HiveConf, i.e. > hive-site.xml. Currently, in Hive the choice is hard coded to be > SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive. There > is no other direct reference to SessionHiveMetaStoreClient other than the > hard coded class name in Hive.java and the QL component operates only on the > IMetaStoreClient interface so the change would be minimal and it would be > quite similar to how an implementation of RawStore is specified and loaded in > hive-metastore. One use case this change would serve would be one where a > user wishes to use an implementation of this interface without the dependency > on the Thrift server. > > Thank you, > Austin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.06.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168422#comment-15168422 ] chillon_m commented on HIVE-13165: -- set hive.query.result.fileformat = SequenceFile; do well. thanks. > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which I execute query sql added 'order by' or 'sort by' is > different from not add it .size of resultset and value of row is returned is > different. > with order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 > order by id; > INFO : Number of reduce tasks determined at compile time: 1 > INFO : In order to change the average load for a reducer (in bytes): > INFO : set hive.exec.reducers.bytes.per.reducer= > INFO : In order to limit the maximum number of reducers: > INFO : set hive.exec.reducers.max= > INFO : In order to set a constant number of reducers: > INFO : set mapreduce.job.reduces= > WARN : Hadoop command-line option parsing not performed. Implement the Tool > interface and execute your application with ToolRunner to remedy this. > INFO : number of splits:1 > INFO : Submitting tokens for job: job_1456383638304_0008 > INFO : The url to track the job: > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Starting Job = job_1456383638304_0008, Tracking URL = > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop > job -kill job_1456383638304_0008 > INFO : Hadoop job information for Stage-1: number of mappers: 0; number of > reducers: 0 > INFO : 2016-02-26 11:06:55,493 Stage-1 map = 0%, reduce = 0% > INFO : 2016-02-26 11:07:01,710 Stage-1 map = 100%, reduce = 0% > INFO : 2016-02-26 11:07:04,815 Stage-1 map = 100%, reduce = 100% > INFO : Ended Job = job_1456383638304_0008 > ++--+---+--+ > | temp.type | temp.id | temp.msgdata | > ++--+---+--+ > | -1000 | 163437 | we come: | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > ++--+---+--+ > 11 rows selected (16.191 seconds) > without order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437; > ++--+---+--+ > | temp.type | temp.id | > temp.msgdata > | > ++--+---+--+ > | -1000 | 163437 | we come: > sadferqgb gtrhyj hytjyjuk nhmuykiluil > hthnynmkukmhrj, | > ++--+---+--+ > 1 row selected (18.245 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168418#comment-15168418 ] Gopal V commented on HIVE-13165: FYI, what I meant was that without that optimization, even the withOrderBy will fail. > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which I execute query sql added 'order by' or 'sort by' is > different from not add it .size of resultset and value of row is returned is > different. > with order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 > order by id; > INFO : Number of reduce tasks determined at compile time: 1 > INFO : In order to change the average load for a reducer (in bytes): > INFO : set hive.exec.reducers.bytes.per.reducer= > INFO : In order to limit the maximum number of reducers: > INFO : set hive.exec.reducers.max= > INFO : In order to set a constant number of reducers: > INFO : set mapreduce.job.reduces= > WARN : Hadoop command-line option parsing not performed. Implement the Tool > interface and execute your application with ToolRunner to remedy this. > INFO : number of splits:1 > INFO : Submitting tokens for job: job_1456383638304_0008 > INFO : The url to track the job: > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Starting Job = job_1456383638304_0008, Tracking URL = > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop > job -kill job_1456383638304_0008 > INFO : Hadoop job information for Stage-1: number of mappers: 0; number of > reducers: 0 > INFO : 2016-02-26 11:06:55,493 Stage-1 map = 0%, reduce = 0% > INFO : 2016-02-26 11:07:01,710 Stage-1 map = 100%, reduce = 0% > INFO : 2016-02-26 11:07:04,815 Stage-1 map = 100%, reduce = 100% > INFO : Ended Job = job_1456383638304_0008 > ++--+---+--+ > | temp.type | temp.id | temp.msgdata | > ++--+---+--+ > | -1000 | 163437 | we come: | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > ++--+---+--+ > 11 rows selected (16.191 seconds) > without order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437; > ++--+---+--+ > | temp.type | temp.id | > temp.msgdata > | > ++--+---+--+ > | -1000 | 163437 | we come: > sadferqgb gtrhyj hytjyjuk nhmuykiluil > hthnynmkukmhrj, | > ++--+---+--+ > 1 row selected (18.245 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168417#comment-15168417 ] chillon_m commented on HIVE-13165: -- [bigdata@namenode hive-1.2.1]$ bin/beeline -u jdbc:hive2://namenode:1/default bigdata -n bigdata Connecting to jdbc:hive2://namenode:1/default Connected to: Apache Hive (version 1.2.1) Driver: Hive JDBC (version 1.2.1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 1.2.1 by Apache Hive 0: jdbc:hive2://namenode:1/default> set hive.fetch.task.conversion=none; No rows affected (0.051 seconds) 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as type,id,msgData from messages where Num='41433141' and erNum='99841977') 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 order by id; INFO : Number of reduce tasks determined at compile time: 1 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer= INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max= INFO : In order to set a constant number of reducers: INFO : set mapreduce.job.reduces= WARN : Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. INFO : number of splits:1 INFO : Submitting tokens for job: job_1456383638304_0009 INFO : The url to track the job: http://namenode:8088/proxy/application_1456383638304_0009/ INFO : Starting Job = job_1456383638304_0009, Tracking URL = http://namenode:8088/proxy/application_1456383638304_0009/ INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop job -kill job_1456383638304_0009 INFO : Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 INFO : 2016-02-26 12:22:21,928 Stage-1 map = 0%, reduce = 0% INFO : 2016-02-26 12:22:29,178 Stage-1 map = 100%, reduce = 0% INFO : 2016-02-26 12:22:32,269 Stage-1 map = 100%, reduce = 100% INFO : Ended Job = job_1456383638304_0009 ++--+---+--+ | temp.type | temp.id | temp.msgdata | ++--+---+--+ | -1000 | 163437 | we come: | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | ++--+---+--+ 11 rows selected (15.594 seconds) > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which I execute query sql added 'order by' or 'sort by' is > different from not add it .size of resultset and value of row is returned is > different. > with order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 > order by id; > INFO : Number of reduce tasks determined at compile time: 1 > INFO : In order to change the average load for a reducer (in bytes): > INFO : set hive.exec.reducers.bytes.per.reducer= > INFO : In order to limit the maximum number of reducers: > INFO : set hive.exec.reducers.max= > INFO : In order to set a constant number of reducers: > INFO : set mapreduce.job.reduces= > WARN : Hadoop command-line option parsing not performed. Implement the Tool > interface and execute your application with ToolRunner to remedy this. > INFO : number of splits:1 > INFO : Submitting tokens for job: job_1456383638304_0008 > INFO : The url to track the job: > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Starting Job = job_1456383638304_0008, Tracking URL = > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop > job -kill job_1456383638304_0008 > INFO : Hadoop job information for Stage-1: number of mappers: 0; number of > reducers: 0 > INFO : 2016-02-26 11:06:55,493 Stage-1 map = 0%, reduce = 0% > INFO : 2016-02-26
[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chillon_m updated HIVE-13165: - Description: resultset which I execute query sql added 'order by' or 'sort by' is different from not add it .size of resultset and value of row is returned is different. with order: 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as type,id,msgData from messages where Num='41433141' and erNum='99841977') 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 order by id; INFO : Number of reduce tasks determined at compile time: 1 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer= INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max= INFO : In order to set a constant number of reducers: INFO : set mapreduce.job.reduces= WARN : Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. INFO : number of splits:1 INFO : Submitting tokens for job: job_1456383638304_0008 INFO : The url to track the job: http://namenode:8088/proxy/application_1456383638304_0008/ INFO : Starting Job = job_1456383638304_0008, Tracking URL = http://namenode:8088/proxy/application_1456383638304_0008/ INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop job -kill job_1456383638304_0008 INFO : Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 INFO : 2016-02-26 11:06:55,493 Stage-1 map = 0%, reduce = 0% INFO : 2016-02-26 11:07:01,710 Stage-1 map = 100%, reduce = 0% INFO : 2016-02-26 11:07:04,815 Stage-1 map = 100%, reduce = 100% INFO : Ended Job = job_1456383638304_0008 ++--+---+--+ | temp.type | temp.id | temp.msgdata | ++--+---+--+ | -1000 | 163437 | we come: | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | | NULL | NULL | NULL | ++--+---+--+ 11 rows selected (16.191 seconds) without order: 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as type,id,msgData from messages where Num='41433141' and erNum='99841977') 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437; ++--+---+--+ | temp.type | temp.id | temp.msgdata | ++--+---+--+ | -1000 | 163437 | we come: sadferqgb gtrhyj hytjyjuk nhmuykiluil hthnynmkukmhrj, | ++--+---+--+ 1 row selected (18.245 seconds) was: resultset which I execute query sql added 'order by' or 'sort by' is different from not add it .size of resultset and value of row is returned is different. with order: 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as type,id,msgData from messages where Num='41433141' and erNum='99841977') 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 order by id; INFO : Number of reduce tasks determined at compile time: 1 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer= INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max= INFO : In order to set a constant number of reducers: INFO : set mapreduce.job.reduces= WARN : Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. INFO : number of splits:1 INFO : Submitting tokens for job: job_1456383638304_0008 INFO : The url to track the job: http://namenode:8088/proxy/application_1456383638304_0008/ INFO : Starting Job = job_1456383638304_0008, Tracking URL =
[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168413#comment-15168413 ] Gopal V commented on HIVE-13165: [~chillon_m]: that looks like a simple new line error (the conversion=none should trigger it for both cases). If turning off FetchTask reproduces the issue for both cases, then {{set hive.query.result.fileformat= SequenceFile;}} (default in hive-2.1.0). > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which I execute query sql added 'order by' or 'sort by' is > different from not add it .size of resultset and value of row is returned is > different. > with order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from messages where Num='41433141' and erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 > order by id; > INFO : Number of reduce tasks determined at compile time: 1 > INFO : In order to change the average load for a reducer (in bytes): > INFO : set hive.exec.reducers.bytes.per.reducer= > INFO : In order to limit the maximum number of reducers: > INFO : set hive.exec.reducers.max= > INFO : In order to set a constant number of reducers: > INFO : set mapreduce.job.reduces= > WARN : Hadoop command-line option parsing not performed. Implement the Tool > interface and execute your application with ToolRunner to remedy this. > INFO : number of splits:1 > INFO : Submitting tokens for job: job_1456383638304_0008 > INFO : The url to track the job: > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Starting Job = job_1456383638304_0008, Tracking URL = > http://namenode:8088/proxy/application_1456383638304_0008/ > INFO : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop > job -kill job_1456383638304_0008 > INFO : Hadoop job information for Stage-1: number of mappers: 0; number of > reducers: 0 > INFO : 2016-02-26 11:06:55,493 Stage-1 map = 0%, reduce = 0% > INFO : 2016-02-26 11:07:01,710 Stage-1 map = 100%, reduce = 0% > INFO : 2016-02-26 11:07:04,815 Stage-1 map = 100%, reduce = 100% > INFO : Ended Job = job_1456383638304_0008 > ++--+---+--+ > | temp.type | temp.id | temp.msgdata | > ++--+---+--+ > | -1000 | 163437 | we come: | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > | NULL | NULL | NULL | > ++--+---+--+ > 11 rows selected (16.191 seconds) > without order: > 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as > type,id,msgData from qqtroopmessages where Num='41433141' and > erNum='99841977') > 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437; > ++--+---+--+ > | temp.type | temp.id | > temp.msgdata > | > ++--+---+--+ > | -1000 | 163437 | we come: > sadferqgb gtrhyj hytjyjuk nhmuykiluil > hthnynmkukmhrj, | > ++--+---+--+ > 1 row selected (18.245 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chillon_m updated HIVE-13165: - Attachment: Hql with order.png > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which query sql add 'order by' or 'sort by' return is different > from not add it .size of resultset and value of row is returned is different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168410#comment-15168410 ] Gopal V commented on HIVE-13165: Try running this by disabling the FetchTask optimizer - {{set hive.fetch.task.conversion=none;}} > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png, Hql with order.png > > > resultset which query sql add 'order by' or 'sort by' return is different > from not add it .size of resultset and value of row is returned is different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it
[ https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chillon_m updated HIVE-13165: - Attachment: (was: Hql order.png) > resultset which query sql add 'order by' or 'sort by' return is different > from not add it > -- > > Key: HIVE-13165 > URL: https://issues.apache.org/jira/browse/HIVE-13165 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: hadoop 2.5.2 hive 1.2.1 >Reporter: chillon_m >Assignee: Vaibhav Gumashta > Attachments: Hql not order.png > > > resultset which query sql add 'order by' or 'sort by' return is different > from not add it .size of resultset and value of row is returned is different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join
[ https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang resolved HIVE-13164. Resolution: Invalid > Predicate pushdown may cause cross-product in left semi join > > > Key: HIVE-13164 > URL: https://issues.apache.org/jira/browse/HIVE-13164 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > For some left semi join queries like followings: > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t2.value = 'val_0'; > or > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t1.value = 'val_0'; > Their plans show that they have been converted to keyless cross-product due > to the predicate pushdown and the dropping of the on condition. > {code} > LOGICAL PLAN: > t1:t1 > TableScan (TS_0) > alias: t1 > Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_18) > predicate: (key = 0) (type: boolean) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_2) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_9) > sort order: > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: > Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_13) > aggregations: count(1) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_14) > sort order: > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > value expressions: _col0 (type: bigint) > Group By Operator (GBY_15) > aggregations: count(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > File Output Operator (FS_17) > compressed: false > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > t2:t2 > TableScan (TS_3) > alias: t2 > Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_19) > predicate: ((key = 0) and (value = 'val_0')) (type: boolean) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_5) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_8) > keys: 'val_0' (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_10) > sort order: > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: >Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > {code} > [~gopalv], do you think these plans are valid or not? Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join
[ https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168395#comment-15168395 ] Chaoyu Tang commented on HIVE-13164: Yeah, with the t1.key = t2.key, the query plan looks right and there is not cross-product. Thanks for pointing out. > Predicate pushdown may cause cross-product in left semi join > > > Key: HIVE-13164 > URL: https://issues.apache.org/jira/browse/HIVE-13164 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > For some left semi join queries like followings: > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t2.value = 'val_0'; > or > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t1.value = 'val_0'; > Their plans show that they have been converted to keyless cross-product due > to the predicate pushdown and the dropping of the on condition. > {code} > LOGICAL PLAN: > t1:t1 > TableScan (TS_0) > alias: t1 > Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_18) > predicate: (key = 0) (type: boolean) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_2) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_9) > sort order: > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: > Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_13) > aggregations: count(1) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_14) > sort order: > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > value expressions: _col0 (type: bigint) > Group By Operator (GBY_15) > aggregations: count(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > File Output Operator (FS_17) > compressed: false > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > t2:t2 > TableScan (TS_3) > alias: t2 > Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_19) > predicate: ((key = 0) and (value = 'val_0')) (type: boolean) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_5) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_8) > keys: 'val_0' (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_10) > sort order: > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: >Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > {code} > [~gopalv], do you think these plans are valid or not? Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join
[ https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168381#comment-15168381 ] Gopal V commented on HIVE-13164: [~ctang.ma]: that actually looks like a cross-product even pre-optimization. The optimizer is not the one generating a cross-product unless there's a missing t1.key = t2.key there? > Predicate pushdown may cause cross-product in left semi join > > > Key: HIVE-13164 > URL: https://issues.apache.org/jira/browse/HIVE-13164 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > For some left semi join queries like followings: > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t2.value = 'val_0'; > or > select count(1) from (select value from t1 where key = 0) t1 left semi join > (select value from t2 where key = 0) t2 on t1.value = 'val_0'; > Their plans show that they have been converted to keyless cross-product due > to the predicate pushdown and the dropping of the on condition. > {code} > LOGICAL PLAN: > t1:t1 > TableScan (TS_0) > alias: t1 > Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_18) > predicate: (key = 0) (type: boolean) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_2) > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_9) > sort order: > Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: > Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_13) > aggregations: count(1) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_14) > sort order: > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > value expressions: _col0 (type: bigint) > Group By Operator (GBY_15) > aggregations: count(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > File Output Operator (FS_17) > compressed: false > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > t2:t2 > TableScan (TS_3) > alias: t2 > Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column > stats: NONE > Filter Operator (FIL_19) > predicate: ((key = 0) and (value = 'val_0')) (type: boolean) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column > stats: NONE > Select Operator (SEL_5) > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Group By Operator (GBY_8) > keys: 'val_0' (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator (RS_10) > sort order: > Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE > Column stats: NONE > Join Operator (JOIN_11) > condition map: >Left Semi Join 0 to 1 > keys: > 0 > 1 > Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE > Column stats: NONE > {code} > [~gopalv], do you think these plans are valid or not? Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13153) SessionID is appended to thread name twice
[ https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168377#comment-15168377 ] Prasanth Jayachandran commented on HIVE-13153: -- I just copied over the same log level from reset thread name logic. No. I think both are setting the thread name in the original code. Resetting in CliDriver is done elsewhere. > SessionID is appended to thread name twice > -- > > Key: HIVE-13153 > URL: https://issues.apache.org/jira/browse/HIVE-13153 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch > > > HIVE-12249 added sessionId to thread name. In some cases the sessionId could > be appended twice. Example log line > {code} > DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 > 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join
[ https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168375#comment-15168375 ] Chaoyu Tang commented on HIVE-13082: [~gopalv] Could you take a look at HIVE-13164 to see if it makes sense or not based on our discussion here? Thanks > Enable constant propagation optimization in query with left semi join > - > > Key: HIVE-13082 > URL: https://issues.apache.org/jira/browse/HIVE-13082 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, > HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch > > > Currently constant folding is only allowed for inner or unique join, I think > it is also applicable and allowed for left semi join. Otherwise the query > like following having multiple joins with left semi joins will fail: > {code} > select table1.id, table1.val, table2.val2 from table1 inner join table2 on > table1.val = 't1val01' and table1.id = table2.id left semi join table3 on > table1.dimid = table3.id; > {code} > with errors: > {code} > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > [hadoop-mapreduce-client-common-2.6.0.jar:?] > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > ~[hadoop-common-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[?:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45] > ... > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45] > at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > {code} -- This message was sent
[jira] [Commented] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN
[ https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168366#comment-15168366 ] Prasanth Jayachandran commented on HIVE-13163: -- LGTM, +1 > ORC MemoryManager thread checks are fatal, should WARN > --- > > Key: HIVE-13163 > URL: https://issues.apache.org/jira/browse/HIVE-13163 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: PIG > Attachments: HIVE-13163.1.patch > > > The MemoryManager is tied to a WriterOptions on create, which can occur in a > different thread from the writer calls. > This is unexpected, but safe and needs a warning not a fatal. > {code} > /** >* Light weight thread-safety check for multi-threaded access patterns >*/ > private void checkOwner() { > Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(), > "Owner thread expected %s, got %s", > ownerLock.getOwner(), > Thread.currentThread()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN
[ https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13163: --- Status: Patch Available (was: Open) > ORC MemoryManager thread checks are fatal, should WARN > --- > > Key: HIVE-13163 > URL: https://issues.apache.org/jira/browse/HIVE-13163 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: PIG > Attachments: HIVE-13163.1.patch > > > The MemoryManager is tied to a WriterOptions on create, which can occur in a > different thread from the writer calls. > This is unexpected, but safe and needs a warning not a fatal. > {code} > /** >* Light weight thread-safety check for multi-threaded access patterns >*/ > private void checkOwner() { > Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(), > "Owner thread expected %s, got %s", > ownerLock.getOwner(), > Thread.currentThread()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168293#comment-15168293 ] Jason Dere commented on HIVE-13063: --- Took a quick look at the updated patch: - No need to update itests/src/test/resources/testconfiguration.properties - You will need to generated the golden files for the qfile tests you added - they will be .q.out files. See https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoIupdatetheoutputofaCliDrivertestcase - In UDFChar.java: can you change this to "chr" {code} +@Description(name = "char" {code} > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Description: Create UDFS for these functions. CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If n is less than 0 or greater than 255, return the empty string. If n is 0, return null. REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); Equals 'BLack and BLue'" was: Create UDFS for these functions. CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If n is less than 0 or greater than 255, return the empty string. If n is 0, return null. REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); Equals 'BLack and BLue'" > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Status: Patch Available (was: Open) > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Attachment: (was: HIVE-13063.master.patch) > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Status: Open (was: Patch Available) > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Attachment: HIVE-13063.patch > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 > PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13162) Fixes for LlapDump and FileSinkoperator
[ https://issues.apache.org/jira/browse/HIVE-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-13162. --- Resolution: Fixed Committed to llap branch. > Fixes for LlapDump and FileSinkoperator > --- > > Key: HIVE-13162 > URL: https://issues.apache.org/jira/browse/HIVE-13162 > Project: Hive > Issue Type: Sub-task >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Fix For: llap > > Attachments: HIVE-13162.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13162) Fixes for LlapDump and FileSinkoperator
[ https://issues.apache.org/jira/browse/HIVE-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-13162: -- Attachment: HIVE-13162.1.patch > Fixes for LlapDump and FileSinkoperator > --- > > Key: HIVE-13162 > URL: https://issues.apache.org/jira/browse/HIVE-13162 > Project: Hive > Issue Type: Sub-task >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Fix For: llap > > Attachments: HIVE-13162.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN
[ https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13163: --- Labels: PIG (was: ) > ORC MemoryManager thread checks are fatal, should WARN > --- > > Key: HIVE-13163 > URL: https://issues.apache.org/jira/browse/HIVE-13163 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: PIG > > The MemoryManager is tied to a WriterOptions on create, which can occur in a > different thread from the writer calls. > This is unexpected, but safe and needs a warning not a fatal. > {code} > /** >* Light weight thread-safety check for multi-threaded access patterns >*/ > private void checkOwner() { > Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(), > "Owner thread expected %s, got %s", > ownerLock.getOwner(), > Thread.currentThread()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13130) API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13130: - Attachment: HIVE-13130.3.patch > API calls for retrieving primary keys and foreign keys information > --- > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, > HIVE-13130.3.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions
[ https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13151: - Attachment: (was: HIVE-13151.1.patch) > Clean up UGI objects in FileSystem cache for transactions > - > > Key: HIVE-13151 > URL: https://issues.apache.org/jira/browse/HIVE-13151 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13151.1.patch > > > One issue with FileSystem.CACHE is that it does not clean itself. The key in > that cache includes UGI object. When new UGI objects are created and used > with the FileSystem api, new entries get added to the cache. > We need to manually clean up those UGI objects once they are no longer in use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions
[ https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13151: - Attachment: HIVE-13151.1.patch > Clean up UGI objects in FileSystem cache for transactions > - > > Key: HIVE-13151 > URL: https://issues.apache.org/jira/browse/HIVE-13151 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13151.1.patch > > > One issue with FileSystem.CACHE is that it does not clean itself. The key in > that cache includes UGI object. When new UGI objects are created and used > with the FileSystem api, new entries get added to the cache. > We need to manually clean up those UGI objects once they are no longer in use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9422) LLAP: row-level vectorized SARGs
[ https://issues.apache.org/jira/browse/HIVE-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168212#comment-15168212 ] Yohei Abe commented on HIVE-9422: - HIVE-9422.WIP1.patch > LLAP: row-level vectorized SARGs > > > Key: HIVE-9422 > URL: https://issues.apache.org/jira/browse/HIVE-9422 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Sergey Shelukhin > Attachments: HIVE-9422.WIP1.patch > > > When VRBs are built from encoded data, sargs can be applied on low level to > reduce the number of rows to process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9422) LLAP: row-level vectorized SARGs
[ https://issues.apache.org/jira/browse/HIVE-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yohei Abe updated HIVE-9422: Attachment: HIVE-9422.WIP1.patch just WIP points are ... * Add row-level SARG at OrcEncodedDataConsumer.decodeBath() * SARG is applied to CVB > LLAP: row-level vectorized SARGs > > > Key: HIVE-9422 > URL: https://issues.apache.org/jira/browse/HIVE-9422 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Sergey Shelukhin > Attachments: HIVE-9422.WIP1.patch > > > When VRBs are built from encoded data, sargs can be applied on low level to > reduce the number of rows to process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13161) ORC: Always do sloppy overlaps for DiskRanges
[ https://issues.apache.org/jira/browse/HIVE-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13161: --- Priority: Minor (was: Major) > ORC: Always do sloppy overlaps for DiskRanges > - > > Key: HIVE-13161 > URL: https://issues.apache.org/jira/browse/HIVE-13161 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.3.0, 2.1.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran >Priority: Minor > > The selected columns are sometimes only a few bytes apart (particularly for > nulls which compresses tightly) and the reads aren't merged > The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied > more for safety than reducing total number of round-trip calls to filesystem. > {code} > /** >* Update the disk ranges to collapse adjacent or overlapping ranges. It >* assumes that the ranges are sorted. >* @param ranges the list of disk ranges to merge >*/ > static void mergeDiskRanges(List ranges) { > DiskRange prev = null; > for(int i=0; i < ranges.size(); ++i) { > DiskRange current = ranges.get(i); > if (prev != null && overlap(prev.offset, prev.end, > current.offset, current.end)) { > prev.offset = Math.min(prev.offset, current.offset); > prev.end = Math.max(prev.end, current.end); > ranges.remove(i); > i -= 1; > } else { > prev = current; > } > } > } > ... > private static boolean overlap(long leftA, long rightA, long leftB, long > rightB) { > if (leftA <= leftB) { > return rightA >= leftB; > } > return rightB >= leftA; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13161) ORC: Always do sloppy overlaps for DiskRanges
[ https://issues.apache.org/jira/browse/HIVE-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-13161: --- Component/s: ORC > ORC: Always do sloppy overlaps for DiskRanges > - > > Key: HIVE-13161 > URL: https://issues.apache.org/jira/browse/HIVE-13161 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.3.0, 2.1.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > > The selected columns are sometimes only a few bytes apart (particularly for > nulls which compresses tightly) and the reads aren't merged > The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied > more for safety than reducing total number of round-trip calls to filesystem. > {code} > /** >* Update the disk ranges to collapse adjacent or overlapping ranges. It >* assumes that the ranges are sorted. >* @param ranges the list of disk ranges to merge >*/ > static void mergeDiskRanges(List ranges) { > DiskRange prev = null; > for(int i=0; i < ranges.size(); ++i) { > DiskRange current = ranges.get(i); > if (prev != null && overlap(prev.offset, prev.end, > current.offset, current.end)) { > prev.offset = Math.min(prev.offset, current.offset); > prev.end = Math.max(prev.end, current.end); > ranges.remove(i); > i -= 1; > } else { > prev = current; > } > } > } > ... > private static boolean overlap(long leftA, long rightA, long leftB, long > rightB) { > if (leftA <= leftB) { > return rightA >= leftB; > } > return rightB >= leftA; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13160) HS2 unable to load UDFs on startup when HMS is not ready
[ https://issues.apache.org/jira/browse/HIVE-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-13160: Description: The error looks like this: {code} 2016-02-18 14:43:54,251 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:48:54,692 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:48:54,692 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:48:55,692 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:53:55,800 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:53:55,800 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:53:56,801 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:58:56,967 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:58:56,967 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:58:57,994 WARN hive.ql.metadata.Hive: [main]: Failed to register all functions. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1492) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2915) ... 016-02-18 14:58:57,997 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:03:58,094 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:03:58,095 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:03:59,095 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:08:59,203 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:08:59,203 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:09:00,203 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:14:00,304 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:14:00,304 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:14:01,306 INFO org.apache.hive.service.server.HiveServer2: [main]: Shutting down HiveServer2 2016-02-18 15:14:01,308 INFO org.apache.hive.service.server.HiveServer2: [main]: Exception caught when calling stop of HiveServer2 before retrying start java.lang.NullPointerException at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:283) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:351) at org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:69) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:545) {code} And then none of the functions will be available for use as HS2 does not re-register them after HMS is up and ready. This is not desired behaviour, we shouldn't allow HS2 to be in a servicing state if function list is not ready. Or, maybe instead of initialize the function list when HS2 starts, try to load the function list when each Hive session is created. Of course we can have a cache of function list somewhere for better performance, but we would better decouple it from class Hive. was: The error looks like this: {code} 2016-02-24 21:16:09,901 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-24 21:16:09,971 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-24 21:16:09,971 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-24 21:16:10,971 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-24 21:16:10,975 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-24 21:16:10,976 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-24 21:16:11,976 INFO hive.metastore: [main]: Trying to connect to metastore with URI
[jira] [Updated] (HIVE-13120) propagate doAs when generating ORC splits
[ https://issues.apache.org/jira/browse/HIVE-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13120: Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > propagate doAs when generating ORC splits > - > > Key: HIVE-13120 > URL: https://issues.apache.org/jira/browse/HIVE-13120 > Project: Hive > Issue Type: Improvement >Reporter: Yi Zhang >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-13120.patch > > > ORC+HS2+doAs+FetchTask conversion = weird permission errors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST
[ https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168089#comment-15168089 ] Hive QA commented on HIVE-12994: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12790020/HIVE-12994.11.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 4 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/124/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/124/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-124/ Messages: {noformat} LXC derby found. LXC derby is not started. Starting container... Container started. Preparing derby container... Container prepared. Calling /hive/testutils/metastore/dbs/derby/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/derby/execute.sh ... Tests executed. LXC mysql found. LXC mysql is not started. Starting container... Container started. Preparing mysql container... Container prepared. Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/mysql/execute.sh ... Tests executed. LXC oracle found. LXC oracle is not started. Starting container... Container started. Preparing oracle container... Container prepared. Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/oracle/execute.sh ... Tests executed. LXC postgres found. LXC postgres is not started. Starting container... Container started. Preparing postgres container... Container prepared. Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/postgres/execute.sh ... Tests executed. {noformat} This message is automatically generated. ATTACHMENT ID: 12790020 - PreCommit-HIVE-METASTORE-Test > Implement support for NULLS FIRST/NULLS LAST > > > Key: HIVE-12994 > URL: https://issues.apache.org/jira/browse/HIVE-12994 > Project: Hive > Issue Type: New Feature > Components: CBO, Parser, Serializers/Deserializers >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, > HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, > HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, > HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, > HIVE-12994.11.patch, HIVE-12994.patch > > > From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to > determine whether nulls appear before or after non-null data values when the > ORDER BY clause is used. > SQL standard does not specify the behavior by default. Currently in Hive, > null values sort as if lower than any non-null value; that is, NULLS FIRST is > the default for ASC order, and NULLS LAST for DESC order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST
[ https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12994: --- Attachment: HIVE-12994.11.patch I needed to regenerate additional q files (showing info about null ordering in extended explain). > Implement support for NULLS FIRST/NULLS LAST > > > Key: HIVE-12994 > URL: https://issues.apache.org/jira/browse/HIVE-12994 > Project: Hive > Issue Type: New Feature > Components: CBO, Parser, Serializers/Deserializers >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, > HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, > HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, > HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, > HIVE-12994.11.patch, HIVE-12994.patch > > > From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to > determine whether nulls appear before or after non-null data values when the > ORDER BY clause is used. > SQL standard does not specify the behavior by default. Currently in Hive, > null values sort as if lower than any non-null value; that is, NULLS FIRST is > the default for ASC order, and NULLS LAST for DESC order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168060#comment-15168060 ] Hive QA commented on HIVE-13149: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789631/HIVE-13149.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9814 tests executed *Failed tests:* {noformat} TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file TestMsgBusConnection - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape_clusterby1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown_negative org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_set_metaconf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_conv org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_rpad org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestMetastoreVersion.testVersionRestriction org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7092/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7092/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7092/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789631 - PreCommit-HIVE-TRUNK-Build > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some of them like StatsTask, don't need to access > HMS, currently a new HMS connection will be established for each thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST
[ https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12994: --- Status: Patch Available (was: In Progress) > Implement support for NULLS FIRST/NULLS LAST > > > Key: HIVE-12994 > URL: https://issues.apache.org/jira/browse/HIVE-12994 > Project: Hive > Issue Type: New Feature > Components: CBO, Parser, Serializers/Deserializers >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, > HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, > HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, > HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, > HIVE-12994.patch > > > From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to > determine whether nulls appear before or after non-null data values when the > ORDER BY clause is used. > SQL standard does not specify the behavior by default. Currently in Hive, > null values sort as if lower than any non-null value; that is, NULLS FIRST is > the default for ASC order, and NULLS LAST for DESC order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST
[ https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12994: --- Status: Open (was: Patch Available) > Implement support for NULLS FIRST/NULLS LAST > > > Key: HIVE-12994 > URL: https://issues.apache.org/jira/browse/HIVE-12994 > Project: Hive > Issue Type: New Feature > Components: CBO, Parser, Serializers/Deserializers >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, > HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, > HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, > HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, > HIVE-12994.patch > > > From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to > determine whether nulls appear before or after non-null data values when the > ORDER BY clause is used. > SQL standard does not specify the behavior by default. Currently in Hive, > null values sort as if lower than any non-null value; that is, NULLS FIRST is > the default for ASC order, and NULLS LAST for DESC order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST
[ https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-12994 started by Jesus Camacho Rodriguez. -- > Implement support for NULLS FIRST/NULLS LAST > > > Key: HIVE-12994 > URL: https://issues.apache.org/jira/browse/HIVE-12994 > Project: Hive > Issue Type: New Feature > Components: CBO, Parser, Serializers/Deserializers >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, > HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, > HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, > HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, > HIVE-12994.patch > > > From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to > determine whether nulls appear before or after non-null data values when the > ORDER BY clause is used. > SQL standard does not specify the behavior by default. Currently in Hive, > null values sort as if lower than any non-null value; that is, NULLS FIRST is > the default for ASC order, and NULLS LAST for DESC order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13159) TxnHandler should support datanucleus.connectionPoolingType = None
[ https://issues.apache.org/jira/browse/HIVE-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167986#comment-15167986 ] Sergey Shelukhin commented on HIVE-13159: - [~alangates] fyi > TxnHandler should support datanucleus.connectionPoolingType = None > -- > > Key: HIVE-13159 > URL: https://issues.apache.org/jira/browse/HIVE-13159 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > Right now, one has to choose bonecp or dbcp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE
[ https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Fernandez updated HIVE-13063: --- Summary: Create UDFs for CHR and REPLACE (was: Create UDFs for CHAR and REPLACE ) > Create UDFs for CHR and REPLACE > > > Key: HIVE-13063 > URL: https://issues.apache.org/jira/browse/HIVE-13063 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez > Fix For: 2.1.0 > > Attachments: HIVE-13063.master.patch, Screen Shot 2016-02-17 at > 7.20.57 PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png > > > Create UDFS for these functions. > CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If > n is less than 0 or greater than 255, return the empty string. If n is 0, > return null. > REPLACE: replace all substrings of 'str' that match 'search' with 'rep'. > Example. SELECT REPLACE('Hack and Hue', 'H', 'BL'); > Equals 'BLack and BLue'" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13069) Enable cartesian product merging
[ https://issues.apache.org/jira/browse/HIVE-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13069 started by Jesus Camacho Rodriguez. -- > Enable cartesian product merging > > > Key: HIVE-13069 > URL: https://issues.apache.org/jira/browse/HIVE-13069 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Currently we can merge 2-way joins into n-way joins when the joins are > executed over the same column. > In turn, CBO might produce plans containing cartesian products if the join > columns are constant values; after HIVE-12543 went in, this is rather common, > as those constant columns are correctly pruned. However, currently we do not > merge a cartesian product with two inputs into a cartesian product with > multiple inputs, which could result in performance loss. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13069) Enable cartesian product merging
[ https://issues.apache.org/jira/browse/HIVE-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13069: --- Status: Patch Available (was: In Progress) > Enable cartesian product merging > > > Key: HIVE-13069 > URL: https://issues.apache.org/jira/browse/HIVE-13069 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Currently we can merge 2-way joins into n-way joins when the joins are > executed over the same column. > In turn, CBO might produce plans containing cartesian products if the join > columns are constant values; after HIVE-12543 went in, this is rather common, > as those constant columns are correctly pruned. However, currently we do not > merge a cartesian product with two inputs into a cartesian product with > multiple inputs, which could result in performance loss. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13120) propagate doAs when generating ORC splits
[ https://issues.apache.org/jira/browse/HIVE-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167845#comment-15167845 ] Prasanth Jayachandran commented on HIVE-13120: -- lgtm, +1 > propagate doAs when generating ORC splits > - > > Key: HIVE-13120 > URL: https://issues.apache.org/jira/browse/HIVE-13120 > Project: Hive > Issue Type: Improvement >Reporter: Yi Zhang >Assignee: Sergey Shelukhin > Attachments: HIVE-13120.patch > > > ORC+HS2+doAs+FetchTask conversion = weird permission errors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13129) CliService leaks HMS connection
[ https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167726#comment-15167726 ] Hive QA commented on HIVE-13129: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789598/HIVE-13129.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9826 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7091/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7091/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7091/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789598 - PreCommit-HIVE-TRUNK-Build > CliService leaks HMS connection > --- > > Key: HIVE-13129 > URL: https://issues.apache.org/jira/browse/HIVE-13129 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13129.patch > > > HIVE-12790 fixes the HMS connection leaking. But seems there is one more > connection from CLIService. > The init() function in CLIService will get info from DB but we never close > the HMS connection for this service main thread. > {noformat} > // creates connection to HMS and thus *must* occur after kerberos login > above > try { > applyAuthorizationConfigPolicy(hiveConf); > } catch (Exception e) { > throw new RuntimeException("Error applying authorization policy on hive > configuration: " > + e.getMessage(), e); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167725#comment-15167725 ] Alan Gates commented on HIVE-13013: --- +1 > Further Improve concurrency in TxnHandler > - > > Key: HIVE-13013 > URL: https://issues.apache.org/jira/browse/HIVE-13013 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch > > > There are still a few operations in TxnHandler that run at Serializable > isolation. > Most or all of them can be dropped to READ_COMMITTED now that we have SELECT > ... FOR UPDATE support. This will reduce number of deadlocks in the DBs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167695#comment-15167695 ] Aihua Xu commented on HIVE-13149: - [~jxiang], [~ctang.ma], [~ngangam] You have worked on the leaking issues before. Can you guys help review the code change? > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some of them like StatsTask, don't need to access > HMS, currently a new HMS connection will be established for each thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167647#comment-15167647 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13130: -- [~ashutoshc] Yes, will add the tests as well. I will try to add the PK/FK to be send as configured in HS2 as well (by introducing a new HS2 parameter). Right now it always sends a DUMMY col for the FK/PK col values. Thanks Hari > API calls for retrieving primary keys and foreign keys information > --- > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13153) SessionID is appended to thread name twice
[ https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167642#comment-15167642 ] Sergey Shelukhin commented on HIVE-13153: - Change logging to debug? Also, should the 2nd call in CliDriver be reset, rather than update? I am not sure about the original logic compatibility, this logic looks good to me. > SessionID is appended to thread name twice > -- > > Key: HIVE-13153 > URL: https://issues.apache.org/jira/browse/HIVE-13153 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch > > > HIVE-12249 added sessionId to thread name. In some cases the sessionId could > be appended twice. Example log line > {code} > DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 > 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format
[ https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167636#comment-15167636 ] Sergey Shelukhin commented on HIVE-12749: - constprog2, constprog_partitioner do not test constant propagation anymore. They should either be removed or changed to test it with compatible types. cc [~ashutoshc] Otherwise looks good. > Constant propagate returns string values in incorrect format > > > Key: HIVE-12749 > URL: https://issues.apache.org/jira/browse/HIVE-12749 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.0 >Reporter: Oleksiy Sayankin >Assignee: Aleksey Vovchenko > Fix For: 2.0.1 > > Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch, > HIVE-12749.3.patch, HIVE-12749.4.patch, HIVE-12749.5.patch, > HIVE-12749.6.patch, HIVE-12749.7.patch > > > h2. STEP 1. Create and upload test data > Execute in command line: > {noformat} > nano stest.data > {noformat} > Add to file: > {noformat} > 000126,000777 > 000126,000778 > 000126,000779 > 000474,000888 > 000468,000889 > 000272,000880 > {noformat} > {noformat} > hadoop fs -put stest.data / > {noformat} > {noformat} > hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> LOAD DATA INPATH '/stest.data' OVERWRITE INTO TABLE stest; > {noformat} > h2. STEP 2. Execute test query (with cast for x) > {noformat} > select x from stest where cast(x as int) = 126; > {noformat} > EXPECTED RESULT: > {noformat} > 000126 > 000126 > 000126 > {noformat} > ACTUAL RESULT: > {noformat} > 126 > 126 > 126 > {noformat} > h2. STEP 3. Execute test query (no cast for x) > {noformat} > hive> select x from stest where x = 126; > {noformat} > EXPECTED RESULT: > {noformat} > 000126 > 000126 > 000126 > {noformat} > ACTUAL RESULT: > {noformat} > 126 > 126 > 126 > {noformat} > In steps #2, #3 I expected '000126' because the origin type of x is STRING in > stest table. > Note, setting hive.optimize.constant.propagation=false fixes the issue. > {noformat} > hive> set hive.optimize.constant.propagation=false; > hive> select x from stest where x = 126; > OK > 000126 > 000126 > 000126 > {noformat} > Related to HIVE-11104, HIVE-8555 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13132) Hive should lazily load and cache metastore (permanent) functions
[ https://issues.apache.org/jira/browse/HIVE-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167639#comment-15167639 ] Anthony Hsu commented on HIVE-13132: Thanks for the review, [~alangates]. # I tested and unfortunately, HIVE-2573 does NOT solve this issue. A stacktrace in jdb shows all functions are loaded during CliDriver start-up: {noformat} main[1] where [1] org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions (Hive.java:173) [2] org.apache.hadoop.hive.ql.metadata.Hive. (Hive.java:166) [3] org.apache.hadoop.hive.ql.session.SessionState.start (SessionState.java:503) [4] org.apache.hadoop.hive.cli.CliDriver.run (CliDriver.java:677) [5] org.apache.hadoop.hive.cli.CliDriver.main (CliDriver.java:621) [6] sun.reflect.NativeMethodAccessorImpl.invoke0 (native method) [7] sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62) [8] sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) [9] java.lang.reflect.Method.invoke (Method.java:483) [10] org.apache.hadoop.util.RunJar.run (RunJar.java:221) [11] org.apache.hadoop.util.RunJar.main (RunJar.java:136) {noformat} # My fix is a bit hacky and only works on the Hive 0.13.1 branch. I basically changed the CliDriver initialization code to use {{FunctionRegistry.getFunctionNames(String funcPatternStr)}} instead of {{FunctionRegistry.getFunctionNames()}}. In the Hive 0.13.1 branch, the former does NOT search the metastore while the latter does. This is no longer the case in trunk. # I don't have much experience with HS2 but I will take a look. I'll work on a cleaner solution that removes the pre-loading of metastore functions from the CliDriver initialization code path. Suggestions welcome. > Hive should lazily load and cache metastore (permanent) functions > - > > Key: HIVE-13132 > URL: https://issues.apache.org/jira/browse/HIVE-13132 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.13.1 >Reporter: Anthony Hsu >Assignee: Anthony Hsu > Attachments: HIVE-13132.1.patch > > > In Hive 0.13.1, we have noticed that as the number of databases increases, > the start-up time of the Hive interactive shell increases. This is because > during start-up, all databases are iterated over to fetch the permanent > functions to display in the {{SHOW FUNCTIONS}} output. > {noformat:title=FunctionRegistry.java} > private static Set getFunctionNames(boolean searchMetastore) { > Set functionNames = mFunctions.keySet(); > if (searchMetastore) { > functionNames = new HashSet(functionNames); > try { > Hive db = getHive(); > List dbNames = db.getAllDatabases(); > for (String dbName : dbNames) { > List funcNames = db.getFunctions(dbName, "*"); > for (String funcName : funcNames) { > functionNames.add(FunctionUtils.qualifyFunctionName(funcName, > dbName)); > } > } > } catch (Exception e) { > LOG.error(e); > // Continue on, we can still return the functions we've gotten to > this point. > } > } > return functionNames; > } > {noformat} > Instead of eagerly loading all metastore functions, we should only load them > the first time {{SHOW FUNCTIONS}} is invoked. We should also cache the > results. > Note that this issue may have been fixed by HIVE-2573, though I haven't > verified this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167632#comment-15167632 ] Ashutosh Chauhan commented on HIVE-13130: - Also can you add tests for new jdbc api in {{TestJdbcWithMiniHS2}} > API calls for retrieving primary keys and foreign keys information > --- > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format
[ https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167630#comment-15167630 ] Jesus Camacho Rodriguez commented on HIVE-12749: Patch LGTM, +1. [~AleKsey Vovchenko], could you rebase it and take care of those q file failures? Thanks > Constant propagate returns string values in incorrect format > > > Key: HIVE-12749 > URL: https://issues.apache.org/jira/browse/HIVE-12749 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.0 >Reporter: Oleksiy Sayankin >Assignee: Aleksey Vovchenko > Fix For: 2.0.1 > > Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch, > HIVE-12749.3.patch, HIVE-12749.4.patch, HIVE-12749.5.patch, > HIVE-12749.6.patch, HIVE-12749.7.patch > > > h2. STEP 1. Create and upload test data > Execute in command line: > {noformat} > nano stest.data > {noformat} > Add to file: > {noformat} > 000126,000777 > 000126,000778 > 000126,000779 > 000474,000888 > 000468,000889 > 000272,000880 > {noformat} > {noformat} > hadoop fs -put stest.data / > {noformat} > {noformat} > hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> LOAD DATA INPATH '/stest.data' OVERWRITE INTO TABLE stest; > {noformat} > h2. STEP 2. Execute test query (with cast for x) > {noformat} > select x from stest where cast(x as int) = 126; > {noformat} > EXPECTED RESULT: > {noformat} > 000126 > 000126 > 000126 > {noformat} > ACTUAL RESULT: > {noformat} > 126 > 126 > 126 > {noformat} > h2. STEP 3. Execute test query (no cast for x) > {noformat} > hive> select x from stest where x = 126; > {noformat} > EXPECTED RESULT: > {noformat} > 000126 > 000126 > 000126 > {noformat} > ACTUAL RESULT: > {noformat} > 126 > 126 > 126 > {noformat} > In steps #2, #3 I expected '000126' because the origin type of x is STRING in > stest table. > Note, setting hive.optimize.constant.propagation=false fixes the issue. > {noformat} > hive> set hive.optimize.constant.propagation=false; > hive> select x from stest where x = 126; > OK > 000126 > 000126 > 000126 > {noformat} > Related to HIVE-11104, HIVE-8555 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167618#comment-15167618 ] Eugene Koifman commented on HIVE-13013: --- [~alangates], could you review please? The delta between patch 1 and 3 is minimal > Further Improve concurrency in TxnHandler > - > > Key: HIVE-13013 > URL: https://issues.apache.org/jira/browse/HIVE-13013 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch > > > There are still a few operations in TxnHandler that run at Serializable > isolation. > Most or all of them can be dropped to READ_COMMITTED now that we have SELECT > ... FOR UPDATE support. This will reduce number of deadlocks in the DBs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167606#comment-15167606 ] Ashutosh Chauhan commented on HIVE-13130: - Can you create a RB entry for non generated code? > API calls for retrieving primary keys and foreign keys information > --- > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13129) CliService leaks HMS connection
[ https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167565#comment-15167565 ] Aihua Xu commented on HIVE-13129: - Only one HMS connection will be created but never released and not used later. It will accumulate leaking HMS resource and database resource if HiveServer keeps restarting. > CliService leaks HMS connection > --- > > Key: HIVE-13129 > URL: https://issues.apache.org/jira/browse/HIVE-13129 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13129.patch > > > HIVE-12790 fixes the HMS connection leaking. But seems there is one more > connection from CLIService. > The init() function in CLIService will get info from DB but we never close > the HMS connection for this service main thread. > {noformat} > // creates connection to HMS and thus *must* occur after kerberos login > above > try { > applyAuthorizationConfigPolicy(hiveConf); > } catch (Exception e) { > throw new RuntimeException("Error applying authorization policy on hive > configuration: " > + e.getMessage(), e); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13146) OrcFile table property values are case sensitive
[ https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-13146: Attachment: HIVE-13146.1.patch The attached patch fixes it by change orc conpress value to uppercase when table is created. > OrcFile table property values are case sensitive > > > Key: HIVE-13146 > URL: https://issues.apache.org/jira/browse/HIVE-13146 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.1 >Reporter: Andrew Sears >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-13146.1.patch > > > In Hive v1.2.1.2.3, with Tez , create an external table with compression > SNAPPY value marked as lower case. Table is created successfully. Insert > data into table fails with no enum constant error. > CREATE EXTERNAL TABLE mydb.mytable > (id int) > PARTITIONED BY (business_date date) > STORED AS ORC > LOCATION > '/data/mydb/mytable' > TBLPROPERTIES ( > 'orc.compress'='snappy'); > set hive.exec.dynamic.partition=true; > set hive.exec.dynamic.partition.mode=nonstrict; > INSERT OVERWRITE mydb.mytable PARTITION (business_date) > SELECT * from mydb.sourcetable; > Caused by: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25) > Constant SNAPPY needs to be uppercase in definition to fix. Case should be > agnostic or throw error on creation of table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13146) OrcFile table property values are case sensitive
[ https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-13146: Status: Patch Available (was: Open) > OrcFile table property values are case sensitive > > > Key: HIVE-13146 > URL: https://issues.apache.org/jira/browse/HIVE-13146 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.1 >Reporter: Andrew Sears >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-13146.1.patch > > > In Hive v1.2.1.2.3, with Tez , create an external table with compression > SNAPPY value marked as lower case. Table is created successfully. Insert > data into table fails with no enum constant error. > CREATE EXTERNAL TABLE mydb.mytable > (id int) > PARTITIONED BY (business_date date) > STORED AS ORC > LOCATION > '/data/mydb/mytable' > TBLPROPERTIES ( > 'orc.compress'='snappy'); > set hive.exec.dynamic.partition=true; > set hive.exec.dynamic.partition.mode=nonstrict; > INSERT OVERWRITE mydb.mytable PARTITION (business_date) > SELECT * from mydb.sourcetable; > Caused by: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25) > Constant SNAPPY needs to be uppercase in definition to fix. Case should be > agnostic or throw error on creation of table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13139) Unfold TOK_ALLCOLREF of source table/view at QB stage
[ https://issues.apache.org/jira/browse/HIVE-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167534#comment-15167534 ] Hive QA commented on HIVE-13139: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789597/HIVE-13139.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 405 failed/errored test(s), 9720 tests executed *Failed tests:* {noformat} TestSparkCliDriver-auto_join18.q-union_remove_23.q-input1_limit.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join39.q-stats12.q-union27.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_partition_drop org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_const org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_windowing org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_column_access_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantfolding org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cteViews org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_3
[jira] [Updated] (HIVE-13131) TezWork queryName can be null after HIVE-12523
[ https://issues.apache.org/jira/browse/HIVE-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-13131: -- Attachment: HIVE-13131.2.patch Uploading patch v2 - this is just a golden file update for the qfiles > TezWork queryName can be null after HIVE-12523 > -- > > Key: HIVE-13131 > URL: https://issues.apache.org/jira/browse/HIVE-13131 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-13131.1.patch, HIVE-13131.2.patch > > > Looks like after HIVE-12523, the queryName field can be null, either if the > conf passed in is null, or if the conf does not contain the necessary > settings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13129) CliService leaks HMS connection
[ https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167489#comment-15167489 ] Naveen Gangam commented on HIVE-13129: -- [~aihuaxu] This should only be creating one HMS connection, during HS2 startup, over the lifetime of the HS2. Do you see this causing an incremental leak? or are you proactively closing that HMS connection because you think its never used again? > CliService leaks HMS connection > --- > > Key: HIVE-13129 > URL: https://issues.apache.org/jira/browse/HIVE-13129 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13129.patch > > > HIVE-12790 fixes the HMS connection leaking. But seems there is one more > connection from CLIService. > The init() function in CLIService will get info from DB but we never close > the HMS connection for this service main thread. > {noformat} > // creates connection to HMS and thus *must* occur after kerberos login > above > try { > applyAuthorizationConfigPolicy(hiveConf); > } catch (Exception e) { > throw new RuntimeException("Error applying authorization policy on hive > configuration: " > + e.getMessage(), e); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13102) CBO: Reduce operations in Calcite do not fold as tight as rule-based folding
[ https://issues.apache.org/jira/browse/HIVE-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13102: --- Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) Pushed to master, thanks for the review [~ashutoshc]! > CBO: Reduce operations in Calcite do not fold as tight as rule-based folding > > > Key: HIVE-13102 > URL: https://issues.apache.org/jira/browse/HIVE-13102 > Project: Hive > Issue Type: Improvement > Components: CBO >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Fix For: 2.1.0 > > Attachments: HIVE-13102.01.patch, HIVE-13102.patch > > > With CBO > {code} > create temporary table table1(id int, val int, val1 int, dimid int); > create temporary table table3(id int, val int, val1 int); > hive> explain select table1.id, table1.val, table1.val1 from table1 inner > join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid > <>1 ; > Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 llap > File Output Operator [FS_11] > Map Join Operator [MAPJOIN_14] (rows=1 width=0) > Conds:(Inner),Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] llap > BROADCAST [RS_8] > Select Operator [SEL_5] (rows=1 width=0) > Filter Operator [FIL_13] (rows=1 width=0) > predicate:(id = 1) > TableScan [TS_3] (rows=1 width=0) > default@table3,table3,Tbl:PARTIAL,Col:NONE,Output:["id"] > <-Select Operator [SEL_2] (rows=1 width=0) > Output:["_col0","_col1","_col2"] > Filter Operator [FIL_12] (rows=1 width=0) > predicate:((dimid = 1) and (dimid <> 1)) > TableScan [TS_0] (rows=1 width=0) > > default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1","dimid"] > {code} > without CBO > {code} > hive> explain select table1.id, table1.val, table1.val1 from table1 inner > join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid > <>1 ; > OK > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 llap > File Output Operator [FS_9] > Map Join Operator [MAPJOIN_14] (rows=1 width=0) > Conds:FIL_12.1=RS_17.1(Inner),Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_17] > PartitionCols:1 > Filter Operator [FIL_16] (rows=1 width=0) > predicate:false > TableScan [TS_1] (rows=1 width=0) > default@table3,table3,Tbl:PARTIAL,Col:COMPLETE > <-Filter Operator [FIL_12] (rows=1 width=0) > predicate:false > TableScan [TS_0] (rows=1 width=0) > > default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1"] > Time taken: 0.044 seconds, Fetched: 23 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join
[ https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167382#comment-15167382 ] Chaoyu Tang commented on HIVE-13082: Thank [~gopalv] for the explanation. Basically any optimization that causes the dropping of on clause in left semi join should be considered as invalid because it turns off the implicit "distinct", right? > Enable constant propagation optimization in query with left semi join > - > > Key: HIVE-13082 > URL: https://issues.apache.org/jira/browse/HIVE-13082 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, > HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch > > > Currently constant folding is only allowed for inner or unique join, I think > it is also applicable and allowed for left semi join. Otherwise the query > like following having multiple joins with left semi joins will fail: > {code} > select table1.id, table1.val, table2.val2 from table1 inner join table2 on > table1.val = 't1val01' and table1.id = table2.id left semi join table3 on > table1.dimid = table3.id; > {code} > with errors: > {code} > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > [hadoop-mapreduce-client-common-2.6.0.jar:?] > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > ~[hadoop-common-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[?:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45] > ... > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45] > at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-13102) CBO: Reduce operations in Calcite do not fold as tight as rule-based folding
[ https://issues.apache.org/jira/browse/HIVE-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167378#comment-15167378 ] Ashutosh Chauhan commented on HIVE-13102: - +1 > CBO: Reduce operations in Calcite do not fold as tight as rule-based folding > > > Key: HIVE-13102 > URL: https://issues.apache.org/jira/browse/HIVE-13102 > Project: Hive > Issue Type: Improvement > Components: CBO >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Attachments: HIVE-13102.01.patch, HIVE-13102.patch > > > With CBO > {code} > create temporary table table1(id int, val int, val1 int, dimid int); > create temporary table table3(id int, val int, val1 int); > hive> explain select table1.id, table1.val, table1.val1 from table1 inner > join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid > <>1 ; > Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 llap > File Output Operator [FS_11] > Map Join Operator [MAPJOIN_14] (rows=1 width=0) > Conds:(Inner),Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] llap > BROADCAST [RS_8] > Select Operator [SEL_5] (rows=1 width=0) > Filter Operator [FIL_13] (rows=1 width=0) > predicate:(id = 1) > TableScan [TS_3] (rows=1 width=0) > default@table3,table3,Tbl:PARTIAL,Col:NONE,Output:["id"] > <-Select Operator [SEL_2] (rows=1 width=0) > Output:["_col0","_col1","_col2"] > Filter Operator [FIL_12] (rows=1 width=0) > predicate:((dimid = 1) and (dimid <> 1)) > TableScan [TS_0] (rows=1 width=0) > > default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1","dimid"] > {code} > without CBO > {code} > hive> explain select table1.id, table1.val, table1.val1 from table1 inner > join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid > <>1 ; > OK > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 llap > File Output Operator [FS_9] > Map Join Operator [MAPJOIN_14] (rows=1 width=0) > Conds:FIL_12.1=RS_17.1(Inner),Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_17] > PartitionCols:1 > Filter Operator [FIL_16] (rows=1 width=0) > predicate:false > TableScan [TS_1] (rows=1 width=0) > default@table3,table3,Tbl:PARTIAL,Col:COMPLETE > <-Filter Operator [FIL_12] (rows=1 width=0) > predicate:false > TableScan [TS_0] (rows=1 width=0) > > default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1"] > Time taken: 0.044 seconds, Fetched: 23 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167372#comment-15167372 ] Jesus Camacho Rodriguez commented on HIVE-13096: I just did. Thanks > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167363#comment-15167363 ] Ashutosh Chauhan commented on HIVE-13096: - Can you create a RB entry? > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13146) OrcFile table property values are case sensitive
[ https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen reassigned HIVE-13146: --- Assignee: Yongzhi Chen > OrcFile table property values are case sensitive > > > Key: HIVE-13146 > URL: https://issues.apache.org/jira/browse/HIVE-13146 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.1 >Reporter: Andrew Sears >Assignee: Yongzhi Chen >Priority: Minor > > In Hive v1.2.1.2.3, with Tez , create an external table with compression > SNAPPY value marked as lower case. Table is created successfully. Insert > data into table fails with no enum constant error. > CREATE EXTERNAL TABLE mydb.mytable > (id int) > PARTITIONED BY (business_date date) > STORED AS ORC > LOCATION > '/data/mydb/mytable' > TBLPROPERTIES ( > 'orc.compress'='snappy'); > set hive.exec.dynamic.partition=true; > set hive.exec.dynamic.partition.mode=nonstrict; > INSERT OVERWRITE mydb.mytable PARTITION (business_date) > SELECT * from mydb.sourcetable; > Caused by: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25) > Constant SNAPPY needs to be uppercase in definition to fix. Case should be > agnostic or throw error on creation of table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13150) When multiple queries are running in the same session, they are sharing the same HMS Client.
[ https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13150: Description: Seems we should create different HMSClient for different queries if multiple queries are executing in the same session in async at the same time to have better performance. Right now, we are unnecessarily to use one HMSClient and we have to make HMS calls in sync among different queries. was: HMS connection leak has been addressed for the session thread, task threads if the execution is run in parallel. While if we execute the queries in async, we will run the queries in separate threads and the HMS connections there are not released. > When multiple queries are running in the same session, they are sharing the > same HMS Client. > > > Key: HIVE-13150 > URL: https://issues.apache.org/jira/browse/HIVE-13150 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > > Seems we should create different HMSClient for different queries if multiple > queries are executing in the same session in async at the same time to have > better performance. > Right now, we are unnecessarily to use one HMSClient and we have to make HMS > calls in sync among different queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13150) When multiple queries are running in the same session, they are sharing the same HMS Client.
[ https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13150: Summary: When multiple queries are running in the same session, they are sharing the same HMS Client. (was: HMS connection leak when the query is run in async) > When multiple queries are running in the same session, they are sharing the > same HMS Client. > > > Key: HIVE-13150 > URL: https://issues.apache.org/jira/browse/HIVE-13150 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > > HMS connection leak has been addressed for the session thread, task threads > if the execution is run in parallel. > While if we execute the queries in async, we will run the queries in separate > threads and the HMS connections there are not released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-13150) HMS connection leak when the query is run in async
[ https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reopened HIVE-13150: - Rather than connection leak, seems we should create different HMSClient for different queries if multiple queries are executing in the same session at the same time to have better performance. Right now, we are unnecessarily to use one HMSClient and we have to make HMS calls in sync among different queries. > HMS connection leak when the query is run in async > -- > > Key: HIVE-13150 > URL: https://issues.apache.org/jira/browse/HIVE-13150 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > > HMS connection leak has been addressed for the session thread, task threads > if the execution is run in parallel. > While if we execute the queries in async, we will run the queries in separate > threads and the HMS connections there are not released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13096: --- Status: Open (was: Patch Available) > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13096: --- Attachment: HIVE-13096.03.patch [~ashutoshc], I updated the patch to recalculate mapJoinConversionPos only if necessary. I checked the plan changes and they seem fine based on the heuristic (plan is akin to pre-HIVE-11954 status). > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13096: --- Status: Patch Available (was: In Progress) > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13096 started by Jesus Camacho Rodriguez. -- > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13034) Add jdeb plugin to build debian
[ https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167103#comment-15167103 ] Amareshwari Sriramadasu commented on HIVE-13034: +1 for permission fix through https://issues.apache.org/jira/secure/attachment/12789906/HIVE-13034.1.patch > Add jdeb plugin to build debian > --- > > Key: HIVE-13034 > URL: https://issues.apache.org/jira/browse/HIVE-13034 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 2.1.0 >Reporter: Arshad Matin >Assignee: Arshad Matin > Fix For: 2.1.0 > > Attachments: HIVE-13034.1.patch, HIVE-13034.patch > > > It would be nice to also generate a debian as a part of build. This can be > done by adding jdeb plugin to dist profile. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13034) Add jdeb plugin to build debian
[ https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167102#comment-15167102 ] Arshad Matin commented on HIVE-13034: - *Testing* {noformat} arshad:/usr/local/hive$ ls -ltr total 508 -rw-r--r-- 1 root root 445081 Feb 25 08:46 RELEASE_NOTES.txt -rw-r--r-- 1 root root 4353 Feb 25 08:46 README.txt -rw-r--r-- 1 root root513 Feb 25 08:46 NOTICE -rw-r--r-- 1 root root 27909 Feb 25 08:46 LICENSE drwxr-xr-x 4 root root 4096 Feb 25 11:29 scripts drwxr-xr-x 7 root root 4096 Feb 25 11:29 hcatalog drwxr-xr-x 4 root root 4096 Feb 25 11:29 examples drwxr-xr-x 2 root root 4096 Feb 25 11:29 conf drwxr-xr-x 3 root root 4096 Feb 25 11:29 bin drwxr-xr-x 4 root root 12288 Feb 25 11:29 lib arshad:/usr/local/hive$ cd bin/ arshad:/usr/local/hive/bin$ ls -ltr total 64 -rwxr-xr-x 1 root root 884 Feb 25 08:46 schematool -rwxr-xr-x 1 root root 832 Feb 25 08:46 metatool -rwxr-xr-x 1 root root 2278 Feb 25 08:46 hplsql.cmd -rwxr-xr-x 1 root root 1030 Feb 25 08:46 hplsql -rwxr-xr-x 1 root root 885 Feb 25 08:46 hiveserver2 -rwxr-xr-x 1 root root 1900 Feb 25 08:46 hive-config.sh -rwxr-xr-x 1 root root 1584 Feb 25 08:46 hive-config.cmd -rwxr-xr-x 1 root root 8713 Feb 25 08:46 hive.cmd -rwxr-xr-x 1 root root 8262 Feb 25 08:46 hive -rwxr-xr-x 1 root root 2553 Feb 25 08:46 beeline.cmd -rwxr-xr-x 1 root root 1436 Feb 25 08:46 beeline drwxr-xr-x 3 root root 4096 Feb 25 11:29 ext {noformat} > Add jdeb plugin to build debian > --- > > Key: HIVE-13034 > URL: https://issues.apache.org/jira/browse/HIVE-13034 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 2.1.0 >Reporter: Arshad Matin >Assignee: Arshad Matin > Fix For: 2.1.0 > > Attachments: HIVE-13034.1.patch, HIVE-13034.patch > > > It would be nice to also generate a debian as a part of build. This can be > done by adding jdeb plugin to dist profile. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13034) Add jdeb plugin to build debian
[ https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arshad Matin updated HIVE-13034: Status: Patch Available (was: Reopened) > Add jdeb plugin to build debian > --- > > Key: HIVE-13034 > URL: https://issues.apache.org/jira/browse/HIVE-13034 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 2.1.0 >Reporter: Arshad Matin >Assignee: Arshad Matin > Fix For: 2.1.0 > > Attachments: HIVE-13034.1.patch, HIVE-13034.patch > > > It would be nice to also generate a debian as a part of build. This can be > done by adding jdeb plugin to dist profile. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13034) Add jdeb plugin to build debian
[ https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arshad Matin updated HIVE-13034: Attachment: HIVE-13034.1.patch > Add jdeb plugin to build debian > --- > > Key: HIVE-13034 > URL: https://issues.apache.org/jira/browse/HIVE-13034 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 2.1.0 >Reporter: Arshad Matin >Assignee: Arshad Matin > Fix For: 2.1.0 > > Attachments: HIVE-13034.1.patch, HIVE-13034.patch > > > It would be nice to also generate a debian as a part of build. This can be > done by adding jdeb plugin to dist profile. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-13034) Add jdeb plugin to build debian
[ https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arshad Matin reopened HIVE-13034: - Reopening it as there is some issue with the permission. Small fix. Uploading patch directly here. > Add jdeb plugin to build debian > --- > > Key: HIVE-13034 > URL: https://issues.apache.org/jira/browse/HIVE-13034 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 2.1.0 >Reporter: Arshad Matin >Assignee: Arshad Matin > Fix For: 2.1.0 > > Attachments: HIVE-13034.patch > > > It would be nice to also generate a debian as a part of build. This can be > done by adding jdeb plugin to dist profile. > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13130) API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13130: - Attachment: HIVE-13130.2.patch added jdbc calls draft#2 > API calls for retrieving primary keys and foreign keys information > --- > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13144) HS2 can leak ZK ACL objects when curator retries to create the persistent ephemeral node
[ https://issues.apache.org/jira/browse/HIVE-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166928#comment-15166928 ] Hive QA commented on HIVE-13144: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789565/HIVE-13144.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7088/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7088/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7088/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7088/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at e9b7348 HIVE-13101: NullPointerException in HiveLexer.g (Sandeep via Xuefu) + git clean -f -d Removing data/files/timestamps.txt Removing ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestTimestampWritableAndColumnVector.java Removing ql/src/test/queries/clientpositive/vector_interval_arithmetic.q Removing ql/src/test/results/clientpositive/tez/vector_interval_arithmetic.q.out Removing ql/src/test/results/clientpositive/tez/vectorized_timestamp.q.out Removing ql/src/test/results/clientpositive/vector_interval_arithmetic.q.out Removing storage-api/src/java/org/apache/hadoop/hive/common/type/HiveIntervalDayTime.java Removing storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/IntervalDayTimeColumnVector.java Removing storage-api/src/java/org/apache/hive/common/util/IntervalDayTimeUtils.java Removing storage-api/src/test/org/apache/hadoop/hive/ql/exec/vector/TestTimestampColumnVector.java + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at e9b7348 HIVE-13101: NullPointerException in HiveLexer.g (Sandeep via Xuefu) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12789565 - PreCommit-HIVE-TRUNK-Build > HS2 can leak ZK ACL objects when curator retries to create the persistent > ephemeral node > > > Key: HIVE-13144 > URL: https://issues.apache.org/jira/browse/HIVE-13144 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13144.1.patch > > > When the node gets deleted from ZK due to connection loss and curator tries > to recreate the node, it might leak ZK ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862
[ https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166921#comment-15166921 ] Hive QA commented on HIVE-13111: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789569/HIVE-13111.02.patch {color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9828 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFUnixTimeStampTimestamp org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testRepeating org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7087/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7087/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7087/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789569 - PreCommit-HIVE-TRUNK-Build > Fix timestamp / interval_day_time wrong results with HIVE-9862 > --- > > Key: HIVE-13111 > URL: https://issues.apache.org/jira/browse/HIVE-13111 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch > > > Fix timestamp / interval_day_time issues discovered when testing the > Vectorized Text patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry
[ https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12935: - Attachment: HIVE-12935.5.patch Named notification handler threadpool, also has fix for ACL leak in zk connection. > LLAP: Replace Yarn registry with Zookeeper registry > --- > > Key: HIVE-12935 > URL: https://issues.apache.org/jira/browse/HIVE-12935 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, > HIVE-12935.4.patch, HIVE-12935.5.patch > > > Existing YARN registry service for cluster membership has to depend on > refresh intervals to get the list of instances/daemons that are running in > the cluster. Better approach would be replace it with zookeeper based > registry service so that custom listeners can be added to update healthiness > of daemons in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join
[ https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166909#comment-15166909 ] Gopal V commented on HIVE-13082: The predicate is actually folded to 1=1 because the actual keys don't count. select * from a where id IN (select b.id from b) and a.id = 1; folds into select * from a where id IN (select 1 from b where b.id = 1) and a.id = 1; After CBO, it gets rewritten as select * from a left semi join b on 1 = 1 where a.id = 1 and b.id = 1; And the 2nd constant folding pass does select * from a left semi join b where a.id = 1 and b.id = 1; accidentally dropping the ON clause & turning it into a keyless cross-product, which turns off the implicit "distinct " injected by the left semi join since there's no key anymore. > Enable constant propagation optimization in query with left semi join > - > > Key: HIVE-13082 > URL: https://issues.apache.org/jira/browse/HIVE-13082 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, > HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch > > > Currently constant folding is only allowed for inner or unique join, I think > it is also applicable and allowed for left semi join. Otherwise the query > like following having multiple joins with left semi joins will fail: > {code} > select table1.id, table1.val, table2.val2 from table1 inner join table2 on > table1.val = 't1val01' and table1.id = table2.id left semi join table3 on > table1.dimid = table3.id; > {code} > with errors: > {code} > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > [hadoop-mapreduce-client-common-2.6.0.jar:?] > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > ~[hadoop-common-2.6.0.jar:?] > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > ~[hadoop-common-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-2.6.0.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) > ~[hadoop-mapreduce-client-common-2.6.0.jar:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[?:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[?:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45] > ... > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45] > at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78) >
[jira] [Updated] (HIVE-13153) SessionID is appended to thread name twice
[ https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13153: - Attachment: HIVE-13153.2.patch Switched log lines before renaming thread per [~gopalv]'s comments. > SessionID is appended to thread name twice > -- > > Key: HIVE-13153 > URL: https://issues.apache.org/jira/browse/HIVE-13153 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch > > > HIVE-12249 added sessionId to thread name. In some cases the sessionId could > be appended twice. Example log line > {code} > DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 > 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13153) SessionID is appended to thread name twice
[ https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13153: - Status: Patch Available (was: Open) > SessionID is appended to thread name twice > -- > > Key: HIVE-13153 > URL: https://issues.apache.org/jira/browse/HIVE-13153 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13153.1.patch > > > HIVE-12249 added sessionId to thread name. In some cases the sessionId could > be appended twice. Example log line > {code} > DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 > 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)