[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168575#comment-15168575
 ] 

Prasanth Jayachandran commented on HIVE-12935:
--

Sure. I will do the nightly run tonight. 

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13169) HiveServer2: Support delegation token based connection when using http transport

2016-02-25 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13169:

Affects Version/s: 1.2.1
   2.0.0

> HiveServer2: Support delegation token based connection when using http 
> transport
> 
>
> Key: HIVE-13169
> URL: https://issues.apache.org/jira/browse/HIVE-13169
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> HIVE-5155 introduced support for delegation token based connection. However, 
> it was intended for tcp transport mode. We need to have similar mechanisms 
> for http transport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13169) HiveServer2: Support delegation token based connection when using http transport

2016-02-25 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13169:

Description: HIVE-5155 introduced support for delegation token based 
connection. However, it was intended for tcp transport mode. We need to have 
similar mechanisms for http transport.  (was: 
[HIVE-5155|https://issues.apache.org/jira/browse/HIVE-5155] introduced support 
for delegation token based connection. However, it was intended for tcp 
transport mode. We need to have similar mechanisms for http transport.)

> HiveServer2: Support delegation token based connection when using http 
> transport
> 
>
> Key: HIVE-13169
> URL: https://issues.apache.org/jira/browse/HIVE-13169
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> HIVE-5155 introduced support for delegation token based connection. However, 
> it was intended for tcp transport mode. We need to have similar mechanisms 
> for http transport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168554#comment-15168554
 ] 

Siddharth Seth commented on HIVE-12935:
---

+1.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168555#comment-15168555
 ] 

Hive QA commented on HIVE-13013:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789801/HIVE-13013.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9828 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.txn.TestTxnHandlerNegative.testBadConnection
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7095/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7095/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7095/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789801 - PreCommit-HIVE-TRUNK-Build

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168545#comment-15168545
 ] 

Gopal V commented on HIVE-12935:


Not sure I can do a nightly run tonight - can you kick-off a run with 
Chaosmonkey interval of 120s with at least 4 nodes &  run the q55-random.sql at 
1Tb scale to validate this?

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12935:
-
Attachment: HIVE-12935.7.patch

added proper synchronization per Sid's comments.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch, HIVE-12935.7.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13108) Operators: SORT BY randomness is not safe with network partitions

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168534#comment-15168534
 ] 

Gopal V commented on HIVE-13108:


[~sershe:]: +1? :)

> Operators: SORT BY randomness is not safe with network partitions
> -
>
> Key: HIVE-13108
> URL: https://issues.apache.org/jira/browse/HIVE-13108
> Project: Hive
>  Issue Type: Bug
>  Components: Spark, Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.0.1
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13108.1.patch
>
>
> SORT BY relies on a transient Random object, which is initialized once per 
> deserialize operation.
> This results in complications during a network partition and when Tez/Spark 
> reuses a cached plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12935:
-
Attachment: HIVE-12935.6.patch

Addressed [~sseth]'s review comments.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch, HIVE-12935.6.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13122) LLAP: simple Model/View separation for UI

2016-02-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168516#comment-15168516
 ] 

Siddharth Seth commented on HIVE-13122:
---

+1.

> LLAP: simple Model/View separation for UI
> -
>
> Key: HIVE-13122
> URL: https://issues.apache.org/jira/browse/HIVE-13122
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13122.1.patch, HIVE-13122.2.patch
>
>
> The current LLAP UI in master uses a fixed loop to both extract data and to 
> display it in the same loop.
> Split this up into a model-view, for modularity.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13166) Log the selection from llap decider

2016-02-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168506#comment-15168506
 ] 

Sergey Shelukhin commented on HIVE-13166:
-

That's part of explain...

> Log the selection from llap decider
> ---
>
> Key: HIVE-13166
> URL: https://issues.apache.org/jira/browse/HIVE-13166
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Siddharth Seth
>
> llap decider logs when it considers a vertex, however the actual placement 
> (llap, container, etc) is not logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13167) LLAP: Remove yarn-site resource from zookeeper based registry

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13167:
-
Priority: Minor  (was: Major)

> LLAP: Remove yarn-site resource from zookeeper based registry
> -
>
> Key: HIVE-13167
> URL: https://issues.apache.org/jira/browse/HIVE-13167
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
>
> With zookeeper registry adding yarn-site.xml resource is no longer required.
> Following line should be removed from LlapZookeeperRegistryImpl
> {code}
> this.conf.addResource(YarnConfiguration.YARN_SITE_CONFIGURATION_FILE);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168450#comment-15168450
 ] 

Hive QA commented on HIVE-12679:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789639/HIVE-12679.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9816 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7094/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7094/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7094/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789639 - PreCommit-HIVE-TRUNK-Build

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Affects Versions: 2.1.0
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11675:

Attachment: HIVE-11675.06.patch

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread chillon_m (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168422#comment-15168422
 ] 

chillon_m commented on HIVE-13165:
--

set hive.query.result.fileformat = SequenceFile; do well.
thanks.


> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which I execute query sql added 'order by' or 'sort by'  is 
> different from not add it .size of resultset  and value of row is returned is 
> different.
> with order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
> order by id;
> INFO  : Number of reduce tasks determined at compile time: 1
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=
> WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
> interface and execute your application with ToolRunner to remedy this.
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1456383638304_0008
> INFO  : The url to track the job: 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop 
> job  -kill job_1456383638304_0008
> INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
> reducers: 0
> INFO  : 2016-02-26 11:06:55,493 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2016-02-26 11:07:01,710 Stage-1 map = 100%,  reduce = 0%
> INFO  : 2016-02-26 11:07:04,815 Stage-1 map = 100%,  reduce = 100%
> INFO  : Ended Job = job_1456383638304_0008
> ++--+---+--+
> | temp.type  | temp.id  | temp.msgdata  |
> ++--+---+--+
> | -1000  | 163437   |   we come:  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> ++--+---+--+
> 11 rows selected (16.191 seconds)
> without order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437;
> ++--+---+--+
> | temp.type  | temp.id  | 
>   temp.msgdata
> |
> ++--+---+--+
> | -1000  | 163437   |   we come:
> sadferqgb gtrhyj hytjyjuk  nhmuykiluil
> hthnynmkukmhrj,  |
> ++--+---+--+
> 1 row selected (18.245 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168418#comment-15168418
 ] 

Gopal V commented on HIVE-13165:


FYI, what I meant was that without that optimization, even the withOrderBy will 
fail.

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which I execute query sql added 'order by' or 'sort by'  is 
> different from not add it .size of resultset  and value of row is returned is 
> different.
> with order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
> order by id;
> INFO  : Number of reduce tasks determined at compile time: 1
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=
> WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
> interface and execute your application with ToolRunner to remedy this.
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1456383638304_0008
> INFO  : The url to track the job: 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop 
> job  -kill job_1456383638304_0008
> INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
> reducers: 0
> INFO  : 2016-02-26 11:06:55,493 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2016-02-26 11:07:01,710 Stage-1 map = 100%,  reduce = 0%
> INFO  : 2016-02-26 11:07:04,815 Stage-1 map = 100%,  reduce = 100%
> INFO  : Ended Job = job_1456383638304_0008
> ++--+---+--+
> | temp.type  | temp.id  | temp.msgdata  |
> ++--+---+--+
> | -1000  | 163437   |   we come:  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> ++--+---+--+
> 11 rows selected (16.191 seconds)
> without order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437;
> ++--+---+--+
> | temp.type  | temp.id  | 
>   temp.msgdata
> |
> ++--+---+--+
> | -1000  | 163437   |   we come:
> sadferqgb gtrhyj hytjyjuk  nhmuykiluil
> hthnynmkukmhrj,  |
> ++--+---+--+
> 1 row selected (18.245 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread chillon_m (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168417#comment-15168417
 ] 

chillon_m commented on HIVE-13165:
--

[bigdata@namenode hive-1.2.1]$ bin/beeline  -u 
jdbc:hive2://namenode:1/default bigdata -n bigdata
Connecting to jdbc:hive2://namenode:1/default
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1 by Apache Hive
0: jdbc:hive2://namenode:1/default> set hive.fetch.task.conversion=none;
No rows affected (0.051 seconds)
0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
type,id,msgData from messages where Num='41433141' and erNum='99841977') 
0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
order by id;
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=
WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
interface and execute your application with ToolRunner to remedy this.
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1456383638304_0009
INFO  : The url to track the job: 
http://namenode:8088/proxy/application_1456383638304_0009/
INFO  : Starting Job = job_1456383638304_0009, Tracking URL = 
http://namenode:8088/proxy/application_1456383638304_0009/
INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop job 
 -kill job_1456383638304_0009
INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
reducers: 0
INFO  : 2016-02-26 12:22:21,928 Stage-1 map = 0%,  reduce = 0%
INFO  : 2016-02-26 12:22:29,178 Stage-1 map = 100%,  reduce = 0%
INFO  : 2016-02-26 12:22:32,269 Stage-1 map = 100%,  reduce = 100%
INFO  : Ended Job = job_1456383638304_0009
++--+---+--+
| temp.type  | temp.id  | temp.msgdata  |
++--+---+--+
| -1000  | 163437   |   we come:  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
++--+---+--+
11 rows selected (15.594 seconds)

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which I execute query sql added 'order by' or 'sort by'  is 
> different from not add it .size of resultset  and value of row is returned is 
> different.
> with order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
> order by id;
> INFO  : Number of reduce tasks determined at compile time: 1
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=
> WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
> interface and execute your application with ToolRunner to remedy this.
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1456383638304_0008
> INFO  : The url to track the job: 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop 
> job  -kill job_1456383638304_0008
> INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
> reducers: 0
> INFO  : 2016-02-26 11:06:55,493 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2016-02-26 

[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread chillon_m (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chillon_m updated HIVE-13165:
-
Description: 
resultset which I execute query sql added 'order by' or 'sort by'  is different 
from not add it .size of resultset  and value of row is returned is different.
with order:
0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
type,id,msgData from messages where Num='41433141' and erNum='99841977') 
0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
order by id;
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=
WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
interface and execute your application with ToolRunner to remedy this.
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1456383638304_0008
INFO  : The url to track the job: 
http://namenode:8088/proxy/application_1456383638304_0008/
INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 
http://namenode:8088/proxy/application_1456383638304_0008/
INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop job 
 -kill job_1456383638304_0008
INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
reducers: 0
INFO  : 2016-02-26 11:06:55,493 Stage-1 map = 0%,  reduce = 0%
INFO  : 2016-02-26 11:07:01,710 Stage-1 map = 100%,  reduce = 0%
INFO  : 2016-02-26 11:07:04,815 Stage-1 map = 100%,  reduce = 100%
INFO  : Ended Job = job_1456383638304_0008
++--+---+--+
| temp.type  | temp.id  | temp.msgdata  |
++--+---+--+
| -1000  | 163437   |   we come:  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
| NULL   | NULL | NULL  |
++--+---+--+
11 rows selected (16.191 seconds)
without order:
0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
type,id,msgData from messages where Num='41433141' and erNum='99841977') 
0: jdbc:hive2://namenode:1/default> select * from temp where id=163437;
++--+---+--+
| temp.type  | temp.id  |   
temp.msgdata
|
++--+---+--+
| -1000  | 163437   |   we come:
sadferqgb gtrhyj hytjyjuk  nhmuykiluil
hthnynmkukmhrj,  |
++--+---+--+
1 row selected (18.245 seconds)

  was:
resultset which I execute query sql added 'order by' or 'sort by'  is different 
from not add it .size of resultset  and value of row is returned is different.
with order:
0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
type,id,msgData from messages where Num='41433141' and erNum='99841977') 
0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
order by id;
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=
WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
interface and execute your application with ToolRunner to remedy this.
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1456383638304_0008
INFO  : The url to track the job: 
http://namenode:8088/proxy/application_1456383638304_0008/
INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 

[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168413#comment-15168413
 ] 

Gopal V commented on HIVE-13165:


[~chillon_m]: that looks like a simple new line error (the conversion=none 
should trigger it for both cases).

If turning off FetchTask reproduces the issue for both cases, then {{set 
hive.query.result.fileformat= SequenceFile;}} (default in hive-2.1.0). 

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which I execute query sql added 'order by' or 'sort by'  is 
> different from not add it .size of resultset  and value of row is returned is 
> different.
> with order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from messages where Num='41433141' and erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437 
> order by id;
> INFO  : Number of reduce tasks determined at compile time: 1
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=
> WARN  : Hadoop command-line option parsing not performed. Implement the Tool 
> interface and execute your application with ToolRunner to remedy this.
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1456383638304_0008
> INFO  : The url to track the job: 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Starting Job = job_1456383638304_0008, Tracking URL = 
> http://namenode:8088/proxy/application_1456383638304_0008/
> INFO  : Kill Command = /home/bigdata/hadoop-runtime/hadoop-2.5.2/bin/hadoop 
> job  -kill job_1456383638304_0008
> INFO  : Hadoop job information for Stage-1: number of mappers: 0; number of 
> reducers: 0
> INFO  : 2016-02-26 11:06:55,493 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2016-02-26 11:07:01,710 Stage-1 map = 100%,  reduce = 0%
> INFO  : 2016-02-26 11:07:04,815 Stage-1 map = 100%,  reduce = 100%
> INFO  : Ended Job = job_1456383638304_0008
> ++--+---+--+
> | temp.type  | temp.id  | temp.msgdata  |
> ++--+---+--+
> | -1000  | 163437   |   we come:  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> | NULL   | NULL | NULL  |
> ++--+---+--+
> 11 rows selected (16.191 seconds)
> without order:
> 0: jdbc:hive2://namenode:1/default> with temp as (select msgType as  
> type,id,msgData from qqtroopmessages where Num='41433141' and 
> erNum='99841977') 
> 0: jdbc:hive2://namenode:1/default> select * from temp where id=163437;
> ++--+---+--+
> | temp.type  | temp.id  | 
>   temp.msgdata
> |
> ++--+---+--+
> | -1000  | 163437   |   we come:
> sadferqgb gtrhyj hytjyjuk  nhmuykiluil
> hthnynmkukmhrj,  |
> ++--+---+--+
> 1 row selected (18.245 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread chillon_m (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chillon_m updated HIVE-13165:
-
Attachment: Hql with order.png

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it .size of resultset  and value of row is returned is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168410#comment-15168410
 ] 

Gopal V commented on HIVE-13165:


Try running this by disabling the FetchTask optimizer - {{set 
hive.fetch.task.conversion=none;}}

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png, Hql with order.png
>
>
> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it .size of resultset  and value of row is returned is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13165) resultset which query sql add 'order by' or 'sort by' return is different from not add it

2016-02-25 Thread chillon_m (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chillon_m updated HIVE-13165:
-
Attachment: (was: Hql order.png)

> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it 
> --
>
> Key: HIVE-13165
> URL: https://issues.apache.org/jira/browse/HIVE-13165
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1
> Environment: hadoop 2.5.2 hive 1.2.1 
>Reporter: chillon_m
>Assignee: Vaibhav Gumashta
> Attachments: Hql not order.png
>
>
> resultset which query sql add 'order by' or 'sort by' return is different 
> from not add it .size of resultset  and value of row is returned is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join

2016-02-25 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang resolved HIVE-13164.

Resolution: Invalid

> Predicate pushdown may cause cross-product in left semi join
> 
>
> Key: HIVE-13164
> URL: https://issues.apache.org/jira/browse/HIVE-13164
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> For some left semi join queries like followings:
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t2.value = 'val_0';
> or 
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t1.value = 'val_0';
> Their plans show that they have been converted to keyless cross-product due 
> to the predicate pushdown and the dropping of the on condition.
> {code}
> LOGICAL PLAN:
> t1:t1 
>   TableScan (TS_0)
> alias: t1
> Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_18)
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_2)
> Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator (RS_9)
>   sort order: 
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
>   Join Operator (JOIN_11)
> condition map:
>  Left Semi Join 0 to 1
> keys:
>   0 
>   1 
> Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_13)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_14)
> sort order: 
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
> value expressions: _col0 (type: bigint)
> Group By Operator (GBY_15)
>   aggregations: count(VALUE._col0)
>   mode: mergepartial
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator (FS_17)
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> t2:t2 
>   TableScan (TS_3)
> alias: t2
> Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_19)
>   predicate: ((key = 0) and (value = 'val_0')) (type: boolean)
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_5)
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_8)
>   keys: 'val_0' (type: string)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_10)
> sort order: 
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Join Operator (JOIN_11)
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 
> 1 
>   Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> {code}
> [~gopalv], do you think these plans are valid or not? Thanks 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join

2016-02-25 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168395#comment-15168395
 ] 

Chaoyu Tang commented on HIVE-13164:


Yeah, with the  t1.key = t2.key, the query plan looks right and there is not 
cross-product. Thanks for pointing out.

> Predicate pushdown may cause cross-product in left semi join
> 
>
> Key: HIVE-13164
> URL: https://issues.apache.org/jira/browse/HIVE-13164
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> For some left semi join queries like followings:
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t2.value = 'val_0';
> or 
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t1.value = 'val_0';
> Their plans show that they have been converted to keyless cross-product due 
> to the predicate pushdown and the dropping of the on condition.
> {code}
> LOGICAL PLAN:
> t1:t1 
>   TableScan (TS_0)
> alias: t1
> Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_18)
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_2)
> Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator (RS_9)
>   sort order: 
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
>   Join Operator (JOIN_11)
> condition map:
>  Left Semi Join 0 to 1
> keys:
>   0 
>   1 
> Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_13)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_14)
> sort order: 
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
> value expressions: _col0 (type: bigint)
> Group By Operator (GBY_15)
>   aggregations: count(VALUE._col0)
>   mode: mergepartial
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator (FS_17)
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> t2:t2 
>   TableScan (TS_3)
> alias: t2
> Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_19)
>   predicate: ((key = 0) and (value = 'val_0')) (type: boolean)
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_5)
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_8)
>   keys: 'val_0' (type: string)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_10)
> sort order: 
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Join Operator (JOIN_11)
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 
> 1 
>   Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> {code}
> [~gopalv], do you think these plans are valid or not? Thanks 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13164) Predicate pushdown may cause cross-product in left semi join

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168381#comment-15168381
 ] 

Gopal V commented on HIVE-13164:


[~ctang.ma]: that actually looks like a cross-product even pre-optimization. 
The optimizer is not the one generating a cross-product unless there's a 
missing t1.key = t2.key there?

> Predicate pushdown may cause cross-product in left semi join
> 
>
> Key: HIVE-13164
> URL: https://issues.apache.org/jira/browse/HIVE-13164
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> For some left semi join queries like followings:
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t2.value = 'val_0';
> or 
> select count(1) from (select value from t1 where key = 0) t1 left semi join 
> (select value from t2 where key = 0) t2 on t1.value = 'val_0';
> Their plans show that they have been converted to keyless cross-product due 
> to the predicate pushdown and the dropping of the on condition.
> {code}
> LOGICAL PLAN:
> t1:t1 
>   TableScan (TS_0)
> alias: t1
> Statistics: Num rows: 1453 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_18)
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_2)
> Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator (RS_9)
>   sort order: 
>   Statistics: Num rows: 726 Data size: 2904 Basic stats: COMPLETE 
> Column stats: NONE
>   Join Operator (JOIN_11)
> condition map:
>  Left Semi Join 0 to 1
> keys:
>   0 
>   1 
> Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_13)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_14)
> sort order: 
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
> value expressions: _col0 (type: bigint)
> Group By Operator (GBY_15)
>   aggregations: count(VALUE._col0)
>   mode: mergepartial
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator (FS_17)
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> t2:t2 
>   TableScan (TS_3)
> alias: t2
> Statistics: Num rows: 645 Data size: 5812 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator (FIL_19)
>   predicate: ((key = 0) and (value = 'val_0')) (type: boolean)
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator (SEL_5)
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator (GBY_8)
>   keys: 'val_0' (type: string)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator (RS_10)
> sort order: 
> Statistics: Num rows: 161 Data size: 1450 Basic stats: COMPLETE 
> Column stats: NONE
> Join Operator (JOIN_11)
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 
> 1 
>   Statistics: Num rows: 798 Data size: 3194 Basic stats: COMPLETE 
> Column stats: NONE
> {code}
> [~gopalv], do you think these plans are valid or not? Thanks 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13153) SessionID is appended to thread name twice

2016-02-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168377#comment-15168377
 ] 

Prasanth Jayachandran commented on HIVE-13153:
--

I just copied over the same log level from reset thread name logic. No. I think 
both are setting the thread name in the original code. Resetting in CliDriver 
is done elsewhere. 

> SessionID is appended to thread name twice
> --
>
> Key: HIVE-13153
> URL: https://issues.apache.org/jira/browse/HIVE-13153
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch
>
>
> HIVE-12249 added sessionId to thread name. In some cases the sessionId could 
> be appended twice. Example log line
> {code}
> DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 
> 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main]
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join

2016-02-25 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168375#comment-15168375
 ] 

Chaoyu Tang commented on HIVE-13082:


[~gopalv] Could you take a look at HIVE-13164 to see if it makes sense or not 
based on our discussion here? Thanks

> Enable constant propagation optimization in query with left semi join
> -
>
> Key: HIVE-13082
> URL: https://issues.apache.org/jira/browse/HIVE-13082
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, 
> HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>   at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> {code}



--
This message was sent 

[jira] [Commented] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN

2016-02-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168366#comment-15168366
 ] 

Prasanth Jayachandran commented on HIVE-13163:
--

LGTM, +1

> ORC MemoryManager thread checks are fatal, should WARN 
> ---
>
> Key: HIVE-13163
> URL: https://issues.apache.org/jira/browse/HIVE-13163
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: PIG
> Attachments: HIVE-13163.1.patch
>
>
> The MemoryManager is tied to a WriterOptions on create, which can occur in a 
> different thread from the writer calls.
> This is unexpected, but safe and needs a warning not a fatal.
> {code}
>   /**
>* Light weight thread-safety check for multi-threaded access patterns
>*/
>   private void checkOwner() {
> Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(),
> "Owner thread expected %s, got %s",
> ownerLock.getOwner(),
> Thread.currentThread());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN

2016-02-25 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13163:
---
Status: Patch Available  (was: Open)

> ORC MemoryManager thread checks are fatal, should WARN 
> ---
>
> Key: HIVE-13163
> URL: https://issues.apache.org/jira/browse/HIVE-13163
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: PIG
> Attachments: HIVE-13163.1.patch
>
>
> The MemoryManager is tied to a WriterOptions on create, which can occur in a 
> different thread from the writer calls.
> This is unexpected, but safe and needs a warning not a fatal.
> {code}
>   /**
>* Light weight thread-safety check for multi-threaded access patterns
>*/
>   private void checkOwner() {
> Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(),
> "Owner thread expected %s, got %s",
> ownerLock.getOwner(),
> Thread.currentThread());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168293#comment-15168293
 ] 

Jason Dere commented on HIVE-13063:
---

Took a quick look at the updated patch:
- No need to update itests/src/test/resources/testconfiguration.properties
- You will need to generated the golden files for the qfile tests you added - 
they will be .q.out files.  See 
https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoIupdatetheoutputofaCliDrivertestcase
- In UDFChar.java: can you change this to "chr"
{code}
+@Description(name = "char"
{code}


> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Description: 
Create UDFS for these functions.

CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If n 
is less than 0 or greater than 255, return the empty string. If n is 0, return 
null.

REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
Equals 'BLack and BLue'"

  was:
Create UDFS for these functions.

CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If n 
is less than 0 or greater than 255, return the empty string. If n is 0, return 
null.

REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
Equals 'BLack and BLue'"


> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Status: Patch Available  (was: Open)

> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Attachment: (was: HIVE-13063.master.patch)

> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Status: Open  (was: Patch Available)

> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Attachment: HIVE-13063.patch

> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13162) Fixes for LlapDump and FileSinkoperator

2016-02-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-13162.
---
Resolution: Fixed

Committed to llap branch.

> Fixes for LlapDump and FileSinkoperator
> ---
>
> Key: HIVE-13162
> URL: https://issues.apache.org/jira/browse/HIVE-13162
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-13162.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13162) Fixes for LlapDump and FileSinkoperator

2016-02-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-13162:
--
Attachment: HIVE-13162.1.patch

> Fixes for LlapDump and FileSinkoperator
> ---
>
> Key: HIVE-13162
> URL: https://issues.apache.org/jira/browse/HIVE-13162
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-13162.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13163) ORC MemoryManager thread checks are fatal, should WARN

2016-02-25 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13163:
---
Labels: PIG  (was: )

> ORC MemoryManager thread checks are fatal, should WARN 
> ---
>
> Key: HIVE-13163
> URL: https://issues.apache.org/jira/browse/HIVE-13163
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: PIG
>
> The MemoryManager is tied to a WriterOptions on create, which can occur in a 
> different thread from the writer calls.
> This is unexpected, but safe and needs a warning not a fatal.
> {code}
>   /**
>* Light weight thread-safety check for multi-threaded access patterns
>*/
>   private void checkOwner() {
> Preconditions.checkArgument(ownerLock.isHeldByCurrentThread(),
> "Owner thread expected %s, got %s",
> ownerLock.getOwner(),
> Thread.currentThread());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13130) API calls for retrieving primary keys and foreign keys information

2016-02-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13130:
-
Attachment: HIVE-13130.3.patch

>  API calls for retrieving primary keys and foreign keys information
> ---
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, 
> HIVE-13130.3.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions

2016-02-25 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13151:
-
Attachment: (was: HIVE-13151.1.patch)

> Clean up UGI objects in FileSystem cache for transactions
> -
>
> Key: HIVE-13151
> URL: https://issues.apache.org/jira/browse/HIVE-13151
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13151.1.patch
>
>
> One issue with FileSystem.CACHE is that it does not clean itself. The key in 
> that cache includes UGI object. When new UGI objects are created and used 
> with the FileSystem api, new entries get added to the cache.
> We need to manually clean up those UGI objects once they are no longer in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions

2016-02-25 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13151:
-
Attachment: HIVE-13151.1.patch

> Clean up UGI objects in FileSystem cache for transactions
> -
>
> Key: HIVE-13151
> URL: https://issues.apache.org/jira/browse/HIVE-13151
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13151.1.patch
>
>
> One issue with FileSystem.CACHE is that it does not clean itself. The key in 
> that cache includes UGI object. When new UGI objects are created and used 
> with the FileSystem api, new entries get added to the cache.
> We need to manually clean up those UGI objects once they are no longer in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9422) LLAP: row-level vectorized SARGs

2016-02-25 Thread Yohei Abe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168212#comment-15168212
 ] 

Yohei Abe commented on HIVE-9422:
-


HIVE-9422.WIP1.patch

> LLAP: row-level vectorized SARGs
> 
>
> Key: HIVE-9422
> URL: https://issues.apache.org/jira/browse/HIVE-9422
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
> Attachments: HIVE-9422.WIP1.patch
>
>
> When VRBs are built from encoded data, sargs can be applied on low level to 
> reduce the number of rows to process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9422) LLAP: row-level vectorized SARGs

2016-02-25 Thread Yohei Abe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Abe updated HIVE-9422:

Attachment: HIVE-9422.WIP1.patch

just WIP
points are ...
* Add row-level SARG at OrcEncodedDataConsumer.decodeBath()
* SARG is applied to CVB

> LLAP: row-level vectorized SARGs
> 
>
> Key: HIVE-9422
> URL: https://issues.apache.org/jira/browse/HIVE-9422
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
> Attachments: HIVE-9422.WIP1.patch
>
>
> When VRBs are built from encoded data, sargs can be applied on low level to 
> reduce the number of rows to process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13161) ORC: Always do sloppy overlaps for DiskRanges

2016-02-25 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13161:
---
Priority: Minor  (was: Major)

> ORC: Always do sloppy overlaps for DiskRanges
> -
>
> Key: HIVE-13161
> URL: https://issues.apache.org/jira/browse/HIVE-13161
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Minor
>
> The selected columns are sometimes only a few bytes apart (particularly for 
> nulls which compresses tightly) and the reads aren't merged 
> The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied 
> more for safety than reducing total number of round-trip calls to filesystem.
> {code}
>  /**
>* Update the disk ranges to collapse adjacent or overlapping ranges. It
>* assumes that the ranges are sorted.
>* @param ranges the list of disk ranges to merge
>*/
>   static void mergeDiskRanges(List ranges) {
> DiskRange prev = null;
> for(int i=0; i < ranges.size(); ++i) {
>   DiskRange current = ranges.get(i);
>   if (prev != null && overlap(prev.offset, prev.end,
>   current.offset, current.end)) {
> prev.offset = Math.min(prev.offset, current.offset);
> prev.end = Math.max(prev.end, current.end);
> ranges.remove(i);
> i -= 1;
>   } else {
> prev = current;
>   }
> }
>   }
> ...
>   private static boolean overlap(long leftA, long rightA, long leftB, long 
> rightB) {
> if (leftA <= leftB) {
>   return rightA >= leftB;
> }
> return rightB >= leftA;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13161) ORC: Always do sloppy overlaps for DiskRanges

2016-02-25 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13161:
---
Component/s: ORC

> ORC: Always do sloppy overlaps for DiskRanges
> -
>
> Key: HIVE-13161
> URL: https://issues.apache.org/jira/browse/HIVE-13161
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>
> The selected columns are sometimes only a few bytes apart (particularly for 
> nulls which compresses tightly) and the reads aren't merged 
> The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied 
> more for safety than reducing total number of round-trip calls to filesystem.
> {code}
>  /**
>* Update the disk ranges to collapse adjacent or overlapping ranges. It
>* assumes that the ranges are sorted.
>* @param ranges the list of disk ranges to merge
>*/
>   static void mergeDiskRanges(List ranges) {
> DiskRange prev = null;
> for(int i=0; i < ranges.size(); ++i) {
>   DiskRange current = ranges.get(i);
>   if (prev != null && overlap(prev.offset, prev.end,
>   current.offset, current.end)) {
> prev.offset = Math.min(prev.offset, current.offset);
> prev.end = Math.max(prev.end, current.end);
> ranges.remove(i);
> i -= 1;
>   } else {
> prev = current;
>   }
> }
>   }
> ...
>   private static boolean overlap(long leftA, long rightA, long leftB, long 
> rightB) {
> if (leftA <= leftB) {
>   return rightA >= leftB;
> }
> return rightB >= leftA;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13160) HS2 unable to load UDFs on startup when HMS is not ready

2016-02-25 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-13160:

Description: 
The error looks like this:

{code}
2016-02-18 14:43:54,251 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 14:48:54,692 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 14:48:54,692 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 14:48:55,692 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 14:53:55,800 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 14:53:55,800 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 14:53:56,801 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 14:58:56,967 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 14:58:56,967 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 14:58:57,994 WARN  hive.ql.metadata.Hive: [main]: Failed to register 
all functions.
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1492)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2915)
...
016-02-18 14:58:57,997 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 15:03:58,094 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 15:03:58,095 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 15:03:59,095 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 15:08:59,203 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 15:08:59,203 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 15:09:00,203 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-18 15:14:00,304 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-18 15:14:00,304 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-18 15:14:01,306 INFO  org.apache.hive.service.server.HiveServer2: 
[main]: Shutting down HiveServer2
2016-02-18 15:14:01,308 INFO  org.apache.hive.service.server.HiveServer2: 
[main]: Exception caught when calling stop of HiveServer2 before retrying start
java.lang.NullPointerException
at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:283)
at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:351)
at 
org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:69)
at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:545)
{code}

And then none of the functions will be available for use as HS2 does not 
re-register them after HMS is up and ready.

This is not desired behaviour, we shouldn't allow HS2 to be in a servicing 
state if function list is not ready. Or, maybe instead of initialize the 
function list when HS2 starts, try to load the function list when each Hive 
session is created. Of course we can have a cache of function list somewhere 
for better performance, but we would better decouple it from class Hive.

  was:
The error looks like this:

{code}
2016-02-24 21:16:09,901 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-24 21:16:09,971 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-24 21:16:09,971 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-24 21:16:10,971 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083
2016-02-24 21:16:10,975 WARN  hive.metastore: [main]: Failed to connect to the 
MetaStore Server...
2016-02-24 21:16:10,976 INFO  hive.metastore: [main]: Waiting 1 seconds before 
next connection attempt.
2016-02-24 21:16:11,976 INFO  hive.metastore: [main]: Trying to connect to 
metastore with URI 

[jira] [Updated] (HIVE-13120) propagate doAs when generating ORC splits

2016-02-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13120:

   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> propagate doAs when generating ORC splits
> -
>
> Key: HIVE-13120
> URL: https://issues.apache.org/jira/browse/HIVE-13120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yi Zhang
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-13120.patch
>
>
> ORC+HS2+doAs+FetchTask conversion = weird permission errors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168089#comment-15168089
 ] 

Hive QA commented on HIVE-12994:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12790020/HIVE-12994.11.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 4 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/124/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/124/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-124/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Tests executed.
LXC oracle found.
LXC oracle is not started. Starting container...
Container started.
Preparing oracle container...
Container prepared.
Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/oracle/execute.sh ...
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12790020 - PreCommit-HIVE-METASTORE-Test

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, 
> HIVE-12994.11.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12994:
---
Attachment: HIVE-12994.11.patch

I needed to regenerate additional q files (showing info about null ordering in 
extended explain).

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, 
> HIVE-12994.11.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168060#comment-15168060
 ] 

Hive QA commented on HIVE-13149:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789631/HIVE-13149.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9814 tests 
executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape_clusterby1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown_negative
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_set_metaconf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_conv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_rpad
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestMetastoreVersion.testVersionRestriction
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7092/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7092/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7092/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789631 - PreCommit-HIVE-TRUNK-Build

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some of them like StatsTask, don't need to access 
> HMS, currently a new HMS connection will be established for each thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12994:
---
Status: Patch Available  (was: In Progress)

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, 
> HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12994:
---
Status: Open  (was: Patch Available)

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, 
> HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12994 started by Jesus Camacho Rodriguez.
--
> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.09.patch, HIVE-12994.10.patch, 
> HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13159) TxnHandler should support datanucleus.connectionPoolingType = None

2016-02-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167986#comment-15167986
 ] 

Sergey Shelukhin commented on HIVE-13159:
-

[~alangates] fyi

> TxnHandler should support datanucleus.connectionPoolingType = None
> --
>
> Key: HIVE-13159
> URL: https://issues.apache.org/jira/browse/HIVE-13159
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Right now, one has to choose bonecp or dbcp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHR and REPLACE

2016-02-25 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Summary: Create UDFs for CHR and REPLACE   (was: Create UDFs for CHAR and 
REPLACE )

> Create UDFs for CHR and REPLACE 
> 
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.master.patch, Screen Shot 2016-02-17 at 
> 7.20.57 PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13069) Enable cartesian product merging

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13069 started by Jesus Camacho Rodriguez.
--
> Enable cartesian product merging
> 
>
> Key: HIVE-13069
> URL: https://issues.apache.org/jira/browse/HIVE-13069
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Currently we can merge 2-way joins into n-way joins when the joins are 
> executed over the same column.
> In turn, CBO might produce plans containing cartesian products if the join 
> columns are constant values; after HIVE-12543 went in, this is rather common, 
> as those constant columns are correctly pruned. However, currently we do not 
> merge a cartesian product with two inputs into a cartesian product with 
> multiple inputs, which could result in performance loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13069) Enable cartesian product merging

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13069:
---
Status: Patch Available  (was: In Progress)

> Enable cartesian product merging
> 
>
> Key: HIVE-13069
> URL: https://issues.apache.org/jira/browse/HIVE-13069
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Currently we can merge 2-way joins into n-way joins when the joins are 
> executed over the same column.
> In turn, CBO might produce plans containing cartesian products if the join 
> columns are constant values; after HIVE-12543 went in, this is rather common, 
> as those constant columns are correctly pruned. However, currently we do not 
> merge a cartesian product with two inputs into a cartesian product with 
> multiple inputs, which could result in performance loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13120) propagate doAs when generating ORC splits

2016-02-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167845#comment-15167845
 ] 

Prasanth Jayachandran commented on HIVE-13120:
--

lgtm, +1

> propagate doAs when generating ORC splits
> -
>
> Key: HIVE-13120
> URL: https://issues.apache.org/jira/browse/HIVE-13120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yi Zhang
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13120.patch
>
>
> ORC+HS2+doAs+FetchTask conversion = weird permission errors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13129) CliService leaks HMS connection

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167726#comment-15167726
 ] 

Hive QA commented on HIVE-13129:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789598/HIVE-13129.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9826 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7091/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7091/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7091/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789598 - PreCommit-HIVE-TRUNK-Build

> CliService leaks HMS connection
> ---
>
> Key: HIVE-13129
> URL: https://issues.apache.org/jira/browse/HIVE-13129
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13129.patch
>
>
> HIVE-12790 fixes the HMS connection leaking. But seems there is one more 
> connection from CLIService.
> The init() function in CLIService will get info from DB but we never close 
> the HMS connection for this service main thread.  
> {noformat}
> // creates connection to HMS and thus *must* occur after kerberos login 
> above
> try {
>   applyAuthorizationConfigPolicy(hiveConf);
> } catch (Exception e) {
>   throw new RuntimeException("Error applying authorization policy on hive 
> configuration: "
>   + e.getMessage(), e);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167725#comment-15167725
 ] 

Alan Gates commented on HIVE-13013:
---

+1

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-02-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167695#comment-15167695
 ] 

Aihua Xu commented on HIVE-13149:
-

[~jxiang], [~ctang.ma], [~ngangam] You have worked on the leaking issues 
before.  Can you guys help review the code change? 

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some of them like StatsTask, don't need to access 
> HMS, currently a new HMS connection will be established for each thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information

2016-02-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167647#comment-15167647
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13130:
--

[~ashutoshc]
Yes, will add the tests as well. I will try to add the PK/FK to be send as 
configured in HS2 as well (by introducing a new HS2 parameter). Right now it 
always sends a DUMMY col for the FK/PK col values.

Thanks
Hari

>  API calls for retrieving primary keys and foreign keys information
> ---
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13153) SessionID is appended to thread name twice

2016-02-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167642#comment-15167642
 ] 

Sergey Shelukhin commented on HIVE-13153:
-

Change logging to debug? Also, should the 2nd call in CliDriver be reset, 
rather than update? I am not sure about the original logic compatibility, this 
logic looks good to me.

> SessionID is appended to thread name twice
> --
>
> Key: HIVE-13153
> URL: https://issues.apache.org/jira/browse/HIVE-13153
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch
>
>
> HIVE-12249 added sessionId to thread name. In some cases the sessionId could 
> be appended twice. Example log line
> {code}
> DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 
> 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main]
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167636#comment-15167636
 ] 

Sergey Shelukhin commented on HIVE-12749:
-

constprog2, constprog_partitioner do not test constant propagation anymore. 
They should either be removed or changed to test it with compatible types. cc 
[~ashutoshc] 
Otherwise looks good.

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Aleksey Vovchenko
> Fix For: 2.0.1
>
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch, 
> HIVE-12749.3.patch, HIVE-12749.4.patch, HIVE-12749.5.patch, 
> HIVE-12749.6.patch, HIVE-12749.7.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13132) Hive should lazily load and cache metastore (permanent) functions

2016-02-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167639#comment-15167639
 ] 

Anthony Hsu commented on HIVE-13132:


Thanks for the review, [~alangates].

# I tested and unfortunately, HIVE-2573 does NOT solve this issue. A stacktrace 
in jdb shows all functions are loaded during CliDriver start-up:
{noformat}
main[1] where
  [1] org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions (Hive.java:173)
  [2] org.apache.hadoop.hive.ql.metadata.Hive. (Hive.java:166)
  [3] org.apache.hadoop.hive.ql.session.SessionState.start 
(SessionState.java:503)
  [4] org.apache.hadoop.hive.cli.CliDriver.run (CliDriver.java:677)
  [5] org.apache.hadoop.hive.cli.CliDriver.main (CliDriver.java:621)
  [6] sun.reflect.NativeMethodAccessorImpl.invoke0 (native method)
  [7] sun.reflect.NativeMethodAccessorImpl.invoke 
(NativeMethodAccessorImpl.java:62)
  [8] sun.reflect.DelegatingMethodAccessorImpl.invoke 
(DelegatingMethodAccessorImpl.java:43)
  [9] java.lang.reflect.Method.invoke (Method.java:483)
  [10] org.apache.hadoop.util.RunJar.run (RunJar.java:221)
  [11] org.apache.hadoop.util.RunJar.main (RunJar.java:136)
{noformat}
# My fix is a bit hacky and only works on the Hive 0.13.1 branch. I basically 
changed the CliDriver initialization code to use 
{{FunctionRegistry.getFunctionNames(String funcPatternStr)}} instead of 
{{FunctionRegistry.getFunctionNames()}}. In the Hive 0.13.1 branch, the former 
does NOT search the metastore while the latter does. This is no longer the case 
in trunk.
# I don't have much experience with HS2 but I will take a look.

I'll work on a cleaner solution that removes the pre-loading of metastore 
functions from the CliDriver initialization code path. Suggestions welcome.

> Hive should lazily load and cache metastore (permanent) functions
> -
>
> Key: HIVE-13132
> URL: https://issues.apache.org/jira/browse/HIVE-13132
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.1
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-13132.1.patch
>
>
> In Hive 0.13.1, we have noticed that as the number of databases increases, 
> the start-up time of the Hive interactive shell increases. This is because 
> during start-up, all databases are iterated over to fetch the permanent 
> functions to display in the {{SHOW FUNCTIONS}} output.
> {noformat:title=FunctionRegistry.java}
>   private static Set getFunctionNames(boolean searchMetastore) {
> Set functionNames = mFunctions.keySet();
> if (searchMetastore) {
>   functionNames = new HashSet(functionNames);
>   try {
> Hive db = getHive();
> List dbNames = db.getAllDatabases();
> for (String dbName : dbNames) {
>   List funcNames = db.getFunctions(dbName, "*");
>   for (String funcName : funcNames) {
> functionNames.add(FunctionUtils.qualifyFunctionName(funcName, 
> dbName));
>   }
> }
>   } catch (Exception e) {
> LOG.error(e);
> // Continue on, we can still return the functions we've gotten to 
> this point.
>   }
> }
> return functionNames;
>   }
> {noformat}
> Instead of eagerly loading all metastore functions, we should only load them 
> the first time {{SHOW FUNCTIONS}} is invoked. We should also cache the 
> results.
> Note that this issue may have been fixed by HIVE-2573, though I haven't 
> verified this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information

2016-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167632#comment-15167632
 ] 

Ashutosh Chauhan commented on HIVE-13130:
-

Also can you add tests for new jdbc api in {{TestJdbcWithMiniHS2}}

>  API calls for retrieving primary keys and foreign keys information
> ---
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167630#comment-15167630
 ] 

Jesus Camacho Rodriguez commented on HIVE-12749:


Patch LGTM, +1. [~AleKsey Vovchenko], could you rebase it and take care of 
those q file failures? Thanks

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Aleksey Vovchenko
> Fix For: 2.0.1
>
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch, 
> HIVE-12749.3.patch, HIVE-12749.4.patch, HIVE-12749.5.patch, 
> HIVE-12749.6.patch, HIVE-12749.7.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167618#comment-15167618
 ] 

Eugene Koifman commented on HIVE-13013:
---

[~alangates], could you review please?  The delta between patch 1 and 3 is 
minimal

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.2.patch, HIVE-13013.3.patch, HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13130) API calls for retrieving primary keys and foreign keys information

2016-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167606#comment-15167606
 ] 

Ashutosh Chauhan commented on HIVE-13130:
-

Can you create a RB entry for non generated code?

>  API calls for retrieving primary keys and foreign keys information
> ---
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13129) CliService leaks HMS connection

2016-02-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167565#comment-15167565
 ] 

Aihua Xu commented on HIVE-13129:
-

Only one HMS connection will be created but never released and not used later. 
It will accumulate leaking HMS resource and database resource if HiveServer 
keeps restarting. 

> CliService leaks HMS connection
> ---
>
> Key: HIVE-13129
> URL: https://issues.apache.org/jira/browse/HIVE-13129
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13129.patch
>
>
> HIVE-12790 fixes the HMS connection leaking. But seems there is one more 
> connection from CLIService.
> The init() function in CLIService will get info from DB but we never close 
> the HMS connection for this service main thread.  
> {noformat}
> // creates connection to HMS and thus *must* occur after kerberos login 
> above
> try {
>   applyAuthorizationConfigPolicy(hiveConf);
> } catch (Exception e) {
>   throw new RuntimeException("Error applying authorization policy on hive 
> configuration: "
>   + e.getMessage(), e);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13146) OrcFile table property values are case sensitive

2016-02-25 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13146:

Attachment: HIVE-13146.1.patch

The attached patch fixes it by change orc conpress value to uppercase when 
table is created. 

> OrcFile table property values are case sensitive
> 
>
> Key: HIVE-13146
> URL: https://issues.apache.org/jira/browse/HIVE-13146
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Andrew Sears
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-13146.1.patch
>
>
> In Hive v1.2.1.2.3, with Tez , create an external table with compression 
> SNAPPY value marked as lower case.  Table is created successfully.  Insert 
> data into table fails with no enum constant error.
> CREATE EXTERNAL TABLE mydb.mytable 
> (id int)
>   PARTITIONED BY (business_date date)
> STORED AS ORC
> LOCATION
>   '/data/mydb/mytable'
> TBLPROPERTIES (
>   'orc.compress'='snappy');
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> INSERT OVERWRITE mydb.mytable PARTITION (business_date)
> SELECT * from mydb.sourcetable;
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy
>   at java.lang.Enum.valueOf(Enum.java:238)
>   at 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25)
> Constant SNAPPY needs to be uppercase in definition to fix.  Case should be 
> agnostic or throw error on creation of table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13146) OrcFile table property values are case sensitive

2016-02-25 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13146:

Status: Patch Available  (was: Open)

> OrcFile table property values are case sensitive
> 
>
> Key: HIVE-13146
> URL: https://issues.apache.org/jira/browse/HIVE-13146
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Andrew Sears
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-13146.1.patch
>
>
> In Hive v1.2.1.2.3, with Tez , create an external table with compression 
> SNAPPY value marked as lower case.  Table is created successfully.  Insert 
> data into table fails with no enum constant error.
> CREATE EXTERNAL TABLE mydb.mytable 
> (id int)
>   PARTITIONED BY (business_date date)
> STORED AS ORC
> LOCATION
>   '/data/mydb/mytable'
> TBLPROPERTIES (
>   'orc.compress'='snappy');
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> INSERT OVERWRITE mydb.mytable PARTITION (business_date)
> SELECT * from mydb.sourcetable;
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy
>   at java.lang.Enum.valueOf(Enum.java:238)
>   at 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25)
> Constant SNAPPY needs to be uppercase in definition to fix.  Case should be 
> agnostic or throw error on creation of table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13139) Unfold TOK_ALLCOLREF of source table/view at QB stage

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167534#comment-15167534
 ] 

Hive QA commented on HIVE-13139:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789597/HIVE-13139.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 405 failed/errored test(s), 9720 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-auto_join18.q-union_remove_23.q-input1_limit.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join39.q-stats12.q-union27.q-and-12-more - did not produce a 
TEST-*.xml file
TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_partition_drop
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_const
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_column_access_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantfolding
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cteViews
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_3

[jira] [Updated] (HIVE-13131) TezWork queryName can be null after HIVE-12523

2016-02-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-13131:
--
Attachment: HIVE-13131.2.patch

Uploading patch v2 - this is just a golden file update for the qfiles

> TezWork queryName can be null after HIVE-12523
> --
>
> Key: HIVE-13131
> URL: https://issues.apache.org/jira/browse/HIVE-13131
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13131.1.patch, HIVE-13131.2.patch
>
>
> Looks like after HIVE-12523, the queryName field can be null, either if the 
> conf passed in is null, or if the conf does not contain the necessary 
> settings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13129) CliService leaks HMS connection

2016-02-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167489#comment-15167489
 ] 

Naveen Gangam commented on HIVE-13129:
--

[~aihuaxu] This should only be creating one HMS connection, during HS2 startup, 
over the lifetime of the HS2. Do you see this causing an incremental leak? or 
are you proactively closing that HMS connection because you think its never 
used again? 

> CliService leaks HMS connection
> ---
>
> Key: HIVE-13129
> URL: https://issues.apache.org/jira/browse/HIVE-13129
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13129.patch
>
>
> HIVE-12790 fixes the HMS connection leaking. But seems there is one more 
> connection from CLIService.
> The init() function in CLIService will get info from DB but we never close 
> the HMS connection for this service main thread.  
> {noformat}
> // creates connection to HMS and thus *must* occur after kerberos login 
> above
> try {
>   applyAuthorizationConfigPolicy(hiveConf);
> } catch (Exception e) {
>   throw new RuntimeException("Error applying authorization policy on hive 
> configuration: "
>   + e.getMessage(), e);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13102) CBO: Reduce operations in Calcite do not fold as tight as rule-based folding

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13102:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for the review [~ashutoshc]!

> CBO: Reduce operations in Calcite do not fold as tight as rule-based folding
> 
>
> Key: HIVE-13102
> URL: https://issues.apache.org/jira/browse/HIVE-13102
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-13102.01.patch, HIVE-13102.patch
>
>
> With CBO
> {code}
> create temporary table table1(id int, val int, val1 int, dimid int);
> create temporary table table3(id int, val int, val1 int);
> hive> explain select table1.id, table1.val, table1.val1 from table1 inner 
> join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid 
> <>1 ;
> Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 llap
>   File Output Operator [FS_11]
> Map Join Operator [MAPJOIN_14] (rows=1 width=0)
>   Conds:(Inner),Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE] llap
>   BROADCAST [RS_8]
> Select Operator [SEL_5] (rows=1 width=0)
>   Filter Operator [FIL_13] (rows=1 width=0)
> predicate:(id = 1)
> TableScan [TS_3] (rows=1 width=0)
>   default@table3,table3,Tbl:PARTIAL,Col:NONE,Output:["id"]
> <-Select Operator [SEL_2] (rows=1 width=0)
> Output:["_col0","_col1","_col2"]
> Filter Operator [FIL_12] (rows=1 width=0)
>   predicate:((dimid = 1) and (dimid <> 1))
>   TableScan [TS_0] (rows=1 width=0)
> 
> default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1","dimid"]
> {code}
> without CBO
> {code}
> hive> explain select table1.id, table1.val, table1.val1 from table1 inner 
> join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid 
> <>1 ;
> OK
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 llap
>   File Output Operator [FS_9]
> Map Join Operator [MAPJOIN_14] (rows=1 width=0)
>   Conds:FIL_12.1=RS_17.1(Inner),Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_17]
> PartitionCols:1
> Filter Operator [FIL_16] (rows=1 width=0)
>   predicate:false
>   TableScan [TS_1] (rows=1 width=0)
> default@table3,table3,Tbl:PARTIAL,Col:COMPLETE
> <-Filter Operator [FIL_12] (rows=1 width=0)
> predicate:false
> TableScan [TS_0] (rows=1 width=0)
>   
> default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1"]
> Time taken: 0.044 seconds, Fetched: 23 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join

2016-02-25 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167382#comment-15167382
 ] 

Chaoyu Tang commented on HIVE-13082:


Thank [~gopalv] for the explanation. Basically any optimization that causes the 
dropping of on clause in left semi join should be considered as invalid because 
it turns off the implicit "distinct", right?

> Enable constant propagation optimization in query with left semi join
> -
>
> Key: HIVE-13082
> URL: https://issues.apache.org/jira/browse/HIVE-13082
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, 
> HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>   at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> 

[jira] [Commented] (HIVE-13102) CBO: Reduce operations in Calcite do not fold as tight as rule-based folding

2016-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167378#comment-15167378
 ] 

Ashutosh Chauhan commented on HIVE-13102:
-

+1

> CBO: Reduce operations in Calcite do not fold as tight as rule-based folding
> 
>
> Key: HIVE-13102
> URL: https://issues.apache.org/jira/browse/HIVE-13102
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-13102.01.patch, HIVE-13102.patch
>
>
> With CBO
> {code}
> create temporary table table1(id int, val int, val1 int, dimid int);
> create temporary table table3(id int, val int, val1 int);
> hive> explain select table1.id, table1.val, table1.val1 from table1 inner 
> join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid 
> <>1 ;
> Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 llap
>   File Output Operator [FS_11]
> Map Join Operator [MAPJOIN_14] (rows=1 width=0)
>   Conds:(Inner),Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE] llap
>   BROADCAST [RS_8]
> Select Operator [SEL_5] (rows=1 width=0)
>   Filter Operator [FIL_13] (rows=1 width=0)
> predicate:(id = 1)
> TableScan [TS_3] (rows=1 width=0)
>   default@table3,table3,Tbl:PARTIAL,Col:NONE,Output:["id"]
> <-Select Operator [SEL_2] (rows=1 width=0)
> Output:["_col0","_col1","_col2"]
> Filter Operator [FIL_12] (rows=1 width=0)
>   predicate:((dimid = 1) and (dimid <> 1))
>   TableScan [TS_0] (rows=1 width=0)
> 
> default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1","dimid"]
> {code}
> without CBO
> {code}
> hive> explain select table1.id, table1.val, table1.val1 from table1 inner 
> join table3 on table1.dimid = table3.id and table3.id = 1 where table1.dimid 
> <>1 ;
> OK
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 llap
>   File Output Operator [FS_9]
> Map Join Operator [MAPJOIN_14] (rows=1 width=0)
>   Conds:FIL_12.1=RS_17.1(Inner),Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_17]
> PartitionCols:1
> Filter Operator [FIL_16] (rows=1 width=0)
>   predicate:false
>   TableScan [TS_1] (rows=1 width=0)
> default@table3,table3,Tbl:PARTIAL,Col:COMPLETE
> <-Filter Operator [FIL_12] (rows=1 width=0)
> predicate:false
> TableScan [TS_0] (rows=1 width=0)
>   
> default@table1,table1,Tbl:PARTIAL,Col:NONE,Output:["id","val","val1"]
> Time taken: 0.044 seconds, Fetched: 23 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167372#comment-15167372
 ] 

Jesus Camacho Rodriguez commented on HIVE-13096:


I just did. Thanks

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167363#comment-15167363
 ] 

Ashutosh Chauhan commented on HIVE-13096:
-

Can you create a RB entry?

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13146) OrcFile table property values are case sensitive

2016-02-25 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-13146:
---

Assignee: Yongzhi Chen

> OrcFile table property values are case sensitive
> 
>
> Key: HIVE-13146
> URL: https://issues.apache.org/jira/browse/HIVE-13146
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Andrew Sears
>Assignee: Yongzhi Chen
>Priority: Minor
>
> In Hive v1.2.1.2.3, with Tez , create an external table with compression 
> SNAPPY value marked as lower case.  Table is created successfully.  Insert 
> data into table fails with no enum constant error.
> CREATE EXTERNAL TABLE mydb.mytable 
> (id int)
>   PARTITIONED BY (business_date date)
> STORED AS ORC
> LOCATION
>   '/data/mydb/mytable'
> TBLPROPERTIES (
>   'orc.compress'='snappy');
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> INSERT OVERWRITE mydb.mytable PARTITION (business_date)
> SELECT * from mydb.sourcetable;
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.snappy
>   at java.lang.Enum.valueOf(Enum.java:238)
>   at 
> org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25)
> Constant SNAPPY needs to be uppercase in definition to fix.  Case should be 
> agnostic or throw error on creation of table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13150) When multiple queries are running in the same session, they are sharing the same HMS Client.

2016-02-25 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13150:

Description: 
Seems we should create different HMSClient for different queries if multiple 
queries are executing in the same session in async at the same time to have 
better performance.
Right now, we are unnecessarily to use one HMSClient and we have to make HMS 
calls in sync among different queries.


  was:
HMS connection leak has been addressed for the session thread, task threads if 
the execution is run in parallel. 

While if we execute the queries in async, we will run the queries in separate 
threads and the HMS connections there are not released. 


> When multiple queries are running in the same session, they are sharing the 
> same HMS Client.
> 
>
> Key: HIVE-13150
> URL: https://issues.apache.org/jira/browse/HIVE-13150
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Seems we should create different HMSClient for different queries if multiple 
> queries are executing in the same session in async at the same time to have 
> better performance.
> Right now, we are unnecessarily to use one HMSClient and we have to make HMS 
> calls in sync among different queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13150) When multiple queries are running in the same session, they are sharing the same HMS Client.

2016-02-25 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13150:

Summary: When multiple queries are running in the same session, they are 
sharing the same HMS Client.  (was: HMS connection leak when the query is run 
in async)

> When multiple queries are running in the same session, they are sharing the 
> same HMS Client.
> 
>
> Key: HIVE-13150
> URL: https://issues.apache.org/jira/browse/HIVE-13150
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> HMS connection leak has been addressed for the session thread, task threads 
> if the execution is run in parallel. 
> While if we execute the queries in async, we will run the queries in separate 
> threads and the HMS connections there are not released. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-13150) HMS connection leak when the query is run in async

2016-02-25 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reopened HIVE-13150:
-

Rather than connection leak, seems we should create different HMSClient for 
different queries if multiple queries are executing in the same session at the 
same time to have better performance. 

Right now, we are unnecessarily to use one HMSClient and we have to make HMS 
calls in sync among different queries. 

> HMS connection leak when the query is run in async
> --
>
> Key: HIVE-13150
> URL: https://issues.apache.org/jira/browse/HIVE-13150
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> HMS connection leak has been addressed for the session thread, task threads 
> if the execution is run in parallel. 
> While if we execute the queries in async, we will run the queries in separate 
> threads and the HMS connections there are not released. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13096:
---
Status: Open  (was: Patch Available)

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13096:
---
Attachment: HIVE-13096.03.patch

[~ashutoshc], I updated the patch to recalculate mapJoinConversionPos only if 
necessary.

I checked the plan changes and they seem fine based on the heuristic (plan is 
akin to pre-HIVE-11954 status).

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13096:
---
Status: Patch Available  (was: In Progress)

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13096 started by Jesus Camacho Rodriguez.
--
> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13034) Add jdeb plugin to build debian

2016-02-25 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167103#comment-15167103
 ] 

Amareshwari Sriramadasu commented on HIVE-13034:


+1 for permission fix through 
https://issues.apache.org/jira/secure/attachment/12789906/HIVE-13034.1.patch

> Add jdeb plugin to build debian
> ---
>
> Key: HIVE-13034
> URL: https://issues.apache.org/jira/browse/HIVE-13034
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.1.0
>Reporter: Arshad Matin
>Assignee: Arshad Matin
> Fix For: 2.1.0
>
> Attachments: HIVE-13034.1.patch, HIVE-13034.patch
>
>
> It would be nice to also generate a debian as a part of build. This can be 
> done by adding jdeb plugin to dist profile.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13034) Add jdeb plugin to build debian

2016-02-25 Thread Arshad Matin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167102#comment-15167102
 ] 

Arshad Matin commented on HIVE-13034:
-

*Testing*

{noformat}
arshad:/usr/local/hive$ ls -ltr
total 508
-rw-r--r-- 1 root root 445081 Feb 25 08:46 RELEASE_NOTES.txt
-rw-r--r-- 1 root root   4353 Feb 25 08:46 README.txt
-rw-r--r-- 1 root root513 Feb 25 08:46 NOTICE
-rw-r--r-- 1 root root  27909 Feb 25 08:46 LICENSE
drwxr-xr-x 4 root root   4096 Feb 25 11:29 scripts
drwxr-xr-x 7 root root   4096 Feb 25 11:29 hcatalog
drwxr-xr-x 4 root root   4096 Feb 25 11:29 examples
drwxr-xr-x 2 root root   4096 Feb 25 11:29 conf
drwxr-xr-x 3 root root   4096 Feb 25 11:29 bin
drwxr-xr-x 4 root root  12288 Feb 25 11:29 lib
arshad:/usr/local/hive$ cd bin/
arshad:/usr/local/hive/bin$ ls -ltr
total 64
-rwxr-xr-x 1 root root  884 Feb 25 08:46 schematool
-rwxr-xr-x 1 root root  832 Feb 25 08:46 metatool
-rwxr-xr-x 1 root root 2278 Feb 25 08:46 hplsql.cmd
-rwxr-xr-x 1 root root 1030 Feb 25 08:46 hplsql
-rwxr-xr-x 1 root root  885 Feb 25 08:46 hiveserver2
-rwxr-xr-x 1 root root 1900 Feb 25 08:46 hive-config.sh
-rwxr-xr-x 1 root root 1584 Feb 25 08:46 hive-config.cmd
-rwxr-xr-x 1 root root 8713 Feb 25 08:46 hive.cmd
-rwxr-xr-x 1 root root 8262 Feb 25 08:46 hive
-rwxr-xr-x 1 root root 2553 Feb 25 08:46 beeline.cmd
-rwxr-xr-x 1 root root 1436 Feb 25 08:46 beeline
drwxr-xr-x 3 root root 4096 Feb 25 11:29 ext
{noformat}


> Add jdeb plugin to build debian
> ---
>
> Key: HIVE-13034
> URL: https://issues.apache.org/jira/browse/HIVE-13034
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.1.0
>Reporter: Arshad Matin
>Assignee: Arshad Matin
> Fix For: 2.1.0
>
> Attachments: HIVE-13034.1.patch, HIVE-13034.patch
>
>
> It would be nice to also generate a debian as a part of build. This can be 
> done by adding jdeb plugin to dist profile.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13034) Add jdeb plugin to build debian

2016-02-25 Thread Arshad Matin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Matin updated HIVE-13034:

Status: Patch Available  (was: Reopened)

> Add jdeb plugin to build debian
> ---
>
> Key: HIVE-13034
> URL: https://issues.apache.org/jira/browse/HIVE-13034
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.1.0
>Reporter: Arshad Matin
>Assignee: Arshad Matin
> Fix For: 2.1.0
>
> Attachments: HIVE-13034.1.patch, HIVE-13034.patch
>
>
> It would be nice to also generate a debian as a part of build. This can be 
> done by adding jdeb plugin to dist profile.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13034) Add jdeb plugin to build debian

2016-02-25 Thread Arshad Matin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Matin updated HIVE-13034:

Attachment: HIVE-13034.1.patch

> Add jdeb plugin to build debian
> ---
>
> Key: HIVE-13034
> URL: https://issues.apache.org/jira/browse/HIVE-13034
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.1.0
>Reporter: Arshad Matin
>Assignee: Arshad Matin
> Fix For: 2.1.0
>
> Attachments: HIVE-13034.1.patch, HIVE-13034.patch
>
>
> It would be nice to also generate a debian as a part of build. This can be 
> done by adding jdeb plugin to dist profile.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-13034) Add jdeb plugin to build debian

2016-02-25 Thread Arshad Matin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Matin reopened HIVE-13034:
-

Reopening it as there is some issue with the permission. Small fix. Uploading 
patch directly here.

> Add jdeb plugin to build debian
> ---
>
> Key: HIVE-13034
> URL: https://issues.apache.org/jira/browse/HIVE-13034
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.1.0
>Reporter: Arshad Matin
>Assignee: Arshad Matin
> Fix For: 2.1.0
>
> Attachments: HIVE-13034.patch
>
>
> It would be nice to also generate a debian as a part of build. This can be 
> done by adding jdeb plugin to dist profile.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13130) API calls for retrieving primary keys and foreign keys information

2016-02-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13130:
-
Attachment: HIVE-13130.2.patch

added jdbc calls draft#2

>  API calls for retrieving primary keys and foreign keys information
> ---
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13144) HS2 can leak ZK ACL objects when curator retries to create the persistent ephemeral node

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166928#comment-15166928
 ] 

Hive QA commented on HIVE-13144:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789565/HIVE-13144.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7088/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7088/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7088/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7088/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at e9b7348 HIVE-13101: NullPointerException in HiveLexer.g (Sandeep 
via Xuefu)
+ git clean -f -d
Removing data/files/timestamps.txt
Removing 
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestTimestampWritableAndColumnVector.java
Removing ql/src/test/queries/clientpositive/vector_interval_arithmetic.q
Removing ql/src/test/results/clientpositive/tez/vector_interval_arithmetic.q.out
Removing ql/src/test/results/clientpositive/tez/vectorized_timestamp.q.out
Removing ql/src/test/results/clientpositive/vector_interval_arithmetic.q.out
Removing 
storage-api/src/java/org/apache/hadoop/hive/common/type/HiveIntervalDayTime.java
Removing 
storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/IntervalDayTimeColumnVector.java
Removing 
storage-api/src/java/org/apache/hive/common/util/IntervalDayTimeUtils.java
Removing 
storage-api/src/test/org/apache/hadoop/hive/ql/exec/vector/TestTimestampColumnVector.java
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at e9b7348 HIVE-13101: NullPointerException in HiveLexer.g (Sandeep 
via Xuefu)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789565 - PreCommit-HIVE-TRUNK-Build

> HS2 can leak ZK ACL objects when curator retries to create the persistent 
> ephemeral node
> 
>
> Key: HIVE-13144
> URL: https://issues.apache.org/jira/browse/HIVE-13144
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-13144.1.patch
>
>
> When the node gets deleted from ZK due to connection loss and curator tries 
> to recreate the node, it might leak ZK ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862

2016-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166921#comment-15166921
 ] 

Hive QA commented on HIVE-13111:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789569/HIVE-13111.02.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9828 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFUnixTimeStampTimestamp
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testRepeating
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7087/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7087/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7087/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789569 - PreCommit-HIVE-TRUNK-Build

> Fix timestamp / interval_day_time wrong results with HIVE-9862 
> ---
>
> Key: HIVE-13111
> URL: https://issues.apache.org/jira/browse/HIVE-13111
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch
>
>
> Fix timestamp / interval_day_time issues discovered when testing the 
> Vectorized Text patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12935:
-
Attachment: HIVE-12935.5.patch

Named notification handler threadpool, also has fix for ACL leak in zk 
connection.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch, 
> HIVE-12935.4.patch, HIVE-12935.5.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13082) Enable constant propagation optimization in query with left semi join

2016-02-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166909#comment-15166909
 ] 

Gopal V commented on HIVE-13082:


The predicate is actually folded to 1=1 because the actual keys don't count.

select * from a where id IN (select b.id from b) and a.id = 1;

folds into 

select * from a where id IN (select 1 from b where b.id = 1) and a.id = 1;

After CBO, it gets rewritten as 

select * from a left semi join b on 1 = 1 where a.id = 1 and b.id = 1;

And the 2nd constant folding pass does

select * from a left semi join b where a.id = 1 and b.id = 1;

accidentally dropping the ON clause & turning it into a keyless cross-product, 
which turns off the implicit "distinct " injected by the left semi join since 
there's no key anymore.


> Enable constant propagation optimization in query with left semi join
> -
>
> Key: HIVE-13082
> URL: https://issues.apache.org/jira/browse/HIVE-13082
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13082.1.patch, HIVE-13082.2.patch, 
> HIVE-13082.3.patch, HIVE-13082.branch-1.patch, HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>   at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  

[jira] [Updated] (HIVE-13153) SessionID is appended to thread name twice

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13153:
-
Attachment: HIVE-13153.2.patch

Switched log lines before renaming thread per [~gopalv]'s comments.

> SessionID is appended to thread name twice
> --
>
> Key: HIVE-13153
> URL: https://issues.apache.org/jira/browse/HIVE-13153
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch
>
>
> HIVE-12249 added sessionId to thread name. In some cases the sessionId could 
> be appended twice. Example log line
> {code}
> DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 
> 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main]
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13153) SessionID is appended to thread name twice

2016-02-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13153:
-
Status: Patch Available  (was: Open)

> SessionID is appended to thread name twice
> --
>
> Key: HIVE-13153
> URL: https://issues.apache.org/jira/browse/HIVE-13153
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13153.1.patch
>
>
> HIVE-12249 added sessionId to thread name. In some cases the sessionId could 
> be appended twice. Example log line
> {code}
> DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 
> 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main]
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >