date:20151211

[jira] [Commented] (HIVE-12609) Remove javaXML serialization

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052626#comment-15052626
 ] 

Hive QA commented on HIVE-12609:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12776719/HIVE-12609.2.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_selectDistinctStar
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6317/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6317/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6317/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12776719 - PreCommit-HIVE-TRUNK-Build

> Remove javaXML serialization
> 
>
> Key: HIVE-12609
> URL: https://issues.apache.org/jira/browse/HIVE-12609
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12609.1.patch, HIVE-12609.2.patch
>
>
> We use kryo as default serializer and javaXML based serialization is not used 
> in many places and is also not well tested. We should remove javaXML 
> serialization and make kryo as the only serialization option.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12551) Fix several kryo exceptions in branch-1

2015-12-11 Thread Feng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Yuan updated HIVE-12551:
-
Attachment: test case.zip

> Fix several kryo exceptions in branch-1
> ---
>
> Key: HIVE-12551
> URL: https://issues.apache.org/jira/browse/HIVE-12551
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: serialization
> Fix For: 1.3.0
>
> Attachments: HIVE-12551.1.patch, test case.zip
>
>
> HIVE-11519, HIVE-12174 and the following exception are all caused by 
> unregistered classes or serializers. HIVE-12175 should have fixed these 
> issues for master branch.
> {code}
> Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.NullPointerException
> Serialization trace:
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> expr (org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor)
> childExpressions 
> (org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterStringColumnBetween)
> conditionEvaluator 
> (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:367)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:276)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1087)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:976)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:990)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:426)
>   ... 27 more
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:3818)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
>

[jira] [Commented] (HIVE-12551) Fix several kryo exceptions in branch-1

2015-12-11 Thread Feng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052521#comment-15052521
 ] 

Feng Yuan commented on HIVE-12551:
--

hi [~prasanth_j] i upload the test case,please look at it when you have 
time,thanks!

> Fix several kryo exceptions in branch-1
> ---
>
> Key: HIVE-12551
> URL: https://issues.apache.org/jira/browse/HIVE-12551
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: serialization
> Fix For: 1.3.0
>
> Attachments: HIVE-12551.1.patch, test case.zip
>
>
> HIVE-11519, HIVE-12174 and the following exception are all caused by 
> unregistered classes or serializers. HIVE-12175 should have fixed these 
> issues for master branch.
> {code}
> Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.NullPointerException
> Serialization trace:
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> expr (org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor)
> childExpressions 
> (org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterStringColumnBetween)
> conditionEvaluator 
> (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:367)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:276)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1087)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:976)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:990)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:426)
>   ... 27 more
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:3818)
>   at

[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Attachment: HIVE-12653.patch

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052712#comment-15052712
 ] 

Hive QA commented on HIVE-12538:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12776744/HIVE-12538.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6318/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6318/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6318/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12776744 - PreCommit-HIVE-TRUNK-Build

> After set spark related config, SparkSession never get reused
> -
>
> Key: HIVE-12538
> URL: https://issues.apache.org/jira/browse/HIVE-12538
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-12538.1.patch, HIVE-12538.2.patch, 
> HIVE-12538.3.patch, HIVE-12538.4.patch, HIVE-12538.5.patch, HIVE-12538.patch
>
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.
> Two of the yarn applications in FINISHED & SUCCEEDED state,and one in RUNNING 
> & UNDEFINED state waiting for next work.
> And if you submit one more "select count(*) from test" ,the third one will be 
> in FINISHED & SUCCEEDED state and a new yarn application will start up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-12-11 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052727#comment-15052727
 ] 

Xuefu Zhang commented on HIVE-12538:


+1

> After set spark related config, SparkSession never get reused
> -
>
> Key: HIVE-12538
> URL: https://issues.apache.org/jira/browse/HIVE-12538
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-12538.1.patch, HIVE-12538.2.patch, 
> HIVE-12538.3.patch, HIVE-12538.4.patch, HIVE-12538.5.patch, HIVE-12538.patch
>
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.
> Two of the yarn applications in FINISHED & SUCCEEDED state,and one in RUNNING 
> & UNDEFINED state waiting for next work.
> And if you submit one more "select count(*) from test" ,the third one will be 
> in FINISHED & SUCCEEDED state and a new yarn application will start up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12615) Do not start spark session when only explain

2015-12-11 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052743#comment-15052743
 ] 

Xuefu Zhang commented on HIVE-12615:


[~nemon], Thanks for working on this. I noticed: 

1. A lot of test output needs to be updated. 
2. I'm not sure showing -1 as the number of reducers makes sense. 
SetSparkReducerParallelism has logic of getting parallelism w/o knowing the 
execturor/memory.

Given these additional changes required, do you think the problem is worth a 
fix? In another word, doesn't it really bothers your use case?

> Do not start spark session when only explain 
> -
>
> Key: HIVE-12615
> URL: https://issues.apache.org/jira/browse/HIVE-12615
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12615.patch
>
>
> When using beeline -e "set hive.execution.engine=spark;explain select 
> count(*) from sometable",it's very slow due to starting of spark session on 
> yarn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12615) Do not start spark session when only explain

2015-12-11 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052779#comment-15052779
 ] 

Nemon Lou commented on HIVE-12615:
--

I didn't expect so many test case are affected. Sorry for haven't run these 
test on my own .
I think it's better to mark this improvement as won't fix ,since explain is a 
rare use case for most users. 

> Do not start spark session when only explain 
> -
>
> Key: HIVE-12615
> URL: https://issues.apache.org/jira/browse/HIVE-12615
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12615.patch
>
>
> When using beeline -e "set hive.execution.engine=spark;explain select 
> count(*) from sometable",it's very slow due to starting of spark session on 
> yarn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it will get a wrong result

2015-12-11 Thread Xiaowei Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052906#comment-15052906
 ] 

Xiaowei Wang commented on HIVE-12541:
-

May be https://issues.apache.org/jira/browse/HIVE-12652 is more clear .Thanks 
your for attention !

> Using CombineHiveInputFormat with the origin inputformat  
> SymbolicTextInputFormat  ,it will get a wrong result
> --
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12652) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052918#comment-15052918
 ] 

Hive QA commented on HIVE-12652:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12776990/HIVE-12652.0.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9879 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6319/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6319/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6319/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12776990 - PreCommit-HIVE-TRUNK-Build

> SymbolicTextInputFormat should supports the  path with regex 
> -
>
> Key: HIVE-12652
> URL: https://issues.apache.org/jira/browse/HIVE-12652
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Fix For: 1.2.1
>
> Attachments: HIVE-12652.0.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
> test sql . 
> 2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
> resolve the path with regex ,so it will get a wrong result.I fix the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12615) Do not start spark session when only explain

2015-12-11 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052999#comment-15052999
 ] 

Xuefu Zhang commented on HIVE-12615:


Okay. It can be done, but just takes some extra work. Let's close it for now 
and revisit if there are more complaints about this.

> Do not start spark session when only explain 
> -
>
> Key: HIVE-12615
> URL: https://issues.apache.org/jira/browse/HIVE-12615
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12615.patch
>
>
> When using beeline -e "set hive.execution.engine=spark;explain select 
> count(*) from sometable",it's very slow due to starting of spark session on 
> yarn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12603) Add config to block queries that scan > N number of partitions

2015-12-11 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052980#comment-15052980
 ] 

Thejas M Nair commented on HIVE-12603:
--

Note that HIVE-9499 prevents this from being used as a general config.


> Add config to block queries that scan > N number of partitions 
> ---
>
> Key: HIVE-12603
> URL: https://issues.apache.org/jira/browse/HIVE-12603
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Query Planning
>Affects Versions: 2.0.0
>Reporter: Lenni Kuff
>
> Strict mode is useful for blocking queries that load all partitions, but it's 
> still possible to put significant load on the HMS for queries that scan a 
> large number of partitions. It would be useful to add a config provide a hard 
> limit to the number of partitions scanned by a query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12635) Hive should return the latest hbase cell timestamp as the row timestamp value

2015-12-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12635:

Hadoop Flags: Incompatible change
Release Note: After this change, the timestamp of the row will be the 
latest timestamp of all the cells in hbase for that row, instead of the 
timestamp of the first cell.  

> Hive should return the latest hbase cell timestamp as the row timestamp value
> -
>
> Key: HIVE-12635
> URL: https://issues.apache.org/jira/browse/HIVE-12635
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: backward-incompatible
> Attachments: HIVE-12635.patch
>
>
> When hive talks to hbase and maps hbase timestamp field to one hive column,  
> seems hive returns the first cell timestamp instead of the latest one as the 
> timestamp value. 
> Makes sense to return the latest timestamp since adding the latest cell can 
> be  considered an update to the row. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12652) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12652:

Summary: SymbolicTextInputFormat should supports the  path with regex   
(was: SymbolicTextInputFormat should supports the  path with regex  ,especially 
used in  CombineHiveInputFormat .)

> SymbolicTextInputFormat should supports the  path with regex 
> -
>
> Key: HIVE-12652
> URL: https://issues.apache.org/jira/browse/HIVE-12652
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Fix For: 1.2.1
>
> Attachments: HIVE-12652.0.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
> test sql . 
> 2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
> resolve the path with regex ,so it will get a wrong result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it cannot resolve the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Summary: Using CombineHiveInputFormat with the origin inputformat  
SymbolicTextInputFormat  ,it cannot resolve the path with regex  (was: Using 
CombineHiveInputFormat with the origin inputformat  SymbolicTextInputFormat  
,it will get a wrong result)

> Using CombineHiveInputFormat with the origin inputformat  
> SymbolicTextInputFormat  ,it cannot resolve the path with regex
> -
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12635) Hive should return the latest hbase cell timestamp as the row timestamp value

2015-12-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12635:

Labels: backward-incompatible  (was: )

> Hive should return the latest hbase cell timestamp as the row timestamp value
> -
>
> Key: HIVE-12635
> URL: https://issues.apache.org/jira/browse/HIVE-12635
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: backward-incompatible
> Attachments: HIVE-12635.patch
>
>
> When hive talks to hbase and maps hbase timestamp field to one hive column,  
> seems hive returns the first cell timestamp instead of the latest one as the 
> timestamp value. 
> Makes sense to return the latest timestamp since adding the latest cell can 
> be  considered an update to the row. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12652) SymbolicTextInputFormat should supports the path with regex ,especially used in CombineHiveInputFormat .

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12652:

Summary: SymbolicTextInputFormat should supports the  path with regex  
,especially used in  CombineHiveInputFormat .  (was: SymbolicTextInputFormat 
should supports the  path with regex  ,especially using CombineHiveInputFormat 
.Add test sql .)

> SymbolicTextInputFormat should supports the  path with regex  ,especially 
> used in  CombineHiveInputFormat .
> ---
>
> Key: HIVE-12652
> URL: https://issues.apache.org/jira/browse/HIVE-12652
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Fix For: 1.2.1
>
> Attachments: HIVE-12652.0.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
> test sql . 
> 2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
> resolve the path with regex ,so it will get a wrong result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it will get a wrong result

2015-12-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052879#comment-15052879
 ] 

Aihua Xu commented on HIVE-12541:
-

Thanks [~wisgood] for the clarification. I changed the type to improvement and 
removed "fixed version".

The patch looks good to me. +1.

> Using CombineHiveInputFormat with the origin inputformat  
> SymbolicTextInputFormat  ,it will get a wrong result
> --
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12620) Misc improvement to Acid module

2015-12-11 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052899#comment-15052899
 ] 

Eugene Koifman commented on HIVE-12620:
---

all failed test have age > 5

> Misc improvement to Acid module
> ---
>
> Key: HIVE-12620
> URL: https://issues.apache.org/jira/browse/HIVE-12620
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12620.patch
>
>
> # DbLockManger.unlock() - if this fails (due to no such lock in turn due to 
> timeout) the lock is not removed from DbLockManger internal tracking
> # Add logic to DBLockManager to detect if there is attempt to interleave 
> transactions or locks from different statements for read-only auto commit mode
> # TxnHandler.checkLock() can use 1 connection instead of 2
> # TxnHandler.timeOutLocks() - refactor so that it can log which locks were 
> expired (simplifies debugging)
> # TxnHandler#getTxnIdFromLockId() - include lock id if it's not found
> # TxnHandler#checkRetryable() - log exception it saw
> # TxnHandler.lock() - throw new MetaException("Couldn't find a lock we just 
> created!"); - include lockid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12652) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12652:

Description: 
1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
test sql . 
2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
resolve the path with regex ,so it will get a wrong result.I fix the problem.



  was:
1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
test sql . 
2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
resolve the path with regex ,so it will get a wrong result.



> SymbolicTextInputFormat should supports the  path with regex 
> -
>
> Key: HIVE-12652
> URL: https://issues.apache.org/jira/browse/HIVE-12652
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Fix For: 1.2.1
>
> Attachments: HIVE-12652.0.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex  .I add some  
> test sql . 
> 2, But ,when using  CombineHiveInputFormat  to merge small file  , It cannot 
> resolve the path with regex ,so it will get a wrong result.I fix the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it cannot resolve the path with regex

2015-12-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052955#comment-15052955
 ] 

Aihua Xu commented on HIVE-12541:
-

[~wisgood] You can edit this jira, like changing the title to 
"SymbolicTextInputFormat should supports the path with regex". If you can add 
more tests in this jira as Chaoyu mentioned, that will be great. We can dup 
HIVE-12652 to this one.

> Using CombineHiveInputFormat with the origin inputformat  
> SymbolicTextInputFormat  ,it cannot resolve the path with regex
> -
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053620#comment-15053620
 ] 

Prasanth Jayachandran commented on HIVE-12657:
--

[~pxiong] and [~ashutoshc] fyi..

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12632) LLAP: don't use IO elevator for ACID tables

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12632:

Attachment: HIVE-12632.01.patch

The patch that actually works (the test works with the LLAP IO init fixed the 
way it's done in the blocking JIRA). 

> LLAP: don't use IO elevator for ACID tables 
> 
>
> Key: HIVE-12632
> URL: https://issues.apache.org/jira/browse/HIVE-12632
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12632.01.patch, HIVE-12632.patch
>
>
> Until HIVE-12631 is fixed, we need to avoid ACID tables in IO elevator. Right 
> now, a FileNotFound error is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12659) LLAP should detect all nodes down state and stop issuing queries

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053772#comment-15053772
 ] 

Prasanth Jayachandran commented on HIVE-12659:
--

[~sseth] and [~gopalv] fyi..

> LLAP should detect all nodes down state and stop issuing queries
> 
>
> Key: HIVE-12659
> URL: https://issues.apache.org/jira/browse/HIVE-12659
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>
> I ran a simple query with 1 task in llap and for some reason llap daemon was 
> down (all nodes down scenario). But queries got submitted repeatedly to the 
> daemon and got killed by tez AM infinitely. Single task got killed over 20 
> times and had to ctrl + c. We need to detect all nodes down scenarios (using 
> Zookeeper?) and notify the client of the scenario and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12632) LLAP: don't use IO elevator for ACID tables

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053594#comment-15053594
 ] 

Sergey Shelukhin commented on HIVE-12632:
-

Hmm, it doesn't look like ORC split generation is getting called in the test.

> LLAP: don't use IO elevator for ACID tables 
> 
>
> Key: HIVE-12632
> URL: https://issues.apache.org/jira/browse/HIVE-12632
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12632.patch
>
>
> Until HIVE-12631 is fixed, we need to avoid ACID tables in IO elevator. Right 
> now, a FileNotFound error is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12609) Remove javaXML serialization

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053637#comment-15053637
 ] 

Prasanth Jayachandran commented on HIVE-12609:
--

Committed to master and cherrypicked to branch-2.0

> Remove javaXML serialization
> 
>
> Key: HIVE-12609
> URL: https://issues.apache.org/jira/browse/HIVE-12609
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12609.1.patch, HIVE-12609.2.patch
>
>
> We use kryo as default serializer and javaXML based serialization is not used 
> in many places and is also not well tested. We should remove javaXML 
> serialization and make kryo as the only serialization option.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12659) LLAP should detect all nodes down state and stop issuing queries

2015-12-11 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053836#comment-15053836
 ] 

Siddharth Seth commented on HIVE-12659:
---

This is related to the jira which attempts to detect instances of an LLAP 
cluster going down.
Ideally, we should be able to get enough information from the registry in 
Zookeeper to make a decision about whether to continuously attempt to run the 
query, or exit.
Alternately, we can start tracking the status of individual nodes to decide 
that an LLAP cluster is in an 'unhealthy' state.

> LLAP should detect all nodes down state and stop issuing queries
> 
>
> Key: HIVE-12659
> URL: https://issues.apache.org/jira/browse/HIVE-12659
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>
> I ran a simple query with 1 task in llap and for some reason llap daemon was 
> down (all nodes down scenario). But queries got submitted repeatedly to the 
> daemon and got killed by tez AM infinitely. Single task got killed over 20 
> times and had to ctrl + c. We need to detect all nodes down scenarios (using 
> Zookeeper?) and notify the client of the scenario and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-11 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Attachment: HIVE-12643.1.patch

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12632) LLAP: don't use IO elevator for ACID tables

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053661#comment-15053661
 ] 

Sergey Shelukhin commented on HIVE-12632:
-

[~prasanth_j] can you review? Unfortunately I cannot create an RB, there's some 
bug and it won't accept any patch

> LLAP: don't use IO elevator for ACID tables 
> 
>
> Key: HIVE-12632
> URL: https://issues.apache.org/jira/browse/HIVE-12632
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12632.01.patch, HIVE-12632.patch
>
>
> Until HIVE-12631 is fixed, we need to avoid ACID tables in IO elevator. Right 
> now, a FileNotFound error is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11890) Create ORC module

2015-12-11 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053795#comment-15053795
 ] 

Owen O'Malley commented on HIVE-11890:
--

The Spark tests and qfile tests were failing on other patches.

I ran the following tests (as well as all of the ORC unit and q file tests):

org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping

on my local box and they passed.

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11890) Create ORC module

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053797#comment-15053797
 ] 

Sergey Shelukhin commented on HIVE-11890:
-

Hmm... how come RecordReaderImpl, ReaderImpl etc didn't move? Is there a 
follow-up JIRA to watch?

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053813#comment-15053813
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777180/HIVE-12643.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 9894 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_tblproperty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6322/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6322/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6322/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777180 - PreCommit-HIVE-TRUNK-Build

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11890) Create ORC module

2015-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053832#comment-15053832
 ] 

ASF GitHub Bot commented on HIVE-11890:
---

Github user omalley closed the pull request at:

https://github.com/apache/hive/pull/54


> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12397) LLAP: add security to daemon-hosted shuffle

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12397:

Assignee: (was: Sergey Shelukhin)

> LLAP: add security to daemon-hosted shuffle
> ---
>
> Key: HIVE-12397
> URL: https://issues.apache.org/jira/browse/HIVE-12397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12609) Remove javaXML serialization

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053623#comment-15053623
 ] 

Prasanth Jayachandran commented on HIVE-12609:
--

selectDistinctStar.q is unrelated to this patch. I tested it even without this 
patch it still produces diff. Filed HIVE-12657 for it. Other test failures are 
unrelated. 

> Remove javaXML serialization
> 
>
> Key: HIVE-12609
> URL: https://issues.apache.org/jira/browse/HIVE-12609
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12609.1.patch, HIVE-12609.2.patch
>
>
> We use kryo as default serializer and javaXML based serialization is not used 
> in many places and is also not well tested. We should remove javaXML 
> serialization and make kryo as the only serialization option.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12397) LLAP: add security to daemon-hosted shuffle

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053580#comment-15053580
 ] 

Sergey Shelukhin commented on HIVE-12397:
-

[~sseth] Removed myself for now.. looking at it, but it may be faster for you 
to take a look

> LLAP: add security to daemon-hosted shuffle
> ---
>
> Key: HIVE-12397
> URL: https://issues.apache.org/jira/browse/HIVE-12397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12597) LLAP - allow using elevator without cache

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12597:

Attachment: HIVE-12597.01.patch

Resubmitting for HiveQA

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12597.01.patch, HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12633) LLAP: package included serde jars

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12633:

Attachment: HIVE-12633.02.patch

Same patch for HiveQA

> LLAP: package included serde jars
> -
>
> Key: HIVE-12633
> URL: https://issues.apache.org/jira/browse/HIVE-12633
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12633.01.patch, HIVE-12633.02.patch, 
> HIVE-12633.patch
>
>
> Some SerDes like JSONSerde are not packaged with LLAP. One cannot localize 
> jars on the daemon (due to security consideration if nothing else), so we 
> should package them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-12-11 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0.35.patch

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, 
> HIVE-0.13.patch, HIVE-0.14.patch, HIVE-0.15.patch, 
> HIVE-0.16.patch, HIVE-0.17.patch, HIVE-0.18.patch, 
> HIVE-0.19.patch, HIVE-0.2.patch, HIVE-0.20.patch, 
> HIVE-0.21.patch, HIVE-0.22.patch, HIVE-0.23.patch, 
> HIVE-0.24.patch, HIVE-0.25.patch, HIVE-0.26.patch, HIVE-0.27, 
> HIVE-0.27.patch, HIVE-0.28.patch, HIVE-0.29.patch, 
> HIVE-0.30.patch, HIVE-0.31.patch, HIVE-0.32.patch, 
> HIVE-0.33.patch, HIVE-0.34.patch, HIVE-0.35.patch, 
> HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, 
> HIVE-0.7.patch, HIVE-0.8.patch, HIVE-0.9.patch, 
> HIVE-0.91.patch, HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2015-12-11 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053882#comment-15053882
 ] 

Pengcheng Xiong commented on HIVE-12657:


[~prasanth_j], select distinct star is just doing group by *, if it has 
problem, then union distinct should also have problem as it is based on select 
distinct *. So only selectDistinctStar.q has problem, not any others? Thanks.

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053918#comment-15053918
 ] 

Sergey Shelukhin commented on HIVE-12577:
-

Can you add some comments? I don't quite understand what the patch is actually 
doing.

> NPE in LlapTaskCommunicator when unregistering containers
> -
>
> Key: HIVE-12577
> URL: https://issues.apache.org/jira/browse/HIVE-12577
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
>

[jira] [Updated] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-11 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12570:
-
Attachment: HIVE-12570.4.patch

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Instead of the above error message, it be more appropriate to throw the below 
> error :
> ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid 
> function 'average'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053913#comment-15053913
 ] 

Xiaowei Wang commented on HIVE-12541:
-

Ok,I have modified the name of the jira ,and put up a new patch ,which contains 
more tests  .

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12597) LLAP - allow using elevator without cache

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054067#comment-15054067
 ] 

Hive QA commented on HIVE-12597:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777213/HIVE-12597.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6326/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6326/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6326/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6326/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   433e506..c692e2e  branch-2.0 -> origin/branch-2.0
   3e3d966..b187d42  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 3e3d966 HIVE-12609: Remove javaXML serialization (Prasanth 
Jayachandran reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterAggregateTransposeRule.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTSTransposeRule.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdPredicates.java
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 4 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at b187d42 HIVE-12648 : LLAP IO was disabled in CliDriver by 
accident (and tests are broken) (Sergey Shelukhin, reviewed by Prasanth 
Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777213 - PreCommit-HIVE-TRUNK-Build

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12597.01.patch, HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053890#comment-15053890
 ] 

Prasanth Jayachandran commented on HIVE-12657:
--

I haven't tried running union distinct on jdk 1.8. I encountered this 
specifically when should up as a test failure. It could happen for other cases 
as well. From a high level, I see order of the columns have changed in group by 
but not sure. Hashmap ordering between jdk 1.7 and 1.8 is known to be different 
but I don't know why would that affect column order in group by. 

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Attachment: HIVE-12541.2.patch

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Description: 
1, In fact,SybolicTextInputFormat supports the path with regex .I add some test 
sql . 
2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
resolve the path with regex ,so it will get a wrong result.I  give a example 
,and fix the problem.

Table desc :
{noformat}
CREATE External TABLE `symlink_text_input_format`(
  `key` string,
  `value` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
{noformat}
There is a link file in the dir 
'/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
file is 
{noformat}
 viewfs://nsx/tmp/symlink* 
{noformat}
it contains one path ,and the path contains a regex!


Execute the sql : 
{noformat}
set hive.rework.mapredwork = true ;
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
set mapred.min.split.size.per.rack= 0 ;
set mapred.min.split.size.per.node= 0 ;
set mapred.max.split.size= 0 ;
select count(*) from  symlink_text_input_format ;

{noformat}
It will get a wrong result :0 




  was:
Table desc :
{noformat}
CREATE External TABLE `symlink_text_input_format`(
  `key` string,
  `value` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
{noformat}
There is a link file in the dir 
'/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
file is 
{noformat}
 viewfs://nsx/tmp/symlink* 
{noformat}
it contains one path ,and the path contains a regex!


Execute the sql : 
{noformat}
set hive.rework.mapredwork = true ;
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
set mapred.min.split.size.per.rack= 0 ;
set mapred.min.split.size.per.node= 0 ;
set mapred.max.split.size= 0 ;
select count(*) from  symlink_text_input_format ;

{noformat}
It will get wrong result :0 

At the same time ,I add a test case in the patch.



> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-11 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reopened HIVE-12473:
---

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-12-11 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053849#comment-15053849
 ] 

Jason Dere commented on HIVE-12502:
---

Looks good, do you mind adding a .q file test with to_date(null)? might be the 
best way to verify it is fixed, if this is your use case.

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
> Attachments: HIVE-12502-branch-1.patch, HIVE-12502.patch
>
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

2015-12-11 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12590:

Attachment: HIVE-12590.patch

> Repeated UDAFs with literals can produce incorrect result
> -
>
> Key: HIVE-12590
> URL: https://issues.apache.org/jira/browse/HIVE-12590
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0
>Reporter: Laljo John Pullokkaran
>Assignee: Ashutosh Chauhan
>Priority: Critical
> Attachments: HIVE-12590.patch
>
>
> Repeated UDAF with literals could produce wrong result.
> This is not a common use case, nevertheless a bug.
> hive> select max('pants'), max('pANTS') from t1 group by key;
>  Total MapReduce CPU Time Spent: 0 msec
> OK
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> Time taken: 296.252 seconds, Fetched: 5 row(s)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12632) LLAP: don't use IO elevator for ACID tables

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053979#comment-15053979
 ] 

Hive QA commented on HIVE-12632:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777182/HIVE-12632.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9880 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6323/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6323/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6323/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777182 - PreCommit-HIVE-TRUNK-Build

> LLAP: don't use IO elevator for ACID tables 
> 
>
> Key: HIVE-12632
> URL: https://issues.apache.org/jira/browse/HIVE-12632
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12632.01.patch, HIVE-12632.patch
>
>
> Until HIVE-12631 is fixed, we need to avoid ACID tables in IO elevator. Right 
> now, a FileNotFound error is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12528) don't start HS2 Tez sessions in a single thread

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053984#comment-15053984
 ] 

Sergey Shelukhin commented on HIVE-12528:
-

Well patch fails with an NPE because SessionState is not attached on new 
threads. It's used in exactly one place so I think I will just remove its usage 
from TezSessionState init.

> don't start HS2 Tez sessions in a single thread
> ---
>
> Key: HIVE-12528
> URL: https://issues.apache.org/jira/browse/HIVE-12528
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12528.patch
>
>
> Starting sessions in parallel would improve the startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

2015-12-11 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-12055:
-
Attachment: HIVE-12055.patch

I've rebased it to the current trunk and answered the review feedback from 
Prasanth.

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Summary: SymbolicTextInputFormat should supports the path with regex  (was: 
Using CombineHiveInputFormat with the origin inputformat  
SymbolicTextInputFormat  ,it cannot resolve the path with regex)

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053980#comment-15053980
 ] 

Hive QA commented on HIVE-12055:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777212/HIVE-12055.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6324/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6324/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6324/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6324/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 3e3d966 HIVE-12609: Remove javaXML serialization (Prasanth 
Jayachandran reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/io/LlapAwareSplit.java
Removing ql/src/test/queries/clientpositive/llap_acid.q
Removing ql/src/test/results/clientpositive/llap_acid.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 3e3d966 HIVE-12609: Remove javaXML serialization (Prasanth 
Jayachandran reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777212 - PreCommit-HIVE-TRUNK-Build

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054066#comment-15054066
 ] 

Hive QA commented on HIVE-0:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777211/HIVE-0.35.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9879 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-acid_globallimit.q-tez_union_dynamic_partition.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_offset_limit_ppd_optimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_offset_limit
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6325/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6325/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6325/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777211 - PreCommit-HIVE-TRUNK-Build

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, 
> HIVE-0.13.patch, HIVE-0.14.patch, HIVE-0.15.patch, 
> HIVE-0.16.patch, HIVE-0.17.patch, HIVE-0.18.patch, 
> HIVE-0.19.patch, HIVE-0.2.patch, HIVE-0.20.patch, 
> HIVE-0.21.patch, HIVE-0.22.patch, HIVE-0.23.patch, 
> HIVE-0.24.patch, HIVE-0.25.patch, HIVE-0.26.patch, HIVE-0.27, 
> HIVE-0.27.patch, HIVE-0.28.patch, HIVE-0.29.patch, 
> HIVE-0.30.patch, HIVE-0.31.patch, HIVE-0.32.patch, 
> HIVE-0.33.patch, HIVE-0.34.patch, HIVE-0.35.patch, 
> HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, 
> HIVE-0.7.patch, HIVE-0.8.patch, HIVE-0.9.patch, 
> HIVE-0.91.patch, HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-11 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12473:
--
Priority: Blocker  (was: Major)

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-11 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054105#comment-15054105
 ] 

Gunther Hagleitner commented on HIVE-12473:
---

[~sershe]/[~gopalv] here's a comment in the code:

{noformat}
  // TODO: this is not necessarily going to work for all cases. At least, 
table name is needed.
  //   Also it's not clear if this is going to work with subquery 
columns and such.
{noformat}

Maybe commit when it does?

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-11 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054106#comment-15054106
 ] 

Gunther Hagleitner commented on HIVE-12473:
---

Also - there's not test in the code, so it's hard to try to repro and see what 
it does...

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers

2015-12-11 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12577:
--
Attachment: HIVE-12577.1.txt

Patch for jenkins.

> NPE in LlapTaskCommunicator when unregistering containers
> -
>
> Key: HIVE-12577
> URL: https://issues.apache.org/jira/browse/HIVE-12577
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>

[jira] [Updated] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers

2015-12-11 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12577:
--
Attachment: HIVE-12577.1.review.txt

In addition to the fix, the patch renames one class. TaskCommunicator to 
LlapDaemonClientProcy. TaskCOmmunicator was too similar to 
LlapTaskCommunicator, and gets confusing.
Attaching two patches. One generated with git diff -M - to show the actual 
changes - so that the rename does not get in the way.

[~sershe], [~prasanth_j] - please review.

> NPE in LlapTaskCommunicator when unregistering containers
> -
>
> Key: HIVE-12577
> URL: https://issues.apache.org/jira/browse/HIVE-12577
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
>

[jira] [Updated] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Attachment: HIVE-12541.2.patch

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-11 Thread Xiaowei Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Wang updated HIVE-12541:

Attachment: (was: HIVE-12541.2.patch)

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054101#comment-15054101
 ] 

Hive QA commented on HIVE-12570:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777217/HIVE-12570.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 9896 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_invalid
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_unknown_function1
org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_unknown_function4
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6327/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6327/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6327/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777217 - PreCommit-HIVE-TRUNK-Build

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
>

[jira] [Commented] (HIVE-12635) Hive should return the latest hbase cell timestamp as the row timestamp value

2015-12-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052869#comment-15052869
 ] 

Aihua Xu commented on HIVE-12635:
-

1. Timestamp variable can be negative. One way is to init it to LONG.MIN_VALUE, 
but I thought the current way is kind of the same. Somehow trying to avoid init 
to MIN_VALUE. :) Let me know if you want me to make that change.

2. Yeah. You are right. It will be backward-imcompatibility. I will mark that.

> Hive should return the latest hbase cell timestamp as the row timestamp value
> -
>
> Key: HIVE-12635
> URL: https://issues.apache.org/jira/browse/HIVE-12635
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12635.patch
>
>
> When hive talks to hbase and maps hbase timestamp field to one hive column,  
> seems hive returns the first cell timestamp instead of the latest one as the 
> timestamp value. 
> Makes sense to return the latest timestamp since adding the latest cell can 
> be  considered an update to the row. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it will get a wrong result

2015-12-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12541:

Fix Version/s: (was: 1.2.1)

> Using CombineHiveInputFormat with the origin inputformat  
> SymbolicTextInputFormat  ,it will get a wrong result
> --
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch
>
>
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get wrong result :0 
> At the same time ,I add a test case in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12620) Misc improvement to Acid module

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052472#comment-15052472
 ] 

Hive QA commented on HIVE-12620:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12776712/HIVE-12620.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6316/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6316/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6316/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12776712 - PreCommit-HIVE-TRUNK-Build

> Misc improvement to Acid module
> ---
>
> Key: HIVE-12620
> URL: https://issues.apache.org/jira/browse/HIVE-12620
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12620.patch
>
>
> # DbLockManger.unlock() - if this fails (due to no such lock in turn due to 
> timeout) the lock is not removed from DbLockManger internal tracking
> # Add logic to DBLockManager to detect if there is attempt to interleave 
> transactions or locks from different statements for read-only auto commit mode
> # TxnHandler.checkLock() can use 1 connection instead of 2
> # TxnHandler.timeOutLocks() - refactor so that it can log which locks were 
> expired (simplifies debugging)
> # TxnHandler#getTxnIdFromLockId() - include lock id if it's not found
> # TxnHandler#checkRetryable() - log exception it saw
> # TxnHandler.lock() - throw new MetaException("Couldn't find a lock we just 
> created!"); - include lockid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12610) Hybrid Grace Hash Join should fail task faster if processing first batch fails, instead of continuing processing the rest

2015-12-11 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12610:
-
Attachment: HIVE-12610.branch-1.patch

branch-1 patch is attached.

> Hybrid Grace Hash Join should fail task faster if processing first batch 
> fails, instead of continuing processing the rest
> -
>
> Key: HIVE-12610
> URL: https://issues.apache.org/jira/browse/HIVE-12610
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12610.1.patch, HIVE-12610.2.patch, 
> HIVE-12610.branch-1.patch
>
>
> During processing the spilled partitions, if there's any fatal error, such as 
> Kryo exception, then we should exit early, instead of moving on to process 
> the rest of spilled partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12547) VectorMapJoinFastBytesHashTable fails during expansion

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053469#comment-15053469
 ] 

Sergey Shelukhin commented on HIVE-12547:
-

[~mmccline] what do you think? I looked at the code, it looks like it should be 
possible to compress value reference and hashcode into one (or key+hash+value 
into two?)
Is there a point of storing the small value length in value ref? We never 
compare values, just retrieve them, so there are no cases where we avoid going 
to WriteBuffer-s, like there is with keys.

> VectorMapJoinFastBytesHashTable fails during expansion
> --
>
> Key: HIVE-12547
> URL: https://issues.apache.org/jira/browse/HIVE-12547
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: HIVE-12547.WIP.patch
>
>
> {code}
> 2015-11-30 20:55:30,361 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1448429572030_1224_7][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 2, taskAttemptId=attempt_1448429572030_1224_7_03_05_0, 
> creationTime=1448934722881, allocationTime=1448934726552, 
> startTime=1448934726553, finishTime=1448934930360, timeTaken=203807, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task: 
> attempt_1448429572030_1224_7_03_05_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:424)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:394)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:519)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:472)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:274)
>   ... 15 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:414)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
>

[jira] [Assigned] (HIVE-12397) LLAP: add security to daemon-hosted shuffle

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12397:
---

Assignee: Sergey Shelukhin  (was: Siddharth Seth)

> LLAP: add security to daemon-hosted shuffle
> ---
>
> Key: HIVE-12397
> URL: https://issues.apache.org/jira/browse/HIVE-12397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053482#comment-15053482
 ] 

Sergey Shelukhin commented on HIVE-11531:
-

It appears that union9 is broken since the patch has been committed. The stats 
have gone negative for some queries. Can you double check?

[~prasanth_j] what do negative stats mean?
{noformat}
<   Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
<   Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
<   Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
< Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
< Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
< Statistics: Num rows: -1 Data size: 5812 Basic stats: PARTIAL 
Column stats: COMPLETE
---
>   Statistics: Num rows: 1500 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
>   Statistics: Num rows: 1500 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
>   Statistics: Num rows: 1500 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> Statistics: Num rows: 1500 Data size: 0 Basic stats: PARTIAL 
> Column stats: COMPLETE
> Statistics: Num rows: 1500 Data size: 0 Basic stats: PARTIAL 
> Column stats: COMPLETE
> Statistics: Num rows: 1500 Data size: 0 Basic stats: PARTIAL 
> Column stats: COMPLETE
{noformat}

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
> Fix For: 2.1.0
>
> Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, 
> HIVE-11531.04.patch, HIVE-11531.05.patch, HIVE-11531.06.patch, 
> HIVE-11531.07.patch, HIVE-11531.WIP.1.patch, HIVE-11531.WIP.2.patch, 
> HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12422) LLAP: add security to Web UI endpoint

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053491#comment-15053491
 ] 

Sergey Shelukhin commented on HIVE-12422:
-

Test failures are unrelated... some are unstable and some have age more than 1. 

> LLAP: add security to Web UI endpoint
> -
>
> Key: HIVE-12422
> URL: https://issues.apache.org/jira/browse/HIVE-12422
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12422.01.patch, HIVE-12422.02.patch, 
> HIVE-12422.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12635) Hive should return the latest hbase cell timestamp as the row timestamp value

2015-12-11 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053079#comment-15053079
 ] 

Szehon Ho commented on HIVE-12635:
--

Whoops you are right, ignore that suggestion then :)

> Hive should return the latest hbase cell timestamp as the row timestamp value
> -
>
> Key: HIVE-12635
> URL: https://issues.apache.org/jira/browse/HIVE-12635
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: backward-incompatible
> Attachments: HIVE-12635.patch
>
>
> When hive talks to hbase and maps hbase timestamp field to one hive column,  
> seems hive returns the first cell timestamp instead of the latest one as the 
> timestamp value. 
> Makes sense to return the latest timestamp since adding the latest cell can 
> be  considered an update to the row. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Description: 
when I create table with ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
string, 
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' 
overwrite into table PersonInfo;

 I found chinese disorder code in the table and  'serialization.encoding' does 
not work, which list as below：
|   
  
9999�ϴ���
  0624624002��ʱ��   

 

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, which list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang reassigned HIVE-12653:
---

Assignee: yangfang

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, which list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Attachment: HIVE-12653.patch

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, which list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12620) Misc improvement to Acid module

2015-12-11 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053389#comment-15053389
 ] 

Jason Dere commented on HIVE-12620:
---

+1

> Misc improvement to Acid module
> ---
>
> Key: HIVE-12620
> URL: https://issues.apache.org/jira/browse/HIVE-12620
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12620.patch
>
>
> # DbLockManger.unlock() - if this fails (due to no such lock in turn due to 
> timeout) the lock is not removed from DbLockManger internal tracking
> # Add logic to DBLockManager to detect if there is attempt to interleave 
> transactions or locks from different statements for read-only auto commit mode
> # TxnHandler.checkLock() can use 1 connection instead of 2
> # TxnHandler.timeOutLocks() - refactor so that it can log which locks were 
> expired (simplifies debugging)
> # TxnHandler#getTxnIdFromLockId() - include lock id if it's not found
> # TxnHandler#checkRetryable() - log exception it saw
> # TxnHandler.lock() - throw new MetaException("Couldn't find a lock we just 
> created!"); - include lockid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12648) LLAP IO was disabled in CliDriver by accident (and tests are broken)

2015-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053536#comment-15053536
 ] 

Hive QA commented on HIVE-12648:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777090/HIVE-12648.02.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9894 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6321/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6321/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6321/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777090 - PreCommit-HIVE-TRUNK-Build

> LLAP IO was disabled in CliDriver by accident (and tests are broken)
> 
>
> Key: HIVE-12648
> URL: https://issues.apache.org/jira/browse/HIVE-12648
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12648.01.patch, HIVE-12648.02.patch, 
> HIVE-12648.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-11 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12435:

Attachment: HIVE-12435.03.patch

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12648) LLAP IO was disabled in CliDriver by accident (and tests are broken)

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053208#comment-15053208
 ] 

Sergey Shelukhin commented on HIVE-12648:
-

Added HIVE-12654 

> LLAP IO was disabled in CliDriver by accident (and tests are broken)
> 
>
> Key: HIVE-12648
> URL: https://issues.apache.org/jira/browse/HIVE-12648
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12648.01.patch, HIVE-12648.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12620) Misc improvement to Acid module

2015-12-11 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053166#comment-15053166
 ] 

Wei Zheng commented on HIVE-12620:
--

Code looks good to me.

+1 (non-binding)

> Misc improvement to Acid module
> ---
>
> Key: HIVE-12620
> URL: https://issues.apache.org/jira/browse/HIVE-12620
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12620.patch
>
>
> # DbLockManger.unlock() - if this fails (due to no such lock in turn due to 
> timeout) the lock is not removed from DbLockManger internal tracking
> # Add logic to DBLockManager to detect if there is attempt to interleave 
> transactions or locks from different statements for read-only auto commit mode
> # TxnHandler.checkLock() can use 1 connection instead of 2
> # TxnHandler.timeOutLocks() - refactor so that it can log which locks were 
> expired (simplifies debugging)
> # TxnHandler#getTxnIdFromLockId() - include lock id if it's not found
> # TxnHandler#checkRetryable() - log exception it saw
> # TxnHandler.lock() - throw new MetaException("Couldn't find a lock we just 
> created!"); - include lockid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12648) LLAP IO was disabled in CliDriver by accident (and tests are broken)

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053203#comment-15053203
 ] 

Sergey Shelukhin commented on HIVE-12648:
-

0.20 is the default that was passed before for other cases. There are some unit 
tests but apparently don't cover this case. Regular and synthetic fileId are 
covered by MiniLlap and CliDriver respectively. I'll file a JIRA for file ID 
tests. Can you +1 meanwhile so we can fix the existing code and enable tests? 

> LLAP IO was disabled in CliDriver by accident (and tests are broken)
> 
>
> Key: HIVE-12648
> URL: https://issues.apache.org/jira/browse/HIVE-12648
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12648.01.patch, HIVE-12648.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12648) LLAP IO was disabled in CliDriver by accident (and tests are broken)

2015-12-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053207#comment-15053207
 ] 

Prasanth Jayachandran commented on HIVE-12648:
--

+1

> LLAP IO was disabled in CliDriver by accident (and tests are broken)
> 
>
> Key: HIVE-12648
> URL: https://issues.apache.org/jira/browse/HIVE-12648
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12648.01.patch, HIVE-12648.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12648) LLAP IO was disabled in CliDriver by accident (and tests are broken)

2015-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12648:

Attachment: HIVE-12648.02.patch

HiveQA queue looks empty. Retrying...

> LLAP IO was disabled in CliDriver by accident (and tests are broken)
> 
>
> Key: HIVE-12648
> URL: https://issues.apache.org/jira/browse/HIVE-12648
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12648.01.patch, HIVE-12648.02.patch, 
> HIVE-12648.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Description: 
when I create table with ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
string, 
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' 
overwrite into table PersonInfo;

 I found chinese disorder code in the table and  'serialization.encoding' does 
not work, the chinese disorder data list as below：
|   
  
9999�ϴ���
  0624624002��ʱ��   

 

  was:
when I create table with ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
string, 
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' 
overwrite into table PersonInfo;

 I found chinese disorder code in the table and  'serialization.encoding' does 
not work, the error which list as below：
|   
  
9999�ϴ���
  0624624002��ʱ��   

 


> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-11 Thread yangfang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Description: 
when I create table with ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
string, 
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' 
overwrite into table PersonInfo;

 I found chinese disorder code in the table and  'serialization.encoding' does 
not work, the error which list as below：
|   
  
9999�ϴ���
  0624624002��ʱ��   

 

  was:
when I create table with ROW FORMAT SERDE 
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
with chinese encoded by GBK:
create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
string, 
num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');

load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' 
overwrite into table PersonInfo;

 I found chinese disorder code in the table and  'serialization.encoding' does 
not work, which list as below：
|   
  
9999�ϴ���
  0624624002��ʱ��   

 


> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the error which list as below：
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12644) Support for offset in HiveSortMergeRule

2015-12-11 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053937#comment-15053937
 ] 

Laljo John Pullokkaran commented on HIVE-12644:
---

In HiveSortMergeRule @92 isn't it better to check if offset+limit is less than 
the one from below.
i.e 
if ((topOffset + topLimit) < (bottomOffset + bottomLimit))

> Support for offset in HiveSortMergeRule
> ---
>
> Key: HIVE-12644
> URL: https://issues.apache.org/jira/browse/HIVE-12644
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12644.patch
>
>
> After HIVE-11531 goes in, HiveSortMergeRule needs to be extended to support 
> offset properly when it merges operators that contain Limit. Otherwise, limit 
> pushdown through outer join optimization (introduced in HIVE-11684) will not 
> work properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12528) don't start HS2 Tez sessions in a single thread

2015-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053962#comment-15053962
 ] 

Sergey Shelukhin commented on HIVE-12528:
-

What do you mean by correct visibility? The approach is the same - the sessions 
are taken from queue, the sessions are put back into queue. The only logic 
change is that it's done in parallel; all the threads are joined at the end in 
the same place. As long as there are no threading issues between different 
SessionState/TezClient objects, it should be ok.

> don't start HS2 Tez sessions in a single thread
> ---
>
> Key: HIVE-12528
> URL: https://issues.apache.org/jira/browse/HIVE-12528
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12528.patch
>
>
> Starting sessions in parallel would improve the startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-11 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Attachment: HIVE-12643.2.patch

As a useful side-effect, few queries got vectorized.

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12547) VectorMapJoinFastBytesHashTable fails during expansion

2015-12-11 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053547#comment-15053547
 ] 

Matt McCline commented on HIVE-12547:
-

That makes sense.

> VectorMapJoinFastBytesHashTable fails during expansion
> --
>
> Key: HIVE-12547
> URL: https://issues.apache.org/jira/browse/HIVE-12547
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: HIVE-12547.WIP.patch
>
>
> {code}
> 2015-11-30 20:55:30,361 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1448429572030_1224_7][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 2, taskAttemptId=attempt_1448429572030_1224_7_03_05_0, 
> creationTime=1448934722881, allocationTime=1448934726552, 
> startTime=1448934726553, finishTime=1448934930360, timeTaken=203807, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task: 
> attempt_1448429572030_1224_7_03_05_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:424)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:394)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:519)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:472)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:274)
>   ... 15 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:414)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:174)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:170)
>   at 
>

90 matches

Mail list logo