date:20151109

[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996439#comment-14996439
 ] 

Hive QA commented on HIVE-12301:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771298/HIVE-12301.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9765 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-tez_joins_explain.q-vector_decimal_aggregate.q-auto_join29.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5969/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5969/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5969/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771298 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch, HIVE-12301.02.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12372) Improve to support the multibyte character at lpad and rpad

2015-11-09 Thread Shinichi Yamashita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HIVE-12372:
--
Attachment: HIVE-12372.1.patch

I attach a patch file.

> Improve to support the multibyte character at lpad and rpad
> ---
>
> Key: HIVE-12372
> URL: https://issues.apache.org/jira/browse/HIVE-12372
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
>  Labels: udf
> Attachments: HIVE-12372.1.patch
>
>
> The current lpad and rpad don't support the multibyte character at "str" and 
> "pad".
> For example, we can see the following result.
> {code}
> hive> select name from sample1;
> OK
> tokyo
> ＴＯＫＹＯ
> hive> select lpad(name, 20, '*') from sample1;
> OK
> ***tokyo
> *ＴＯＫＹＯ
> {code}
> This is improved as follows.
> {code}
> hive> select lpad(name, 20, '*') from sample1;
> ***tokyo
> ***ＴＯＫＹＯ
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization

2015-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996629#comment-14996629
 ] 

Xuefu Zhang commented on HIVE-12370:


Have you tried other data format? The stack trace seems suggesting a problem of 
that.

> Hive Query got failure with larger scale data set with enablng sampling order 
> optimization
> --
>
> Key: HIVE-12370
> URL: https://issues.apache.org/jira/browse/HIVE-12370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Yi Zhou
>
> Found that hive would get failure on Hive on MR with larger scale 
> data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 
> 1GB data set).
> hive.optimize.sampling.orderby=true
> hive.optimize.sampling.orderby.number=2
> hive.optimize.sampling.orderby.percent=0.1
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> ... 9 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-09 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996528#comment-14996528
 ] 

Naveen Gangam commented on HIVE-12184:
--

The 3 failures appear to be un-related to the patch.

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12369) Faster Vector GroupBy

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996553#comment-14996553
 ] 

Hive QA commented on HIVE-12369:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771324/HIVE-12369.01.patch

{color:green}SUCCESS:{color} +1 due to 19 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 9788 tests 
executed
*Failed tests:*
{noformat}
TestKeyValueWriter - did not produce a TEST-*.xml file
TestLongKeyValueWriter - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testExpand
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testGetNonExistent
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testLarge
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testLargeAndExpand
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testPutGetMultiple
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testPutGetOne
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastLongHashMap.testPutWithFullMap
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testExpand
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testGetNonExistent
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testLarge
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testLargeAndExpand
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testPutGetMultiple
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testPutGetOne
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.TestVectorMapJoinFastMultiKeyHashMap.testPutWithFullMap
org.apache.hadoop.hive.ql.optimizer.physical.TestVectorizer.testAggregateOnUDF
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5970/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5970/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5970/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771324 - PreCommit-HIVE-TRUNK-Build

> Faster Vector GroupBy
> -
>
> Key: HIVE-12369
> URL:

[jira] [Commented] (HIVE-12045) ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)

2015-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996587#comment-14996587
 ] 

Xuefu Zhang commented on HIVE-12045:


Thanks, Rui. I didn't know that. Yes, I think we should if it doesn't cause too 
much trouble.

> ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)
> --
>
> Key: HIVE-12045
> URL: https://issues.apache.org/jira/browse/HIVE-12045
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
> Environment: Cloudera QuickStart VM - CDH5.4.2
> beeline
>Reporter: Zsolt Tóth
>Assignee: Rui Li
> Attachments: HIVE-12045.1-spark.patch, example.jar, genUDF.patch
>
>
> If I execute the following query in beeline, I get ClassNotFoundException for 
> the UDF class.
> {code}
> drop function myGenericUdf;
> create function myGenericUdf as 'org.example.myGenericUdf' using jar 
> 'hdfs:///tmp/myudf.jar';
> select distinct myGenericUdf(1,2,1) from mytable;
> {code}
> In my example, myGenericUdf just looks for the 1st argument's value in the 
> others and returns the index. I don't think this is related to the actual 
> GenericUDF function.
> Note that:
> "select myGenericUdf(1,2,1) from mytable;" succeeds
> If I use the non-generic implementation of the same UDF, the select distinct 
> call succeeds.
> StackTrace:
> {code}
> 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: 
> hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
>

[jira] [Updated] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-09 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-12184:
---
Hadoop Flags: Incompatible change

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996613#comment-14996613
 ] 

Xuefu Zhang commented on HIVE-12184:


Could you please provide a RB for this? Thanks.


> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-09 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996677#comment-14996677
 ] 

Naveen Gangam commented on HIVE-12184:
--

Re-posting the link for RB. I will also update the diff with the latest patch 
(contains changes for qtest.out files)
Review posted to RB at https://reviews.apache.org/r/39508/


> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization

2015-11-09 Thread Yi Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Zhou updated HIVE-12370:
---
Description: 
Found that hive would get failure on Hive on MR with larger scale 
data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 1GB 
data set).

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 8 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask


  was:
Found that hive would get failure on Hive on MR with larger scale 
data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 1GB 
data set). So we disabled the sampling optimization to make it pass.

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 8 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask



> Hive Query got failure with larger scale data set with enablng sampling order 
> optimization
> --
>
> Key: HIVE-12370
> URL: https://issues.apache.org/jira/browse/HIVE-12370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Yi Zhou
>
> Found that hive would get failure on Hive on MR with larger scale 
> data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 
> 1GB data set).
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive

[jira] [Updated] (HIVE-12371) Adding a timeout connection parameter for JDBC

2015-11-09 Thread Nemon Lou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-12371:
-
Priority: Minor  (was: Major)

> Adding a timeout connection parameter for JDBC
> --
>
> Key: HIVE-12371
> URL: https://issues.apache.org/jira/browse/HIVE-12371
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Nemon Lou
>Assignee: Vaibhav Gumashta
>Priority: Minor
>
> There are some timeout settings from server side:
> HIVE-4766
> HIVE-6679
> Adding a timeout connection parameter for JDBC is useful in some scenario:
> 1,beeline (which can not set timeout manually)
> 2,customize timeout for different connections (among hive or RDBs,which can 
> not be done via DriverManager.setLoginTimeout())
> Just like postgresql,
> {noformat}
> jdbc:postgresql://localhost/test?user=fred=secret=true=0
> {noformat}
> or mysql
> {noformat}
> jdbc:mysql://xxx.xx.xxx.xxx:3306/database?connectTimeout=6=6
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-09 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996531#comment-14996531
 ] 

Naveen Gangam commented on HIVE-12362:
--

The test failures are unrelated to the attached patch.

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization

2015-11-09 Thread Yi Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Zhou updated HIVE-12370:
---
Description: 
Found that hive would get failure on Hive on MR with larger scale 
data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 1GB 
data set).

hive.optimize.sampling.orderby=true
hive.optimize.sampling.orderby.number=2
hive.optimize.sampling.orderby.percent=0.1

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 8 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask


  was:
Found that hive would get failure on Hive on MR with larger scale 
data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 1GB 
data set).

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 8 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask



> Hive Query got failure with larger scale data set with enablng sampling order 
> optimization
> --
>
> Key: HIVE-12370
> URL: https://issues.apache.org/jira/browse/HIVE-12370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Yi Zhou
>
> Found that hive would get failure on Hive on MR with larger scale 
> data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 
> 1GB data set).
>

[jira] [Updated] (HIVE-10328) Enable new return path for cbo

2015-11-09 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10328:
---
Attachment: HIVE-10328.13.patch

> Enable new return path for cbo
> --
>
> Key: HIVE-10328
> URL: https://issues.apache.org/jira/browse/HIVE-10328
> Project: Hive
>  Issue Type: Task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10328.1.patch, HIVE-10328.10.patch, 
> HIVE-10328.11.patch, HIVE-10328.12.patch, HIVE-10328.13.patch, 
> HIVE-10328.2.patch, HIVE-10328.3.patch, HIVE-10328.4.patch, 
> HIVE-10328.4.patch, HIVE-10328.5.patch, HIVE-10328.6.patch, 
> HIVE-10328.7.patch, HIVE-10328.8.patch, HIVE-10328.9.patch, HIVE-10328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6990) Direct SQL fails when the explicit schema setting is different from the default one

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997365#comment-14997365
 ] 

Ashutosh Chauhan commented on HIVE-6990:


Failed tests are likely relevant. Also, as Sergey said if this is not specified 
in config, then default (empty string for schema) will be used, which should be 
ok, correct?

> Direct SQL fails when the explicit schema setting is different from the 
> default one
> ---
>
> Key: HIVE-6990
> URL: https://issues.apache.org/jira/browse/HIVE-6990
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.14.0, 1.2.1
> Environment: hive + derby
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-6990.1.patch, HIVE-6990.2.patch, HIVE-6990.3.patch, 
> HIVE-6990.4.patch, HIVE-6990.5.patch
>
>
> I got the following ERROR in hive.log
> 2014-04-23 17:30:23,331 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(1756)) - Direct SQL failed, falling 
> back to ORM
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> PARTITIONS.PART_ID from PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = 
> TBLS.TBL_ID   inner join DBS on TBLS.DB_ID = DBS.DB_ID inner join 
> PARTITION_KEY_VALS as FILTER0 on FILTER0.PART_ID = PARTITIONS.PART_ID and 
> FILTER0.INTEGER_IDX = 0 where TBLS.TBL_NAME = ? and DBS.NAME = ? and 
> ((FILTER0.PART_KEY_VAL = ?))".
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:181)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:98)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:1833)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1806)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
> at com.sun.proxy.$Proxy11.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:3310)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> at com.sun.proxy.$Proxy12.get_partitions_by_filter(Unknown Source)
> Reproduce steps:
> 1. set the following properties in hive-site.xml
>  
>   javax.jdo.mapping.Schema
>   HIVE
>  
>  
>   javax.jdo.option.ConnectionUserName
>   user1
>  
> 2. execute hive queries
> hive> create table mytbl ( key int, value string);
> hive> load data local inpath 'examples/files/kv1.txt' overwrite into table 
> mytbl;
> hive> select * from mytbl;
> hive> create view myview partitioned on (value) as select key, value from 
> mytbl where key=98;
> hive> alter view myview add partition (value='val_98') partition 
> (value='val_xyz');
> hive> alter view myview drop partition (value='val_xyz');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12364) Distcp job fails when run under Tez

2015-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12364:

Attachment: HIVE-12364.patch

> Distcp job fails when run under Tez
> ---
>
> Key: HIVE-12364
> URL: https://issues.apache.org/jira/browse/HIVE-12364
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-12364-branch-1.patch, HIVE-12364.patch
>
>
> PROBLEM:
> insert into/overwrite directory '/path' invokes distcp for moveTask and fails
> query when execution engine is Tez 
> set hive.exec.copyfile.maxsize=4;
> insert overwrite into '/tmp/testinser' select * from customer;
> failed at moveTask
> hive client log:
> {code}
> 2015-11-05 16:02:53,254 INFO  [main]: exec.FileSinkOperator 
> (Utilities.java:mvFileToFinalPath(1882)) - Moving tmp dir: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/_tmp.-ext-1
>  to: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
> 2015-11-05 16:02:53,611 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) -  method=task.DEPENDENCY_COLLECTION.Stage-2 
> from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task 
> [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task [Stage-0:MOVE] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: exec.Task 
> (SessionState.java:printInfo(951)) - Moving data to: /tmp/testindir from 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
> 2015-11-05 16:02:53,637 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(551)) - Source is 491763261 bytes. (MAX: 4)
> 2015-11-05 16:02:53,638 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
> 2015-11-05 16:03:03,924 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:04,081 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:20,210 INFO  [main]: hdfs.DFSClient 
> (DFSClient.java:getDelegationToken(1047)) - Created HDFS_DELEGATION_TOKEN 
> token 1069 for haha on ha-hdfs:hdpsecehdfs
> 2015-11-05 16:03:20,249 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: HDFS_DELEGATION_TOKEN, Service: 
> ha-hdfs:hdpsecehdfs, Ident: (HDFS_DELEGATION_TOKEN token 1069 for haha)
> 2015-11-05 16:03:20,250 WARN  [main]: token.Token 
> (Token.java:getClassForIdentifier(121)) - Cannot find class for token kind 
> kms-dt
> 2015-11-05 16:03:20,250 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: kms-dt, Service: 172.25.17.102:9292, Ident: 00 04 
> 68 61 68 61 02 72 6d 00 8a 01 50 da 1a ca 29 8a 01 50 fe 27 4e 29 03 02
> 2015-11-05 16:03:22,561 INFO  [main]: Configuration.deprecation 
> (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 2015-11-05 16:03:22,562 INFO  [main]: Configuration.deprecation 
> (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 2015-11-05 16:03:33,733 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Unable to move 
> source 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
>  to destination /tmp/testindir
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
>  to destination /tmp/testindir
> at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2665)
> at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
>

[jira] [Commented] (HIVE-12330) Fix precommit Spark test part2

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997794#comment-14997794
 ] 

Hive QA commented on HIVE-12330:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771455/HIVE-12330.4-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 9680 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_constprog_dpp
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_lvj_mapjoin
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mapjoin_decimal
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_fsstat
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_insert_overwrite_local_directory_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_result_complex
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_tests
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_joins_explain
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_multi_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_decimal
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_dynamic_partition
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_group_by
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/995/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/995/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-995/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771455 - PreCommit-HIVE-SPARK-Build

> Fix precommit Spark test part2
> --
>
> Key: HIVE-12330
> URL: https://issues.apache.org/jira/browse/HIVE-12330
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Sergio Peña
>

[jira] [Comment Edited] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997797#comment-14997797
 ] 

Ashutosh Chauhan edited comment on HIVE-12017 at 11/10/15 1:34 AM:
---

I went through golden file plan changes and found following categories of plan 
diffs:
* 1) extra select operator : Many plans now have extra select operator in 
plans. e.g., auto_sortmerge_join_*.q
* 2) agg expr lost : In some tests, it seems like we dropped the aggregation 
altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q
* 3) Shuffle join warning : Some tests now are generating shuffle join warning, 
e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q
* 4) extra columns : seems like column pruning issue: 
auto_join1.q,auto_join10.q,auto_join11.q
* 5) PTF op missing : This one seems like ptf operator got dropped altogether 
ptfgroupbyjoin.q.
*  6) Non-skew-join plan : Seems like skew join optimization is broken and we 
drop that optimization. e.g., skewjoin_mapjoin*.q

Among these 1) & 4) are not a big concern. However, 2) & 5) could be 
correctness issue and 3) & 6) could be substantial perf losses.


was (Author: ashutoshc):
I went through golden file plan changes and found following categories of plan 
diffs:
* 1) extra select operator : Many plans now have extra select operator in 
plans. e.g., auto_sortmerge_join_*.q
* 2) agg expr lost : In some tests, it seems like we dropped the aggregation 
altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q
* 3) Shuffle join warning : Some tests now are generating shuffle join warning, 
e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q
* 4) extra columns : seems like column pruning issue: 
auto_join1.q,auto_join10.q,auto_join11.q
* 5) PTF op missing : This one seems like ptf operator got dropped altogether 
ptfgroupbyjoin.q.
*  6) Non-skew-join plan : Seems like skew join optimization is broken and we 
drop that optimization. e.g., skewjoin_mapjoin*.q

Among these 1) & 4) are not a big concern. However, 2) & 5) could be 
correctness issue and 3) & 7) could be substantial perf losses.

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997797#comment-14997797
 ] 

Ashutosh Chauhan commented on HIVE-12017:
-

I went through golden file plan changes and found following categories of plan 
diffs:
* 1) extra select operator : Many plans now have extra select operator in 
plans. e.g., auto_sortmerge_join_*.q
* 2) agg expr lost : In some tests, it seems like we dropped the aggregation 
altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q
* 3) Shuffle join warning : Some tests now are generating shuffle join warning, 
e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q
* 4) extra columns : seems like column pruning issue: 
auto_join1.q,auto_join10.q,auto_join11.q
* 5) PTF op missing : This one seems like ptf operator got dropped altogether 
ptfgroupbyjoin.q.
*  6) Non-skew-join plan : Seems like skew join optimization is broken and we 
drop that optimization. e.g., skewjoin_mapjoin*.q

Among these 1) & 4) are not a big concern. However, 2) & 5) could be 
correctness issue and 3) & 7) could be substantial perf losses.

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997839#comment-14997839
 ] 

Sergey Shelukhin commented on HIVE-12017:
-

Hmm.. that will expand the issues to many more queries though. Is CBO on by 
default on master? In that case we shouldn't commit before these are fixed.

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997922#comment-14997922
 ] 

Hive QA commented on HIVE-12366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771467/HIVE-12366.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9778 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5978/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5978/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5978/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771467 - PreCommit-HIVE-TRUNK-Build

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11948) Investigate TxnHandler and CompactionTxnHandler to see where we improve concurrency

2015-11-09 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11948:
--
Attachment: HIVE-11948.4.patch

> Investigate TxnHandler and CompactionTxnHandler to see where we improve 
> concurrency
> ---
>
> Key: HIVE-11948
> URL: https://issues.apache.org/jira/browse/HIVE-11948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11948.3.patch, HIVE-11948.4.patch, HIVE-11948.patch
>
>
> at least some operations (or parts of operations) can run at READ_COMMITTED.
> CompactionTxnHandler.setRunAs()
> CompactionTxnHandler.findNextToCompact()
> if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause 
> and logic to look for "next" candidate
> CompactionTxnHandler.markCompacted()
> perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra 
> consistency check)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997952#comment-14997952
 ] 

Hive QA commented on HIVE-12301:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771472/HIVE-12301.03.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9782 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5979/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5979/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5979/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771472 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch, HIVE-12301.02.patch, 
> HIVE-12301.03.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-09 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12362:
-
Attachment: HIVE-12362.2.patch

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997027#comment-14997027
 ] 

Hive QA commented on HIVE-12362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771364/HIVE-12362.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9777 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_null_format
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5972/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5972/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5972/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771364 - PreCommit-HIVE-TRUNK-Build

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-09 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997102#comment-14997102
 ] 

Jason Dere commented on HIVE-11878:
---

Hi [~rdsr], I think that sounds good.

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12338) Add webui to HiveServer2

2015-11-09 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997133#comment-14997133
 ] 

Szehon Ho commented on HIVE-12338:
--

First part should be covered in another JIRA- HIVE-10926, should be able to 
hook in once this is in place?

> Add webui to HiveServer2
> 
>
> Key: HIVE-12338
> URL: https://issues.apache.org/jira/browse/HIVE-12338
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> A web ui for HiveServer2 can show some useful information such as:
>  
> 1. Sessions,
> 2. Queries that are executing on the HS2, their states, starting time, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11948) Investigate TxnHandler and CompactionTxnHandler to see where we improve concurrency

2015-11-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996860#comment-14996860
 ] 

Eugene Koifman commented on HIVE-11948:
---

this needs a change around TxnManager.lock()

> Investigate TxnHandler and CompactionTxnHandler to see where we improve 
> concurrency
> ---
>
> Key: HIVE-11948
> URL: https://issues.apache.org/jira/browse/HIVE-11948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11948.patch
>
>
> at least some operations (or parts of operations) can run at READ_COMMITTED.
> CompactionTxnHandler.setRunAs()
> CompactionTxnHandler.findNextToCompact()
> if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause 
> and logic to look for "next" candidate
> CompactionTxnHandler.markCompacted()
> perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra 
> consistency check)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-11-09 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11684:
---
Attachment: HIVE-11684.16.patch

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, 
> HIVE-11684.07.patch, HIVE-11684.08.patch, HIVE-11684.09.patch, 
> HIVE-11684.10.patch, HIVE-11684.11.patch, HIVE-11684.12.patch, 
> HIVE-11684.12.patch, HIVE-11684.14.patch, HIVE-11684.15.patch, 
> HIVE-11684.16.patch, HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-09 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996795#comment-14996795
 ] 

Yongzhi Chen commented on HIVE-12182:
-

[~ngangam], you may need rebase your change. 

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.2.patch, HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12372) Improve to support the multibyte character at lpad and rpad

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996827#comment-14996827
 ] 

Hive QA commented on HIVE-12372:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771355/HIVE-12372.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9778 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5971/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5971/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5971/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771355 - PreCommit-HIVE-TRUNK-Build

> Improve to support the multibyte character at lpad and rpad
> ---
>
> Key: HIVE-12372
> URL: https://issues.apache.org/jira/browse/HIVE-12372
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
>  Labels: udf
> Attachments: HIVE-12372.1.patch
>
>
> The current lpad and rpad don't support the multibyte character at "str" and 
> "pad".
> For example, we can see the following result.
> {code}
> hive> select name from sample1;
> OK
> tokyo
> ＴＯＫＹＯ
> hive> select lpad(name, 20, '*') from sample1;
> OK
> ***tokyo
> *ＴＯＫＹＯ
> {code}
> This is improved as follows.
> {code}
> hive> select lpad(name, 20, '*') from sample1;
> ***tokyo
> ***ＴＯＫＹＯ
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.2.patch

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12355) Keep Obj Inspectors in Sync with RowSchema

2015-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997319#comment-14997319
 ] 

Xuefu Zhang commented on HIVE-12355:


Removed the fixed version and it can be filled once this gets resolved.

> Keep Obj Inspectors in Sync with RowSchema
> --
>
> Key: HIVE-12355
> URL: https://issues.apache.org/jira/browse/HIVE-12355
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0, 1.1.0, 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
>
> Currently Not all operators match their Output Obj inspectors to Row schema.
> Many times OutputObjectInspectors may be more than needed.
> This causes problems especially with union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11201) HCatalog is ignoring user specified avro schema in the table definition

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997238#comment-14997238
 ] 

Ashutosh Chauhan commented on HIVE-11201:
-

+1

> HCatalog  is ignoring user specified avro schema in the table definition
> 
>
> Key: HIVE-11201
> URL: https://issues.apache.org/jira/browse/HIVE-11201
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-11201.1.patch
>
>
> HCatalog  is ignoring user specified avro schema in the table definition , 
> instead generating its own avro based  from hive meta store. 
> By generating its own schema  will result in mismatch names.  For exmple Avro 
> fields name are Case Sensitive.  By generating it's own schema will  result 
> in incorrect schema written to the avro file , and result   select fail on 
> read.   And also Even if user specified schema does not allow null ,  when 
> data is written using Hcatalog , it will write a schema that will allow null. 
> For example in the table ,  user specified , all CAPITAL letters in the 
> schema , and record name as LINEITEM.  The schema should be written as it is. 
>  Instead Hcatalog ignores it and generated its own avro schema from the hive 
> table case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12196) NPE when converting bad timestamp value

2015-11-09 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997228#comment-14997228
 ] 

Chaoyu Tang commented on HIVE-12196:


Patch looks good to me. +1. 
Actually I think in the second case, TIMESTAMP '2015-04-11-12:24:34.535' is a 
timestamp literal where the literal provided here for ts was not in correct 
format. Kind of syntax error.

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12330) Fix precommit Spark test part2

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997172#comment-14997172
 ] 

Hive QA commented on HIVE-12330:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771135/HIVE-12330.4-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 9159 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-alter_table_not_sorted.q-groupby4_map.q-exim_11_managed_external.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-auto_join30.q-orc_ppd_decimal.q-input17.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-auto_join_reordering_values.q-vector_decimal_trailing.q-udf_concat_ws.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-bucket_map_join_tez1.q-fetch_aggregation.q-drop_database_removes_partition_dirs.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-bucketsortoptimize_insert_7.q-dynpart_sort_optimization2.q-decimal_3.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-cbo_rp_lineage2.q-authorization_update.q-udf_pmod.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-compute_stats_string.q-show_columns.q-vector_auto_smb_mapjoin_14.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-gby_star.q-decimal_join.q-lateral_view_cp.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-groupby3_map.q-current_date_timestamp.q-skewjoinopt8.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-groupby_grouping_sets5.q-auto_sortmerge_join_13.q-udf_concat.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-groupby_map_ppr_multi_distinct.q-vectorization_16.q-union_remove_15.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-index_compact.q-bucketcontext_3.q-merge_dynamic_partition2.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-insert_values_non_partitioned.q-union5.q-udf_lower.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-metadataonly1.q-union19.q-groupby_grouping_sets6.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-skewjoin_mapjoin7.q-optional_outer.q-ptf_rcfile.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-udf_bitwise_and.q-bucketcontext_4.q-orc_ends_with_nulls.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-udf_log2.q-cbo_rp_views.q-alter_merge_stats.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-unicode_notation.q-ppd_join4.q-union27.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-vectorization_10.q-list_bucket_dml_2.q-union26.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-vectorized_parquet.q-stats12.q-binarysortable_1.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-view.q-udf2.q-input16.q-and-12-more - did not produce a 
TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_7.q-tez_union_group_by.q-orc_merge9.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-delete_where_non_partitioned.q-auto_sortmerge_join_16.q-skewjoin.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-union2.q-unionDistinct_2.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-avro_joins.q-join36.q-join4.q-and-12-more - did not produce 
a TEST-*.xml file
TestSparkCliDriver-bucketmapjoin3.q-enforce_order.q-union11.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-groupby_complex_types.q-auto_join9.q-groupby_map_ppr.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-runtime_skewjoin_mapjoin_spark.q-union_remove_9.q-ppd_multi_insert.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-smb_mapjoin_15.q-mapreduce2.q-mapreduce1.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-union22.q-union_remove_23.q-transform_ppr2.q-and-5-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-union_top_level.q-auto_join1.q-join18.q-and-12-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/994/testReport
Console output:

[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-09 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997246#comment-14997246
 ] 

Pengcheng Xiong commented on HIVE-12301:


The test cases failures are unrelated and I can not repo 
testCliDriver_cbo_udf_max on my mac. Thanks.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch, HIVE-12301.02.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997208#comment-14997208
 ] 

Ashutosh Chauhan commented on HIVE-4577:


Hadoop FsShell is passed in token[] using GenericOptionsParser to parse the 
string. 
https://github.com/apache/hadoop/blob/release-2.6.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ToolRunner.java#L64
 
I think instead of adding custom splitCmd() to parse it, using 
GenericOptionsParser will make us consistent with Hadoop's parsing code.

> hive CLI can't handle hadoop dfs command  with space and quotes.
> 
>
> Key: HIVE-4577
> URL: https://issues.apache.org/jira/browse/HIVE-4577
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, 
> HIVE-4577.3.patch.txt, HIVE-4577.4.patch
>
>
> As design, hive could support hadoop dfs command in hive shell, like 
> hive> dfs -mkdir /user/biadmin/mydir;
> but has different behavior with hadoop if the path contains space and quotes
> hive> dfs -mkdir "hello"; 
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
> /user/biadmin/"hello"
> hive> dfs -mkdir 'world';
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
> /user/biadmin/'world'
> hive> dfs -mkdir "bei jing";
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/"bei
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/jing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12355) Keep Obj Inspectors in Sync with RowSchema

2015-11-09 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-12355:
---
Fix Version/s: (was: 2.0.0)

> Keep Obj Inspectors in Sync with RowSchema
> --
>
> Key: HIVE-12355
> URL: https://issues.apache.org/jira/browse/HIVE-12355
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0, 1.1.0, 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
>
> Currently Not all operators match their Output Obj inspectors to Row schema.
> Many times OutputObjectInspectors may be more than needed.
> This causes problems especially with union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11948) Investigate TxnHandler and CompactionTxnHandler to see where we improve concurrency

2015-11-09 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11948:
--
Attachment: HIVE-11948.5.patch

> Investigate TxnHandler and CompactionTxnHandler to see where we improve 
> concurrency
> ---
>
> Key: HIVE-11948
> URL: https://issues.apache.org/jira/browse/HIVE-11948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11948.3.patch, HIVE-11948.4.patch, 
> HIVE-11948.5.patch, HIVE-11948.patch
>
>
> at least some operations (or parts of operations) can run at READ_COMMITTED.
> CompactionTxnHandler.setRunAs()
> CompactionTxnHandler.findNextToCompact()
> if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause 
> and logic to look for "next" candidate
> CompactionTxnHandler.markCompacted()
> perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra 
> consistency check)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11948) Investigate TxnHandler and CompactionTxnHandler to see where we improve concurrency

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998118#comment-14998118
 ] 

Hive QA commented on HIVE-11948:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771497/HIVE-11948.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9778 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_generatehfiles_require_family_path
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.metastore.txn.TestTxnHandler.showLocks
org.apache.hive.hcatalog.streaming.TestStreaming.testHearbeat
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5981/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5981/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5981/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771497 - PreCommit-HIVE-TRUNK-Build

> Investigate TxnHandler and CompactionTxnHandler to see where we improve 
> concurrency
> ---
>
> Key: HIVE-11948
> URL: https://issues.apache.org/jira/browse/HIVE-11948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11948.3.patch, HIVE-11948.4.patch, 
> HIVE-11948.5.patch, HIVE-11948.patch
>
>
> at least some operations (or parts of operations) can run at READ_COMMITTED.
> CompactionTxnHandler.setRunAs()
> CompactionTxnHandler.findNextToCompact()
> if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause 
> and logic to look for "next" candidate
> CompactionTxnHandler.markCompacted()
> perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra 
> consistency check)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization

2015-11-09 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998086#comment-14998086
 ] 

Lefty Leverenz commented on HIVE-12370:
---

I think you meant another Xuefu:  [~xuefuz].  ;)

> Hive Query got failure with larger scale data set with enablng sampling order 
> optimization
> --
>
> Key: HIVE-12370
> URL: https://issues.apache.org/jira/browse/HIVE-12370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Yi Zhou
>
> Found that hive would get failure on Hive on MR with larger scale 
> data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 
> 1GB data set).
> hive.optimize.sampling.orderby=true
> hive.optimize.sampling.orderby.number=2
> hive.optimize.sampling.orderby.percent=0.1
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> ... 9 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12045) ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)

2015-11-09 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-12045:
--
Attachment: HIVE-12045.2-spark.patch

Add test for the generic UDF.

> ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)
> --
>
> Key: HIVE-12045
> URL: https://issues.apache.org/jira/browse/HIVE-12045
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
> Environment: Cloudera QuickStart VM - CDH5.4.2
> beeline
>Reporter: Zsolt Tóth
>Assignee: Rui Li
> Attachments: HIVE-12045.1-spark.patch, HIVE-12045.2-spark.patch, 
> example.jar, genUDF.patch
>
>
> If I execute the following query in beeline, I get ClassNotFoundException for 
> the UDF class.
> {code}
> drop function myGenericUdf;
> create function myGenericUdf as 'org.example.myGenericUdf' using jar 
> 'hdfs:///tmp/myudf.jar';
> select distinct myGenericUdf(1,2,1) from mytable;
> {code}
> In my example, myGenericUdf just looks for the 1st argument's value in the 
> others and returns the index. I don't think this is related to the actual 
> GenericUDF function.
> Note that:
> "select myGenericUdf(1,2,1) from mytable;" succeeds
> If I use the non-generic implementation of the same UDF, the select distinct 
> call succeeds.
> StackTrace:
> {code}
> 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: 
> hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
>

[jira] [Assigned] (HIVE-12285) Add locking to HCatClient

2015-11-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-12285:
-

Assignee: Carl Steinbach  (was: Elliot West)

> Add locking to HCatClient
> -
>
> Key: HIVE-12285
> URL: https://issues.apache.org/jira/browse/HIVE-12285
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Carl Steinbach
>  Labels: concurrency, hcatalog, lock, locking, locks
>
> With the introduction of a concurrency model (HIVE-1293) Hive uses locks to 
> coordinate  access and updates to both table data and metadata. Within the 
> Hive CLI such lock management is seamless. However, Hive provides additional 
> APIs that permit interaction with data repositories, namely the HCatalog 
> APIs. Currently, operations implemented by this API do not participate with 
> Hive's locking scheme. Furthermore, access to the locking mechanisms is not 
> exposed by the APIs (as is the case with the Metastore Thrift API) and so 
> users are not able to explicitly interact with locks either. This has created 
> a less than ideal situation where users of the APIs have no choice but to 
> manipulate these data repositories outside of the command of Hive's lock 
> management, potentially resulting in situations where data inconsistencies 
> can occur both for external processes using the API and for queries executing 
> within Hive.
> h3. Scope of work
> This ticket is concerned with sections of the HCatalog API that deal with DDL 
> type operations using the metastore, not with those whose purpose is to 
> read/write table data. A separate issue already exists for adding locking to 
> HCat readers and writers (HIVE-6207).
> h3. Proposed work
> The following work items would serve as a minimum deliverable that would both 
> allow API users to effectively work with locks:
> * Comprehensively document on the wiki the locks required for various Hive 
> operations. At a minimum this should cover all operations exposed by 
> {{HCatClient}}. The [Locking design 
> document|https://cwiki.apache.org/confluence/display/Hive/Locking] can be 
> used as a starting point or perhaps updated.
> * Implement methods and types in the {{HCatClient}} API that allow users to 
> manipulate Hive locks. For the most part I'd expect these to delegate to the 
> metastore API implementations:
> ** {{org.apache.hadoop.hive.metastore.IMetaStoreClient.lock(LockRequest)}}
> ** {{org.apache.hadoop.hive.metastore.IMetaStoreClient.checkLock(long)}}
> ** {{org.apache.hadoop.hive.metastore.IMetaStoreClient.unlock(long)}}
> ** -{{org.apache.hadoop.hive.metastore.IMetaStoreClient.showLocks()}}-
> ** {{org.apache.hadoop.hive.metastore.IMetaStoreClient.heartbeat(long, long)}}
> ** {{org.apache.hadoop.hive.metastore.api.LockComponent}}
> ** {{org.apache.hadoop.hive.metastore.api.LockRequest}}
> ** {{org.apache.hadoop.hive.metastore.api.LockResponse}}
> ** {{org.apache.hadoop.hive.metastore.api.LockLevel}}
> ** {{org.apache.hadoop.hive.metastore.api.LockType}}
> ** {{org.apache.hadoop.hive.metastore.api.LockState}}
> ** -{{org.apache.hadoop.hive.metastore.api.ShowLocksResponse}}-
> h3. Additional proposals
> Explicit lock management should be fairly simple to add to {{HCatClient}}, 
> however it puts the onus on the API user to correctly understand and 
> implement code that uses lock in an appropriate manner. Failure to do so may 
> have undesirable consequences. With a simpler user model the operations 
> exposed on the API would automatically acquire and release the locks that 
> they need. This might work well for small numbers of operations, but not 
> perhaps for large sequences of invocations. (Do we need to worry about this 
> though as the API methods usually accept batches?).  Additionally tasks such 
> as heartbeat management could also be handled implicitly for long running 
> sets of operations. With these concerns in mind it may also be beneficial to 
> deliver some of the following:
> * A means to automatically acquire/release appropriate locks for 
> {{HCatClient}} operations.
> * A component that maintains a lock heartbeat from the client.
> * A strategy for switching between manual/automatic lock management, 
> analogous to SQL's {{autocommit}} for transactions.
> An API for lock and heartbeat management already exists in the HCatalog 
> Mutation API (see: 
> {{org.apache.hive.hcatalog.streaming.mutate.client.lock}}). It will likely 
> make sense to refactor either this code and/or code that uses it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998020#comment-14998020
 ] 

Wei Zheng commented on HIVE-12366:
--

[~ekoifman] Can you take a look at patch 4? RB link attached as well

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12373) Interner should return identical map or list

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998029#comment-14998029
 ] 

Hive QA commented on HIVE-12373:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771471/HIVE-12373.1.patch.txt

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 9779 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.binaryPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.booleanPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.createTable
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doublePartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringPartitionStatistics
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteAllNonPartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteAllPartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteAllWherePartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteOnePartition
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteOnePartitionWhere
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testDeleteWhereNoPartition
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testInsertSelect
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testInsertValues
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testInsertValuesPartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateAllNonPartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateAllNonPartitionedWhere
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateAllPartitioned
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateAllPartitionedWhere
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateOnePartition
org.apache.hadoop.hive.ql.parse.TestUpdateDeleteSemanticAnalyzer.testUpdateOnePartitionWhere
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5980/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5980/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5980/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771471 - PreCommit-HIVE-TRUNK-Build

> Interner should return identical map or list
> 
>
> Key: HIVE-12373
> URL: https://issues.apache.org/jira/browse/HIVE-12373
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-12373.1.patch.txt
>
>
> Currently, HiveStringUtils.intern(map/list) returns new instance of map or 
> list. But it would break some usage style of code something like below (it's 
> spark code in HiveMetastoreCatalog)
> {code}
> val serdeParameters = new java.util.HashMap[String, String]()
> serdeInfo.setParameters(serdeParameters)
> // these properties will be gone
> table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> {code}
> Luckily for spark, interner was not applied to released version of hive 
> (1.2.0, 1.2.1) by mistake. But it would make problem in someday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2015-11-09 Thread Hui Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Zheng updated HIVE-11531:
-
Attachment: HIVE-11531.patch

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
> Attachments: HIVE-11531.WIP.1.patch, HIVE-11531.WIP.2.patch, 
> HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization

2015-11-09 Thread Yi Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998077#comment-14998077
 ] 

Yi Zhou commented on HIVE-12370:


Thanks you [~xuefu.w...@kodak.com] !
In our case, it got failure with larger data set..but small data set is OK. 
Could you please help to identify this Hive issue ?

> Hive Query got failure with larger scale data set with enablng sampling order 
> optimization
> --
>
> Key: HIVE-12370
> URL: https://issues.apache.org/jira/browse/HIVE-12370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Yi Zhou
>
> Found that hive would get failure on Hive on MR with larger scale 
> data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 
> 1GB data set).
> hive.optimize.sampling.orderby=true
> hive.optimize.sampling.orderby.number=2
> hive.optimize.sampling.orderby.percent=0.1
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> ... 9 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997821#comment-14997821
 ] 

Hive QA commented on HIVE-11927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771456/HIVE-11927.07.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9778 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup4
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5977/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5977/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5977/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771456 - PreCommit-HIVE-TRUNK-Build

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, 
> HIVE-11927.06.patch, HIVE-11927.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997802#comment-14997802
 ] 

Ashutosh Chauhan commented on HIVE-12017:
-

I must say that I believe all above 6) issues are not introduced by this patch, 
but rather exposed by it. My belief is they always occurred on CBO, so may be 
we can commit this patch (since its not the one introducing these issues) while 
we investigate these issues.

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.3.patch

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-11-09 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11927:
---
Attachment: HIVE-11927.07.patch

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, 
> HIVE-11927.06.patch, HIVE-11927.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12330) Fix precommit Spark test part2

2015-11-09 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12330:
---
Attachment: (was: HIVE-12330.4-spark.patch)

> Fix precommit Spark test part2
> --
>
> Key: HIVE-12330
> URL: https://issues.apache.org/jira/browse/HIVE-12330
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Sergio Peña
> Attachments: HIVE-12229.3-spark.patch, HIVE-12330.4-spark.patch
>
>
> Regression because of HIVE-11489



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997589#comment-14997589
 ] 

Ashutosh Chauhan commented on HIVE-12017:
-

Code changes look good. Whats the reason for writing and using 
HiveRelOptUtil::createProject() instead of calcite's RelOptUtil version? It 
will be good to add reason as a comment in the code.
Also, as a side note, we will also want to add a profile, which will run all 
rules which don't need stats. e.g, even if there are 3 joins but no stats, we 
will not apply transitive inference rules for ppd for joins, because currently 
CBO will throw exception when stats are not found. We should add such a profile 
in a followup.
I am going through plan changes (slowly, slowly : ))

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12330) Fix precommit Spark test part2

2015-11-09 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12330:
---
Attachment: HIVE-12330.4-spark.patch

> Fix precommit Spark test part2
> --
>
> Key: HIVE-12330
> URL: https://issues.apache.org/jira/browse/HIVE-12330
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Sergio Peña
> Attachments: HIVE-12229.3-spark.patch, HIVE-12330.4-spark.patch
>
>
> Regression because of HIVE-11489



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996260#comment-14996260
 ] 

Hive QA commented on HIVE-12367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771293/HIVE-12367.001.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9748 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-dynamic_partition_pruning.q-update_all_partitioned.q-vectorized_rcfile_columnar.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_left_outer_join2.q-vector_outer_join5.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dbtxnmgr_nodblock
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dbtxnmgr_nodbunlock
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg_query_tbl_in_locked_db
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg_try_db_lock_conflict
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg_try_drop_locked_db
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg_try_lock_db_in_use
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5968/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5968/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5968/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771293 - PreCommit-HIVE-TRUNK-Build

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12369) Faster Vector GroupBy

2015-11-09 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12369:

Attachment: HIVE-12369.01.patch

> Faster Vector GroupBy
> -
>
> Key: HIVE-12369
> URL: https://issues.apache.org/jira/browse/HIVE-12369
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12369.01.patch
>
>
> Implement fast Vector GroupBy using fast hash table technology developed for 
> Native Vector MapJoin and vector key handling developed for recent HIVE-12290 
> Native Vector ReduceSink JIRA.
> (Patch also includes making Native Vector MapJoin use Hybrid Grace -- but 
> that can be separated out)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12373) Interner should return identical map or list

2015-11-09 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12373:
-
Attachment: HIVE-12373.1.patch.txt

> Interner should return identical map or list
> 
>
> Key: HIVE-12373
> URL: https://issues.apache.org/jira/browse/HIVE-12373
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-12373.1.patch.txt
>
>
> Currently, HiveStringUtils.intern(map/list) returns new instance of map or 
> list. But it would break some usage style of code something like below (it's 
> spark code in HiveMetastoreCatalog)
> {code}
> val serdeParameters = new java.util.HashMap[String, String]()
> serdeInfo.setParameters(serdeParameters)
> // these properties will be gone
> table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> {code}
> Luckily for spark, interner was not applied to released version of hive 
> (1.2.0, 1.2.1) by mistake. But it would make problem in someday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12364) Distcp job fails when run under Tez

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997717#comment-14997717
 ] 

Hive QA commented on HIVE-12364:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771436/HIVE-12364.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9779 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5975/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5975/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5975/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771436 - PreCommit-HIVE-TRUNK-Build

> Distcp job fails when run under Tez
> ---
>
> Key: HIVE-12364
> URL: https://issues.apache.org/jira/browse/HIVE-12364
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-12364-branch-1.patch, HIVE-12364.patch
>
>
> PROBLEM:
> insert into/overwrite directory '/path' invokes distcp for moveTask and fails
> query when execution engine is Tez 
> set hive.exec.copyfile.maxsize=4;
> insert overwrite into '/tmp/testinser' select * from customer;
> failed at moveTask
> hive client log:
> {code}
> 2015-11-05 16:02:53,254 INFO  [main]: exec.FileSinkOperator 
> (Utilities.java:mvFileToFinalPath(1882)) - Moving tmp dir: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/_tmp.-ext-1
>  to: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
> 2015-11-05 16:02:53,611 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) -  method=task.DEPENDENCY_COLLECTION.Stage-2 
> from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task 
> [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task [Stage-0:MOVE] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: exec.Task 
> (SessionState.java:printInfo(951)) - Moving data to: /tmp/testindir from 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-1
> 2015-11-05 16:02:53,637 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(551)) - Source is 491763261 bytes. (MAX: 4)
> 2015-11-05 16:02:53,638 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
> 2015-11-05 16:03:03,924 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:04,081 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:20,210 INFO  [main]: hdfs.DFSClient 
> (DFSClient.java:getDelegationToken(1047)) - Created HDFS_DELEGATION_TOKEN 
> token 1069 for haha on ha-hdfs:hdpsecehdfs
> 2015-11-05 16:03:20,249 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: HDFS_DELEGATION_TOKEN, Service: 
> ha-hdfs:hdpsecehdfs, Ident: (HDFS_DELEGATION_TOKEN token 1069 for haha)
> 2015-11-05 16:03:20,250 WARN  [main]: token.Token 
> (Token.java:getClassForIdentifier(121)) - Cannot find class for token kind 
> kms-dt
> 2015-11-05 16:03:20,250 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: kms-dt, Service: 172.25.17.102:9292, Ident: 00 04 
> 68 61 68 61 02 72 6d 00 8a 01 50 da 1a ca 29 8a 01 50 fe 27 4e 29 03 02
>

[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.4.patch

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-09 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12301:
---
Attachment: HIVE-12301.03.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch, HIVE-12301.02.patch, 
> HIVE-12301.03.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997718#comment-14997718
 ] 

Hive QA commented on HIVE-12366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771448/HIVE-12366.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5976/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5976/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5976/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5976/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 08e9d26 HIVE-12312 : Excessive logging in PPD code (Carter 
Shanklin via Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/insert_dir_distcp.q
Removing ql/src/test/results/clientpositive/insert_dir_distcp.q.out
Removing ql/src/test/results/clientpositive/tez/insert_dir_distcp.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 08e9d26 HIVE-12312 : Excessive logging in PPD code (Carter 
Shanklin via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
patch:  malformed patch at line 120: diff --git 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java

patch:  malformed patch at line 120: diff --git 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java

patch:  malformed patch at line 120: diff --git 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 
ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java

The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771448 - PreCommit-HIVE-TRUNK-Build

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10328) Enable new return path for cbo

2015-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997545#comment-14997545
 ] 

Hive QA commented on HIVE-10328:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771430/HIVE-10328.13.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 705 failed/errored test(s), 9780 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_count
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_ddl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_resolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_test_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_macro_duplicate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_random
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_basic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_selectDistinctStar
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_min
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_6_subq
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_custom_udf_configure
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_reduce1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_reduce2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_reduce3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_dboutput

[jira] [Updated] (HIVE-12080) Support auto type widening (int->bigint & float->double) for Parquet table

2015-11-09 Thread Mohammad Kamrul Islam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated HIVE-12080:
-
Attachment: HIVE-12080.6.patch


Uploading the patch for Jenkins build which was reviewed and +1-ed in RB.


> Support auto type widening (int->bigint & float->double) for Parquet table
> --
>
> Key: HIVE-12080
> URL: https://issues.apache.org/jira/browse/HIVE-12080
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: HIVE-12080.1.patch, HIVE-12080.2.patch, 
> HIVE-12080.3.patch, HIVE-12080.6.patch
>
>
> Currently Hive+Parquet doesn't support it. It should include at least basic 
> type promotions short->int->bigint,  float->double etc, that are already 
> supported for  other file formats.
> There were similar effort (Hive-6784) but was not committed. This JIRA is to 
> address the same in different way with little (no) performance impact.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-11-09 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997738#comment-14997738
 ] 

Owen O'Malley commented on HIVE-11981:
--

You should create a wiki page with the supported schema evolution.

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, 
> HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, 
> HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, 
> HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, 
> HIVE-11981.094.patch, HIVE-11981.095.patch, HIVE-11981.096.patch, 
> HIVE-11981.097.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12045) ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)

2015-11-09 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996217#comment-14996217
 ] 

Rui Li commented on HIVE-12045:
---

Thanks for the patch Xuefu. Our spark on yarn test runs with yarn-client. Do 
you think we need to change it to yarn-cluster?

> ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)
> --
>
> Key: HIVE-12045
> URL: https://issues.apache.org/jira/browse/HIVE-12045
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
> Environment: Cloudera QuickStart VM - CDH5.4.2
> beeline
>Reporter: Zsolt Tóth
>Assignee: Rui Li
> Attachments: HIVE-12045.1-spark.patch, example.jar, genUDF.patch
>
>
> If I execute the following query in beeline, I get ClassNotFoundException for 
> the UDF class.
> {code}
> drop function myGenericUdf;
> create function myGenericUdf as 'org.example.myGenericUdf' using jar 
> 'hdfs:///tmp/myudf.jar';
> select distinct myGenericUdf(1,2,1) from mytable;
> {code}
> In my example, myGenericUdf just looks for the 1st argument's value in the 
> others and returns the index. I don't think this is related to the actual 
> GenericUDF function.
> Note that:
> "select myGenericUdf(1,2,1) from mytable;" succeeds
> If I use the non-generic implementation of the same UDF, the select distinct 
> call succeeds.
> StackTrace:
> {code}
> 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: 
> hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
>

[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-11-09 Thread Bing Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996186#comment-14996186
 ] 

Bing Li commented on HIVE-10982:


Hi, [~alangates]
Thank you for your comment. 
Yes, I still want to be able to set this property via the connection URL.
I will rebase the patch soon.

Thank you.

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-10982.1.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-11-09 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997767#comment-14997767
 ] 

Owen O'Malley commented on HIVE-11981:
--

Ok, we need to do some work to avoid tying ORC back into Hive. In particular, 
we need should add to OrcFile.ReaderOptions a method to set the desired schema. 
It should look like:

{code}
  /**
   * Define the schema that the reader should read as.
   */
  public ReaderOptions schema(TypeDescription schema);

 /**
  * The accessor for the schema to read as.
  */
  TypeDescription getSchema();
{code}

The OrcInputFormat should use the SchemaEvolution code to figure out whether to 
call ReaderOptions.schema on the underlying Reader. (You could also put this 
into Reader.Options, which is the options object used to create RecordReaders.) 
The critical piece is that OrcFile and the parts under it can't depend on 
anything from serde or io.



> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, 
> HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, 
> HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, 
> HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, 
> HIVE-11981.094.patch, HIVE-11981.095.patch, HIVE-11981.096.patch, 
> HIVE-11981.097.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

68 matches

Mail list logo