[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Attachment: HIVE-14367.4.patch

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Patch Available  (was: Open)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch, HIVE-14367.4.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Open  (was: Patch Available)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14363) bucketmap inner join query fails due to NullPointerException in some cases

2016-07-30 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14363:
-
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

> bucketmap inner join query fails due to NullPointerException in some cases
> --
>
> Key: HIVE-14363
> URL: https://issues.apache.org/jira/browse/HIVE-14363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14363.1.patch, HIVE-14363.final.patch
>
>
> Bucketmap inner join query between bucketed tables throws following exception 
> when one table contains all the empty buckets while other has all the 
> non-empty buckets.
> {noformat}
> Vertex failed, vertexName=Map 2, vertexId=vertex_1466710232033_0432_4_01, 
> diagnostics=[Task failed, taskId=task_1466710232033_0432_4_01_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1466710232033_0432_4_01_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:330)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getKeyValueReader(MapRecordProcessor.java:372)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.initializeMapRecordSources(MapRecordProcessor.java:344)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:292)
>   ... 15 more
> ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1466710232033_0432_4_01_00_1:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> 

[jira] [Updated] (HIVE-14363) bucketmap inner join query fails due to NullPointerException in some cases

2016-07-30 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14363:
-
Attachment: HIVE-14363.final.patch

Adding comments, the issue was reproduced only in cluster so I havent put a q 
file test case. I am not running the unit tests again for just adding comments.

> bucketmap inner join query fails due to NullPointerException in some cases
> --
>
> Key: HIVE-14363
> URL: https://issues.apache.org/jira/browse/HIVE-14363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14363.1.patch, HIVE-14363.final.patch
>
>
> Bucketmap inner join query between bucketed tables throws following exception 
> when one table contains all the empty buckets while other has all the 
> non-empty buckets.
> {noformat}
> Vertex failed, vertexName=Map 2, vertexId=vertex_1466710232033_0432_4_01, 
> diagnostics=[Task failed, taskId=task_1466710232033_0432_4_01_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1466710232033_0432_4_01_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:330)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getKeyValueReader(MapRecordProcessor.java:372)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.initializeMapRecordSources(MapRecordProcessor.java:344)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:292)
>   ... 15 more
> ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1466710232033_0432_4_01_00_1:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> 

[jira] [Commented] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400946#comment-15400946
 ] 

Hive QA commented on HIVE-14378:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820841/HIVE-14378.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 61 failed/errored test(s), 10417 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constant_prop_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_noalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_onview
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_stack
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin_3way
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge_incompat2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_selectDistinctStar
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_varchar_simple
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_13
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union9
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns

[jira] [Updated] (HIVE-14355) Schema evolution for ORC in llap is broken for int to string conversion

2016-07-30 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14355:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Will commit to branch-2.1 after resolving some conflicts.

> Schema evolution for ORC in llap is broken for int to string conversion
> ---
>
> Key: HIVE-14355
> URL: https://issues.apache.org/jira/browse/HIVE-14355
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14355-java-only.patch, HIVE-14355.1.patch, 
> HIVE-14355.2.java-only.patch, HIVE-14355.2.patch, 
> HIVE-14355.3.java-only.patch, HIVE-14355.3.patch
>
>
> When schema is evolved from any integer type to string then following 
> exceptions are thrown in LLAP (Works fine in Tez). I guess this should happen 
> even for other conversions.
> {code}
> hive> create table orc_integer(b bigint) stored as orc;
> hive> insert into orc_integer values(100);
> hive> select count(*) from orc_integer where b=100;
> OK
> 1
> hive> alter table orc_integer change column b b string;
> hive> select count(*) from orc_integer where b=100;
> // FAIL with following exception
> {code}
> {code:title=When vectorization is enabled}
> 2016-07-27T01:48:05,611  INFO [TezTaskRunner ()] 
> vector.VectorReduceSinkOperator: RECORDS_OUT_INTERMEDIATE_Map_1:0,
> 2016-07-27T01:48:05,611 ERROR [TezTaskRunner ()] tez.TezProcessor: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:393)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:866)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> ... 18 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterStringGroupColEqualStringGroupScalarBase.evaluate(FilterStringGroupColEqualStringGroupScalarBase.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:110)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:774)
> ... 19 more
> {code}
> {code:title=When vectorization is disabled}
> 2016-07-27T01:52:43,328  INFO [TezTaskRunner 
> 

[jira] [Updated] (HIVE-14335) TaskDisplay's return value is not getting deserialized properly

2016-07-30 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-14335:
---
Fix Version/s: 2.1.1

Committed to branch-2.1 as well.

> TaskDisplay's return value is not getting deserialized properly
> ---
>
> Key: HIVE-14335
> URL: https://issues.apache.org/jira/browse/HIVE-14335
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14335.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14386) UGI clone shim also needs to clone credentials

2016-07-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400914#comment-15400914
 ] 

Siddharth Seth commented on HIVE-14386:
---

+1.

> UGI clone shim also needs to clone credentials
> --
>
> Key: HIVE-14386
> URL: https://issues.apache.org/jira/browse/HIVE-14386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14386.patch
>
>
> Discovered while testing HADOOP-13081



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14380) Queries on tables with remote HDFS paths fail in "encryption" checks.

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400909#comment-15400909
 ] 

Hive QA commented on HIVE-14380:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820842/HIVE-14380.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10418 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/702/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/702/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-702/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820842 - PreCommit-HIVE-MASTER-Build

> Queries on tables with remote HDFS paths fail in "encryption" checks.
> -
>
> Key: HIVE-14380
> URL: https://issues.apache.org/jira/browse/HIVE-14380
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-14380.1.patch
>
>
> If a table has table/partition locations set to remote HDFS paths, querying 
> them will cause the following IAException:
> {noformat}
> 2016-07-26 01:16:27,471 ERROR parse.CalcitePlanner 
> (SemanticAnalyzer.java:getMetaData(1867)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if 
> hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table is encrypted: 
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table, expected: 
> hdfs://bar.ygrid.yahoo.com:8020
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2204)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStrongestEncryptedTablePath(SemanticAnalyzer.java:2274)
> ...
> {noformat}
> This is because of the following code in {{SessionState}}:
> {code:title=SessionState.java|borderStyle=solid}
>  public HadoopShims.HdfsEncryptionShim getHdfsEncryptionShim() throws 
> HiveException {
> if (hdfsEncryptionShim == null) {
>   try {
> FileSystem fs = FileSystem.get(sessionConf);
> if ("hdfs".equals(fs.getUri().getScheme())) {
>   hdfsEncryptionShim = 
> ShimLoader.getHadoopShims().createHdfsEncryptionShim(fs, sessionConf);
> } else {
>   LOG.debug("Could not get hdfsEncryptionShim, it is only applicable 
> to hdfs filesystem.");
> }
>   } catch (Exception e) {
> throw new HiveException(e);
>   }
> }
> return hdfsEncryptionShim;
>   }
> {code}
> When the {{FileSystem}} instance is created, using the {{sessionConf}} 
> implies that the current HDFS is going to be used. This call should instead 
> fetch the {{FileSystem}} instance corresponding to the path being checked.
> A fix is forthcoming...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-07-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14392:
--
Status: Patch Available  (was: Open)

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-07-30 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14392:
--
Attachment: HIVE-14392.01.patch

The ordering is as follows.
1. work.dirs specified - they will be used (except for a specific string value).
2. work.dirs not specified, or specific string value set - try using work dirs 
from YARN container env.
3. Fail

Using the work dirs from the yarn container env gets rid of the problems with 
having to setup explicit directories for secure clusters. YARN will take care 
of setting up the base dirs for the llap app - which will operate within this 
dir where it has access.
Containers running for apps which actually run the query (mode=map instead of 
mode=ALL) - access the data via LLAP shuffle, which knows how to deal with 
these dirs (noone outside of LLAP is accessing these dirs directly)

[~sershe], [~gopalv] - please review.

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Patch Available  (was: Open)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Open  (was: Patch Available)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Attachment: HIVE-14367.3.patch

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch, HIVE-14367.3.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())

2016-07-30 Thread Carter Shanklin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400892#comment-15400892
 ] 

Carter Shanklin commented on HIVE-11600:


It would be good to get this documented without waiting for the followup work. 
I spent 10 minutes or so figuring this out because of HIVE-14393

> Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
> 
>
> Key: HIVE-11600
> URL: https://issues.apache.org/jira/browse/HIVE-11600
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-11600.01.patch, HIVE-11600.02.patch, 
> HIVE-11600.03.patch, HIVE-11600.04.patch, HIVE-11600.05.patch
>
>
> Current hive only support single column in clause, e.g., 
> {code}select * from src where  col0 in (v1,v2,v3);{code}
> We want it to support 
> {code}select * from src where (col0,col1+3) in 
> ((col0+v1,v2),(v3,v4-col1));{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14377) LLAP IO: issue with how estimate cache removes unneeded buffers

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400876#comment-15400876
 ] 

Hive QA commented on HIVE-14377:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820848/HIVE-14377.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_into1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_insert_overwrite_local_directory_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_where_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_cast
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_count
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_if_expr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/701/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/701/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-701/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820848 - PreCommit-HIVE-MASTER-Build

> LLAP IO: issue with how estimate cache removes unneeded buffers
> ---
>
> Key: HIVE-14377
> URL: https://issues.apache.org/jira/browse/HIVE-14377
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14377.01.patch, HIVE-14377.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-07-30 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14346:

Attachment: HIVE-14346.2.patch

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14370) printStackTrace() called in Operator.close()

2016-07-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-14370:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~DavidKaroly] for your contribution. 
I applied this to 2.2

> printStackTrace() called in Operator.close()
> 
>
> Key: HIVE-14370
> URL: https://issues.apache.org/jira/browse/HIVE-14370
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Karoly
>Assignee: David Karoly
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14370.1.patch
>
>
> Operator.close() calls printStackTrace() if something goes wrong, making the 
> stack trace go to stderr instead of logs and losing the timestamp. We should 
> use LOG.warn instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14375) hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} that uses 2.8.1

2016-07-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400861#comment-15400861
 ] 

Sergio Peña commented on HIVE-14375:


[~mohitsabharwal] could you give help me review this?

> hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} 
> that uses 2.8.1
> --
>
> Key: HIVE-14375
> URL: https://issues.apache.org/jira/browse/HIVE-14375
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14375.1.patch
>
>
> HIVE-14149 changed the joda-time dependency to 2.8.1 version. However, the 
> hcatalog-pig-adapter has 2.2 hardcoded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14375) hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} that uses 2.8.1

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400840#comment-15400840
 ] 

Hive QA commented on HIVE-14375:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820788/HIVE-14375.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10386 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/700/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/700/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-700/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820788 - PreCommit-HIVE-MASTER-Build

> hcatalog-pig-adaptor pom.xml uses joda-time 2.2 instead of ${joda.version} 
> that uses 2.8.1
> --
>
> Key: HIVE-14375
> URL: https://issues.apache.org/jira/browse/HIVE-14375
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14375.1.patch
>
>
> HIVE-14149 changed the joda-time dependency to 2.8.1 version. However, the 
> hcatalog-pig-adapter has 2.2 hardcoded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5483) use metastore statistics to optimize max/min/etc. queries

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400813#comment-15400813
 ] 

Dudu Markovitz commented on HIVE-5483:
--

Hi guys, perhaps I'm missing something but do I have any guarantee for the 
correctness of the metadata when someone can simply delete or replace a files 
directly in the file system without going through the metastore? 

> use metastore statistics to optimize max/min/etc. queries
> -
>
> Key: HIVE-5483
> URL: https://issues.apache.org/jira/browse/HIVE-5483
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
> Fix For: 0.13.0
>
> Attachments: HIVE-5483.2.patch, HIVE-5483.3.patch, HIVE-5483.patch
>
>
> We have discussed this a little bit.
> Hive can answer queries such as select max(c1) from t purely from metastore 
> using partition statistics, provided that we know the statistics are up to 
> date.
> All data changes (e.g. adding new partitions) currently go thru metastore so 
> we can track up-to-date-ness. If they are not up-to-date, the queries will 
> have to read data (at least for outdated partitions) until someone runs 
> analyze table. We can also analyze new partitions after add, if that is 
> configured/specified in the command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400809#comment-15400809
 ] 

Dudu Markovitz commented on HIVE-6492:
--

Hi guys

Perhaps I'm missing something, but although I understand the the business 
scenario I can't say I understand the chosen solution.

1.
Does it make sense to limit the access to all tables by the number of 
partitions when the volume of a partition can vary rapidly from table to table? 

2.
Does it make sense to limit all users with a single parameter where there are 
different groups of users with different business justifications?

3.
What prevents the users from simply divide their queries to multiple smaller 
queries?  

4.
Can't a user just change the parameter for his session, removing the limitation?


For various reasons It is strongly recommended not to give the users access to 
tables themselves but only to views that masks the tables.
If that approach is taken, a simple filter within the view can solved the 
issue, e.g. -

create view mytable_v as select * from mytable where create_date >= date 
'2013-01-01';



> limit partition number involved in a table scan
> ---
>
> Key: HIVE-6492
> URL: https://issues.apache.org/jira/browse/HIVE-6492
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Fix For: 0.13.0
>
> Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
> HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
> HIVE-6492.5.patch.txt, HIVE-6492.6.patch.txt, HIVE-6492.7.parch.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> To protect the cluster, a new configure variable 
> "hive.limit.query.max.table.partition" is added to hive configuration to
> limit the table partitions involved in a table scan. 
> The default value will be set to -1 which means there is no limit by default. 
> This variable will not affect "metadata only" query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14329) fix flapping qtests - because of output string ordering

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14329:

Status: Patch Available  (was: Open)

added qtest changes to patch

> fix flapping qtests - because of output string ordering
> ---
>
> Key: HIVE-14329
> URL: https://issues.apache.org/jira/browse/HIVE-14329
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14329.1.patch, HIVE-14329.2.patch, 
> HIVE-14329.3.patch
>
>
> it's a bit annoying to see some tests come and go in testresults; for example:
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/631/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_stats_list_bucket/history/
> These tests fail occasionally because of the ordering is different in the map.
> The usual cause of these failures is a simple hashmap in 
> {{MetaDataFormatUtils}}:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java#L411



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14329) fix flapping qtests - because of output string ordering

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14329:

Attachment: HIVE-14329.3.patch

> fix flapping qtests - because of output string ordering
> ---
>
> Key: HIVE-14329
> URL: https://issues.apache.org/jira/browse/HIVE-14329
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14329.1.patch, HIVE-14329.2.patch, 
> HIVE-14329.3.patch
>
>
> it's a bit annoying to see some tests come and go in testresults; for example:
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/631/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_stats_list_bucket/history/
> These tests fail occasionally because of the ordering is different in the map.
> The usual cause of these failures is a simple hashmap in 
> {{MetaDataFormatUtils}}:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java#L411



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Open  (was: Patch Available)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Attachment: HIVE-14367.2.patch

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14367:

Status: Patch Available  (was: Open)

> Estimated size for constant nulls is 0
> --
>
> Key: HIVE-14367
> URL: https://issues.apache.org/jira/browse/HIVE-14367
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14367.1.patch, HIVE-14367.1.patch, 
> HIVE-14367.2.patch
>
>
> since type is incorrectly assumed as void.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14329) fix flapping qtests - because of output string ordering

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14329:

Status: Open  (was: Patch Available)

> fix flapping qtests - because of output string ordering
> ---
>
> Key: HIVE-14329
> URL: https://issues.apache.org/jira/browse/HIVE-14329
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14329.1.patch, HIVE-14329.2.patch
>
>
> it's a bit annoying to see some tests come and go in testresults; for example:
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/631/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_stats_list_bucket/history/
> These tests fail occasionally because of the ordering is different in the map.
> The usual cause of these failures is a simple hashmap in 
> {{MetaDataFormatUtils}}:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java#L411



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14363) bucketmap inner join query fails due to NullPointerException in some cases

2016-07-30 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400802#comment-15400802
 ] 

Ashutosh Chauhan commented on HIVE-14363:
-

[~hsubramaniyan] Please commit if failures are not related.

> bucketmap inner join query fails due to NullPointerException in some cases
> --
>
> Key: HIVE-14363
> URL: https://issues.apache.org/jira/browse/HIVE-14363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14363.1.patch
>
>
> Bucketmap inner join query between bucketed tables throws following exception 
> when one table contains all the empty buckets while other has all the 
> non-empty buckets.
> {noformat}
> Vertex failed, vertexName=Map 2, vertexId=vertex_1466710232033_0432_4_01, 
> diagnostics=[Task failed, taskId=task_1466710232033_0432_4_01_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1466710232033_0432_4_01_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:330)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getKeyValueReader(MapRecordProcessor.java:372)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.initializeMapRecordSources(MapRecordProcessor.java:344)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:292)
>   ... 15 more
> ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1466710232033_0432_4_01_00_1:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator 

[jira] [Commented] (HIVE-6628) Use UDFs in create table statement

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400794#comment-15400794
 ] 

Dudu Markovitz commented on HIVE-6628:
--

Hi Nicolas

I couldn't understand why a combination of external table + view is not a good 
fit for you, e.g. -

create external table mytable(
 userid int
,adate string
,listofthings string
)ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
;

create view mytable_v as
select userid int
 , from_utc_timestamp(adate,"Europe/Paris")
 ,split( list_of_things, ";")
from mytable
;

> Use UDFs in create table statement
> --
>
> Key: HIVE-6628
> URL: https://issues.apache.org/jira/browse/HIVE-6628
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Reporter: nicolas maillard
>Priority: Trivial
>
> It would be nice to be able to use UDFs in a create table statement
> Say my data is : userid, timestamp utc, list_of_things
> 123,1386716402,thing1;thing2:thing3
> Being able to say
> create external table mytable(
> userid int
> adate string as from_utc_timestamp(tilmestamp,"Europe/Paris")
> listofthings array as split( list_of_things, ";")
> )ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> this is like a much lighter serde or a simpler view I guess.
> It would allow to correct the view of certains fields on the fly  without 
> needing to do reproscessing. this is a use case we see happening a lot in our 
> inital data collections



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10142) Calculating formula based on difference between each row's value and current row's in Windowing function

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400487#comment-15400487
 ] 

Dudu Markovitz edited comment on HIVE-10142 at 7/30/16 7:23 PM:


Although I can relate to the request, I've never seen it implemented before, 
probably because it is an O(N^2) operation.

E.g.-
For every event I would like to count the number of events with higher values, 
that occurred before this event.
Assuming we have a new keyword  "CURRENT_ROW", the analytic function would look 
something like this:

count (case when val > CURRENT_ROW.val then 1 end) over (order by ts rows 
between unbounded preceding and current row)

The thing is that in order to implement this we would probably sort the data 
set by ts (so far so good) and then compare each record against its preceding 
records which is a O(N^2) operation.
That mean that for a table of 1M (1,000,000) record we are at the scale of 1T 
(1,000,000,000,000) operations.

I'm not sure we want to go there.



was (Author: dmarkovitz):
Although I can relate to the request, I've never seen it implemented before, 
probably because it is an O(N^2) operation.

Take this for example -
For every event I would like to count the number of events with higher higher 
values that occurred before it.
Assuming we have a new keyword  "CURRENT_ROW", the analytic function would look 
something like this:

count (case when val > CURRENT_ROW.val then 1 end) over (order by ts rows 
between unbounded preceding and current row)

The thing is that in order to implement this we would probably sort the data 
set by ts (so far so good) and then compare each record against its preceding 
records which is a O(N^2) operation.
That mean that for a table of 1M (1,000,000) record we are at the scale of 1T 
(1,000,000,000,000) operations.

I'm not sure want to go there.


> Calculating formula based on difference between each row's value and current 
> row's in Windowing function
> 
>
> Key: HIVE-10142
> URL: https://issues.apache.org/jira/browse/HIVE-10142
> Project: Hive
>  Issue Type: New Feature
>  Components: PTF-Windowing
>Affects Versions: 1.0.0
>Reporter: Yi Zhang
>Assignee: Aihua Xu
>
> For analytics with windowing function, the calculation formula sometimes 
> needs to perform over each row's value against current tow's value. The decay 
> value is a good example, such as sums of value with a decay function based on 
> difference of timestamp between each row and current row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11022) Support collecting lists in user defined order

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400781#comment-15400781
 ] 

Dudu Markovitz commented on HIVE-11022:
---

In order to support complex ordering expressions I would suggest to add the 
sorting option as an enhanced syntax for the current functions and not as 
additional functions, in a similar way to MySQL's GROUP_CONCAT or Oracle's 
LISTAGG.
It would also be nice to add the SEPARATOR /DELIMITER option.

http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat
http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions089.htm#SQLRF30030




> Support collecting lists in user defined order
> --
>
> Key: HIVE-11022
> URL: https://issues.apache.org/jira/browse/HIVE-11022
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Michael Haeusler
>
> Hive currently supports aggregation of lists "in order of input rows" with 
> the UDF collect_list. Unfortunately, the order is not well defined when 
> map-side aggregations are used.
> Hive could support collecting lists in user-defined order by providing a UDF
> COLLECT_LIST_SORTED(valueColumn, sortColumn[, limit]), that would return a 
> list of values sorted in a user defined order. An optional limit parameter 
> can restrict this to the n first values within that order.
> Especially in the limit case, this can be efficiently pre-aggregated and 
> reduces the amount of data transferred to reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400779#comment-15400779
 ] 

Hive QA commented on HIVE-14270:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820781/HIVE-14270.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1267 failed/errored test(s), 10390 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_exist
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_concatenate_indexed_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_index
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_index
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_insert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_show_grant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19_inclause
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9

[jira] [Updated] (HIVE-14382) Improve the Functionality of Reverse FOR Statement

2016-07-30 Thread Akash Sethi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Sethi updated HIVE-14382:
---
Flags: Patch

> Improve the Functionality of Reverse  FOR Statement
> ---
>
> Key: HIVE-14382
> URL: https://issues.apache.org/jira/browse/HIVE-14382
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akash Sethi
>Assignee: Akash Sethi
>Priority: Minor
> Attachments: HIVE-14382.1-branch-2.1.patch, HIVE-14382.1.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> According to SQL Standards, Reverse FOR Statement should be like this:-
> FOR index IN Optional[Reverse] Lower_Bound Upper_Bound
> but in hive it is like this :- 
> FOR index IN Optional[Reverse]  Upper_Bound Lower_Bound
> so i m just trying to improve the functionality for Reverse FOR Statement
> REFERNCES :- 
> https://docs.oracle.com/cloud/latest/db112/LNPLS/for_loop_statement.htm#LNPLS1536



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400732#comment-15400732
 ] 

Hive QA commented on HIVE-14346:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820752/HIVE-14346.1.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10371 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_coalesce.q-vector_decimal_round.q-tez_bmj_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constant_prop_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/698/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/698/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-698/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820752 - PreCommit-HIVE-MASTER-Build

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13604) Do not log AlreadyExistsException when "IF NOT EXISTS" is used.

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400683#comment-15400683
 ] 

Hive QA commented on HIVE-13604:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820724/HIVE-13604.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10386 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/697/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/697/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-697/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820724 - PreCommit-HIVE-MASTER-Build

> Do not log AlreadyExistsException when "IF NOT EXISTS" is used.
> ---
>
> Key: HIVE-13604
> URL: https://issues.apache.org/jira/browse/HIVE-13604
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Yuriy Plysyuk
>Assignee: Chinna Rao Lalam
>Priority: Trivial
> Attachments: HIVE-13604.1.patch, HIVE-13604.2.patch, HIVE-13604.patch
>
>
> When trying to create view that exists with statement:
> CREATE VIEW IF NOT EXISTS dummy_table ...
> Next error is logged:
> ERROR RetryingHMSHandler:190 - AlreadyExistsException(message:Table 
> dummy_view already exists)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1296)
> ...
> The same when creating schema using:
> CREATE SCHEMA IF NOT EXISTS ...
> Error should not be logged as it confuses.
> For 
> CREATE TABLE IF NOT EXISTS ...
> it works fine. I checked that there is code to handle this in:
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable
> // check for existence of table
> if (ifNotExists) {
>   try {
> Table table = getTable(qualifiedTabName, false);
> if (table != null) { // table exists
>   return null;
> }
> could you please add similar check for creating views and schema?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14370) printStackTrace() called in Operator.close()

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400651#comment-15400651
 ] 

Hive QA commented on HIVE-14370:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820716/HIVE-14370.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10386 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/696/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/696/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-696/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820716 - PreCommit-HIVE-MASTER-Build

> printStackTrace() called in Operator.close()
> 
>
> Key: HIVE-14370
> URL: https://issues.apache.org/jira/browse/HIVE-14370
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Karoly
>Assignee: David Karoly
>Priority: Minor
> Attachments: HIVE-14370.1.patch
>
>
> Operator.close() calls printStackTrace() if something goes wrong, making the 
> stack trace go to stderr instead of logs and losing the timestamp. We should 
> use LOG.warn instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14259) FileUtils.isSubDir may return incorrect result

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14259:

Attachment: HIVE-14259.3.patch

rebased patch

> FileUtils.isSubDir may return incorrect result
> --
>
> Key: HIVE-14259
> URL: https://issues.apache.org/jira/browse/HIVE-14259
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14259.1.patch, HIVE-14259.2.patch, 
> HIVE-14259.3.patch
>
>
>  while I was working on HIVE-12244 i've looked around for utility 
> methods...i've found this method; but it considers path: `/dir12` inside 
> `/dir1`
> which is not true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14259) FileUtils.isSubDir may return incorrect result

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14259:

Status: Patch Available  (was: Open)

> FileUtils.isSubDir may return incorrect result
> --
>
> Key: HIVE-14259
> URL: https://issues.apache.org/jira/browse/HIVE-14259
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14259.1.patch, HIVE-14259.2.patch, 
> HIVE-14259.3.patch
>
>
>  while I was working on HIVE-12244 i've looked around for utility 
> methods...i've found this method; but it considers path: `/dir12` inside 
> `/dir1`
> which is not true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14259) FileUtils.isSubDir may return incorrect result

2016-07-30 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14259:

Status: Open  (was: Patch Available)

> FileUtils.isSubDir may return incorrect result
> --
>
> Key: HIVE-14259
> URL: https://issues.apache.org/jira/browse/HIVE-14259
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14259.1.patch, HIVE-14259.2.patch, 
> HIVE-14259.3.patch
>
>
>  while I was working on HIVE-12244 i've looked around for utility 
> methods...i've found this method; but it considers path: `/dir12` inside 
> `/dir1`
> which is not true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11116) Can not select data from table which points to remote hdfs location

2016-07-30 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400614#comment-15400614
 ] 

Yongzhi Chen commented on HIVE-6:
-

[~DavidKaroly], the fix looks good, could you add some tests if possible?
And in order to run precommit build, you have to submit your patch. 

> Can not select data from table which points to remote hdfs location
> ---
>
> Key: HIVE-6
> URL: https://issues.apache.org/jira/browse/HIVE-6
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Affects Versions: 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Alexander Pivovarov
>Assignee: David Karoly
> Attachments: HIVE-6.1.patch
>
>
> I tried to create new table which points to remote hdfs location and select 
> data from it.
> It works for hive-0.14 and hive-1.0  but it does not work starting from 
> hive-1.1
> to reproduce the issue
> 1. create folder on remote hdfs
> {code}
> hadoop fs -mkdir -p hdfs://remote-nn/tmp/et1
> {code}
> 2. create table 
> {code}
> CREATE TABLE et1 (
>   a string
> ) stored as textfile
> LOCATION 'hdfs://remote-nn/tmp/et1';
> {code}
> 3. run select
> {code}
> select * from et1 limit 10;
> {code}
> 4. Should get the following error
> {code}
> select * from et1;
> 15/06/25 13:43:44 [main]: ERROR parse.CalcitePlanner: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if 
> hdfs://remote-nn/tmp/et1is encrypted: java.lang.IllegalArgumentException: 
> Wrong FS: hdfs://remote-nn/tmp/et1, expected: hdfs://localhost:8020
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1763)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:190)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://remote-nn/tmp/et1, expected: hdfs://localhost:8020
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1906)
>   at 
> org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1097)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1759)
>   ... 25 more
> FAILED: SemanticException Unable to determine if hdfs://remote-nn/tmp/et1is 
> encrypted: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://remote-nn/tmp/et1, expected: hdfs://localhost:8020
> 15/06/25 13:43:44 [main]: ERROR ql.Driver: FAILED: SemanticException Unable 
> to determine if 

[jira] [Commented] (HIVE-14350) Aborted txns cause false positive "Not enough history available..." msgs

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400613#comment-15400613
 ] 

Hive QA commented on HIVE-14350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820789/HIVE-14350.9.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/695/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/695/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-695/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-695/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 6b0131b Revert "HIVE-14303: CommonJoinOperator.checkAndGenObject 
should return directly at CLOSE state to avoid NPE if ExecReducer.close is 
called twice. (Zhihai Xu, reviewed by Xuefu Zhang)"
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 6b0131b Revert "HIVE-14303: CommonJoinOperator.checkAndGenObject 
should return directly at CLOSE state to avoid NPE if ExecReducer.close is 
called twice. (Zhihai Xu, reviewed by Xuefu Zhang)"
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820789 - PreCommit-HIVE-MASTER-Build

> Aborted txns cause false positive "Not enough history available..." msgs
> 
>
> Key: HIVE-14350
> URL: https://issues.apache.org/jira/browse/HIVE-14350
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-14350.2.patch, HIVE-14350.3.patch, 
> HIVE-14350.5.patch, HIVE-14350.6.patch, HIVE-14350.7.patch, 
> HIVE-14350.8.patch, HIVE-14350.9.patch
>
>
> this is a followup to HIVE-13369.  Only open txns should prevent use of a 
> base file.  But ValidTxnList does not make a distinction between open and 
> aborted txns.  The presence of aborted txns causes false positives which can 
> happen too often since the flow is 
> 1. Worker generates a new base file, 
> 2. then asynchronously Cleaner removes now-compacted aborted txns.  (strictly 
> speaking it's Initiator that does the actual clean up)
> So we may have base_5 and base_10 and txnid 7 aborted.  Then current impl 
> will disallow use of base_10 though there is no need for that.  Worse, if 
> txnid_4 is aborted and hasn't been purged yet, base_5 will be rejected as 
> well and then an error will be raised since there is no suitable base file 
> left.
> ErrorMsg.ACID_NOT_ENOUGH_HISTORY is msg produced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14368) ThriftCLIService.GetOperationStatus should include exception's stack trace to the error message.

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400610#comment-15400610
 ] 

Hive QA commented on HIVE-14368:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12820650/HIVE-14368.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10386 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_avro_non_nullable_union
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/694/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/694/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-694/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12820650 - PreCommit-HIVE-MASTER-Build

> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message.
> 
>
> Key: HIVE-14368
> URL: https://issues.apache.org/jira/browse/HIVE-14368
> Project: Hive
>  Issue Type: Improvement
>  Components: Thrift API
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Attachments: HIVE-14368.000.patch
>
>
> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message. The stack trace will be really helpful for client to debug 
> failed queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14350) Aborted txns cause false positive "Not enough history available..." msgs

2016-07-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400575#comment-15400575
 ] 

Lefty Leverenz commented on HIVE-14350:
---

[~ekoifman], please update the status and fix version/s for this issue.  You 
committed it to master, branch-2.1, and branch-1 yesterday.

> Aborted txns cause false positive "Not enough history available..." msgs
> 
>
> Key: HIVE-14350
> URL: https://issues.apache.org/jira/browse/HIVE-14350
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-14350.2.patch, HIVE-14350.3.patch, 
> HIVE-14350.5.patch, HIVE-14350.6.patch, HIVE-14350.7.patch, 
> HIVE-14350.8.patch, HIVE-14350.9.patch
>
>
> this is a followup to HIVE-13369.  Only open txns should prevent use of a 
> base file.  But ValidTxnList does not make a distinction between open and 
> aborted txns.  The presence of aborted txns causes false positives which can 
> happen too often since the flow is 
> 1. Worker generates a new base file, 
> 2. then asynchronously Cleaner removes now-compacted aborted txns.  (strictly 
> speaking it's Initiator that does the actual clean up)
> So we may have base_5 and base_10 and txnid 7 aborted.  Then current impl 
> will disallow use of base_10 though there is no need for that.  Worse, if 
> txnid_4 is aborted and hasn't been purged yet, base_5 will be rejected as 
> well and then an error will be raised since there is no suitable base file 
> left.
> ErrorMsg.ACID_NOT_ENOUGH_HISTORY is msg produced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14366) Conversion of a Non-ACID table to an ACID table produces non-unique primary keys

2016-07-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400574#comment-15400574
 ] 

Lefty Leverenz commented on HIVE-14366:
---

[~ekoifman] or [~saketj], this was committed to master, branch-2.1, and 
branch-1 so please update the status and fix version/s.

> Conversion of a Non-ACID table to an ACID table produces non-unique primary 
> keys
> 
>
> Key: HIVE-14366
> URL: https://issues.apache.org/jira/browse/HIVE-14366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Blocker
> Attachments: HIVE-14366.01.patch, HIVE-14366.02.patch
>
>
> When a Non-ACID table is converted to an ACID table, the primary key 
> consisting of (original transaction id, bucket_id, row_id) is not generated 
> uniquely. Currently, the row_id is always set to 0 for most rows. This leads 
> to correctness issue for such tables.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>   @Test
>   public void testOriginalReader() throws Exception {
> FileSystem fs = FileSystem.get(hiveConf);
> FileStatus[] status;
> // 1. Insert five rows to Non-ACID table.
> runStatementOnDriver("insert into " + Table.NONACIDORCTBL + "(a,b) 
> values(1,2),(3,4),(5,6),(7,8),(9,10)");
> // 2. Convert NONACIDORCTBL to ACID table.
> runStatementOnDriver("alter table " + Table.NONACIDORCTBL + " SET 
> TBLPROPERTIES ('transactional'='true')");
> // 3. Perform a major compaction.
> runStatementOnDriver("alter table "+ Table.NONACIDORCTBL + " compact 
> 'MAJOR'");
> runWorker(hiveConf);
> // 4. Perform a delete.
> runStatementOnDriver("delete from " + Table.NONACIDORCTBL + " where a = 
> 1");
> // 5. Now do a projection should have (3,4) (5,6),(7,8),(9,10) only since 
> (1,2) has been deleted.
> List rs = runStatementOnDriver("select a,b from " + 
> Table.NONACIDORCTBL + " order by a,b");
> int[][] resultData = new int[][] {{3,4}, {5,6}, {7,8}, {9,10}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-07-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400557#comment-15400557
 ] 

Lefty Leverenz commented on HIVE-14270:
---

I made some editorial comments on RB for the config description.

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14390) Wrong Table alias when CBO is on

2016-07-30 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-14390:
-
Attachment: HIVE-14390.patch

HIVE-14390.patch can fix this.But I'm not sure it's the right way.
[~pxiong] Would you mind taking a look?

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400543#comment-15400543
 ] 

Lefty Leverenz commented on HIVE-12646:
---

Okay, thanks Sergio.

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-12646.2.patch, HIVE-12646.3.patch, 
> HIVE-12646.4.patch, HIVE-12646.5.patch, HIVE-12646.patch
>
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14390) Wrong Table alias when CBO is on

2016-07-30 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-14390:
-
Attachment: explain.rar

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Priority: Minor
> Attachments: explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14123) Add beeline configuration option to show database in the prompt

2016-07-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400531#comment-15400531
 ] 

Lefty Leverenz commented on HIVE-14123:
---

[~pvary], yes a TODOC## label is added to each JIRA issue that needs 
documentation, then we remove it when the documentation is done.  These labels 
make it possible to find out how many issues remain undocumented for a given 
release.

I like to add a doc note in the JIRA comments naming any configuration 
parameters (so they'll be easy to find in a search) and if I know where the doc 
belongs, I give the link.

Ideally the documentation should be done by the developer who created the 
patch, in this case you.  The Hive wiki covers all releases, so new information 
needs to specify the starting release number.  We often document things well 
before the release just because that's when they're fresh in our minds, even 
though this might lead to a bit of confusion among readers.

If you would like to document this now, you will need a Confluence account and 
wiki edit permissions as described here:  [About This Wiki -- How to get 
permission to edit| 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit].

A release note can also give the documentation information.  Use the Edit 
button (top lefty) to add a release note.

Thanks for asking.

> Add beeline configuration option to show database in the prompt
> ---
>
> Key: HIVE-14123
> URL: https://issues.apache.org/jira/browse/HIVE-14123
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, CLI
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14123.10.patch, HIVE-14123.2.patch, 
> HIVE-14123.3.patch, HIVE-14123.4.patch, HIVE-14123.5.patch, 
> HIVE-14123.6.patch, HIVE-14123.7.patch, HIVE-14123.8.patch, 
> HIVE-14123.9.patch, HIVE-14123.patch
>
>
> There are several jira issues complaining that, the Beeline does not respect 
> hive.cli.print.current.db.
> This is partially true, since in embedded mode, it uses the 
> hive.cli.print.current.db to change the prompt, since HIVE-10511.
> In beeline mode, I think this function should use a beeline command line 
> option instead, like for the showHeader option emphasizing, that this is a 
> client side option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14367) Estimated size for constant nulls is 0

2016-07-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400521#comment-15400521
 ] 

Hive QA commented on HIVE-14367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821119/HIVE-14367.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 122 failed/errored test(s), 10386 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_groupby3_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantfolding
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fetch_aggregation
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_interval_arithmetic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_literal_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_num_op_type_conv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduceSinkDeDuplication_pRS_key_empty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_remove_exprs_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_number_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_greatest
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_if
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_instr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_least
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_locate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_trunc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_when
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_stack
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_6_subq
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_without_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_arithmetic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_nvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress

[jira] [Commented] (HIVE-7660) Hive to support qualify analytic filtering

2016-07-30 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400494#comment-15400494
 ] 

Dudu Markovitz commented on HIVE-7660:
--

This is a syntactic sugar supported by Teradata.

e.g.
In order to get the last 3 records of each buyer:

select * from purchase qualify row_number () over (partition by buyer order by 
ts desc) <= 3;

without the "qualify" it would look something like:

select * from (select *,row_number () over (partition by buyer order by ts 
desc) as rn from purchase) as t where t.rn <= 3;

> Hive to support qualify analytic filtering
> --
>
> Key: HIVE-7660
> URL: https://issues.apache.org/jira/browse/HIVE-7660
> Project: Hive
>  Issue Type: New Feature
>Reporter: Viji
>Priority: Trivial
>
> Currently, Hive does not support qualify analytic filtering. It would be 
> useful fi this feature were added in the future.
> As a workaround, since it is just a filter, we can replace it with a subquery 
> and filter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)