[jira] [Commented] (HIVE-18952) Tez session disconnect and reconnect on HS2 HA failover

2018-03-21 Thread Eric Wohlstadter (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409080#comment-16409080
 ] 

Eric Wohlstadter commented on HIVE-18952:
-

[~sershe]

lgtm 

+1 modulo Tez dependency

> Tez session disconnect and reconnect on HS2 HA failover
> ---
>
> Key: HIVE-18952
> URL: https://issues.apache.org/jira/browse/HIVE-18952
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18952.01.patch, HIVE-18952.patch
>
>
> Now that TEZ-3892 is committed, HIVE-18281 can make use of tez session 
> disconnect and reconnect on HA failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19015) Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException

2018-03-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409076#comment-16409076
 ] 

Matt McCline commented on HIVE-19015:
-

[~vihangk1] You may need the fixes in HIVE-19019 in order to make progress on 
this one.

> Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q 
> gets a ClassCastException
> -
>
> Key: HIVE-19015
> URL: https://issues.apache.org/jira/browse/HIVE-19015
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;"  to 
> parquet_map_of_arrays_of_ints.q triggers this call stack:
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
> org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]
> Adding parquet_map_of_maps.q, too.  Stack trace seems related.
> {noformat}
> Caused by: java.lang.ClassCastException: optional group value (MAP) {
>   repeated group key_value {
> optional binary key (UTF8);
> required int32 value;
>   }
> } is not primitive
>   at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) 
> ~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.(BaseVectorizedColumnReader.java:130)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.(VectorizedListColumnReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionW

2018-03-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409074#comment-16409074
 ] 

Matt McCline commented on HIVE-19019:
-

Patch #1 doesn't solve parquet_schema_evolution.q execution error because 
another bug gets hit.  Needs more tests to make sure setValue/writeValue 
exercised for all the complex types.

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and 
> orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from 
> VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19019.01.patch
>
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> Also, null_cast.q, nullMap.q, and nested_column_pruning.q



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Status: Patch Available  (was: Open)

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and 
> orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from 
> VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19019.01.patch
>
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> Also, null_cast.q, nullMap.q, and nested_column_pruning.q



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Attachment: HIVE-19019.01.patch

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and 
> orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from 
> VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19019.01.patch
>
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> Also, null_cast.q, nullMap.q, and nested_column_pruning.q



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18926) Imporve operator-tree matching

2018-03-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409025#comment-16409025
 ] 

Ashutosh Chauhan commented on HIVE-18926:
-

+1 pending tests

> Imporve operator-tree matching
> --
>
> Key: HIVE-18926
> URL: https://issues.apache.org/jira/browse/HIVE-18926
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-18926.01.patch, HIVE-18926.02.patch, 
> HIVE-18926.03.patch, HIVE-18926.04.patch, HIVE-18926.05.patch
>
>
> currently joins are not matched



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19022) Hive Beeline can not read user define environment variables

2018-03-21 Thread KaiXu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-19022:
-
Description: 
we found that user can not get exported environment variables in Hive beeline.

how to reproduce:
1. start hiveserver2 service 
2. beeline embedded mode:
[root@bdw-master hive232]# export AAA=aaa
[root@bdw-master ~]# echo $HADOOP_HOME
/opt/hive_package/hadoop273
[root@bdw-master hive232]# bin/beeline -u 'jdbc:hive2://localhost:1' -n 
root -p 123456
0: jdbc:hive2://localhost:1> set env:AAA;
Error: Error while processing statement: null (state=,code=1)

but we found that we can get HADOOP_HOME JAVA_HOME etc. variables:

0: jdbc:hive2://localhost:1> set env:HADOOP_HOME;
+--+
|   set|
+--+
| env:HADOOP_HOME=/opt/hive_package/hadoop273  |
+--+
1 row selected (0.097 seconds)

0: jdbc:hive2://localhost:1> set env:JAVA_HOME;
+---+
|  set  |
+---+
| env:JAVA_HOME=/usr/java/jdk1.8.0_131  |
+---+
1 row selected (0.09 seconds)

Below is hive.log:

2018-03-22T11:12:01,708  WARN [HiveServer2-Handler-Pool: Thread-94] 
thrift.ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
null
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
 ~[hive-service-2.3.2.jar:2.3.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_131]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-2.3.2.jar:2.3.2]

  was:
we found that user can not get exported environment variables in Hive beeline.

how to reproduce:
1. start hiveserver2 service 
2. beeline embedded mode:
[root@bdw-master hive232]# export AAA=aaa
[root@bdw-master ~]# echo $HADOOP_HOME
/opt/hive_package/hadoop273
[root@bdw-master hive232]# bin/beeline -u 'jdbc:hive2://localhost:1' -n 
root -p 123456
0: jdbc:hive2://localhost:1> set env:AAA;
Error: Error while processing statement: null (state=,code=1)

but we found that we can get HADOOP_HOME etc. variables:

0: jdbc:hive2://localhost:1> set env:HADOOP_HOME;
+--+
|   set|
+--+
| env:HADOOP_HOME=/opt/hive_package/hadoop273  |
+--+
1 row selected (0.097 seconds)



Below is hive.log:

2018-03-22T11:12:01,708  WARN [HiveServer2-Handler-Pool: Thread-94] 
thrift.ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
null
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
 ~[hive-service-2.3.2.jar:2.3.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_131]
at 

[jira] [Updated] (HIVE-19022) Hive Beeline can not read user define environment variables

2018-03-21 Thread KaiXu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-19022:
-
Environment: (was: we found that user can not get exported environment 
variables in Hive beeline.

how to reproduce:
1. start hiveserver2 service 
2. beeline embedded mode:
[root@bdw-master hive232]# export AAA=aaa
[root@bdw-master ~]# echo $HADOOP_HOME
/opt/hive_package/hadoop273
[root@bdw-master hive232]# bin/beeline -u 'jdbc:hive2://localhost:1' -n 
root -p 123456
0: jdbc:hive2://localhost:1> set env:AAA;
Error: Error while processing statement: null (state=,code=1)

but we found that we can get HADOOP_HOME etc. variables:

0: jdbc:hive2://localhost:1> set env:HADOOP_HOME;
+--+
|   set|
+--+
| env:HADOOP_HOME=/opt/hive_package/hadoop273  |
+--+
1 row selected (0.097 seconds)



Below is hive.log:

2018-03-22T11:12:01,708  WARN [HiveServer2-Handler-Pool: Thread-94] 
thrift.ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
null
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
 ~[hive-service-2.3.2.jar:2.3.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_131]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-2.3.2.jar:2.3.2])

> Hive Beeline can not read user define environment variables
> ---
>
> Key: HIVE-19022
> URL: https://issues.apache.org/jira/browse/HIVE-19022
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2, JDBC
>Affects Versions: 2.3.2
>Reporter: KaiXu
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19022) Hive Beeline can not read user define environment variables

2018-03-21 Thread KaiXu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-19022:
-
Description: 
we found that user can not get exported environment variables in Hive beeline.

how to reproduce:
1. start hiveserver2 service 
2. beeline embedded mode:
[root@bdw-master hive232]# export AAA=aaa
[root@bdw-master ~]# echo $HADOOP_HOME
/opt/hive_package/hadoop273
[root@bdw-master hive232]# bin/beeline -u 'jdbc:hive2://localhost:1' -n 
root -p 123456
0: jdbc:hive2://localhost:1> set env:AAA;
Error: Error while processing statement: null (state=,code=1)

but we found that we can get HADOOP_HOME etc. variables:

0: jdbc:hive2://localhost:1> set env:HADOOP_HOME;
+--+
|   set|
+--+
| env:HADOOP_HOME=/opt/hive_package/hadoop273  |
+--+
1 row selected (0.097 seconds)



Below is hive.log:

2018-03-22T11:12:01,708  WARN [HiveServer2-Handler-Pool: Thread-94] 
thrift.ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
null
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
 ~[hive-service-2.3.2.jar:2.3.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_131]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-2.3.2.jar:2.3.2]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-2.3.2.jar:2.3.2]

> Hive Beeline can not read user define environment variables
> ---
>
> Key: HIVE-19022
> URL: https://issues.apache.org/jira/browse/HIVE-19022
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2, JDBC
>Affects Versions: 2.3.2
>Reporter: KaiXu
>Priority: Major
>
> we found that user can not get exported environment variables in Hive beeline.
> how to reproduce:
> 1. start hiveserver2 service 
> 2. beeline embedded mode:
> [root@bdw-master hive232]# export AAA=aaa
> [root@bdw-master ~]# echo $HADOOP_HOME
> /opt/hive_package/hadoop273
> [root@bdw-master hive232]# bin/beeline -u 'jdbc:hive2://localhost:1' -n 
> root -p 123456
> 0: jdbc:hive2://localhost:1> set env:AAA;
> Error: Error while processing statement: null (state=,code=1)
> but we found that we can get HADOOP_HOME etc. variables:
> 0: jdbc:hive2://localhost:1> set env:HADOOP_HOME;
> +--+
> |   set|
> +--+
> | env:HADOOP_HOME=/opt/hive_package/hadoop273  |
> +--+
> 1 row selected (0.097 seconds)
> Below is hive.log:
> 2018-03-22T11:12:01,708  WARN [HiveServer2-Handler-Pool: Thread-94] 
> thrift.ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>  ~[hive-service-2.3.2.jar:2.3.2]
> at 
> org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
>  ~[hive-service-2.3.2.jar:2.3.2]
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
> ~[hive-service-2.3.2.jar:2.3.2]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>  ~[hive-service-2.3.2.jar:2.3.2]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>  ~[hive-service-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> 

[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-21 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.10.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch, HIVE-18910.10.patch, 
> HIVE-18910.2.patch, HIVE-18910.3.patch, HIVE-18910.4.patch, 
> HIVE-18910.5.patch, HIVE-18910.6.patch, HIVE-18910.7.patch, 
> HIVE-18910.8.patch, HIVE-18910.9.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18525) Add explain plan to Hive on Spark Web UI

2018-03-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408965#comment-16408965
 ] 

Xuefu Zhang commented on HIVE-18525:


The feature looks good and useful to me. I don't think generating the explain 
plan costs too much. However, you might to put some perf logs so it can be 
measured. For this, I don't think we need to make it configurable unless later 
perf log shows otherwise.

As a related question, do we show the plan at the job level? That is, show the 
whole query plan for a spark job. That could be useful too.

> Add explain plan to Hive on Spark Web UI
> 
>
> Key: HIVE-18525
> URL: https://issues.apache.org/jira/browse/HIVE-18525
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18525.1.patch, HIVE-18525.2.patch, 
> Job-Page-Collapsed.png, Job-Page-Expanded.png, Map-Explain-Plan.png, 
> Reduce-Explain-Plan.png
>
>
> More of an investigation JIRA. The Spark UI has a "long description" of each 
> stage in the Spark DAG. Typically one stage in the Spark DAG corresponds to 
> either a {{MapWork}} or {{ReduceWork}} object. It would be useful if the long 
> description contained the explain plan of the corresponding work object.
> I'm not sure how much additional overhead this would introduce. If not the 
> full explain plan, then maybe a modified one that just lists out all the 
> operator tree along with each operator name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-21 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408964#comment-16408964
 ] 

Vihang Karajgaonkar commented on HIVE-17843:


+1 patch looks good to me.

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17843.1.patch, HIVE-17843.1.patch, 
> HIVE-17843.2.patch, HIVE-17843.3.patch, HIVE-17843.4.patch
>
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18991) Drop database cascade doesn't work with materialized views

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408955#comment-16408955
 ] 

Jesus Camacho Rodriguez commented on HIVE-18991:


[~alangates], could you take a look at the patch? While adding tests, I 
realized that the code that I had added was not being hit because the tables 
were dropped individually from the client side. I have removed that code in 
{{HiveMetaStoreClient}} because AFAIK it does not make sense to make multiple 
roundtrips to metastore and it is rather better to do all the work once the 
request reaches the metastore server, but maybe I am missing something? Thanks

> Drop database cascade doesn't work with materialized views
> --
>
> Key: HIVE-18991
> URL: https://issues.apache.org/jira/browse/HIVE-18991
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18991.patch
>
>
> Create a database, add a table and then a materialized view that depends on 
> the table.  Then drop the database with cascade set.  Sometimes this will 
> fail because when HiveMetaStore.drop_database_core goes to drop all of the 
> tables it may drop the base table before the materialized view, which will 
> cause an integrity constraint violation in the RDBMS.  To resolve this that 
> method should change to fetch and drop materialized views before tables.
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18991) Drop database cascade doesn't work with materialized views

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18991:
---
Attachment: HIVE-18991.patch

> Drop database cascade doesn't work with materialized views
> --
>
> Key: HIVE-18991
> URL: https://issues.apache.org/jira/browse/HIVE-18991
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18991.patch
>
>
> Create a database, add a table and then a materialized view that depends on 
> the table.  Then drop the database with cascade set.  Sometimes this will 
> fail because when HiveMetaStore.drop_database_core goes to drop all of the 
> tables it may drop the base table before the materialized view, which will 
> cause an integrity constraint violation in the RDBMS.  To resolve this that 
> method should change to fetch and drop materialized views before tables.
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18991) Drop database cascade doesn't work with materialized views

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18991:
---
Status: Patch Available  (was: In Progress)

> Drop database cascade doesn't work with materialized views
> --
>
> Key: HIVE-18991
> URL: https://issues.apache.org/jira/browse/HIVE-18991
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Create a database, add a table and then a materialized view that depends on 
> the table.  Then drop the database with cascade set.  Sometimes this will 
> fail because when HiveMetaStore.drop_database_core goes to drop all of the 
> tables it may drop the base table before the materialized view, which will 
> cause an integrity constraint violation in the RDBMS.  To resolve this that 
> method should change to fetch and drop materialized views before tables.
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408944#comment-16408944
 ] 

Sergey Shelukhin edited comment on HIVE-19021 at 3/22/18 2:31 AM:
--

-I still need to test this a bit on real cluster- nm, just saw it works. Just 
probably doesn't get the counters for tasks that don't run

Actually, looks like runtimes are not accounted for correctly either probably 
because TezCounters are collected before we update them. Need to update 
earlier... Queued counters are ok.


was (Author: sershe):
-I still need to test this a bit on real cluster- nm, just saw it works. Just 
probably doesn't get the counters for tasks that don't run

> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19021.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408944#comment-16408944
 ] 

Sergey Shelukhin edited comment on HIVE-19021 at 3/22/18 2:30 AM:
--

-I still need to test this a bit on real cluster- nm, just saw it works. Just 
probably doesn't get the counters for tasks that don't run


was (Author: sershe):
I still need to test this a bit on real cluster

> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19021.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408944#comment-16408944
 ] 

Sergey Shelukhin commented on HIVE-19021:
-

I still need to test this a bit on real cluster

> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19021.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19021:

Status: Patch Available  (was: Open)

[~sseth] can you look at this one? it's not a very large patch.
I wonder if it's actually possible to get TezCounters before RunningTask is 
created. This is rather obscure and I doubt anyone else would know this.

> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19021.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19021:

Attachment: HIVE-19021.patch

> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19021.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408937#comment-16408937
 ] 

Sahil Takiar commented on HIVE-18831:
-

[~lirui] can you take a look? Somewhat related to HIVE-15237

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1631#comment-1631
 ] 

Sahil Takiar edited comment on HIVE-18831 at 3/22/18 2:09 AM:
--

Before this patch the console output would look like:

{code}
Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: [Error 
20003]: An error occurred when trying to close the Operator running your custom 
script.
FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during 
runtime. Please check stacktrace for the root cause.
{code}

Now it looks like:

{code}
FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed due to Spark 
task failures: Job failed with 
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error 
occurred when trying to close the Operator running your custom script.
{code}

So this change just combined these two lines and cleaned up the error message a 
bit.

Other changes:
* Did the same thing for Spark job failures
* Found a way to differentiate between Spark task failures and Spark job 
failures
* Added some unit tests


was (Author: stakiar):
Before this patch the console output would look like:

{code}
Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: [Error 
20003]: An error occurred when trying to close the Operator running your custom 
script.
FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during 
runtime. Please check stacktrace for the root cause.
{code}

Now it looks like:

{code}
FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed due to Spark 
task failures: Job failed with 
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error 
occurred when trying to close the Operator running your custom script.
{code}

So pretty much just combined these two lines and cleaned up the error message a 
bit.

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19021) WM counters are not properly propagated from LLAP to AM

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19021:
---


> WM counters are not properly propagated from LLAP to AM
> ---
>
> Key: HIVE-19021
> URL: https://issues.apache.org/jira/browse/HIVE-19021
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-03-21 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408914#comment-16408914
 ] 

Vihang Karajgaonkar commented on HIVE-17684:


The precommit job 
https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/9735/ for 
this job was stuck forever. I am not sure if it was just the job setup issue or 
something to do with the patch. I had to kill the job after it ran for more 
than 5 hrs since it was holding up the queue for other patches. Can you please 
confirm if this not related to the patch and it not please resubmit the patch.

Thanks,
Vihang

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
{noformat}

The complex types in VectorExpressionWriterFactory are not fully implemented.

Also, null_cast.q, nullMap.q, and nested_column_pruning.q

  was:
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 

[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
{noformat}

The complex types in VectorExpressionWriterFactory are not fully implemented.

Also, null_cast.q and nullMap.q

  was:
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 

[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
{noformat}

The complex types in VectorExpressionWriterFactory are not fully implemented.

FYI: [~vihangk1]


Also, null_cast.q and nullMap.q

  was:
Adding "SET hive.vectorized.execution.enabled=true;" to 
parquet_schema_evolution.q triggers this call stack:

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented yet
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 

[jira] [Assigned] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-03-21 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-19016:
--

Assignee: Vihang Karajgaonkar

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19015) Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException

2018-03-21 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-19015:
--

Assignee: Vihang Karajgaonkar

> Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q 
> gets a ClassCastException
> -
>
> Key: HIVE-19015
> URL: https://issues.apache.org/jira/browse/HIVE-19015
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;"  to 
> parquet_map_of_arrays_of_ints.q triggers this call stack:
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
> org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]
> Adding parquet_map_of_maps.q, too.  Stack trace seems related.
> {noformat}
> Caused by: java.lang.ClassCastException: optional group value (MAP) {
>   repeated group key_value {
> optional binary key (UTF8);
> required int32 value;
>   }
> } is not primitive
>   at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) 
> ~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.(BaseVectorizedColumnReader.java:130)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.(VectorizedListColumnReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19020) Vectorization: When vectorized, orc_null_check.q throws NPE in VectorExpressionWriterFactory

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19020:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;" to orc_null_check.q 
triggers this call stack:

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.setValue(VectorExpressionWriterFactory.java:1465)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.writeValue(VectorExpressionWriterFactory.java:1453)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:813)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:846)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

  was:
Adding "SET hive.vectorized.execution.enabled=true;" to orc_null_check.q 
triggers this call stack:

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.setValue(VectorExpressionWriterFactory.java:1465)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.writeValue(VectorExpressionWriterFactory.java:1453)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:813)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:846)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

Also, null_cast.q


> Vectorization: When vectorized, orc_null_check.q throws NPE in 
> VectorExpressionWriterFactory
> 

[jira] [Updated] (HIVE-19011) Druid Storage Handler returns conflicting results for Qtest druidmini_dynamic_partition.q

2018-03-21 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-19011:
--
Priority: Blocker  (was: Major)

> Druid Storage Handler returns conflicting results for Qtest 
> druidmini_dynamic_partition.q
> -
>
> Key: HIVE-19011
> URL: https://issues.apache.org/jira/browse/HIVE-19011
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Blocker
>
> This git diff shows the conflicting results
> {code}
> diff --git 
> a/ql/src/test/results/clientpositive/druid/druidmini_dynamic_partition.q.out 
> b/ql/src/test/results/clientpositive/druid/druidmini_dynamic_partition.q.out
> index 714778ebfc..cea9b7535c 100644
> --- 
> a/ql/src/test/results/clientpositive/druid/druidmini_dynamic_partition.q.out
> +++ 
> b/ql/src/test/results/clientpositive/druid/druidmini_dynamic_partition.q.out
> @@ -243,7 +243,7 @@ POSTHOOK: query: SELECT  sum(cint), max(cbigint),  
> sum(cbigint), max(cint) FROM
>  POSTHOOK: type: QUERY
>  POSTHOOK: Input: default@druid_partitioned_table
>  POSTHOOK: Output: hdfs://### HDFS PATH ###
> -1408069801800  4139540644  10992545287 165393120
> +1408069801800  3272553822  10992545287 -648527473
>  PREHOOK: query: SELECT  sum(cint), max(cbigint),  sum(cbigint), max(cint) 
> FROM druid_partitioned_table_0
>  PREHOOK: type: QUERY
>  PREHOOK: Input: default@druid_partitioned_table_0
> @@ -429,7 +429,7 @@ POSTHOOK: query: SELECT sum(cint), max(cbigint),  
> sum(cbigint), max(cint) FROM d
>  POSTHOOK: type: QUERY
>  POSTHOOK: Input: default@druid_partitioned_table
>  POSTHOOK: Output: hdfs://### HDFS PATH ###
> -2857395071862  4139540644  -1661313883124  885815256
> +2857395071862  3728054572  -1661313883124  71894663
>  PREHOOK: query: EXPLAIN INSERT OVERWRITE TABLE druid_partitioned_table
>SELECT cast (`ctimestamp1` as timestamp with local time zone) as `__time`,
>  cstring1,
> @@ -566,7 +566,7 @@ POSTHOOK: query: SELECT sum(cint), max(cbigint),  
> sum(cbigint), max(cint) FROM d
>  POSTHOOK: type: QUERY
>  POSTHOOK: Input: default@druid_partitioned_table
>  POSTHOOK: Output: hdfs://### HDFS PATH ###
> -1408069801800  7115092987  10992545287 1232243564
> +1408069801800  4584782821  10992545287 -1808876374
>  PREHOOK: query: SELECT  sum(cint), max(cbigint),  sum(cbigint), max(cint) 
> FROM druid_partitioned_table_0
>  PREHOOK: type: QUERY
>  PREHOOK: Input: default@druid_partitioned_table_0
> @@ -659,7 +659,7 @@ POSTHOOK: query: SELECT sum(cint), max(cbigint),  
> sum(cbigint), max(cint) FROM d
>  POSTHOOK: type: QUERY
>  POSTHOOK: Input: default@druid_partitioned_table
>  POSTHOOK: Output: hdfs://### HDFS PATH ###
> -1408069801800  7115092987  10992545287 1232243564
> +1408069801800  4584782821  10992545287 -1808876374
>  PREHOOK: query: EXPLAIN SELECT  sum(cint), max(cbigint),  sum(cbigint), 
> max(cint)  FROM druid_max_size_partition
>  PREHOOK: type: QUERY
>  POSTHOOK: query: EXPLAIN SELECT  sum(cint), max(cbigint),  sum(cbigint), 
> max(cint)  FROM druid_max_size_partition
> @@ -758,7 +758,7 @@ POSTHOOK: query: SELECT sum(cint), max(cbigint),  
> sum(cbigint), max(cint) FROM d
>  POSTHOOK: type: QUERY
>  POSTHOOK: Input: default@druid_partitioned_table
>  POSTHOOK: Output: hdfs://### HDFS PATH ###
> -1408069801800  7115092987  10992545287 1232243564
> +1408069801800  4584782821  10992545287 -1808876374
>  PREHOOK: query: DROP TABLE druid_partitioned_table_0
>  PREHOOK: type: DROPTABLE
>  PREHOOK: Input: default@druid_partitioned_table_0
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408893#comment-16408893
 ] 

Ashutosh Chauhan commented on HIVE-18953:
-

+1 pending tests

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch, HIVE-18953.6.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19020) Vectorization: When vectorized, orc_null_check.q throws NPE in VectorExpressionWriterFactory

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19020:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;" to orc_null_check.q 
triggers this call stack:

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.setValue(VectorExpressionWriterFactory.java:1465)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.writeValue(VectorExpressionWriterFactory.java:1453)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:813)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:846)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

Also, null_cast.q

  was:
Adding "SET hive.vectorized.execution.enabled=true;" to orc_null_check.q 
triggers this call stack:

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.setValue(VectorExpressionWriterFactory.java:1465)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.writeValue(VectorExpressionWriterFactory.java:1453)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:813)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:846)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}


> Vectorization: When vectorized, orc_null_check.q throws NPE in 
> VectorExpressionWriterFactory
> 

[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q throws HiveException "Not implemented yet" from VectorExpressionWriterMap

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Summary: Vectorization and Parquet: When vectorized, 
parquet_schema_evolution.q throws HiveException "Not implemented yet" from 
VectorExpressionWriterMap  (was: Vectorization and Parquet: When vectorized, 
parquet_schema_evolution.q throws HiveException "Not implemented yet")

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q throws 
> HiveException "Not implemented yet" from VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWri

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19019:

Summary: Vectorization and Parquet: When vectorized, 
parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException 
"Not implemented yet" from VectorExpressionWriterMap  (was: Vectorization and 
Parquet: When vectorized, parquet_schema_evolution.q throws HiveException "Not 
implemented yet" from VectorExpressionWriterMap)

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and 
> orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from 
> VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19019) Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from VectorExpressionWr

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-19019:
---

Assignee: Matt McCline

> Vectorization and Parquet: When vectorized, parquet_schema_evolution.q and 
> orc_merge_incompat_schema.q throws HiveException "Not implemented yet" from 
> VectorExpressionWriterMap
> 
>
> Key: HIVE-19019
> URL: https://issues.apache.org/jira/browse/HIVE-19019
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_schema_evolution.q triggers this call stack:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Not implemented 
> yet
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$19.writeValue(VectorExpressionWriterFactory.java:1496)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.flushDeserializerBatch(VectorMapOperator.java:630)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:698)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1210)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:829)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
> {noformat}
> The complex types in VectorExpressionWriterFactory are not fully implemented.
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19020) Vectorization: When vectorized, orc_null_check.q throws NPE in VectorExpressionWriterFactory

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-19020:
---


> Vectorization: When vectorized, orc_null_check.q throws NPE in 
> VectorExpressionWriterFactory
> 
>
> Key: HIVE-19020
> URL: https://issues.apache.org/jira/browse/HIVE-19020
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to orc_null_check.q 
> triggers this call stack:
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.setValue(VectorExpressionWriterFactory.java:1465)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$18.writeValue(VectorExpressionWriterFactory.java:1453)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:199)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:151)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:955) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:813)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:846)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2018-03-21 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19018:

Status: Patch Available  (was: Open)

> beeline -e now requires semicolon even when used with query from command line
> -
>
> Key: HIVE-19018
> URL: https://issues.apache.org/jira/browse/HIVE-19018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19018.1.patch
>
>
> Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, 
> beeline console will wait for you to enter ';". It's a regression from the 
> old behavior. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2018-03-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408840#comment-16408840
 ] 

Aihua Xu commented on HIVE-19018:
-

patch-1: add setAllowMultiLineCommand(false) for the "-e" option since multiple 
line commands are not allowed for such case. Then beeline will process the 
whole line as a complete command and not wait for additional input.

> beeline -e now requires semicolon even when used with query from command line
> -
>
> Key: HIVE-19018
> URL: https://issues.apache.org/jira/browse/HIVE-19018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19018.1.patch
>
>
> Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, 
> beeline console will wait for you to enter ';". It's a regression from the 
> old behavior. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2018-03-21 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19018:

Attachment: HIVE-19018.1.patch

> beeline -e now requires semicolon even when used with query from command line
> -
>
> Key: HIVE-19018
> URL: https://issues.apache.org/jira/browse/HIVE-19018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19018.1.patch
>
>
> Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, 
> beeline console will wait for you to enter ';". It's a regression from the 
> old behavior. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2018-03-21 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-19018:
---


> beeline -e now requires semicolon even when used with query from command line
> -
>
> Key: HIVE-19018
> URL: https://issues.apache.org/jira/browse/HIVE-19018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, 
> beeline console will wait for you to enter ';". It's a regression from the 
> old behavior. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18739) Add support for Export from Acid table

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408830#comment-16408830
 ] 

Sergey Shelukhin commented on HIVE-18739:
-

Left some comments on RB. My main one is why it's not better to just run a 
compaction and export the base directory. That way it's also better for other 
users?

> Add support for Export from Acid table
> --
>
> Key: HIVE-18739
> URL: https://issues.apache.org/jira/browse/HIVE-18739
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18739.01.patch, HIVE-18739.04.patch, 
> HIVE-18739.04.patch, HIVE-18739.06.patch, HIVE-18739.08.patch, 
> HIVE-18739.09.patch, HIVE-18739.10.patch, HIVE-18739.11.patch, 
> HIVE-18739.12.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19017) Add util function to determine if 2 ValidWriteIdLists are at the same committed ID

2018-03-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-19017:
--
Status: Patch Available  (was: Open)

> Add util function to determine if 2 ValidWriteIdLists are at the same 
> committed ID
> --
>
> Key: HIVE-19017
> URL: https://issues.apache.org/jira/browse/HIVE-19017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19017.1.patch
>
>
> May be useful for the materialized view/results cache work, since this could 
> be used to determine if there have been any changes to a table between when 
> the materialization was generated and a query trying to use the 
> materialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18525) Add explain plan to Hive on Spark Web UI

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408816#comment-16408816
 ] 

Sahil Takiar commented on HIVE-18525:
-

[~xuefuz], [~aihuaxu], [~belugabehr] any thoughts on this feature? I've 
attached multiple screenshots to show what the Web UI looks like.

Some things to note:
* Not sure whether to make this feature configurable, and not sure whether to 
turn it on or off
* This change is mainly geared to users who use the Spark Web UI to visualize / 
debug their queries
* Spark SQL has a similar-ish feature, so ideally the Web UI should be able to 
handle the extra per-query info
* The patch is still a WIP, I need to profile how long it takes for the explain 
plan to get generated

> Add explain plan to Hive on Spark Web UI
> 
>
> Key: HIVE-18525
> URL: https://issues.apache.org/jira/browse/HIVE-18525
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18525.1.patch, HIVE-18525.2.patch, 
> Job-Page-Collapsed.png, Job-Page-Expanded.png, Map-Explain-Plan.png, 
> Reduce-Explain-Plan.png
>
>
> More of an investigation JIRA. The Spark UI has a "long description" of each 
> stage in the Spark DAG. Typically one stage in the Spark DAG corresponds to 
> either a {{MapWork}} or {{ReduceWork}} object. It would be useful if the long 
> description contained the explain plan of the corresponding work object.
> I'm not sure how much additional overhead this would introduce. If not the 
> full explain plan, then maybe a modified one that just lists out all the 
> operator tree along with each operator name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19017) Add util function to determine if 2 ValidWriteIdLists are at the same committed ID

2018-03-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-19017:
--
Attachment: HIVE-19017.1.patch

> Add util function to determine if 2 ValidWriteIdLists are at the same 
> committed ID
> --
>
> Key: HIVE-19017
> URL: https://issues.apache.org/jira/browse/HIVE-19017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19017.1.patch
>
>
> May be useful for the materialized view/results cache work, since this could 
> be used to determine if there have been any changes to a table between when 
> the materialization was generated and a query trying to use the 
> materialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19017) Add util function to determine if 2 ValidWriteIdLists are at the same committed ID

2018-03-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-19017:
-


> Add util function to determine if 2 ValidWriteIdLists are at the same 
> committed ID
> --
>
> Key: HIVE-19017
> URL: https://issues.apache.org/jira/browse/HIVE-19017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>
> May be useful for the materialized view/results cache work, since this could 
> be used to determine if there have been any changes to a table between when 
> the materialization was generated and a query trying to use the 
> materialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Status: Patch Available  (was: Open)

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch, HIVE-18953.6.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Status: Open  (was: Patch Available)

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch, HIVE-18953.6.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Attachment: HIVE-18953.6.patch

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch, HIVE-18953.6.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18755) Modifications to the metastore for catalogs

2018-03-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408812#comment-16408812
 ] 

Alan Gates commented on HIVE-18755:
---

Peter, I think I understood your intent from the first question, which I 
believe was: "why not always handle the missing catalog on the server side?"  
To which my answer is, I am trying to always handle the missing catalog on the 
server side, so that it is compatible with old clients.  But I am also handling 
the missing catalog on the client side (like my change in DDLTask.java) so that 
if the metastore server is set up to talk to one catalog but users want to 
configure their clients to talk to another by default without changing every 
one of their client calls to use the new catalog aware method, they can.  So 
for example, say someone has a Hive CLI client they are using and they want to 
default to 'mycatalog' instead of whatever their metastore server is set to.  
They can then set that value in their config file and it will still work, even 
though I haven't changed every call to getTable etc inside Hive.  

> Modifications to the metastore for catalogs
> ---
>
> Key: HIVE-18755
> URL: https://issues.apache.org/jira/browse/HIVE-18755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18755.nothrift, HIVE-18755.patch
>
>
> Step 1 of adding catalogs is to add support in the metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18093) Improve logging when HoS application is killed

2018-03-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408787#comment-16408787
 ] 

Aihua Xu commented on HIVE-18093:
-

The change looks good to me. +1.

> Improve logging when HoS application is killed
> --
>
> Key: HIVE-18093
> URL: https://issues.apache.org/jira/browse/HIVE-18093
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18093.1.patch, HIVE-18093.2.patch
>
>
> When a HoS jobs is explicitly killed via a user (via a yarn command), the 
> logs just say "RPC channel closed"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18755) Modifications to the metastore for catalogs

2018-03-21 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-18755:
--
Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

I have attached two patches.  Again, I apologize for the size.  The nothrift 
one should be easier to review.  The .patch one is the full patch.

I made a change in this patch where I set the embedded metastore tests in 
client to use JDO calls, while the remote ones continue to use direct SQL.  
This way both paths are tested.  This resulted in finding and fixing several 
unrelated errors.

In addition to testing existing calls and new catalog calls, I added tests for 
the case where an old client (that doesn't know about catalogs) is used, both 
with the catalog set to the default and with the server configured for a 
different catalog.  I also added a test for the case where the server is 
configured for the default catalog but the client overrides that in its 
configuration to use a different catalog as default.

> Modifications to the metastore for catalogs
> ---
>
> Key: HIVE-18755
> URL: https://issues.apache.org/jira/browse/HIVE-18755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18755.nothrift, HIVE-18755.patch
>
>
> Step 1 of adding catalogs is to add support in the metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18525) Add explain plan to Hive on Spark Web UI

2018-03-21 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18525:

Attachment: Job-Page-Expanded.png
Job-Page-Collapsed.png

> Add explain plan to Hive on Spark Web UI
> 
>
> Key: HIVE-18525
> URL: https://issues.apache.org/jira/browse/HIVE-18525
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18525.1.patch, HIVE-18525.2.patch, 
> Job-Page-Collapsed.png, Job-Page-Expanded.png, Map-Explain-Plan.png, 
> Reduce-Explain-Plan.png
>
>
> More of an investigation JIRA. The Spark UI has a "long description" of each 
> stage in the Spark DAG. Typically one stage in the Spark DAG corresponds to 
> either a {{MapWork}} or {{ReduceWork}} object. It would be useful if the long 
> description contained the explain plan of the corresponding work object.
> I'm not sure how much additional overhead this would introduce. If not the 
> full explain plan, then maybe a modified one that just lists out all the 
> operator tree along with each operator name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18755) Modifications to the metastore for catalogs

2018-03-21 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-18755:
--
Attachment: HIVE-18755.patch

> Modifications to the metastore for catalogs
> ---
>
> Key: HIVE-18755
> URL: https://issues.apache.org/jira/browse/HIVE-18755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18755.nothrift, HIVE-18755.patch
>
>
> Step 1 of adding catalogs is to add support in the metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18755) Modifications to the metastore for catalogs

2018-03-21 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-18755:
--
Attachment: HIVE-18755.nothrift

> Modifications to the metastore for catalogs
> ---
>
> Key: HIVE-18755
> URL: https://issues.apache.org/jira/browse/HIVE-18755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18755.nothrift, HIVE-18755.patch
>
>
> Step 1 of adding catalogs is to add support in the metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19015) Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19015:

Description: 
Adding "SET hive.vectorized.execution.enabled=true;"  to 
parquet_map_of_arrays_of_ints.q triggers this call stack:

{noformat}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

FYI: [~vihangk1]

Adding parquet_map_of_maps.q, too.  Stack trace seems related.

{noformat}
Caused by: java.lang.ClassCastException: optional group value (MAP) {
  repeated group key_value {
optional binary key (UTF8);
required int32 value;
  }
} is not primitive
at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) 
~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.(BaseVectorizedColumnReader.java:130)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.(VectorizedListColumnReader.java:52)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

  was:
Adding "SET hive.vectorized.execution.enabled=true;"  to 
parquet_map_of_arrays_of_ints.q triggers this call stack:

{noformat}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}

FYI: [~vihangk1]


> Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q 
> gets a ClassCastException
> 

[jira] [Updated] (HIVE-19003) metastoreconf logs too much on info level

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19003:

Attachment: HIVE-19003.patch

> metastoreconf logs too much on info level
> -
>
> Key: HIVE-19003
> URL: https://issues.apache.org/jira/browse/HIVE-19003
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19003.patch
>
>
> {noformat}
> 2018-03-20T17:49:33,107  INFO [main] conf.MetastoreConf: MetastoreConf object:
> Used hive-site file:...
> Used hivemetastore-site file: ...
> Key:  old hive key:   value: 
> <>
> Key:  old hive key: 
>   value: <0.8>
> Key:  old hive key: 
>   value: 
> Key:  old hive key: 
>   value: <0.01>
> Key:  old hive key: 
>   value: <0.9>
> ...  the entire config.
> {noformat}
> Is it possible to remove this logging or reduce it to trace, or at least 
> debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19003) metastoreconf logs too much on info level

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19003:

Status: Patch Available  (was: Open)

[~alangates] can you take a look?

> metastoreconf logs too much on info level
> -
>
> Key: HIVE-19003
> URL: https://issues.apache.org/jira/browse/HIVE-19003
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19003.patch
>
>
> {noformat}
> 2018-03-20T17:49:33,107  INFO [main] conf.MetastoreConf: MetastoreConf object:
> Used hive-site file:...
> Used hivemetastore-site file: ...
> Key:  old hive key:   value: 
> <>
> Key:  old hive key: 
>   value: <0.8>
> Key:  old hive key: 
>   value: 
> Key:  old hive key: 
>   value: <0.01>
> Key:  old hive key: 
>   value: <0.9>
> ...  the entire config.
> {noformat}
> Is it possible to remove this logging or reduce it to trace, or at least 
> debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19003) metastoreconf logs too much on info level

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408762#comment-16408762
 ] 

Sergey Shelukhin commented on HIVE-19003:
-

Heh, perfect timing...

> metastoreconf logs too much on info level
> -
>
> Key: HIVE-19003
> URL: https://issues.apache.org/jira/browse/HIVE-19003
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19003.patch
>
>
> {noformat}
> 2018-03-20T17:49:33,107  INFO [main] conf.MetastoreConf: MetastoreConf object:
> Used hive-site file:...
> Used hivemetastore-site file: ...
> Key:  old hive key:   value: 
> <>
> Key:  old hive key: 
>   value: <0.8>
> Key:  old hive key: 
>   value: 
> Key:  old hive key: 
>   value: <0.01>
> Key:  old hive key: 
>   value: <0.9>
> ...  the entire config.
> {noformat}
> Is it possible to remove this logging or reduce it to trace, or at least 
> debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19003) metastoreconf logs too much on info level

2018-03-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408756#comment-16408756
 ] 

Alan Gates commented on HIVE-19003:
---

+1 for debug

> metastoreconf logs too much on info level
> -
>
> Key: HIVE-19003
> URL: https://issues.apache.org/jira/browse/HIVE-19003
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> {noformat}
> 2018-03-20T17:49:33,107  INFO [main] conf.MetastoreConf: MetastoreConf object:
> Used hive-site file:...
> Used hivemetastore-site file: ...
> Key:  old hive key:   value: 
> <>
> Key:  old hive key: 
>   value: <0.8>
> Key:  old hive key: 
>   value: 
> Key:  old hive key: 
>   value: <0.01>
> Key:  old hive key: 
>   value: <0.9>
> ...  the entire config.
> {noformat}
> Is it possible to remove this logging or reduce it to trace, or at least 
> debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19003) metastoreconf logs too much on info level

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19003:
---

Assignee: Sergey Shelukhin  (was: Alan Gates)

> metastoreconf logs too much on info level
> -
>
> Key: HIVE-19003
> URL: https://issues.apache.org/jira/browse/HIVE-19003
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> {noformat}
> 2018-03-20T17:49:33,107  INFO [main] conf.MetastoreConf: MetastoreConf object:
> Used hive-site file:...
> Used hivemetastore-site file: ...
> Key:  old hive key:   value: 
> <>
> Key:  old hive key: 
>   value: <0.8>
> Key:  old hive key: 
>   value: 
> Key:  old hive key: 
>   value: <0.01>
> Key:  old hive key: 
>   value: <0.9>
> ...  the entire config.
> {noformat}
> Is it possible to remove this logging or reduce it to trace, or at least 
> debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19014:

Status: Patch Available  (was: Open)

> utilize YARN-8028 (queue ACL check) in Hive Tez session pool
> 
>
> Key: HIVE-19014
> URL: https://issues.apache.org/jira/browse/HIVE-19014
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19014.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19014:

Attachment: HIVE-19014.patch

> utilize YARN-8028 (queue ACL check) in Hive Tez session pool
> 
>
> Key: HIVE-19014
> URL: https://issues.apache.org/jira/browse/HIVE-19014
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19014.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18830) RemoteSparkJobMonitor failures are logged twice

2018-03-21 Thread Bharathkrishna Guruvayoor Murali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-18830:

Status: Patch Available  (was: Open)

Removed extra logging of the exception message

> RemoteSparkJobMonitor failures are logged twice
> ---
>
> Key: HIVE-18830
> URL: https://issues.apache.org/jira/browse/HIVE-18830
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-18830.1.patch
>
>
> If there is an exception in {{RemoteSparkJobMonitor}} while monitoring the 
> remote Spark job the error is logged twice:
> {code}
> LOG.error(msg, e);
> console.printError(msg, "\n" + 
> org.apache.hadoop.util.StringUtils.stringifyException(e));
> {code}
> {{console#printError}} writes the stringified exception to the logs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-03-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19014:
---


> utilize YARN-8028 (queue ACL check) in Hive Tez session pool
> 
>
> Key: HIVE-19014
> URL: https://issues.apache.org/jira/browse/HIVE-19014
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18830) RemoteSparkJobMonitor failures are logged twice

2018-03-21 Thread Bharathkrishna Guruvayoor Murali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-18830:

Attachment: HIVE-18830.1.patch

> RemoteSparkJobMonitor failures are logged twice
> ---
>
> Key: HIVE-18830
> URL: https://issues.apache.org/jira/browse/HIVE-18830
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-18830.1.patch
>
>
> If there is an exception in {{RemoteSparkJobMonitor}} while monitoring the 
> remote Spark job the error is logged twice:
> {code}
> LOG.error(msg, e);
> console.printError(msg, "\n" + 
> org.apache.hadoop.util.StringUtils.stringifyException(e));
> {code}
> {{console#printError}} writes the stringified exception to the logs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18825) Define ValidTxnList before starting query optimization

2018-03-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408730#comment-16408730
 ] 

Eugene Koifman commented on HIVE-18825:
---

+1 patch 6 pending tests

> Define ValidTxnList before starting query optimization
> --
>
> Key: HIVE-18825
> URL: https://issues.apache.org/jira/browse/HIVE-18825
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18825.01.patch, HIVE-18825.02.patch, 
> HIVE-18825.03.patch, HIVE-18825.04.patch, HIVE-18825.05.patch, 
> HIVE-18825.06.patch, HIVE-18825.patch
>
>
> Consider a set of tables used by a materialized view where inserts happened 
> after the materialization was created. To compute incremental view 
> maintenance, we need to be able to filter only new rows from those base 
> tables. That can be done by inserting a filter operator with condition e.g. 
> {{ROW\_\_ID.transactionId < highwatermark and ROW\_\_ID.transactionId NOT 
> IN()}} on top of the MVs query definition and triggering the 
> rewriting (which should in turn produce a partial rewriting). However, to do 
> that, we need to have a value for {{ValidTxnList}} during query compilation 
> so we know the snapshot that we are querying.
> This patch aims to generate {{ValidTxnList}} before query optimization. There 
> should not be any visible changes for end user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18751) ACID table scan through get_splits UDF doesn't receive ValidWriteIdList configuration.

2018-03-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408726#comment-16408726
 ] 

Eugene Koifman commented on HIVE-18751:
---

[~sankarh], should
{noformat}
  // Pass the ValidTxnList and ValidTxnWriteIdList snapshot configurations 
corresponding to the input query
  HiveConf driverConf = driver.getConf();
  String validTxnString = driverConf.get(ValidTxnList.VALID_TXNS_KEY);
  if (validTxnString != null) {
jc.set(ValidTxnList.VALID_TXNS_KEY, validTxnString);
  }
  String validWriteIdString = 
driverConf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY);
  if (validWriteIdString != null) {
jc.set(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY, 
validWriteIdString);
  }
{noformat}
do some sort of check to make sure that this value was set as expected? 

this patch crossed with HIVE-18825 and could've caused a really bad bug I think

> ACID table scan through get_splits UDF doesn't receive ValidWriteIdList 
> configuration.
> --
>
> Key: HIVE-18751
> URL: https://issues.apache.org/jira/browse/HIVE-18751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, UDF, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18751.01.patch
>
>
> Per table write ID (HIVE-18192) have replaced global transaction ID with 
> write ID to version data files in ACID/MM tables,
> To ensure snapshot isolation, need to generate ValidWriteIdList for the given 
> txn/table and use it when scan the ACID/MM tables.
> In case of get_splits UDF which runs on ACID table scan query won't receive 
> it properly through configuration (hive.txn.tables.valid.writeids) and hence 
> throws exception. 
> TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to 
> fix it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18825) Define ValidTxnList before starting query optimization

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18825:
---
Attachment: HIVE-18825.06.patch

> Define ValidTxnList before starting query optimization
> --
>
> Key: HIVE-18825
> URL: https://issues.apache.org/jira/browse/HIVE-18825
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18825.01.patch, HIVE-18825.02.patch, 
> HIVE-18825.03.patch, HIVE-18825.04.patch, HIVE-18825.05.patch, 
> HIVE-18825.06.patch, HIVE-18825.patch
>
>
> Consider a set of tables used by a materialized view where inserts happened 
> after the materialization was created. To compute incremental view 
> maintenance, we need to be able to filter only new rows from those base 
> tables. That can be done by inserting a filter operator with condition e.g. 
> {{ROW\_\_ID.transactionId < highwatermark and ROW\_\_ID.transactionId NOT 
> IN()}} on top of the MVs query definition and triggering the 
> rewriting (which should in turn produce a partial rewriting). However, to do 
> that, we need to have a value for {{ValidTxnList}} during query compilation 
> so we know the snapshot that we are querying.
> This patch aims to generate {{ValidTxnList}} before query optimization. There 
> should not be any visible changes for end user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18825) Define ValidTxnList before starting query optimization

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408718#comment-16408718
 ] 

Jesus Camacho Rodriguez commented on HIVE-18825:


[~ekoifman], the code that accesses the valid txn list was not there when I 
created the patch, it was introduced in HIVE-18751, hence I did not realize 
there was any dependency at that level. I have made changes to avoid clearing 
the txn list, the rest should work as expected.

> Define ValidTxnList before starting query optimization
> --
>
> Key: HIVE-18825
> URL: https://issues.apache.org/jira/browse/HIVE-18825
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18825.01.patch, HIVE-18825.02.patch, 
> HIVE-18825.03.patch, HIVE-18825.04.patch, HIVE-18825.05.patch, 
> HIVE-18825.06.patch, HIVE-18825.patch
>
>
> Consider a set of tables used by a materialized view where inserts happened 
> after the materialization was created. To compute incremental view 
> maintenance, we need to be able to filter only new rows from those base 
> tables. That can be done by inserting a filter operator with condition e.g. 
> {{ROW\_\_ID.transactionId < highwatermark and ROW\_\_ID.transactionId NOT 
> IN()}} on top of the MVs query definition and triggering the 
> rewriting (which should in turn produce a partial rewriting). However, to do 
> that, we need to have a value for {{ValidTxnList}} during query compilation 
> so we know the snapshot that we are querying.
> This patch aims to generate {{ValidTxnList}} before query optimization. There 
> should not be any visible changes for end user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19013) Fix some minor build issues in storage-api

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408717#comment-16408717
 ] 

ASF GitHub Bot commented on HIVE-19013:
---

GitHub user omalley opened a pull request:

https://github.com/apache/hive/pull/323

HIVE-19013. Fix various storage-api build issues.

Fix some minor issues.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omalley/hive hive-19013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/323.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #323


commit 0967afd5d6d364f2a9a47b5164cc5ae4c20786aa
Author: Owen O'Malley 
Date:   2018-03-21T22:08:52Z

HIVE-19013. Fix various storage-api build issues.




> Fix some minor build issues in storage-api
> --
>
> Key: HIVE-19013
> URL: https://issues.apache.org/jira/browse/HIVE-19013
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the storage-api tests complain that there isn't a log4j2.xml and 
> the javadoc fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19013) Fix some minor build issues in storage-api

2018-03-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19013:
--
Labels: pull-request-available  (was: )

> Fix some minor build issues in storage-api
> --
>
> Key: HIVE-19013
> URL: https://issues.apache.org/jira/browse/HIVE-19013
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the storage-api tests complain that there isn't a log4j2.xml and 
> the javadoc fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19013) Fix some minor build issues in storage-api

2018-03-21 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HIVE-19013:



> Fix some minor build issues in storage-api
> --
>
> Key: HIVE-19013
> URL: https://issues.apache.org/jira/browse/HIVE-19013
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> Currently, the storage-api tests complain that there isn't a log4j2.xml and 
> the javadoc fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18999) Filter operator does not work for List

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408696#comment-16408696
 ] 

Jesus Camacho Rodriguez commented on HIVE-18999:


Cc [~ashutoshc]

> Filter operator does not work for List
> --
>
> Key: HIVE-18999
> URL: https://issues.apache.org/jira/browse/HIVE-18999
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Priority: Major
>
> {code:sql}
> create table table1(col0 int, col1 bigint, col2 string, col3 bigint, col4 
> bigint);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2015, 11);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2013, 11);
> -- INCORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct(2014,11));
> -- CORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct('2014','11'));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18999) Filter operator does not work for List

2018-03-21 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408695#comment-16408695
 ] 

Gopal V commented on HIVE-18999:


This is broken because the constant structs are  being constructed with 
Struct(Int, Int) and they don't compare with Struct(Long,Long) from the bigint 
on the Table side.

> Filter operator does not work for List
> --
>
> Key: HIVE-18999
> URL: https://issues.apache.org/jira/browse/HIVE-18999
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Priority: Major
>
> {code:sql}
> create table table1(col0 int, col1 bigint, col2 string, col3 bigint, col4 
> bigint);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2015, 11);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2013, 11);
> -- INCORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct(2014,11));
> -- CORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct('2014','11'));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Attachment: HIVE-18953.5.patch

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Status: Open  (was: Patch Available)

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18953) Implement CHECK constraint

2018-03-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18953:
---
Status: Patch Available  (was: Open)

> Implement CHECK constraint
> --
>
> Key: HIVE-18953
> URL: https://issues.apache.org/jira/browse/HIVE-18953
> Project: Hive
>  Issue Type: New Feature
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-18953.1.patch, HIVE-18953.2.patch, 
> HIVE-18953.3.patch, HIVE-18953.4.patch, HIVE-18953.5.patch
>
>
> Implement column level CHECK constraint



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18883) Add findbugs to yetus pre-commit checks

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408670#comment-16408670
 ] 

Sahil Takiar commented on HIVE-18883:
-

[~szita] any ideas on how I can test this beyond running it locally? Or is 
running it locally sufficient.

> Add findbugs to yetus pre-commit checks
> ---
>
> Key: HIVE-18883
> URL: https://issues.apache.org/jira/browse/HIVE-18883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18883.1.patch, HIVE-18883.2.patch
>
>
> We should enable FindBugs for our YETUS pre-commit checks, this will help 
> overall code quality and should decrease the overall number of bugs in Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18093) Improve logging when HoS application is killed

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408668#comment-16408668
 ] 

Sahil Takiar commented on HIVE-18093:
-

[~aihuaxu] could you take a look?

> Improve logging when HoS application is killed
> --
>
> Key: HIVE-18093
> URL: https://issues.apache.org/jira/browse/HIVE-18093
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18093.1.patch, HIVE-18093.2.patch
>
>
> When a HoS jobs is explicitly killed via a user (via a yarn command), the 
> logs just say "RPC channel closed"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-03-21 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408654#comment-16408654
 ] 

Sahil Takiar commented on HIVE-18533:
-

[~vanzin] I'm seeing a race condition in one of the tests that invoke the 
{{SparkLauncher}}

{code}
2018-03-21T14:11:26,064 ERROR [Driver-RPC-Handler-0] util.Utils: Uncaught 
exception in thread Driver-RPC-Handler-0
java.lang.IllegalStateException: Disconnected.
at 
org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248)
 ~[spark-launcher_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.launcher.LauncherConnection.send(LauncherConnection.java:81) 
~[spark-launcher_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.launcher.LauncherBackend.setState(LauncherBackend.scala:77) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.scheduler.local.LocalSchedulerBackend.org$apache$spark$scheduler$local$LocalSchedulerBackend$$stop(LocalSchedulerBackend.scala:161)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.scheduler.local.LocalSchedulerBackend.stop(LocalSchedulerBackend.scala:137)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:508) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1752) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1924)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.stop(SparkContext.scala:1923) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.api.java.JavaSparkContext.stop(JavaSparkContext.scala:654) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.hive.spark.client.JobContextImpl.stop(JobContextImpl.java:81) 
[classes/:?]
at 
org.apache.hive.spark.client.RemoteDriver.shutdown(RemoteDriver.java:223) 
[classes/:?]
at 
org.apache.hive.spark.client.RemoteDriver.access$200(RemoteDriver.java:71) 
[classes/:?]
at 
org.apache.hive.spark.client.RemoteDriver$DriverProtocol.handle(RemoteDriver.java:286)
 [classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
at 
org.apache.hive.spark.client.rpc.RpcDispatcher.handleCall(RpcDispatcher.java:121)
 [classes/:?]
at 
org.apache.hive.spark.client.rpc.RpcDispatcher.channelRead0(RpcDispatcher.java:80)
 [classes/:?]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
 [netty-all-4.1.17.Final.jar:4.1.17.Final]
at 

[jira] [Updated] (HIVE-14032) INSERT OVERWRITE command failed with case sensitive partition key names

2018-03-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14032:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Chinna!

> INSERT OVERWRITE command failed with case sensitive partition key names
> ---
>
> Key: HIVE-14032
> URL: https://issues.apache.org/jira/browse/HIVE-14032
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.1
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-14032.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18840) CachedStore: Prioritize loading of recently accessed tables during prewarm

2018-03-21 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18840:

Status: Patch Available  (was: Open)

> CachedStore: Prioritize loading of recently accessed tables during prewarm
> --
>
> Key: HIVE-18840
> URL: https://issues.apache.org/jira/browse/HIVE-18840
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-18840.1.patch
>
>
> On clusters with large metadata, prewarming the cache can take several hours. 
> Now that CachedStore does not block on prewarm anymore (after HIVE-18264), we 
> should prioritize loading of recently accessed tables during prewarm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18840) CachedStore: Prioritize loading of recently accessed tables during prewarm

2018-03-21 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18840:

Attachment: HIVE-18840.1.patch

> CachedStore: Prioritize loading of recently accessed tables during prewarm
> --
>
> Key: HIVE-18840
> URL: https://issues.apache.org/jira/browse/HIVE-18840
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-18840.1.patch
>
>
> On clusters with large metadata, prewarming the cache can take several hours. 
> Now that CachedStore does not block on prewarm anymore (after HIVE-18264), we 
> should prioritize loading of recently accessed tables during prewarm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408561#comment-16408561
 ] 

Thejas M Nair edited comment on HIVE-19012 at 3/21/18 8:47 PM:
---

+1 pending tests
This change enables download from local maven repo if available there.



was (Author: thejas):
+1
This change enables download from local maven repo if available there.


> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-19012:


Assignee: Vi On  (was: Thejas M Nair)

> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-19012:
-
Summary: Support builds for ARM and PPC arch  (was: Support ARM and PPC 
arch)

> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-19012:


Assignee: Thejas M Nair  (was: Vi On)

> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408561#comment-16408561
 ] 

Thejas M Nair commented on HIVE-19012:
--

+1
This change enables download from local maven repo if available there.


> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19012) Support builds for ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-19012:
-
Status: Patch Available  (was: Open)

> Support builds for ARM and PPC arch
> ---
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19012) Support ARM and PPC arch

2018-03-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-19012:


Assignee: Vi On

> Support ARM and PPC arch
> 
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Assignee: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19012) Support ARM and PPC arch

2018-03-21 Thread Vi On (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vi On updated HIVE-19012:
-
Attachment: HIVE-19012.patch

> Support ARM and PPC arch
> 
>
> Key: HIVE-19012
> URL: https://issues.apache.org/jira/browse/HIVE-19012
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Vi On
>Priority: Major
> Attachments: HIVE-19012.patch
>
>
> Hive standalone metastore uses protoc-jar-maven-plugin 3.5.1.1 which supports 
> downloading from maven repo.   Artifact download should be supported for ARM 
> and PPC architecture since some protobuf versions do not exist in ARM/PPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18952) Tez session disconnect and reconnect on HS2 HA failover

2018-03-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408503#comment-16408503
 ] 

Sergey Shelukhin commented on HIVE-18952:
-

Looks like the kills are not related to recovery, and happen even on the first 
query.

> Tez session disconnect and reconnect on HS2 HA failover
> ---
>
> Key: HIVE-18952
> URL: https://issues.apache.org/jira/browse/HIVE-18952
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18952.01.patch, HIVE-18952.patch
>
>
> Now that TEZ-3892 is committed, HIVE-18281 can make use of tez session 
> disconnect and reconnect on HA failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18825) Define ValidTxnList before starting query optimization

2018-03-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408489#comment-16408489
 ] 

Eugene Koifman edited comment on HIVE-18825 at 3/21/18 8:00 PM:


GenericUDTFGetSplits.createPlanFragment() now calls compileAndRespond(String, 
true) which removes ValidTxnList from Conf.  But createPlanFragment() then 
tries to access it.  Is that intentional?

if it is, shouldn't validTxnListsGenerated be unset?


was (Author: ekoifman):
GenericUDTFGetSplits.createPlanFragment() now calls compileAndRespond(String, 
true) which removes ValidTxnList from Conf.  But createPlanFragment() then 
tries to access it.  Is that intentional?

> Define ValidTxnList before starting query optimization
> --
>
> Key: HIVE-18825
> URL: https://issues.apache.org/jira/browse/HIVE-18825
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18825.01.patch, HIVE-18825.02.patch, 
> HIVE-18825.03.patch, HIVE-18825.04.patch, HIVE-18825.05.patch, 
> HIVE-18825.patch
>
>
> Consider a set of tables used by a materialized view where inserts happened 
> after the materialization was created. To compute incremental view 
> maintenance, we need to be able to filter only new rows from those base 
> tables. That can be done by inserting a filter operator with condition e.g. 
> {{ROW\_\_ID.transactionId < highwatermark and ROW\_\_ID.transactionId NOT 
> IN()}} on top of the MVs query definition and triggering the 
> rewriting (which should in turn produce a partial rewriting). However, to do 
> that, we need to have a value for {{ValidTxnList}} during query compilation 
> so we know the snapshot that we are querying.
> This patch aims to generate {{ValidTxnList}} before query optimization. There 
> should not be any visible changes for end user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18825) Define ValidTxnList before starting query optimization

2018-03-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408489#comment-16408489
 ] 

Eugene Koifman commented on HIVE-18825:
---

GenericUDTFGetSplits.createPlanFragment() now calls compileAndRespond(String, 
true) which removes ValidTxnList from Conf.  But createPlanFragment() then 
tries to access it.  Is that intentional?

> Define ValidTxnList before starting query optimization
> --
>
> Key: HIVE-18825
> URL: https://issues.apache.org/jira/browse/HIVE-18825
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18825.01.patch, HIVE-18825.02.patch, 
> HIVE-18825.03.patch, HIVE-18825.04.patch, HIVE-18825.05.patch, 
> HIVE-18825.patch
>
>
> Consider a set of tables used by a materialized view where inserts happened 
> after the materialization was created. To compute incremental view 
> maintenance, we need to be able to filter only new rows from those base 
> tables. That can be done by inserting a filter operator with condition e.g. 
> {{ROW\_\_ID.transactionId < highwatermark and ROW\_\_ID.transactionId NOT 
> IN()}} on top of the MVs query definition and triggering the 
> rewriting (which should in turn produce a partial rewriting). However, to do 
> that, we need to have a value for {{ValidTxnList}} during query compilation 
> so we know the snapshot that we are querying.
> This patch aims to generate {{ValidTxnList}} before query optimization. There 
> should not be any visible changes for end user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-18991) Drop database cascade doesn't work with materialized views

2018-03-21 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18991 started by Jesus Camacho Rodriguez.
--
> Drop database cascade doesn't work with materialized views
> --
>
> Key: HIVE-18991
> URL: https://issues.apache.org/jira/browse/HIVE-18991
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Create a database, add a table and then a materialized view that depends on 
> the table.  Then drop the database with cascade set.  Sometimes this will 
> fail because when HiveMetaStore.drop_database_core goes to drop all of the 
> tables it may drop the base table before the materialized view, which will 
> cause an integrity constraint violation in the RDBMS.  To resolve this that 
> method should change to fetch and drop materialized views before tables.
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18877) HiveSchemaTool.validateSchemaTables() should wrap a SQLException when rethrowing

2018-03-21 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-18877:
--
Attachment: HIVE-18877.5.patch

> HiveSchemaTool.validateSchemaTables() should wrap a SQLException when 
> rethrowing
> 
>
> Key: HIVE-18877
> URL: https://issues.apache.org/jira/browse/HIVE-18877
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-18877.1.patch, HIVE-18877.2.patch, 
> HIVE-18877.3.patch, HIVE-18877.4.patch, HIVE-18877.5.patch
>
>
> If schematool is run with the -verbose flag then it will print a stack trace 
> for an exception that occurs. If a SQLException is caught during 
> HiveSchemaTool.validateSchemaTables() then a HiveMetaException is rethrown 
> containing the text of the SQLException. If we instead throw  a 
> HiveMetaException that wraps the SQLException then the stacktrace will help 
> with diagnosis of issues where the SQLException contains a generic error 
> text. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18995) Vectorization: Add option to suppress "Execution mode: vectorized" for testing purposes

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18995:

Attachment: HIVE-18995.03.patch

> Vectorization: Add option to suppress "Execution mode: vectorized" for 
> testing purposes
> ---
>
> Key: HIVE-18995
> URL: https://issues.apache.org/jira/browse/HIVE-18995
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18995.01.patch, HIVE-18995.02.patch, 
> HIVE-18995.03.patch
>
>
> In order to see Q file differences in large runs it is helpful to eliminate 
> change noise from "Execution mode: vectorized" in EXPLAIN output.
> Includes some Vectorizer logging cleanup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18995) Vectorization: Add option to suppress "Execution mode: vectorized" for testing purposes

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18995:

Status: Patch Available  (was: In Progress)

> Vectorization: Add option to suppress "Execution mode: vectorized" for 
> testing purposes
> ---
>
> Key: HIVE-18995
> URL: https://issues.apache.org/jira/browse/HIVE-18995
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18995.01.patch, HIVE-18995.02.patch, 
> HIVE-18995.03.patch
>
>
> In order to see Q file differences in large runs it is helpful to eliminate 
> change noise from "Execution mode: vectorized" in EXPLAIN output.
> Includes some Vectorizer logging cleanup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18995) Vectorization: Add option to suppress "Execution mode: vectorized" for testing purposes

2018-03-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18995:

Status: In Progress  (was: Patch Available)

> Vectorization: Add option to suppress "Execution mode: vectorized" for 
> testing purposes
> ---
>
> Key: HIVE-18995
> URL: https://issues.apache.org/jira/browse/HIVE-18995
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18995.01.patch, HIVE-18995.02.patch
>
>
> In order to see Q file differences in large runs it is helpful to eliminate 
> change noise from "Execution mode: vectorized" in EXPLAIN output.
> Includes some Vectorizer logging cleanup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >