[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-21 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205599#comment-15205599
 ] 

Xin Hao commented on HIVE-13277:


Hi, Kapil & Rui,
TPCx-BB query2 is only an example here. Many queries in TPCx-BB failed due to 
similar reason. 

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-21 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205580#comment-15205580
 ] 

Rui Li commented on HIVE-13277:
---

Yes I'm using ORC table. Pinging [~xhao1] regarding whether there're other 
queries that have this issue.

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-21 Thread Kapil Rastogi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205220#comment-15205220
 ] 

Kapil Rastogi commented on HIVE-13277:
--

[~lirui] Are you using ORC data-set to run this query? 

Given the TPCx-BB query set, is this is the only query failing? 

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-19 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200785#comment-15200785
 ] 

Rui Li commented on HIVE-13277:
---

Not sure about it. I'll do some investigation to see if we have a workaround 
for this problem.

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200771#comment-15200771
 ] 

Xuefu Zhang commented on HIVE-13277:


[~lirui], thanks for for your investigation. While new Kryo may fix the 
problem, do we know what's caused the issue? Will the problem go away if we 
register the class in question? Thanks.

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196806#comment-15196806
 ] 

Rui Li commented on HIVE-13277:
---

Pinging [~xuefuz]

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> 

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194645#comment-15194645
 ] 

Rui Li commented on HIVE-13277:
---

I built a local snapshot of kryo with latest code and verified the query can 
pass. So the root cause should be the kryo (3.0.3) we use. I'm afraid there's 
not much we can do at the moment because the fix hasn't been released yet.
[~xuefuz] what do you think?

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  

[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193220#comment-15193220
 ] 

Rui Li commented on HIVE-13277:
---

I managed to reproduce the issue and I found {{StackOverflowError}} in the full 
stack trace. I suppose that's the root cause. Possibly duplicates HIVE-12677. 
And may be related to this [kryo 
issue|https://github.com/EsotericSoftware/kryo/issues/341].

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO