[
https://issues.apache.org/jira/browse/HIVE-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101125#comment-15101125
]
Rui Li commented on HIVE-12828:
-------------------------------
Looked at the log and error is
{noformat}
2016-01-14T14:38:11,889 - 16/01/14 14:38:11 WARN TaskSetManager: Lost task
0.0 in stage 136.0 (TID 238, ip-10-233-128-9.us-west-1.compute.internal):
java.io.IOException: java.lang.reflect.InvocationTargetException
2016-01-14T14:38:11,889 - at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
2016-01-14T14:38:11,889 - at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
2016-01-14T14:38:11,890 - at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:269)
2016-01-14T14:38:11,890 - at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:216)
2016-01-14T14:38:11,890 - at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:343)
2016-01-14T14:38:11,890 - at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:680)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
2016-01-14T14:38:11,890 - at
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
2016-01-14T14:38:11,890 - at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
2016-01-14T14:38:11,890 - at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
2016-01-14T14:38:11,890 - at
org.apache.spark.scheduler.Task.run(Task.scala:89)
2016-01-14T14:38:11,890 - at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
2016-01-14T14:38:11,890 - at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
2016-01-14T14:38:11,890 - at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
2016-01-14T14:38:11,890 - at java.lang.Thread.run(Thread.java:744)
2016-01-14T14:38:11,890 - Caused by:
java.lang.reflect.InvocationTargetException
2016-01-14T14:38:11,890 - at
sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
2016-01-14T14:38:11,890 - at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
2016-01-14T14:38:11,890 - at
java.lang.reflect.Constructor.newInstance(Constructor.java:526)
2016-01-14T14:38:11,890 - at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:255)
2016-01-14T14:38:11,890 - ... 21 more
2016-01-14T14:38:11,891 - Caused by: java.lang.NoSuchMethodError:
org.apache.parquet.schema.Types$MessageTypeBuilder.addFields([Lorg/apache/parquet/schema/Type;)Lorg/apache/parquet/schema/Types$BaseGroupBuilder;
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getSchemaByName(DataWritableReadSupport.java:160)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:223)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:248)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:94)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:80)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
2016-01-14T14:38:11,891 - at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
2016-01-14T14:38:11,891 - ... 25 more
{noformat}
The missing method exists in parquet-1.8.1 (hive) but not in 1.7.0 (spark). So
I think it's still using the old tarball.
> Update Spark version to 1.6
> ---------------------------
>
> Key: HIVE-12828
> URL: https://issues.apache.org/jira/browse/HIVE-12828
> Project: Hive
> Issue Type: Task
> Components: Spark
> Reporter: Xuefu Zhang
> Assignee: Rui Li
> Attachments: HIVE-12828.1-spark.patch, HIVE-12828.2-spark.patch,
> HIVE-12828.2-spark.patch, HIVE-12828.2-spark.patch, HIVE-12828.2-spark.patch,
> mem.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)