Re: java.lang.NoSuchMethodError while saving a random forest model Spark version 1.5
Spark 1.5 officially use Parquet 1.7.0, but Spark 1.3 use Parquet 1.6.0. It's better to check which version of Parquet is used in your environment. 2015-12-17 10:26 GMT+08:00 Joseph Bradley: > This method is tested in the Spark 1.5 unit tests, so I'd guess it's a > problem with the Parquet dependency. What version of Parquet are you > building Spark 1.5 off of? (I'm not that familiar with Parquet issues > myself, but hopefully a SQL person can chime in.) > > On Tue, Dec 15, 2015 at 3:23 PM, Rachana Srivastava < > rachana.srivast...@markmonitor.com> wrote: > >> I have recently upgraded spark version but when I try to run save a random >> forest model using model save command I am getting nosuchmethoderror. My >> code works fine with 1.3x version. >> >> >> >> model.save(sc.sc(), "modelsavedir"); >> >> >> >> >> >> ERROR: >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation - >> Aborting job. >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 >> in stage 22.0 failed 1 times, most recent failure: Lost task 0.0 in stage >> 22.0 (TID 230, localhost): java.lang.NoSuchMethodError: >> parquet.schema.Types$GroupBuilder.addField(Lparquet/schema/Type;)Lparquet/schema/Types$BaseGroupBuilder; >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:517) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:516) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) >> >> at >> scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:108) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:516) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:312) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305) >> >> at >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> >> at >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> >> at >> scala.collection.Iterator$class.foreach(Iterator.scala:727) >> >> at >> scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> >> at >> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) >> >> at >> org.apache.spark.sql.types.StructType.foreach(StructType.scala:92) >> >> at >> scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >> >> at >> org.apache.spark.sql.types.StructType.map(StructType.scala:92) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convert(CatalystSchemaConverter.scala:305) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.ParquetTypesConverter$.convertFromAttributes(ParquetTypesConverter.scala:58) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.RowWriteSupport.init(ParquetTableSupport.scala:55) >> >> at >> parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:287) >> >> at >> parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:261) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetRelation.scala:94) >> >> at >> org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anon$3.newInstance(ParquetRelation.scala:272) >> >> at >> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:233) >> >> at >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) >> >> at >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) >> >> at >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >> >> at org.apache.spark.scheduler.Task.run(Task.scala:88) >> >> at >>
Re: java.lang.NoSuchMethodError while saving a random forest model Spark version 1.5
This method is tested in the Spark 1.5 unit tests, so I'd guess it's a problem with the Parquet dependency. What version of Parquet are you building Spark 1.5 off of? (I'm not that familiar with Parquet issues myself, but hopefully a SQL person can chime in.) On Tue, Dec 15, 2015 at 3:23 PM, Rachana Srivastava < rachana.srivast...@markmonitor.com> wrote: > I have recently upgraded spark version but when I try to run save a random > forest model using model save command I am getting nosuchmethoderror. My > code works fine with 1.3x version. > > > > model.save(sc.sc(), "modelsavedir"); > > > > > > ERROR: > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation - > Aborting job. > > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 > in stage 22.0 failed 1 times, most recent failure: Lost task 0.0 in stage > 22.0 (TID 230, localhost): java.lang.NoSuchMethodError: > parquet.schema.Types$GroupBuilder.addField(Lparquet/schema/Type;)Lparquet/schema/Types$BaseGroupBuilder; > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:517) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:516) > > at > scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) > > at > scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) > > at > scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:108) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:516) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:312) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305) > > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > > at > scala.collection.Iterator$class.foreach(Iterator.scala:727) > > at > scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > > at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > > at > org.apache.spark.sql.types.StructType.foreach(StructType.scala:92) > > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > > at > org.apache.spark.sql.types.StructType.map(StructType.scala:92) > > at > org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convert(CatalystSchemaConverter.scala:305) > > at > org.apache.spark.sql.execution.datasources.parquet.ParquetTypesConverter$.convertFromAttributes(ParquetTypesConverter.scala:58) > > at > org.apache.spark.sql.execution.datasources.parquet.RowWriteSupport.init(ParquetTableSupport.scala:55) > > at > parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:287) > > at > parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:261) > > at > org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetRelation.scala:94) > > at > org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anon$3.newInstance(ParquetRelation.scala:272) > > at > org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:233) > > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) > > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) > > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > > at org.apache.spark.scheduler.Task.run(Task.scala:88) > > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > >