subject:"\[jira\] \[Commented\] \(SPARK\-23734\) InvalidSchemaException While Saving ALSModel"

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

2018-12-07 Thread Stanley Poon (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-23734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713403#comment-16713403
 ] 

Stanley Poon commented on SPARK-23734:
--

Just confirmed the problem is fixed in Spark 2.3.1. The test environment uses 
Scala 2.11.11. And there are no other dependency. I will close the case.

> InvalidSchemaException While Saving ALSModel
> 
>
> Key: SPARK-23734
> URL: https://issues.apache.org/jira/browse/SPARK-23734
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.3.0
> Environment: macOS 10.13.2
> Scala 2.11.8
> Spark 2.3.0  v2.3.0-rc5 (Feb 22 2018)
>Reporter: Stanley Poon
>Priority: Major
>  Labels: ALS, parquet, persistence
>
> After fitting an ALSModel, get following error while saving the model:
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> Exactly the same code ran ok on 2.2.1.
> Same issue also occurs on other ALSModels we have.
> h2. *To reproduce*
> Get ALSExample: 
> [https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/ALSExample.scala]
>  and add the following line to save the model right before "spark.stop".
> {quote}   model.write.overwrite().save("SparkExampleALSModel") 
> {quote}
> h2. Stack Trace
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$.setSchema(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.prepareWrite(ParquetFileFormat.scala:112)
> at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:140)
> at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
> at 
> org.apache.spark.ml.recommendation.ALSModel$ALSModelWriter.saveImpl(ALS.scala:510)
> at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
> at com.vitalmove.model.ALSExample$.main(ALSExample.scala:83)
> at com.vitalmove.model.ALSExample.main(ALSExample.scala)
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> at org.apache.parquet.schema.GroupType.(GroupType.java:92)
> at org.apache.parquet.schema.GroupType.(GroupType.java:48)
> at org.apache.parquet.schema.MessageType.(MessageType.java:50)
> at org.apache.parquet.schema.Types$MessageTypeBuilder.named(Types.java:1256)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.(ParquetSchemaConverter.scala:567)
> at 
>

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

2018-04-11 Thread Stanley Poon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434219#comment-16434219
 ] 

Stanley Poon commented on SPARK-23734:
--

[~viirya] Thank you for checking into this. I added the Spark release details 
where this is reproducible. And will verify that it is fixed in the next 
release.

> InvalidSchemaException While Saving ALSModel
> 
>
> Key: SPARK-23734
> URL: https://issues.apache.org/jira/browse/SPARK-23734
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.3.0
> Environment: macOS 10.13.2
> Scala 2.11.8
> Spark 2.3.0  v2.3.0-rc5
>Reporter: Stanley Poon
>Priority: Major
>  Labels: ALS, parquet, persistence
>
> After fitting an ALSModel, get following error while saving the model:
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> Exactly the same code ran ok on 2.2.1.
> Same issue also occurs on other ALSModels we have.
> h2. *To reproduce*
> Get ALSExample: 
> [https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/ALSExample.scala]
>  and add the following line to save the model right before "spark.stop".
> {quote}   model.write.overwrite().save("SparkExampleALSModel") 
> {quote}
> h2. Stack Trace
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$.setSchema(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.prepareWrite(ParquetFileFormat.scala:112)
> at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:140)
> at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
> at 
> org.apache.spark.ml.recommendation.ALSModel$ALSModelWriter.saveImpl(ALS.scala:510)
> at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
> at com.vitalmove.model.ALSExample$.main(ALSExample.scala:83)
> at com.vitalmove.model.ALSExample.main(ALSExample.scala)
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> at org.apache.parquet.schema.GroupType.(GroupType.java:92)
> at org.apache.parquet.schema.GroupType.(GroupType.java:48)
> at org.apache.parquet.schema.MessageType.(MessageType.java:50)
> at org.apache.parquet.schema.Types$MessageTypeBuilder.named(Types.java:1256)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.(ParquetSchemaConverter.scala:567)
> at 
>

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

2018-03-23 Thread Liang-Chi Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410914#comment-16410914
 ] 

Liang-Chi Hsieh commented on SPARK-23734:
-

I use the latest master branch and can't reproduce the reported issue.

> InvalidSchemaException While Saving ALSModel
> 
>
> Key: SPARK-23734
> URL: https://issues.apache.org/jira/browse/SPARK-23734
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.3.0
> Environment: macOS 10.13.2
> Scala 2.11.8
> Spark 2.3.0
>Reporter: Stanley Poon
>Priority: Major
>  Labels: ALS, parquet, persistence
>
> After fitting an ALSModel, get following error while saving the model:
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> Exactly the same code ran ok on 2.2.1.
> Same issue also occurs on other ALSModels we have.
> h2. *To reproduce*
> Get ALSExample: 
> [https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/ALSExample.scala]
>  and add the following line to save the model right before "spark.stop".
> {quote}   model.write.overwrite().save("SparkExampleALSModel") 
> {quote}
> h2. Stack Trace
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$.setSchema(ParquetWriteSupport.scala:444)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.prepareWrite(ParquetFileFormat.scala:112)
> at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:140)
> at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
> at 
> org.apache.spark.ml.recommendation.ALSModel$ALSModelWriter.saveImpl(ALS.scala:510)
> at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
> at com.vitalmove.model.ALSExample$.main(ALSExample.scala:83)
> at com.vitalmove.model.ALSExample.main(ALSExample.scala)
> Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can 
> not be empty. Parquet does not support empty group without leaves. Empty 
> group: spark_schema
> at org.apache.parquet.schema.GroupType.(GroupType.java:92)
> at org.apache.parquet.schema.GroupType.(GroupType.java:48)
> at org.apache.parquet.schema.MessageType.(MessageType.java:50)
> at org.apache.parquet.schema.Types$MessageTypeBuilder.named(Types.java:1256)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.(ParquetSchemaConverter.scala:567)
> at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.(ParquetSchemaConverter.scala)
>  



--
This message was

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

[jira] [Commented] (SPARK-23734) InvalidSchemaException While Saving ALSModel

3 matches

Site Navigation

Mail list logo

Footer information