[jira] [Commented] (SPARK-24427) Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet
[ https://issues.apache.org/jira/browse/SPARK-24427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496367#comment-16496367 ] Ashok Rai commented on SPARK-24427: --- Ok. Please let me know the mailing list. I will send my error there. > Spark 2.2 - Exception occurred while saving table in spark. Multiple sources > found for parquet > > > Key: SPARK-24427 > URL: https://issues.apache.org/jira/browse/SPARK-24427 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 2.2.0 >Reporter: Ashok Rai >Priority: Major > > We are getting below error while loading into Hive table. In our code, we use > "saveAsTable" - which as per documentation automatically chooses the format > that the table was created on. We have now tested by creating the table as > Parquet as well as ORC. In both cases - the same error occurred. > > - > 2018-05-29 12:25:07,433 ERROR [main] ERROR - Exception occurred while saving > table in spark. > org.apache.spark.sql.AnalysisException: Multiple sources found for parquet > (org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat, > org.apache.spark.sql.execution.datasources.parquet.DefaultSource), please > specify the fully qualified class name.; > at > org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:584) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:111) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:75) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:75) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:71) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57) > ~[scala-library-2.11.8.jar:?] > at > scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66) > ~[scala-library-2.11.8.jar:?] > at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48) > ~[scala-library-2.11.8.jar:?] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at scala.collection.immutable.List.foreach(List.scala:381) > ~[scala-library-2.11.8.jar:?] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:67) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:50) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at >
[jira] [Commented] (SPARK-24427) Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet
[ https://issues.apache.org/jira/browse/SPARK-24427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496358#comment-16496358 ] Ashok Rai commented on SPARK-24427: --- I have not specified any version in spark-submit. I am using "export SPARK_MAJOR_VERSION=2" before spark-submit command. > Spark 2.2 - Exception occurred while saving table in spark. Multiple sources > found for parquet > > > Key: SPARK-24427 > URL: https://issues.apache.org/jira/browse/SPARK-24427 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 2.2.0 >Reporter: Ashok Rai >Priority: Major > > We are getting below error while loading into Hive table. In our code, we use > "saveAsTable" - which as per documentation automatically chooses the format > that the table was created on. We have now tested by creating the table as > Parquet as well as ORC. In both cases - the same error occurred. > > - > 2018-05-29 12:25:07,433 ERROR [main] ERROR - Exception occurred while saving > table in spark. > org.apache.spark.sql.AnalysisException: Multiple sources found for parquet > (org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat, > org.apache.spark.sql.execution.datasources.parquet.DefaultSource), please > specify the fully qualified class name.; > at > org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:584) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:111) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:75) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:75) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:71) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57) > ~[scala-library-2.11.8.jar:?] > at > scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66) > ~[scala-library-2.11.8.jar:?] > at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48) > ~[scala-library-2.11.8.jar:?] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at scala.collection.immutable.List.foreach(List.scala:381) > ~[scala-library-2.11.8.jar:?] > at > org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74) > ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:67) > ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] > at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:50) >
[jira] [Created] (SPARK-24427) Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet
Ashok Rai created SPARK-24427: - Summary: Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet Key: SPARK-24427 URL: https://issues.apache.org/jira/browse/SPARK-24427 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 2.2.0 Reporter: Ashok Rai We are getting below error while loading into Hive table. In our code, we use "saveAsTable" - which as per documentation automatically chooses the format that the table was created on. We have now tested by creating the table as Parquet as well as ORC. In both cases - the same error occurred. - 2018-05-29 12:25:07,433 ERROR [main] ERROR - Exception occurred while saving table in spark. org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat, org.apache.spark.sql.execution.datasources.parquet.DefaultSource), please specify the fully qualified class name.; at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:584) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:111) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.datasources.PreprocessTableCreation$$anonfun$apply$2.applyOrElse(rules.scala:75) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:75) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.datasources.PreprocessTableCreation.apply(rules.scala:71) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57) ~[scala-library-2.11.8.jar:?] at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66) ~[scala-library-2.11.8.jar:?] at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48) ~[scala-library-2.11.8.jar:?] at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at scala.collection.immutable.List.foreach(List.scala:381) ~[scala-library-2.11.8.jar:?] at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74) ~[spark-catalyst_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:67) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:50) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:73) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:72) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:78) ~[spark-sql_2.11-2.2.0.2.6.4.25-1.jar:2.2.0.2.6.4.25-1] at