First Spark project. I have a Java method that returns a Dataset<Row>. I want to convert this to a Dataset, where the Object is named StatusChangeDB. I have created a POJO StatusChangeDB.java and coded it with all the query objects found in the mySQL table. I then create a Encoder and convert the Dataset<Row> to a Dataset<StatusChangeDB>. However when I try to .show() the values of the Dataset<StatusChangeDB> I receive the error
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`hvpinid_quad`' given input columns: [status_change_type, superLayer, loclayer, sector, locwire]; at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:86) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:83) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:290) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:290) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:305) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4$$anonfun$apply$10.apply(TreeNode.scala:324) at scala.collection.MapLike$MappedValues$$anonfun$iterator$3.apply(MapLike.scala:246) at scala.collection.MapLike$MappedValues$$anonfun$iterator$3.apply(MapLike.scala:246) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.IterableLike$$anon$1.foreach(IterableLike.scala:311) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.MapBuilder.$plus$plus$eq(MapBuilder.scala:25) at scala.collection.TraversableViewLike$class.force(TraversableViewLike.scala:88) at scala.collection.IterableLike$$anon$1.force(IterableLike.scala:311) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:332) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:305) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:287) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:266) at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:276) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$6.apply(QueryPlan.scala:285) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:285) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:83) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:76) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:76) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.resolveAndBind(ExpressionEncoder.scala:259) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:209) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) at org.apache.spark.sql.Dataset$.apply(Dataset.scala:58) at org.apache.spark.sql.Dataset.as(Dataset.scala:376) I do not know how to have whomever replicate this, but here are some methods that are used on my end. I hope the error can be seen from the following codes: public static Dataset<Row> mySqlDataset() { SparkSession spSession = getSession(); spSession.sql("set spark.sql.caseSensitive=false"); Dataset<Row> demoDf = spSession.read().format("jdbc").options(jdbcOptions()).load(); } where jdbcOptions() are public static Map<String, String> jdbcOptions() { Map<String, String> jdbcOptions = new HashMap<String, String>(); jdbcOptions.put("url", "jdbc:mysql://localhost:3306/test"); jdbcOptions.put("driver", "com.mysql.jdbc.Driver"); jdbcOptions.put("dbtable", "status_change"); jdbcOptions.put("user", "root"); jdbcOptions.put("password", ""); return jdbcOptions; } the method that fails is public Dataset<StatusChangeDB> compareRunII(String str) { Dataset<Row> tempDF = SparkManager.mySqlDataset() .select("loclayer", "superLayer", "sector", "locwire", "status_change_type") .filter(col("runno").equalTo(str)); return tempDF.as(SparkManager.statusChangeDBEncoder()); } where SparkManager.statusChangeDBEncoder() is public static Encoder<StatusChangeDB> statusChangeDBEncoder() { return Encoders.bean(StatusChangeDB.class); } and StatusChangeDB is just a POJO that works because I am able to create Dataset<StatusChangeDB> from a datafile. There is no help on Google or this forum for this error. Add Link -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Exception-in-thread-main-org-apache-spark-sql-AnalysisException-cannot-resolve-tp28791.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org