spark_user created SPARK-24213: ---------------------------------- Summary: Power Iteration Clustering in SparkML throws exception, when the ID in IntType Key: SPARK-24213 URL: https://issues.apache.org/jira/browse/SPARK-24213 Project: Spark Issue Type: Bug Components: ML Affects Versions: 2.4.0 Environment: {code:java}
{code} Reporter: spark_user Fix For: 2.4.0 While running the code, PowerIterationClustering in spark ML throws exception. {code:scala} val data = spark.createDataFrame(Seq( (0, Array(1), Array(0.9)), (1, Array(2), Array(0.9)), (2, Array(3), Array(0.9)), (3, Array(4), Array(0.1)), (4, Array(5), Array(0.9)) )).toDF("id", "neighbors", "similarities") val result = new PowerIterationClustering() .setK(2) .setMaxIter(10) .setInitMode("random") .transform(data) .select("id","prediction") {code} {code:java} org.apache.spark.sql.AnalysisException: cannot resolve '`prediction`' given input columns: [id, neighbors, similarities];; 'Project [id#215, 'prediction] +- AnalysisBarrier +- Project [id#215, neighbors#216, similarities#217] +- Join Inner, (id#215 = id#234) :- Project [_1#209 AS id#215, _2#210 AS neighbors#216, _3#211 AS similarities#217] : +- LocalRelation [_1#209, _2#210, _3#211] +- Project [cast(id#230L as int) AS id#234] +- LogicalRDD [id#230L, prediction#231], false at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:88) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:85) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org