spark_user created SPARK-24213:
----------------------------------

             Summary: Power Iteration Clustering in SparkML throws exception, 
when the ID in IntType
                 Key: SPARK-24213
                 URL: https://issues.apache.org/jira/browse/SPARK-24213
             Project: Spark
          Issue Type: Bug
          Components: ML
    Affects Versions: 2.4.0
         Environment: {code:java}


{code}
 
            Reporter: spark_user
             Fix For: 2.4.0


While running the code, PowerIterationClustering in spark ML throws exception.
{code:scala}
val data = spark.createDataFrame(Seq(
(0, Array(1), Array(0.9)),
(1, Array(2), Array(0.9)),
(2, Array(3), Array(0.9)),
(3, Array(4), Array(0.1)),
(4, Array(5), Array(0.9))
)).toDF("id", "neighbors", "similarities")

val result = new PowerIterationClustering()
.setK(2)
.setMaxIter(10)
.setInitMode("random")
.transform(data)
.select("id","prediction")
{code}


{code:java}
org.apache.spark.sql.AnalysisException: cannot resolve '`prediction`' given 
input columns: [id, neighbors, similarities];;
'Project [id#215, 'prediction]
+- AnalysisBarrier
      +- Project [id#215, neighbors#216, similarities#217]
         +- Join Inner, (id#215 = id#234)
            :- Project [_1#209 AS id#215, _2#210 AS neighbors#216, _3#211 AS 
similarities#217]
            :  +- LocalRelation [_1#209, _2#210, _3#211]
            +- Project [cast(id#230L as int) AS id#234]
               +- LogicalRDD [id#230L, prediction#231], false

        at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:88)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:85)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
        at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)

{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to