[
https://issues.apache.org/jira/browse/SPARK-24489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-24489:
------------------------------------
Assignee: Apache Spark
> No check for invalid input type of weight data in ml.PowerIterationClustering
> -----------------------------------------------------------------------------
>
> Key: SPARK-24489
> URL: https://issues.apache.org/jira/browse/SPARK-24489
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.4.0
> Reporter: shahid
> Assignee: Apache Spark
> Priority: Major
> Fix For: 2.4.0
>
>
> The test case will result the following failure. currently in ml.PIC, there
> is no check for the data type of weight column. We should check for the valid
> data type of the weight.
> {code:java}
> test("invalid input types for weight") {
> val invalidWeightData = spark.createDataFrame(Seq(
> (0L, 1L, "a"),
> (2L, 3L, "b")
> )).toDF("src", "dst", "weight")
> val pic = new PowerIterationClustering()
> .setWeightCol("weight")
> val result = pic.assignClusters(invalidWeightData)
> }
> {code}
> {code:java}
> Job aborted due to stage failure: Task 0 in stage 8077.0 failed 1 times, most
> recent failure: Lost task 0.0 in stage 8077.0 (TID 882, localhost, executor
> driver): scala.MatchError: [0,1,null] (of class
> org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema)
> at
> org.apache.spark.ml.clustering.PowerIterationClustering$$anonfun$3.apply(PowerIterationClustering.scala:178)
> at
> org.apache.spark.ml.clustering.PowerIterationClustering$$anonfun$3.apply(PowerIterationClustering.scala:178)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at org.apache.spark.graphx.EdgeRDD$$anonfun$1.apply(EdgeRDD.scala:107)
> at org.apache.spark.graphx.EdgeRDD$$anonfun$1.apply(EdgeRDD.scala:105)
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:847)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]