Alex Wozniakowski created SPARK-45910:
-----------------------------------------

             Summary: Numerical output of MulticlassClassificationEvaluator 
does not coincide with expected output
                 Key: SPARK-45910
                 URL: https://issues.apache.org/jira/browse/SPARK-45910
             Project: Spark
          Issue Type: Bug
          Components: ML
    Affects Versions: 3.5.0, 3.4.1
            Reporter: Alex Wozniakowski


To show an example of MulticlassClassificationEvaluator generating a numerical 
output, which does not coincide with the expected output consider the following 
code:
{code:java}
from pyspark.ml.classification import LinearSVC
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.evaluation import MulticlassClassificationEvaluator

train_data = [(0, 1.0, 2.0, 3.0), (1, 4.0, 5.0, 6.0), (0, 7.0, 8.0, 9.0)]
valid_data = [(1, 2.0, 3.0, 4.0), (0, 5.0, 6.0, 7.0), (1, 8.0, 9.0, 10.0)]

schema = ["label", "feature1", "feature2", "feature3"]

train = spark.createDataFrame(train_data, schema=schema)
valid = spark.createDataFrame(valid_data, schema=schema)

feature_columns = ["feature1", "feature2", "feature3"]
assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")
train = assembler.transform(train)
valid = assembler.transform(valid)

svm = LinearSVC(maxIter=10, regParam=0.1)
model = svm.fit(train)
predictions = model.transform(valid)

recallByLabel = MulticlassClassificationEvaluator(metricName="recallByLabel")
weightedRecall = MulticlassClassificationEvaluator(metricName="weightedRecall")

print(f"Recall by label: {recallByLabel.evaluate(predictions)}")
print(f"Weighted recall: {weightedRecall.evaluate(predictions)}") {code}
It produces:
{code:java}
Recall by label: 1.0
Weighted recall: 0.3333333333333333{code}
but predictions.show() implies the following hand calculated confusion matrix:
{code:java}
 -----------
|  0  |  0  |
|  2  |  1  |
 -----------{code}
where the recall is 0, i.e., 0 / (0 + 2).

What is the nature of this discrepancy? Also, note that it is not restricted to 
recall; and other classifiers, which include a probability column in 
predictions, behave similarly.

 

Furthermore, the translation of the example to Scala, namely:
{code:java}
import org.apache.spark.ml.classification.LinearSVC
import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
import org.apache.spark.sql.DataFrame

val trainData = Seq((0, 1.0, 2.0, 3.0), (1, 4.0, 5.0, 6.0), (0, 7.0, 8.0, 9.0))
val validData = Seq((1, 2.0, 3.0, 4.0), (0, 5.0, 6.0, 7.0), (1, 8.0, 9.0, 10.0))

val schema = Seq("label", "feature1", "feature2", "feature3")

val train: DataFrame = spark.createDataFrame(trainData).toDF(schema: _*)
val valid: DataFrame = spark.createDataFrame(validData).toDF(schema: _*)

val featureColumns = Array("feature1", "feature2", "feature3")
val assembler = new VectorAssembler()
  .setInputCols(featureColumns)
  .setOutputCol("features")

val trainAssembled = assembler.transform(train)
val validAssembled = assembler.transform(valid)

val svm = new LinearSVC()
  .setMaxIter(10)
  .setRegParam(0.1)

val model = svm.fit(trainAssembled)
val predictions = model.transform(validAssembled)

val recallByLabel = new MulticlassClassificationEvaluator()
  .setMetricName("recallByLabel")
val weightedRecall = new MulticlassClassificationEvaluator()
  .setMetricName("weightedRecall")

println(s"Recall by label: ${recallByLabel.evaluate(predictions)}")
println(s"Weighted recall: ${weightedRecall.evaluate(predictions)}"){code}
produces the same recall by label and weighted recall, as described above.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to