[jira] [Resolved] (SPARK-11343) Regression Imposes doubles on prediction/label columns

Xiangrui Meng (JIRA) Mon, 02 Nov 2015 16:13:15 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xiangrui Meng resolved SPARK-11343.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.6.0

Issue resolved by pull request 9296
[https://github.com/apache/spark/pull/9296]

> Regression Imposes doubles on prediction/label columns
> ------------------------------------------------------
>
>                 Key: SPARK-11343
>                 URL: https://issues.apache.org/jira/browse/SPARK-11343
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 1.5.1
>         Environment: all environments
>            Reporter: Dominik Dahlem
>             Fix For: 1.6.0
>
>
> Using pyspark.ml and DataFrames, The ALS recommender cannot be evaluated 
> using the RegressionEvaluator, because of a type mis-match between the model 
> transformation and the evaluation APIs. One can work around this by casting 
> the prediction column into double before passing it into the evaluator. 
> However, this does not work with pipelines and cross validation.
> Code and traceback below:
> {code}
> als = ALS(rank=10, maxIter=30, regParam=0.1, userCol='userID', 
> itemCol='movieID', ratingCol='rating')
> model = als.fit(training)
> predictions = model.transform(validation)
> evaluator = RegressionEvaluator(predictionCol='prediction', labelCol='rating')
> validationRmse = evaluator.evaluate(predictions, {evaluator.metricName: 
> 'rmse'})
> {code}
> Traceback:
> validationRmse = evaluator.evaluate(predictions,
> {evaluator.metricName: 'rmse'}
> )
> File 
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
>  line 63, in evaluate
> File 
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
>  line 94, in _evaluate
> File 
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
>  line 813, in _call_
> File 
> "/Users/dominikdahlem/projects/repositories/spark/python/pyspark/sql/utils.py",
>  line 42, in deco
> raise IllegalArgumentException(s.split(': ', 1)[1])
> pyspark.sql.utils.IllegalArgumentException: requirement failed: Column 
> prediction must be of type DoubleType but was actually FloatType.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-11343) Regression Imposes doubles on prediction/label columns

Reply via email to