Re: Is it relevant to use BinaryClassificationMetrics.aucROC / aucPR with LogisticRegressionModel ?

2015-11-25 Thread filthysocks
jmvllt wrote
> Here, because the predicted class will always be 0 or 1, there is no way
> to vary the threshold to get the aucROC, right  Or am I totally wrong
> ? 

No, you are right. If you pass a (Score,Label) tuple to
BinaryClassificationMetrics, then Score has to be the class probability. 

Have you seen the clearThreshold function?

spark_docu wrote
> Clears the threshold so that predict will output raw prediction scores.

https://spark.apache.org/docs/1.5.1/api/scala/index.html#org.apache.spark.mllib.classification.LogisticRegressionModel

You probably need to call it before the predict call.






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-relevant-to-use-BinaryClassificationMetrics-aucROC-aucPR-with-LogisticRegressionModel-tp25465p25473.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Is it relevant to use BinaryClassificationMetrics.aucROC / aucPR with LogisticRegressionModel ?

2015-11-25 Thread jmvllt
Hi filthysocks,

Thanks for the answer. Indeed, using the clearThreshold() function solved my
problem :).

Regards,
Jean.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-relevant-to-use-BinaryClassificationMetrics-aucROC-aucPR-with-LogisticRegressionModel-tp25465p25475.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Is it relevant to use BinaryClassificationMetrics.aucROC / aucPR with LogisticRegressionModel ?

2015-11-24 Thread Sean Owen
Your reasoning is correct; you need probabilities (or at least some
score) out of the model and not just a 0/1 label in order for a ROC /
PR curve to have meaning.

But you just need to call clearThreshold() on the model to make it
return a probability.

On Tue, Nov 24, 2015 at 5:19 PM, jmvllt  wrote:
> Hi guys,
>
> This may be a stupid question. But I m facing an issue here.
>
> I found the class BinaryClassificationMetrics and I wanted to compute the
> aucROC or aucPR of my model.
> The thing is that the predict method of a LogisticRegressionModel only
> returns the predicted class, and not the probability of belonging to the
> positive class. So I will get:
>
> val metrics = new BinaryClassificationMetrics(predictionAndLabels)
> val aucROC = metrics.areaUnderROC
>
> with predictionAndLabels as a RDD[(predictedClass,label)].
>
> Here, because the predicted class will always be 0 or 1, there is no way to
> vary the threshold to get the aucROC, right  Or am I totally wrong ?
>
> So, is it relevant to use BinaryClassificationMetrics.areUnderROC with
> MLlib's classification models which in many cases only return the predicted
> class and not the probability ?
>
> Nevertheless, an easy solution for LogisticRegression would be to create my
> own method who takes the weights' vector of the model as a parameter and
> computes a predictionAndLabels with the real belonging probabilities. But is
> this the only solution 
>
> Thanks in advance.
> Regards,
> Jean.
>
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-relevant-to-use-BinaryClassificationMetrics-aucROC-aucPR-with-LogisticRegressionModel-tp25465.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org