[
https://issues.apache.org/jira/browse/SPARK-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726364#comment-14726364
]
Ehsan Mohyedin Kermani commented on SPARK-8534:
-----------------------------------------------
I'd like to give it a shot but first I think, we need distributed scan function
for computing the cumulative sum of the sorted predictions. Would it be
possible to add that to RegressionMetrics or perhaps mllib.util first? An
implementation was suggested here
https://groups.google.com/forum/#!topic/spark-users/ts-FdB50ltY.
> Gini for regression metrics and evaluator
> -----------------------------------------
>
> Key: SPARK-8534
> URL: https://issues.apache.org/jira/browse/SPARK-8534
> Project: Spark
> Issue Type: New Feature
> Components: ML, MLlib
> Reporter: Joseph K. Bradley
> Priority: Minor
>
> One common metric we do not have in RegressionMetrics or RegressionEvaluator
> is Gini: [https://www.kaggle.com/wiki/Gini]
> Implementing (normalized) Gini would be nice. However, it might be
> expensive; I believe it would require sorting the labels.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]