[jira] [Commented] (FLINK-32889) BinaryClassificationEvaluator gives wrong weighted AUC value
[ https://issues.apache.org/jira/browse/FLINK-32889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758393#comment-17758393 ] Zhipeng Zhang commented on FLINK-32889: --- Solved on master via 5619c3b8591b220e78a0a792c1f940e06149c8f0. > BinaryClassificationEvaluator gives wrong weighted AUC value > > > Key: FLINK-32889 > URL: https://issues.apache.org/jira/browse/FLINK-32889 > Project: Flink > Issue Type: Bug > Components: Library / Machine Learning >Affects Versions: ml-2.3.0 >Reporter: Fan Hong >Priority: Major > Labels: pull-request-available > > BinaryClassificationEvaluator gives wrong AUC value when a weight column > provided. > Here is an case from the unit test. The (score, label, weight) of data are: > {code:java} > (0.9, 1.0, 0.8), > (0.9, 1.0, 0.7), > (0.9, 1.0, 0.5), > (0.75, 0.0, 1.2), > (0.6, 0.0, 1.3), > (0.9, 1.0, 1.5), > (0.9, 1.0, 1.4), > (0.4, 0.0, 0.3), > (0.3, 0.0, 0.5), > (0.9, 1.0, 1.9), > (0.2, 0.0, 1.2), > (0.1, 1.0, 1.0) > {code} > PySpark and scikit-learn gives a AUC score of 0.87179, while Flink ML > implementation gives 0.891168. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32889) BinaryClassificationEvaluator gives wrong weighted AUC value
[ https://issues.apache.org/jira/browse/FLINK-32889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755780#comment-17755780 ] Fan Hong commented on FLINK-32889: -- BTW, the area under PRC is also found incorrect. PySpark and scikit-learn give 0.9510202726261435, while current implementation gives 0.9377705627705628. > BinaryClassificationEvaluator gives wrong weighted AUC value > > > Key: FLINK-32889 > URL: https://issues.apache.org/jira/browse/FLINK-32889 > Project: Flink > Issue Type: Bug > Components: Library / Machine Learning >Affects Versions: ml-2.3.0 >Reporter: Fan Hong >Priority: Major > Labels: pull-request-available > > BinaryClassificationEvaluator gives wrong AUC value when a weight column > provided. > Here is an case from the unit test. The (score, label, weight) of data are: > {code:java} > (0.9, 1.0, 0.8), > (0.9, 1.0, 0.7), > (0.9, 1.0, 0.5), > (0.75, 0.0, 1.2), > (0.6, 0.0, 1.3), > (0.9, 1.0, 1.5), > (0.9, 1.0, 1.4), > (0.4, 0.0, 0.3), > (0.3, 0.0, 0.5), > (0.9, 1.0, 1.9), > (0.2, 0.0, 1.2), > (0.1, 1.0, 1.0) > {code} > PySpark and scikit-learn gives a AUC score of 0.87179, while Flink ML > implementation gives 0.891168. > -- This message was sent by Atlassian Jira (v8.20.10#820010)