[jira] [Updated] (SPARK-26166) CrossValidator.fit() bug,training and validation dataset may overlap

2018-11-29 Thread Xinyong Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyong Tian updated SPARK-26166: - Description: In the code pyspark.ml.tuning.CrossValidator.fit(), after adding random column df

[jira] [Updated] (SPARK-26166) CrossValidator.fit() bug,training and validation dataset may overlap

2018-11-29 Thread Xinyong Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyong Tian updated SPARK-26166: - Description: In the code pyspark.ml.tuning.CrossValidator.fit(), after adding random column df

[jira] [Created] (SPARK-26166) CrossValidator.fit() bug,training and validation dataset may overlap

2018-11-25 Thread Xinyong Tian (JIRA)
Xinyong Tian created SPARK-26166: Summary: CrossValidator.fit() bug,training and validation dataset may overlap Key: SPARK-26166 URL: https://issues.apache.org/jira/browse/SPARK-26166 Project: Spark

[jira] [Created] (SPARK-25441) calculate term frequency in CountVectorizer()

2018-09-15 Thread Xinyong Tian (JIRA)
Xinyong Tian created SPARK-25441: Summary: calculate term frequency in CountVectorizer() Key: SPARK-25441 URL: https://issues.apache.org/jira/browse/SPARK-25441 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24431) wrong areaUnderPR calculation in BinaryClassificationEvaluator

2018-06-06 Thread Xinyong Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504202#comment-16504202 ] Xinyong Tian commented on SPARK-24431: -- I also feel it is reasonable to set first point as (0,p).

[jira] [Commented] (SPARK-24431) wrong areaUnderPR calculation in BinaryClassificationEvaluator

2018-06-05 Thread Xinyong Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502794#comment-16502794 ] Xinyong Tian commented on SPARK-24431: -- I read more about first point of or curve

[jira] [Commented] (SPARK-24431) wrong areaUnderPR calculation in BinaryClassificationEvaluator

2018-06-05 Thread Xinyong Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502761#comment-16502761 ] Xinyong Tian commented on SPARK-24431: -- Your understanding of event rate is what I meant. I

[jira] [Created] (SPARK-24431) wrong areaUnderPR calculation in BinaryClassificationEvaluator

2018-05-30 Thread Xinyong Tian (JIRA)
Xinyong Tian created SPARK-24431: Summary: wrong areaUnderPR calculation in BinaryClassificationEvaluator Key: SPARK-24431 URL: https://issues.apache.org/jira/browse/SPARK-24431 Project: Spark