[
https://issues.apache.org/jira/browse/SPARK-26166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinyong Tian updated SPARK-26166:
-
Description:
In the code pyspark.ml.tuning.CrossValidator.fit(), after adding random column
df
[
https://issues.apache.org/jira/browse/SPARK-26166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinyong Tian updated SPARK-26166:
-
Description:
In the code pyspark.ml.tuning.CrossValidator.fit(), after adding random column
df
Xinyong Tian created SPARK-26166:
Summary: CrossValidator.fit() bug,training and validation dataset
may overlap
Key: SPARK-26166
URL: https://issues.apache.org/jira/browse/SPARK-26166
Project: Spark
Xinyong Tian created SPARK-25441:
Summary: calculate term frequency in CountVectorizer()
Key: SPARK-25441
URL: https://issues.apache.org/jira/browse/SPARK-25441
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504202#comment-16504202
]
Xinyong Tian commented on SPARK-24431:
--
I also feel it is reasonable to set first point as (0,p).
[
https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502794#comment-16502794
]
Xinyong Tian commented on SPARK-24431:
--
I read more about first point of or curve
[
https://issues.apache.org/jira/browse/SPARK-24431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502761#comment-16502761
]
Xinyong Tian commented on SPARK-24431:
--
Your understanding of event rate is what I meant.
I
Xinyong Tian created SPARK-24431:
Summary: wrong areaUnderPR calculation in
BinaryClassificationEvaluator
Key: SPARK-24431
URL: https://issues.apache.org/jira/browse/SPARK-24431
Project: Spark