[
https://issues.apache.org/jira/browse/SPARK-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-16149:
---------------------------------
Labels: bulk-closed (was: )
> API consistency discussion: CountVectorizer.{minDF -> minDocFreq, minTF ->
> minTermFreq}
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-16149
> URL: https://issues.apache.org/jira/browse/SPARK-16149
> Project: Spark
> Issue Type: Brainstorming
> Components: MLlib
> Affects Versions: 2.0.0
> Reporter: Xiangrui Meng
> Priority: Major
> Labels: bulk-closed
>
> We used `minDF` and `minTF` in CountVectorizer and `minDocFreq` in IDF. It
> would be nice to keep the naming consistent. This was discussed in
> https://github.com/apache/spark/pull/7388 and the decision was made based on
> sklearn compatibility. However, we didn't look broadly across MLlib APIs.
> Maybe we can live with this small inconsistency but it would be nice to
> discuss the guideline (consistent with other libraries or existing ones in
> MLlib).
> cc: [~josephkb] [~yuhaoyan]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]