Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/20777
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r175184951
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,21 @@ private[feature] trait CountVectorizerParams
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r175184795
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,21 @@ private[feature] trait CountVectorizerParams
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r175184503
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,21 @@ private[feature] trait CountVectorizerParams
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174935305
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +473,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174911085
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,21 @@ private[feature] trait CountVectorizerParams
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174864748
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +473,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174863155
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,21 @@ private[feature] trait CountVectorizerParams
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174636559
--- Diff: python/pyspark/ml/tests.py ---
@@ -679,6 +679,29 @@ def test_count_vectorizer_with_binary(self):
feature, expected = r
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174627578
--- Diff: python/pyspark/ml/tests.py ---
@@ -679,6 +679,29 @@ def test_count_vectorizer_with_binary(self):
feature, expected = r
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174626899
--- Diff: python/pyspark/ml/tests.py ---
@@ -679,6 +679,29 @@ def test_count_vectorizer_with_binary(self):
feature, expected = r
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174624206
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,22 @@ private[feature] trait CountVectorizerParams
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174625203
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -70,19 +70,22 @@ private[feature] trait CountVectorizerParams
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r173369004
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +522,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r173336643
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +522,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r173336451
--- Diff: python/pyspark/ml/feature.py ---
@@ -455,6 +506,12 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r173335895
--- Diff: python/pyspark/ml/feature.py ---
@@ -408,35 +408,86 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/20777
[SPARK-23615][ML][PYSPARK]Add maxDF Parameter to Python CountVectorizer
## What changes were proposed in this pull request?
The maxDF parameter is for filtering out frequently occurring
18 matches
Mail list logo