Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127550679
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127553451
--- Diff: python/pyspark/ml/tests.py ---
@@ -1229,11 +1229,30 @@ def test_output_columns(self):
(2.0
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127551356
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +326,11 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127551019
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127550735
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127553419
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,50 @@ class OneVsRestSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127550604
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127550407
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r127550217
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/HasParallelism.scala ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
Taking a look now
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r127548975
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/ValidatorParamsSuiteHelpers.scala
---
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18618
Thanks @HyukjinKwon ! I'm still in favor of adding this, partly to match
Scala and partly to have API docs for it.
I just had one question: Is there a reason fieldNames should ret
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18428
Rats, one more thing: We need to use relative paths, not absolute ones,
when we put paths in the persisted file. Could you please add a unit test
which checks this, perhaps by saving a model
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r127057083
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -183,8 +198,15 @@ private[ml] object ValidatorParams
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18428
Also, can you please add "OneVsRest" to the PR and JIRA titles since this
touches that class?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r126849117
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -183,8 +198,14 @@ private[ml] object ValidatorParams
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18428
LGTM
I couldn't think of a great way to reduce code duplication between
JavaWrapper and OneVsRest.
---
If your project is set up for it, you can reply to this email and have your
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176264
--- Diff: python/pyspark/ml/tests.py ---
@@ -681,6 +682,76 @@ def test_save_load(self):
self.assertEqual(loadedLrModel.uid, lrModel.uid
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176348
--- Diff: python/pyspark/ml/tuning.py ---
@@ -263,8 +301,60 @@ def copy(self, extra=None):
newCV.setEvaluator(self.getEvaluator().copy
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176320
--- Diff: python/pyspark/ml/tuning.py ---
@@ -137,8 +140,43 @@ def getEvaluator(self):
"""
return se
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176253
--- Diff: python/pyspark/ml/classification.py ---
@@ -1646,6 +1674,15 @@ class OneVsRestModel(Model, OneVsRestParams,
MLReadable, MLWritable
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176344
--- Diff: python/pyspark/ml/tuning.py ---
@@ -263,8 +301,60 @@ def copy(self, extra=None):
newCV.setEvaluator(self.getEvaluator().copy
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175799
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,52 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176362
--- Diff: python/pyspark/ml/wrapper.py ---
@@ -111,7 +111,14 @@ def _make_java_param_pair(self, param, value):
sc = SparkContext
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175225
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,52 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175803
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,52 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175405
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/TrainValidationSplitSuite.scala
---
@@ -160,8 +213,21 @@ class TrainValidationSplitSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175867
--- Diff: python/pyspark/ml/classification.py ---
@@ -1630,8 +1614,52 @@ def _to_java(self):
_java_obj.setPredictionCol(self.getPredictionCol
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125175232
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/TrainValidationSplitSuite.scala
---
@@ -134,6 +134,59 @@ class TrainValidationSplitSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r125176209
--- Diff: python/pyspark/ml/classification.py ---
@@ -1630,8 +1614,52 @@ def _to_java(self):
_java_obj.setPredictionCol(self.getPredictionCol
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124184937
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -126,10 +126,22 @@ private[ml] object ValidatorParams
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124161422
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,46 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124185775
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -183,8 +195,14 @@ private[ml] object ValidatorParams
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124185314
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -126,10 +126,22 @@ private[ml] object ValidatorParams
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124161463
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,46 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124180242
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/TrainValidationSplitSuite.scala
---
@@ -136,6 +136,29 @@ class TrainValidationSplitSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124168033
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,46 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124161468
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,46 @@ class CrossValidatorSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124185896
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -126,10 +126,22 @@ private[ml] object ValidatorParams
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18428#discussion_r124161588
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -156,6 +156,46 @@ class CrossValidatorSuite
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
Catching up here, it sounds like the current recommendations (which I'm on
board with) are to:
* Switch to Futures, including using sameThreadExecutor for the case of
parallelism=1
to rawPrediction instead of probability. This PR changes the param
in the Scala, Python and R APIs.
## How was this patch tested?
New unit test to make sure the threshold can be set to any Double value.
Author: Joseph K. Bradley
Closes #18151 from jkbradley/ml-2.2-linearsvc-cleanup.
Project: h
to rawPrediction instead of probability. This PR changes the param
in the Scala, Python and R APIs.
## How was this patch tested?
New unit test to make sure the threshold can be set to any Double value.
Author: Joseph K. Bradley
Closes #18151 from jkbradley/ml-2.2-linearsvc-cleanup.
(che
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18151
Merging with master, branch-2.2
Thanks for reviewing!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r122524554
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,37 @@ class OneVsRestSuite extends
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
@BryanCutler Thanks for the thoughts! I didn't see a response w.r.t.
putting parallelism in a trait, so I'll say we won't do it for now. The usage
of par collections / Futures
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r122524395
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +350,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r122524220
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +350,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r122523766
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +350,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
One comment about putting parallelism in a trait vs. not: Would we want to
avoid creating a "threadpool" when parallelism = 1? In that (common) case,
maybe we'd want to avoid
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
You're right about Scala being an issue. This actually works with Scala
2.10 and 2.11 but not 2.12, in which Scala drops its own ForkJoinPool in favor
of the java one. As long as we drop
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121836808
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +343,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
I agree; it'd be good to match on the Param name. Do you think
"parallelism" is too vague? If not, then I like it since it's simple.
I'd vote for default pa
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121736014
--- Diff: python/pyspark/ml/classification.py ---
@@ -1510,21 +1511,26 @@ class OneVsRest(Estimator, OneVsRestParams,
MLReadable, MLWritable
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121740343
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -325,8 +343,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121733736
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -65,6 +67,12 @@ private[ml] trait OneVsRestParams extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121734558
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -283,6 +295,12 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121737179
--- Diff: python/pyspark/ml/classification.py ---
@@ -1560,14 +1566,27 @@ def trainSingleClass(index
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121735491
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,40 @@ class OneVsRestSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121736656
--- Diff: python/pyspark/ml/classification.py ---
@@ -1560,14 +1566,27 @@ def trainSingleClass(index
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121735870
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,40 @@ class OneVsRestSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121735342
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,40 @@ class OneVsRestSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121735271
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,40 @@ class OneVsRestSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121737455
--- Diff: python/pyspark/ml/tests.py ---
@@ -1229,7 +1229,35 @@ def test_output_columns(self):
(2.0
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121734092
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -65,6 +67,12 @@ private[ml] trait OneVsRestParams extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r121737343
--- Diff: python/pyspark/ml/tests.py ---
@@ -1229,7 +1229,35 @@ def test_output_columns(self):
(2.0
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
taking a look now
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18281
add to whitelist
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
ery easily to have an overflow in calculating the number of
partitions for ML persistence.
This modifies the calculations to use Long.
## How was this patch tested?
New unit test. I verified that the test fails before this patch.
Author: Joseph K. Bradley
Closes #18265 from jkbradley/word2
ery easily to have an overflow in calculating the number of
partitions for ML persistence.
This modifies the calculations to use Long.
## How was this patch tested?
New unit test. I verified that the test fails before this patch.
Author: Joseph K. Bradley
Closes #18265 from jkbradley/word2vec-s
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18265
I'm going to call this ready...but please say if you see other fixes I
should make. Thanks!
Merging with master and branch-2.2
---
If your project is set up for it, you can rep
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18265
@Krimit Thanks for taking a look! Does it look ready to merge now?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18265
looks like a spurious failure, retesting
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18265
Yep, someone hit the bug!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18265#discussion_r121315755
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -188,6 +188,15 @@ class Word2VecSuite extends SparkFunSuite with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18265#discussion_r121315677
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -188,6 +188,15 @@ class Word2VecSuite extends SparkFunSuite with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18265#discussion_r121315648
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala
---
@@ -355,9 +364,12 @@ object Word2VecModel extends MLReadable[Word2VecModel
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18265
CC @Krimit and @srowen who had worked on the previous related patch
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/18265
[SPARK-21050][ML] Word2vec persistence overflow bug fix
## What changes were proposed in this pull request?
The method calculateNumberOfPartitions() uses Int, not Long (unlike the
MLlib
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18256
Thanks! LGTM pending tests
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18151
So...good thing you asked for the test b/c transform() wasn't going through
the corrected code path. Another bit of evidence that the Prediction APIs
don't generalize that well...
-
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/18151#discussion_r119275003
--- Diff: R/pkg/R/mllib_classification.R ---
@@ -62,7 +62,7 @@ setClass("NaiveBayesModel", representation(jo
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18151
CC @mlnick @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/18151
[SPARK-20929][ML] LinearSVC should use its own threshold param
## What changes were proposed in this pull request?
LinearSVC should use its own threshold param, rather than the shared
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18085
Thanks for fixing this! Just curious: did you figure out why the test was
working before?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17891#discussion_r118175057
--- Diff: python/pyspark/ml/tests.py ---
@@ -807,6 +807,18 @@ def test_logistic_regression(self):
except OSError:
pass
Repository: spark
Updated Branches:
refs/heads/branch-2.2 d20c64695 -> 00dee3902
[SPARK-20861][ML][PYTHON] Delegate looping over paramMaps to estimators
Changes:
pyspark.ml Estimators can take either a list of param maps or a dict of params.
This change allows the CrossValidator and TrainVal
Repository: spark
Updated Branches:
refs/heads/master 4816c2ef5 -> 9434280cf
[SPARK-20861][ML][PYTHON] Delegate looping over paramMaps to estimators
Changes:
pyspark.ml Estimators can take either a list of param maps or a dict of params.
This change allows the CrossValidator and TrainValidat
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18077
Merging with master and branch-2.2 which means this will get into 2.2.0
Thanks for the quick fix!
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18077
Other than the tags, this LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/18077
@MrBago Can you please add the tags "[ML][PYTHON]" to the title?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If yo
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17946
LGTM
Thanks for doing this!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Repository: spark
Updated Branches:
refs/heads/master d4022d495 -> dbe81633a
[SPARK-20501][ML] ML 2.2 QA: New Scala APIs, docs
## What changes were proposed in this pull request?
Review new Scala APIs introduced in 2.2.
## How was this patch tested?
Existing tests.
Author: Yanbo Liang
Clos
Repository: spark
Updated Branches:
refs/heads/branch-2.2 a869e8bfd -> 57c87cf2d
[SPARK-20501][ML] ML 2.2 QA: New Scala APIs, docs
## What changes were proposed in this pull request?
Review new Scala APIs introduced in 2.2.
## How was this patch tested?
Existing tests.
Author: Yanbo Liang
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17934
LGTM
I'll merge this with master and branch-2.2
Thanks all!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17934#discussion_r116631097
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -146,7 +146,7 @@ object StringIndexer extends
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17829
Awesome, thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17944
Thanks a lot @yanboliang !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15628#discussion_r116614596
--- Diff:
mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala ---
@@ -148,7 +154,8 @@ sealed trait Matrix extends Serializable
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15628#discussion_r116596231
--- Diff:
mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala ---
@@ -148,7 +154,8 @@ sealed trait Matrix extends Serializable
901 - 1000 of 8523 matches
Mail list logo