GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/17407
[SPARK-20043][ML][WIP] DecisionTreeModel can't recongnize Impurity "Gini"
when loading
Fix bug: DecisionTreeModel can't recongnize Impurity "Gini" when loading
TODO
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17407
@jkbradley Hi, tests passed. Is it good enough to be merged?
By the way, String Params are fragile when saving/loading model, as
setParam and getParam methods are useless in such case
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17407
@jkbradley Thanks. I agree with your advice. Modifying the value is a
little aggressive, while changing ImpurityCalculator.getCalculator is more
moderate. However, I'm afraid that the similar bugs
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17383#discussion_r108318833
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
@@ -301,7 +302,7 @@ private[tree] class LearningNode(
* group of nodes
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17383#discussion_r108318997
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
@@ -301,7 +302,7 @@ private[tree] class LearningNode(
* group of nodes
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108320537
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,22 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108320481
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
---
@@ -178,6 +178,22 @@ class DecisionTreeRegressorSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108320525
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,22 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108050101
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,20 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108050103
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,20 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108050099
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,20 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108050110
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,20 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17407#discussion_r108050269
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -385,6 +385,22 @@ class
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14547#discussion_r107077783
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/17383
[SPARK-3165][MLlib][WIP] DecisionTree does not use sparsity in data
## What changes were proposed in this pull request?
DecisionTree should take advantage of sparse feature vectors
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
@jkbradley @hhbyyh Could you review the PR? thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/17503
[SPARK-3159][MLlib] Check for reducible DecisionTree
add canMergeChildren param: find the pairs of leave of the same parent
which output the same prediction, and merge them.
## How
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r110385244
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1009,10 +1009,24 @@ private[spark] object RandomForest extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r110385132
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1009,10 +1009,24 @@ private[spark] object RandomForest extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r110385526
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1009,10 +1009,24 @@ private[spark] object RandomForest extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r110386358
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -996,7 +996,7 @@ private[spark] object RandomForest extends Logging
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
is there something wrong with spark CI?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
```
Test Result (1 failure / +1)
org.apache.spark.storage.TopologyAwareBlockReplicationPolicyBehavior.Peers in 2
racks
```
Does anyone know what is this?
---
If your
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
@srowen Hi, I forget unit tests in python and R. Where can I find document
about creating develop environment? thanks.
---
If your project is set up for it, you can reply to this email and have
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
I have ran all unit test case of MLlib in Python. However, I am not
familiar with R, and I don't want waste too many time on deploying R's
environment.
Could CI retest the pr? We can
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r111656240
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -126,9 +138,10 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r111656235
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -112,9 +124,9 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r111656245
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -104,6 +104,18 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
@sethah Perhaps it's hard to compare R with Spark's behavior, since many
factors involved. I'd like to read R GBM's code, and verify consistency of both
side's design on split criteria. Is it OK
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
many thanks, @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
I scanned split critical of sklearn and xgboost.
1. sklearn
count all continuous values and split at mean value.
commit 5147fd09c6a063188efde444f47bd006fa5f95f0
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
@srowen Hi, could you review the PR? The PR is simple, though many code for
unit test are added. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
Hi, I has checked R GBM's code and found that:
R's gbm uses mean value $(x + y) / 2$, not weighted mean $(c_x * x + c_y *
y) / (c_x + c_y)$ described in [JIRA
SPARK-16957](https
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14547#discussion_r105576961
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14547#discussion_r105814881
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/17556
[SPARK-16957][MLlib] Use weighted midpoints for split values.
## What changes were proposed in this pull request?
Use weighted midpoints for split values.
## How was this patch
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Thanks, @yanboliang . Could you give a hand, @srowen ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Test failures in pyspark.ml.tests with python2.6, but I don't have the
environment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
@yanboliang Thanks, yanbo. I am not familar with python 2.6, which is too
outdated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Test failures in pyspark.ml.tests with python2.6, but I don't have the
environment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18736
Sure, @yanboliang . Thanks for your suggestion. I'll work on it later,
perhaps next week. Is it OK?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r132618802
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -80,20 +82,31 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18736
@yanboliang Hi, yangbo. Could you help review the PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r126863072
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -317,7 +318,12 @@ final class OneVsRest @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126646511
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -36,7 +36,8 @@ import org.apache.spark.util.collection.OpenHashMap
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126645714
--- Diff: python/pyspark/ml/feature.py ---
@@ -3058,26 +3035,37 @@ class RFormula(JavaEstimator, HasFeaturesCol,
HasLabelCol, JavaMLReadable, JavaM
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126643882
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -460,16 +460,16 @@ object LinearRegression extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126642928
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -36,7 +36,8 @@ import org.apache.spark.sql.types.{DoubleType
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
@srowen @yanboliang Could you help review the PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126026388
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
@lins05 thanks, reasonable suggestion, I will fix it later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126023986
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127873828
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -598,8 +598,23 @@ class LogisticRegression @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127874833
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/loss/DifferentiableRegularization.scala
---
@@ -32,40 +34,45 @@ private[ml] trait
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127972263
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -598,8 +598,23 @@ class LogisticRegression @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r128158473
--- Diff: python/pyspark/ml/tests.py ---
@@ -1255,6 +1255,17 @@ def test_output_columns(self):
output = model.transform(df
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r129562237
--- Diff: python/pyspark/ml/tests.py ---
@@ -1255,6 +1255,24 @@ def test_output_columns(self):
output = model.transform(df
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r129562189
--- Diff: python/pyspark/ml/classification.py ---
@@ -1517,20 +1517,22 @@ class OneVsRest(Estimator, OneVsRestParams,
MLReadable, MLWritable
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
ping @holdenk @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18736
[SPARK-21481][ML] Add indexOf method for ml.feature.HashingTF
## What changes were proposed in this pull request?
Add indexOf method for ml.feature.HashingTF.
The PR is a hotfix
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18523
[SPARK-21285][ML] VectorAssembler reports the column name of unsupported
data type
## What changes were proposed in this pull request?
add the column name in the exception which is raised
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
@jkbradley May you have time reviewing the pr? I believe that it will be a
little improvement for predict. Thanks.
---
If your project is set up for it, you can reply to this email and have your
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125398010
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
I don't know how to write an unit test for the pr? Is it necessary?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/17383
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
Good idea!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user facaiy reopened a pull request:
https://github.com/apache/spark/pull/17383
[SPARK-3165][MLlib][WIP] DecisionTree does not use sparsity in data
## What changes were proposed in this pull request?
DecisionTree should take advantage of sparse feature vectors
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125584572
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18554
[SPARK-21306][ML] OneVsRest should cache weightCol if necessary
## What changes were proposed in this pull request?
cache weightCol if classifier inherits HasWeightCol trait
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125860650
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,15 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
@SparkQA Jenkins, run tests again, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
@SparkQA test again, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
I'm not familiar with R, and use grep to search "OneVsRest" and get
nothing. Hence it seems that nothing is needed to do with R part.
---
If your project is set up for it, you
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126050849
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125763918
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,15 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125539040
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
@srowen I am not sure whether I understand your question clearly.
RandomForest uses LearningNode to construct tree model when training, and
convert them to Leaf or InternalNode at last. Hence, all
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
fix failed case, please retest it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17503#discussion_r113360409
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
---
@@ -61,6 +61,8 @@ import org.apache.spark.mllib.tree.impurity
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
I have the same question with you. I guess that Impurity info is useful to
debug and analysis tree model. However, as tree is grown from root to leaf when
training, hence it seems needless to merge
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r114043563
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1009,10 +1009,24 @@ private[spark] object RandomForest extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r114043568
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -138,9 +169,10 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r114043558
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1037,7 +1051,10 @@ private[spark] object RandomForest extends
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
For a (train) sample of continuous series, say {x0, x1, x2, x3, ..., x100}.
Now spark select quantile as split point.
Suppose 10-quantiles is used, and x2 is 1st quantile, and x10 is 2nd
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/17556#discussion_r114043439
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -112,9 +138,11 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
By the way, it's safe to use mean value as it is match the other libraries.
If requested, I'd like to modify the PR.
---
If your project is set up for it, you can reply to this email and have your
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17556
OK, weight has been removed when calculating.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r131529693
--- Diff: python/pyspark/ml/classification.py ---
@@ -1344,7 +1346,19 @@ def _fit(self, dataset):
numClasses = int(dataset.agg({labelCol
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r131529768
--- Diff: python/pyspark/ml/classification.py ---
@@ -1423,7 +1425,18 @@ def _fit(self, dataset):
numClasses = int(dataset.agg({labelCol
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18764
[SPARK-21306][ML] For branch 2.0, OneVsRest should support setWeightCol
The PR is related to #18554, and is modified for branch 2.0.
## What changes were proposed in this pull request
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18763
[SPARK-21306][ML] OneVsRest should support setWeightCol for branch-2.1
The PR is related to #18554, and is modified for branch 2.1.
## What changes were proposed in this pull request
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130202540
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -158,7 +158,7 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r130200288
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -33,6 +33,7 @@ import
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r130200379
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -143,6 +144,16 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130200461
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -157,6 +157,16 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130213337
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -158,7 +158,7 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Sure, thanks, @yanboliang !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18764
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18763
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18763
Thanks, all.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
1 - 100 of 157 matches
Mail list logo