Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19350
Design changed. I will create new PR for this later. New design is here
https://docs.google.com/document/d/1xw5M4sp1e0eQie75yIt-r6-GTuD5vpFf_I6v-AFBM3M/edit?usp=sharing
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19988
@srowen Wait... @jkbradley seems to have more thoughts about this:
Question:
When line search failed, does it mean the model is always meaning-less ?
Maybe we need more discussion
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19746#discussion_r157668450
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSizeHint.scala ---
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19988
I think we can discuss the following cases:
- When gradient non-zero, line-search failed, will the model always be
meaning-less ?
- When gradient nearly zero, and line-search failed. I
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19156
Jenkins retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19950#discussion_r157922929
--- Diff:
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -187,14 +187,18 @@ class KryoSerializer(conf: SparkConf
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19950
And, these items added cannot cover the case in `MultilayerPeceptron`. Look
at `FeedForwardTrainer.train`, the persisted stacked `trainData`, the format is
`RDD[(Double, mllib.Vector)]`. The
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19950
@cloud-fan Does it works like: If A and B are any class which is
registered, then Type Tuple2[A, B] will be automatically registered for kyro
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19979
@MrBago @jkbradley
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19994
LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20077
[SPARK-22899][ML][STREAM] Fix OneVsRestModel transform on streaming data
failed.
## What changes were proposed in this pull request?
Fix OneVsRestModel transform on streaming data
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19843#discussion_r158692700
--- Diff: mllib/src/test/scala/org/apache/spark/ml/util/MLTest.scala ---
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20088
[SPARK-22905][ML][MLLIB][CORE] Fix ChiSqSelectorModel save implementation
## What changes were proposed in this pull request?
Currently, in `ChiSqSelectorModel`, save
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20088
Currently I cannot construct a failed test for this issue, but the future
PR (changing `RoundRobinPartitioning`) by @jiangxb1987 will trigger this bug
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19979
@jkbradley
There're two cases which can use `globalCheckFunction`
- test statistics (such as min/max ) on global transformer output
- get global result array and compare it
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19979
@MrBago Merge your code suggestion. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20095#discussion_r158929523
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ---
@@ -79,7 +82,51 @@ abstract class Estimator[M <: Model[M]] exte
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20095#discussion_r158931079
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ---
@@ -79,7 +82,51 @@ abstract class Estimator[M <: Model[M]] exte
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20095#discussion_r158930992
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ---
@@ -79,7 +82,51 @@ abstract class Estimator[M <: Model[M]] exte
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20058#discussion_r158932419
--- Diff: python/pyspark/ml/base.py ---
@@ -18,13 +18,40 @@
from abc import ABCMeta, abstractmethod
import copy
+import threading
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19979
@jkbradley
> When there has been a shuffle, it is likely the Rows will not follow a
fixed order.
Agreed. But we can make sure it generate fix order from the last shuffle
posit
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20113
LGTM. Have you checked all the model.save ?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20113
@zhengruifeng Good work! Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20111#discussion_r159048079
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSHSuite.scala
---
@@ -98,6 +97,21 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19979#discussion_r159061186
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/IsotonicRegressionSuite.scala
---
@@ -44,13 +41,11 @@ class IsotonicRegressionSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19979#discussion_r159061148
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
---
@@ -89,33 +88,31 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20111#discussion_r159116537
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSHSuite.scala
---
@@ -98,6 +97,21 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20111
LGTM except a tiny issue. :)
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20121
[SPARK-22927][ML][TESTS] ML test for structured streaming: ml.classification
## What changes were proposed in this pull request?
adding Structured Streaming tests for all Models
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
I am too busy recently to fix those failed R tests. Anyone who has spare
time can take over this PR and I will help review. Thanks
Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/19621
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20168#discussion_r160264829
--- Diff: python/pyspark/ml/image.py ---
@@ -71,9 +88,30 @@ def ocvTypes(self):
"""
if self._o
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20168#discussion_r160265175
--- Diff: python/pyspark/ml/image.py ---
@@ -55,7 +72,7 @@ def imageSchema(self):
"""
if self._imag
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20168#discussion_r160264533
--- Diff: python/pyspark/ml/image.py ---
@@ -71,9 +88,30 @@ def ocvTypes(self):
"""
if self._o
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20209
[SPARK-23008][ML] OnehotEncoderEstimator python API
## What changes were proposed in this pull request?
OnehotEncoderEstimator python API.
## How was this patch tested
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20146#discussion_r161040537
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -331,4 +357,51 @@ class StringIndexerSuite
val
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20146#discussion_r161039325
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -33,12 +33,38 @@ class StringIndexerSuite
test
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20146#discussion_r161040131
--- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
@@ -249,6 +249,16 @@ object ParamValidators {
def arrayLengthGt[T
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20146
@viirya
Discuss with @jkbradley offline, we're now busy fixing some issues (e.g.
#20238) in ML structured streaming support, it looks bad after the code freeze,
and we may not be ab
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20241
[SPARK-23008][ML][FOLLOW-UP] mark OneHotEncoder python API deprecated
## What changes were proposed in this pull request?
mark OneHotEncoder python API deprecated
## How was
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20229#discussion_r161120354
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -230,16 +231,17 @@ class RFormula @Since("1.5.0") (@Si
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/20261
[SPARK-22885][ML][TEST] ML test for StructuredStreaming: spark.ml.tuning
## What changes were proposed in this pull request?
ML test for StructuredStreaming: spark.ml.tuning
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21081
@jkbradley Will this be applied to other algos besides clustering algos ?
and how to support sparse float features
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21078
@jkbradley Updated.
I would like to split `RandomForest` and `GradientBoostedTrees`
modification into another PR because it will change many methods in them
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21081
So why not design generic vector class ? and then implement Vector[Double]
and Vector[Float] via generic specification ? So it can support everything, no
matter sparse and dense
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21097#discussion_r182668410
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala
---
@@ -365,6 +365,20 @@ class GBTClassifierSuite extends
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20446
@MLnick @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/21129
[SPARK-7132][ML] Add fit with validation set to spark.ml GBT
## What changes were proposed in this pull request?
Add fit with validation set to spark.ml GBT
## How was this
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183643445
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala
---
@@ -39,21 +46,28 @@ class MulticlassMetrics @Since
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21129
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183647533
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183646411
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183645265
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183645675
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r183647005
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21120
I doubt that this will slow down the summarizer performance because you add
sum statistics internally (and this sum value will possible to overflow).
We can directly use `count * mean` to
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/21163
[SPARK-24097][ML] Instruments improvements - RandomForest and
GradientBoostedTree
## What changes were proposed in this pull request?
Instruments improvements for `RandomForest` and
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184344901
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184345688
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184344777
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184342231
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184346287
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21119#discussion_r184343934
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1156,201 @@ def getKeepLastCheckpoint(self):
return self.getOrDefault
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r184566012
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -95,4 +95,95 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17086#discussion_r184584878
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/evaluation/MulticlassMetricsSuite.scala
---
@@ -55,44 +60,128 @@ class MulticlassMetricsSuite
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17086
overall good, @jkbradley Would you mind take a look ?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21153#discussion_r184620855
--- Diff: python/pyspark/ml/util.py ---
@@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc,
extraMetadata=None, paramMap=None
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21153#discussion_r184620777
--- Diff: python/pyspark/ml/util.py ---
@@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc,
extraMetadata=None, paramMap=None
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21153#discussion_r184626842
--- Diff: python/pyspark/ml/util.py ---
@@ -523,11 +534,29 @@ def getAndSetParams(instance, metadata):
"""
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20973#discussion_r185149879
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala ---
@@ -44,26 +43,37 @@ object PrefixSpan {
*
* @param dataset
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20973
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20261
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20973
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21218#discussion_r185756220
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
---
@@ -378,6 +378,7 @@ class KMeans @Since("
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21218#discussion_r185756193
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -423,6 +423,8 @@ class GaussianMixture @Since("
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20261
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21218#discussion_r185970925
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -423,6 +423,8 @@ class GaussianMixture @Since("
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21097#discussion_r186037589
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala
---
@@ -365,6 +365,20 @@ class GBTClassifierSuite extends
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20095#discussion_r186381507
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ---
@@ -79,7 +82,52 @@ abstract class Estimator[M <: Model[M]] exte
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/13493
LGTM!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/21265
[SPARK-24146][PySpark][ML] spark.ml parity for sequential pattern mining -
PrefixSpan: Python API
## What changes were proposed in this pull request?
spark.ml parity for sequential
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21129
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21272
LGTM!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21270
@shahidki31 Seemingly what you said above is anothor issue ? You can create
another jira for that. :)
---
-
To unsubscribe
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21274#discussion_r186986006
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala
---
@@ -232,7 +232,7 @@ class PowerIterationClustering
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20973#discussion_r186994754
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21274
LGTM. !
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21163
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17086
LGTM. @jkbradley @mengxr Would you mind take a look ?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21129
Jenkins test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20973#discussion_r188491670
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20973#discussion_r188853310
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21163
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/21393
[SPARK-20114][ML][FOLLOW-UP] spark.ml parity for sequential pattern mining
- PrefixSpan
## What changes were proposed in this pull request?
Change `PrefixSpan` into a class with
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21393
@mengxr @jkbradley
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21265#discussion_r191995667
--- Diff: python/pyspark/ml/fpm.py ---
@@ -243,3 +244,75 @@ def setParams(self, minSupport=0.3, minConfidence=0.8,
itemsCol="
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21265#discussion_r191996249
--- Diff: python/pyspark/ml/fpm.py ---
@@ -243,3 +244,105 @@ def setParams(self, minSupport=0.3,
minConfidence=0.8, itemsCol="
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21265#discussion_r192000596
--- Diff: python/pyspark/ml/fpm.py ---
@@ -243,3 +244,105 @@ def setParams(self, minSupport=0.3,
minConfidence=0.8, itemsCol="
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/21265
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/21493
[SPARK-15784] Add Power Iteration Clustering to spark.ml
## What changes were proposed in this pull request?
According to the discussion on JIRA. I rewrite the Power Iteration
101 - 200 of 1170 matches
Mail list logo