Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22788#discussion_r229354247
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22788#discussion_r229479893
--- Diff:
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22788#discussion_r229583440
--- Diff:
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22790#discussion_r228274470
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala
---
@@ -109,7 +109,7 @@ class BisectingKMeansModel private
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/21710
@felixcheung I am terribly sorry that I missed your comment for the ml doc
and example for 2.4. Is it still time to merge in 2.4? I saw one of my PR got
merged in 2.4 last night. I can submit
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/22863
[SPARK-25859][ML]add scala/java/python example and doc for PrefixSpan
## What changes were proposed in this pull request?
add scala/java/python example and doc for PrefixSpan in branch
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22863
@felixcheung
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22788
I have a question regarding the test failure in
```ExpressionTypeCheckingSuite```. Most of the tests in this suite failed after
I change ```UnresolvedAttribute.sql = UnresolvedAttribute.name
Github user huaxingao closed the pull request at:
https://github.com/apache/spark/pull/22863
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22863
Thanks @felixcheung
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22996
@holdenk
Yes, it is. I will include the examples in ml-clustering.md.
---
-
To unsubscribe, e-mail: reviews-unsubscr
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/22996
add Python example code for Power Iteration Clustering in spark.ml
## What changes were proposed in this pull request?
Add python example for Power Iteration Clustering in spark.ml
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22788
@cloud-fan @dongjoon-hyun
Because of the above test failures in ```ExpressionTypeCheckingSuite```,
shall I revert to the previous change ?
```
override def sql: String
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21710#discussion_r223760003
--- Diff: examples/src/main/python/ml/prefixspan_example.py ---
@@ -0,0 +1,48 @@
+#
--- End diff --
@felixcheung I don't think the doc
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22295#discussion_r221394694
--- Diff: python/pyspark/sql/session.py ---
@@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None):
or SparkSession
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22295#discussion_r223165392
--- Diff: python/pyspark/sql/session.py ---
@@ -252,6 +255,20 @@ def newSession(self):
"""
return self.__class__
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/22537
[SPARK-21291][R] add R partitionBy API in DataFrame
## What changes were proposed in this pull request?
add R partitionBy API in write.df
I didn't add bucketBy in write.df. The last
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22537
Thanks! @HyukjinKwon @felixcheung
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/22295
I just saw this fix [SPARK-25525][SQL][PYSPARK] Do not update conf for
existing SparkContext in SparkSession.getOrCreate. #22545
I will remove ```test_create_SparkContext_then_SparkSession
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22295#discussion_r221089916
--- Diff: python/pyspark/sql/session.py ---
@@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None):
or SparkSession
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23161#discussion_r237628719
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2732,13 +2732,24 @@ setMethod("union",
dataFrame(unioned)
})
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23168#discussion_r237661961
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,38 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/23168
@srowen It's not in master yet. The PR is here
https://github.com/apache/spark/pull/23072
---
-
To unsubscribe, e-mail
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r237332601
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,44 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r239238873
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,44 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r239250376
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,44 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/23072
@dongjoon-hyun Thank you very much for your review. I will make the changes
soon.
---
-
To unsubscribe, e-mail: reviews
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r239250335
--- Diff: R/pkg/R/mllib_clustering.R ---
@@ -610,3 +616,58 @@ setMethod("write.ml", signature(object = "LDAModel"
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/21465
@BryanCutler Thank you very much for your review! I will submit changes
soon.
---
-
To unsubscribe, e-mail: reviews-unsubscr
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/23256
[SPARK-24207][R] follow-up PR for SPARK-24207 to fix code style problems
## What changes were proposed in this pull request?
follow-up PR for SPARK-24207 to fix code style problems
You
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21465#discussion_r239904265
--- Diff: python/pyspark/ml/param/shared.py ---
@@ -814,3 +814,25 @@ def getDistanceMeasure(self):
"""
return se
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/21465
@BryanCutler Thank you very much for your help!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r237966508
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/FPGrowthExample.scala ---
@@ -64,4 +64,3 @@ object FPGrowthExample {
spark.stop
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r239626824
--- Diff: R/pkg/R/mllib_clustering.R ---
@@ -610,3 +616,58 @@ setMethod("write.ml", signature(object = "LDAModel"
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r239626871
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,44 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21465#discussion_r239173098
--- Diff: python/pyspark/ml/classification.py ---
@@ -1174,9 +1165,31 @@ def trees(self):
return [DecisionTreeClassificationModel(m) for m
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/23072
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21465#discussion_r235488017
--- Diff: python/pyspark/ml/classification.py ---
@@ -1176,8 +1176,8 @@ def trees(self):
@inherit_doc
class GBTClassifier(JavaEstimator
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/23072#discussion_r236787704
--- Diff: docs/ml-clustering.md ---
@@ -265,3 +265,44 @@ Refer to the [R API
docs](api/R/spark.gaussianMixture.html) for more details
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/23157
[SPARK-26185][PYTHON]add weightCol in python
MulticlassClassificationEvaluator
## What changes were proposed in this pull request?
add weightCol for python version
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/23161
[SPARK-26189][R]Fix unionAll doc in SparkR
## What changes were proposed in this pull request?
Fix unionAll doc in SparkR
## How was this patch tested?
Manually ran
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/23168
@felixcheung Could you please review?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/23168
[SPARK-26207][doc]add PowerIterationClustering (PIC) doc in 2.4 branch
## What changes were proposed in this pull request?
Add PIC doc in 2.4
## How was this patch tested
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/20442
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
GitHub user huaxingao opened a pull request:
https://github.com/apache/spark/pull/23072
[SPARK-19827][R]spark.ml R API for PIC
## What changes were proposed in this pull request?
Add PowerIterationCluster (PIC) in R
## How was this patch tested?
Add test case
You
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21649#discussion_r216727458
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3939,7 +3929,15 @@ setMethod("hint",
signature(x = "SparkDataFrame"
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/22295#discussion_r218237306
--- Diff: python/pyspark/sql/session.py ---
@@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None):
or SparkSession
Github user huaxingao commented on the issue:
https://github.com/apache/spark/pull/21649
@felixcheung Thanks for your comments. I changed ```stopifnot```. At L3925
I could add
```
hintList <- list("hint2", "hint3", "hint4")
h
301 - 348 of 348 matches
Mail list logo