GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/22502
[SPARK-25474][SQL]Size in bytes of the query is coming in EB in case of
parquet datasource
## What changes were proposed in this pull request?
In case of CatalogFileIndex datasource table
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/22271
Thank you @jkbradley for merging.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/22271
@SparkQA Test this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/22271
[SPARK-25268][GraphX]runParallelPersonalizedPageRank throws serialization
Exception
## What changes were proposed in this pull request?
mapValues in scala is currently not serializable
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21842
[Minor][ML]Added UT for checking maximum number of features for
GeneralizedLinearRegression and WeightedLeastSquares
Currently in the GeneralizedLinearRegression and WeightedLeastSquare
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21740
Thanks @srowen. yes, my JIRA handle is "shahid".
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21740
Hi @srowen. The build has passed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21740#discussion_r202547781
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -75,11 +75,29 @@ class
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21740#discussion_r202547705
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -75,11 +75,29 @@ class
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21740#discussion_r202534469
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -75,10 +75,22 @@ class
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21740#discussion_r202534384
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModelSuite.scala
---
@@ -72,6 +72,22 @@ class
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21740
@jianran please refer the PR, https://github.com/apache/spark/pull/15809.
In this PR, I am checking if the 'userFeatures.lookup(user)', is empty
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21740
[SPARK-18230][MLLib]Throw better exception,for a non-existing user/product
When invoking MatrixFactorizationModel.recommendProducts(Int, Int) with a
non-existing user
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21689
Thank you @srowen for merging.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21689#discussion_r200136002
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala
---
@@ -76,23 +78,31 @@ class
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21689
Hi, @srowen . I have modified the code based on your suggestions.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21689#discussion_r200040108
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala
---
@@ -76,23 +78,25 @@ class
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21689#discussion_r200039891
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala
---
@@ -76,23 +78,25 @@ class
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21689
Minor correction in the powerIterationSuite
## What changes were proposed in this pull request?
Currently the power iteration clustering test in ml maps the results to the
labels 0
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21627
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21627
[SPARK-24484][MLLIB]Power Iteration Clustering is giving incorrect
clustering results when there are mutiple leading eigen values.
## What changes were proposed in this pull request
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21277
Closing the PR due to the discussions in the JIRA,
https://issues.apache.org/jira/browse/SPARK-15784 and the PR
https://github.com/apache/spark/pull/21493
Github user shahidki31 closed the pull request at:
https://github.com/apache/spark/pull/21277
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21509#discussion_r194144115
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala
---
@@ -166,6 +166,7 @@ class PowerIterationClustering
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21283
Thanks @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21248
Thanks @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21509
Check for invalid input type of weight data in ml.PowerIterationClustering
## What changes were proposed in this pull request?
The test case will result the following failure. currently
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21248
Hi @hhbyyh @srowen ,
PowerIterationClustering API has some modifications based on the discussion
in the JIRA, https://issues.apache.org/jira/browse/SPARK-15784.
The examples also have
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21283
Hi @srowen , Thanks for the comment. I have modified the code.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21283
Hi @hhbyyh @srowen ,
PowerIterationClustering API has made some modifications based on the
discussion in the JIRA, https://issues.apache.org/jira/browse/SPARK-15784.
The examples also
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21248
Thank you @hhbyyh for the review. I have created a new dataset for the
example code, instead of function generated dataset. I have addressed your
review comments
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21248#discussion_r189704948
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/PowerIterationClusteringExample.scala
---
@@ -0,0 +1,114 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21248#discussion_r189704384
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/PowerIterationClusteringExample.scala
---
@@ -0,0 +1,114 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21248#discussion_r189704094
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/PowerIterationClusteringExample.scala
---
@@ -0,0 +1,114 @@
+/*
+ * Licensed
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21283
Thanks @hhbyyh for the review comments. I have modified the example code
based on the review.
---
-
To unsubscribe, e-mail
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21283#discussion_r189693083
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java
---
@@ -0,0 +1,85 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21283#discussion_r189693033
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java
---
@@ -0,0 +1,85 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21283#discussion_r189692779
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java
---
@@ -0,0 +1,85 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21283#discussion_r189692700
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java
---
@@ -0,0 +1,85 @@
+/*
+ * Licensed
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21283#discussion_r189692625
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java
---
@@ -0,0 +1,85 @@
+/*
+ * Licensed
GitHub user shahidki31 reopened a pull request:
https://github.com/apache/spark/pull/21277
[SPARK-24217][ML]Power Iteration Clustering is not displaying cluster
indices corresponding to some vertices
## What changes were proposed in this pull request?
1) Currently PIC
Github user shahidki31 closed the pull request at:
https://github.com/apache/spark/pull/21270
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user shahidki31 closed the pull request at:
https://github.com/apache/spark/pull/21277
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21277
Based on the comments in the JIRA, (
https://issues.apache.org/jira/browse/SPARK-24217), I am closing the issue
Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21274#discussion_r187234165
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala
---
@@ -231,8 +231,12 @@ class PowerIterationClustering
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21283
Java example code for Power Iteration Clustering in spark.ml
## What changes were proposed in this pull request?
Java example code for Power Iteration Clustering in spark.ml
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21270
@WeichenXu123 Thanks for the comment. I have created another Jira and I
have raised a PR for that. That PR will fix this issue as well. Can you please
review the PR?
Jira : https
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21277
[ML]Power Iteration Clustering is not displaying cluster indices
corresponding to some nodes.
## What changes were proposed in this pull request?
1) Currently PIC in ML displays cluster
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21270
Thank you @jkbradly. Actually one more issue is there. Currently we are
skipping some of the nodes which are not there in the ID column, but there in
the neighboring column. Spark MLLib
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21270
Power Iteration Clustering in SparkML throws exception, when the ID in
IntType
While running the following code, PIC throws exception.
```
val data = spark.createDataFrame(Seq
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/21248
cc @mengxr @WeichenXu123 @felixcheung. Can you please verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/21248
Example code for Power Iteration Clustering
## What changes were proposed in this pull request?
Added example code for Power Iteration Clustering in Spark ML examples
## How
301 - 352 of 352 matches
Mail list logo