[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569883913
 
 
   Merged to master, thanks @srowen @maropu for reviewing! Happy new year!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
zhengruifeng closed pull request #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer commented on issue #27052: [SPARK-30390][MLLIB] Avoid double caching in mllib.KMeans#runWithWeights.

2019-12-30 Thread GitBox
amanomer commented on issue #27052: [SPARK-30390][MLLIB] Avoid double caching 
in mllib.KMeans#runWithWeights.
URL: https://github.com/apache/spark/pull/27052#issuecomment-569882881
 
 
   cc @srowen @zhengruifeng 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add 
iterator/foreach methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569882010
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569882015
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115983/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add 
iterator/foreach methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569882015
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115983/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569882010
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #26982: [SPARK-30329][ML] add 
iterator/foreach methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865548
 
 
   **[Test build #115983 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115983/testReport)**
 for PR 26982 at commit 
[`5177fb5`](https://github.com/apache/spark/commit/5177fb5d8e3400a56f6287e256c420baca4f7820).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
SparkQA commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569881814
 
 
   **[Test build #115983 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115983/testReport)**
 for PR 26982 at commit 
[`5177fb5`](https://github.com/apache/spark/commit/5177fb5d8e3400a56f6287e256c420baca4f7820).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569881262
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569881265
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115980/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569881262
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569881265
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115980/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
SparkQA commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569881072
 
 
   **[Test build #115980 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115980/testReport)**
 for PR 26847 at commit 
[`c391212`](https://github.com/apache/spark/commit/c391212e818bb4947276e4ee442f92963af1cedd).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #26847: [SPARK-30214][SQL]  A new framework 
to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857471
 
 
   **[Test build #115980 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115980/testReport)**
 for PR 26847 at commit 
[`c391212`](https://github.com/apache/spark/commit/c391212e818bb4947276e4ee442f92963af1cedd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #26559: [SPARK-29930][SQL] Remove SQL configs declared to be removed in Spark 3.0

2019-12-30 Thread GitBox
MaxGekk commented on a change in pull request #26559: [SPARK-29930][SQL] Remove 
SQL configs declared to be removed in Spark 3.0
URL: https://github.com/apache/spark/pull/26559#discussion_r362159146
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -720,14 +720,6 @@ object SQLConf {
 .stringConf
 .createWithDefault("_corrupt_record")
 
-  val FROM_JSON_FORCE_NULLABLE_SCHEMA = 
buildConf("spark.sql.fromJsonForceNullableSchema")
 
 Review comment:
   Sure, we could throw an exception for 3 configs.  I am just wondering why we 
silently ignore non-existed SQL configs:
   ```scala
   scala> spark.conf.set("spark.sql.abc", 1)
   
   ```
   How about throwing `AnalysisException` for not existed SQL configs that have 
the `spark.sql` prefix but don't present in 
https://github.com/apache/spark/blob/6d64fc2407e5b21a2db59c5213df438c74a31637/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L50
 ?
   
   or there are SQL configs that we have to bypass for some reasons?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27056: [SPARK-27217][SQL] Nested schema 
pruning with Aggregation
URL: https://github.com/apache/spark/pull/27056#issuecomment-569875748
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #27056: [SPARK-27217][SQL] Nested 
schema pruning with Aggregation
URL: https://github.com/apache/spark/pull/27056#issuecomment-569874837
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
zhengruifeng commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo 
that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569875692
 
 
   Merged to master, thanks all!
   Happy new year!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
zhengruifeng closed pull request #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer commented on a change in pull request #27052: [SPARK-30390][MLLIB] Avoid double caching in mllib.KMeans#runWithWeights.

2019-12-30 Thread GitBox
amanomer commented on a change in pull request #27052: [SPARK-30390][MLLIB] 
Avoid double caching in mllib.KMeans#runWithWeights.
URL: https://github.com/apache/spark/pull/27052#discussion_r362157157
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala
 ##
 @@ -232,7 +232,10 @@ class KMeans private (
 val zippedData = data.zip(norms).map { case ((v, w), norm) =>
   (new VectorWithNorm(v, norm), w)
 }
-zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+
+if (data.getStorageLevel == StorageLevel.NONE) {
+  zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+}
 
 Review comment:
   > what about caching norms if data is already cached?
   
   Won't this lead to double caching problem which we are trying to avoid?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer commented on a change in pull request #27052: [SPARK-30390][MLLIB] Avoid double caching in mllib.KMeans#runWithWeights.

2019-12-30 Thread GitBox
amanomer commented on a change in pull request #27052: [SPARK-30390][MLLIB] 
Avoid double caching in mllib.KMeans#runWithWeights.
URL: https://github.com/apache/spark/pull/27052#discussion_r362157157
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala
 ##
 @@ -232,7 +232,10 @@ class KMeans private (
 val zippedData = data.zip(norms).map { case ((v, w), norm) =>
   (new VectorWithNorm(v, norm), w)
 }
-zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+
+if (data.getStorageLevel == StorageLevel.NONE) {
+  zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+}
 
 Review comment:
   > what about caching norms if data is already cached?
   
   It will lead to double caching problem which we are trying to avoid?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27056: [SPARK-27217][SQL] Nested schema 
pruning with Aggregation
URL: https://github.com/apache/spark/pull/27056#issuecomment-569874837
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer commented on issue #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
amanomer commented on issue #27056: [SPARK-27217][SQL] Nested schema pruning 
with Aggregation
URL: https://github.com/apache/spark/pull/27056#issuecomment-569874706
 
 
   cc @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer removed a comment on issue #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
amanomer removed a comment on issue #27056: [SPARK-27217][SQL] Nested schema 
pruning with Aggregation
URL: https://github.com/apache/spark/pull/27056#issuecomment-569874706
 
 
   cc @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amanomer opened a new pull request #27056: [SPARK-27217][SQL] Nested schema pruning with Aggregation

2019-12-30 Thread GitBox
amanomer opened a new pull request #27056: [SPARK-27217][SQL] Nested schema 
pruning with Aggregation
URL: https://github.com/apache/spark/pull/27056
 
 
   
   
   ### What changes were proposed in this pull request?
   Added a new rule `NestColumnAliasing.Overaggregate` which will help pushdown 
nested columns wrapped inside `Aggregate`.
   
   
   
   ### Why are the changes needed?
   Since, spark is supporting nested schema pushdown when used with `Project` 
(SELECT query), we also need to support same pushdown ability when user perform 
aggregation (such as sum) on nested columns.
   
   
   
   ### Does this PR introduce any user-facing change?
   No
   
   
   
   ### How was this patch tested?
   Added test cases.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #26838: [SPARK-30144][ML][PySpark] Make MultilayerPerceptronClassificationModel extend MultilayerPerceptronParams

2019-12-30 Thread GitBox
zhengruifeng commented on a change in pull request #26838: 
[SPARK-30144][ML][PySpark] Make MultilayerPerceptronClassificationModel extend 
MultilayerPerceptronParams
URL: https://github.com/apache/spark/pull/26838#discussion_r362155650
 
 

 ##
 File path: 
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
 ##
 @@ -459,7 +459,7 @@ class StringIndexerSuite extends MLTest with 
DefaultReadWriteTest {
   }
 
   test("Load StringIndexderModel prior to Spark 3.0") {
-val modelPath = testFile("test-data/strIndexerModel")
 
 Review comment:
   strIndexerModel-2.4.4?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #26838: [SPARK-30144][ML][PySpark] Make MultilayerPerceptronClassificationModel extend MultilayerPerceptronParams

2019-12-30 Thread GitBox
zhengruifeng commented on a change in pull request #26838: 
[SPARK-30144][ML][PySpark] Make MultilayerPerceptronClassificationModel extend 
MultilayerPerceptronParams
URL: https://github.com/apache/spark/pull/26838#discussion_r362155666
 
 

 ##
 File path: 
mllib/src/test/scala/org/apache/spark/ml/feature/HashingTFSuite.scala
 ##
 @@ -89,7 +89,7 @@ class HashingTFSuite extends MLTest with 
DefaultReadWriteTest {
   }
 
   test("SPARK-23469: Load HashingTF prior to Spark 3.0") {
-val hashingTFPath = testFile("test-data/hashingTF-pre3.0")
+val hashingTFPath = testFile("ml-models/hashingTF-pre3.0")
 
 Review comment:
   hashingTF-2.4.4


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB commented on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB commented on issue #27051: [SPARK-30389][SQL]Validate file type extension 
during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569872749
 
 
   only i am thinking about this point  "cases for adding something to the 
classpath that isn't a JAR, like a resource file or zip file"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #16966: [SPARK-18409][ML]LSH approxNearestNeighbors should use approxQuantile instead of sort

2019-12-30 Thread GitBox
zhengruifeng closed pull request #16966: [SPARK-18409][ML]LSH 
approxNearestNeighbors should use approxQuantile instead of sort
URL: https://github.com/apache/spark/pull/16966
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #16966: [SPARK-18409][ML]LSH approxNearestNeighbors should use approxQuantile instead of sort

2019-12-30 Thread GitBox
zhengruifeng commented on issue #16966: [SPARK-18409][ML]LSH 
approxNearestNeighbors should use approxQuantile instead of sort
URL: https://github.com/apache/spark/pull/16966#issuecomment-569872554
 
 
   resolved in https://github.com/apache/spark/pull/26415


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #27040: [SPARK-30380][ML] Refactor RandomForest.findSplits

2019-12-30 Thread GitBox
zhengruifeng commented on issue #27040: [SPARK-30380][ML] Refactor 
RandomForest.findSplits
URL: https://github.com/apache/spark/pull/27040#issuecomment-569871975
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #27040: [SPARK-30380][ML] Refactor RandomForest.findSplits

2019-12-30 Thread GitBox
zhengruifeng closed pull request #27040: [SPARK-30380][ML] Refactor 
RandomForest.findSplits
URL: https://github.com/apache/spark/pull/27040
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log 
weightSum in Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569871438
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115982/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26995: [SPARK-30341][SQL] Overflow 
check for interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569871344
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569871438
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115982/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log 
weightSum in Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569871434
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26995: [SPARK-30341][SQL] Overflow 
check for interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569871347
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115977/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569871434
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26995: [SPARK-30341][SQL] Overflow check for 
interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569871344
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26995: [SPARK-30341][SQL] Overflow check for 
interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569871347
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115977/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
SparkQA commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that 
has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569871313
 
 
   **[Test build #115982 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115982/testReport)**
 for PR 26972 at commit 
[`7120e19`](https://github.com/apache/spark/commit/7120e195266ea9e94351a48a29f0205b72a7344a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569864866
 
 
   **[Test build #115982 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115982/testReport)**
 for PR 26972 at commit 
[`7120e19`](https://github.com/apache/spark/commit/7120e195266ea9e94351a48a29f0205b72a7344a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #26995: [SPARK-30341][SQL] Overflow check 
for interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569848473
 
 
   **[Test build #115977 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115977/testReport)**
 for PR 26995 at commit 
[`92e2668`](https://github.com/apache/spark/commit/92e2668c11f4d26205690d972b28b3a2d92eddf5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26995: [SPARK-30341][SQL] Overflow check for interval arithmetic operations

2019-12-30 Thread GitBox
SparkQA commented on issue #26995: [SPARK-30341][SQL] Overflow check for 
interval arithmetic operations
URL: https://github.com/apache/spark/pull/26995#issuecomment-569871119
 
 
   **[Test build #115977 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115977/testReport)**
 for PR 26995 at commit 
[`92e2668`](https://github.com/apache/spark/commit/92e2668c11f4d26205690d972b28b3a2d92eddf5).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add 
iterator/foreach methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865666
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26982: [SPARK-30329][ML] add 
iterator/foreach methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865667
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20775/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865667
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20775/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865666
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
SparkQA commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865548
 
 
   **[Test build #115983 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115983/testReport)**
 for PR 26982 at commit 
[`5177fb5`](https://github.com/apache/spark/commit/5177fb5d8e3400a56f6287e256c420baca4f7820).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-30 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach 
methods for Vectors
URL: https://github.com/apache/spark/pull/26982#issuecomment-569865403
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log 
weightSum in Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569865008
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #27052: [SPARK-30390][MLLIB] Avoid double caching in mllib.KMeans#runWithWeights.

2019-12-30 Thread GitBox
zhengruifeng commented on a change in pull request #27052: [SPARK-30390][MLLIB] 
Avoid double caching in mllib.KMeans#runWithWeights.
URL: https://github.com/apache/spark/pull/27052#discussion_r362149460
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala
 ##
 @@ -232,7 +232,10 @@ class KMeans private (
 val zippedData = data.zip(norms).map { case ((v, w), norm) =>
   (new VectorWithNorm(v, norm), w)
 }
-zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+
+if (data.getStorageLevel == StorageLevel.NONE) {
+  zippedData.persist(StorageLevel.MEMORY_AND_DISK)
+}
 
 Review comment:
   what about caching norms if data is already cached? like this:
   
   ```scala
   val handlePersistence = data.getStorageLevel == StorageLevel.NONE
   val norms = ...
   val zippedData = if (handlePersistence) {
  data.zip(norms).map { case ((v, w), norm) =>
 (new VectorWithNorm(v, norm), w)
   }.persist(StorageLevel.MEMORY_AND_DISK)
   } else {
norms.persist(StorageLevel.MEMORY_AND_DISK)
data.zip(norms).map { case ((v, w), norm) =>
   (new VectorWithNorm(v, norm), w)
}
   }
   
   ...
   
   
   if (handlePersistence) {
  zippedData.unpersist()
   } else {
  norms.unpersist()
   }
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26972: [SPARK-30321][ML] Log 
weightSum in Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569865013
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20774/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569865013
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20774/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26972: [SPARK-30321][ML] Log weightSum in 
Algo that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569865008
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
SparkQA commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that 
has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569864866
 
 
   **[Test build #115982 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115982/testReport)**
 for PR 26972 at commit 
[`7120e19`](https://github.com/apache/spark/commit/7120e195266ea9e94351a48a29f0205b72a7344a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo that has weights support

2019-12-30 Thread GitBox
zhengruifeng commented on issue #26972: [SPARK-30321][ML] Log weightSum in Algo 
that has weights support
URL: https://github.com/apache/spark/pull/26972#issuecomment-569864240
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] 
Avoid to fail twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569864108
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115979/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid 
to fail twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569864103
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid 
to fail twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569864108
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115979/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #27044: [SPARK-30378][ML][PySpark] Add getter/setter in Python FM

2019-12-30 Thread GitBox
zhengruifeng commented on issue #27044: [SPARK-30378][ML][PySpark] Add 
getter/setter in Python FM
URL: https://github.com/apache/spark/pull/27044#issuecomment-569864131
 
 
   Merged to master, thanks @huaxingao 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] 
Avoid to fail twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569864103
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid 
to fail twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569854102
 
 
   **[Test build #115979 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115979/testReport)**
 for PR 27054 at commit 
[`912eaf0`](https://github.com/apache/spark/commit/912eaf0a3136e33e99d3fa7975c1f8d5637cab1d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #27044: [SPARK-30378][ML][PySpark] Add getter/setter in Python FM

2019-12-30 Thread GitBox
zhengruifeng closed pull request #27044: [SPARK-30378][ML][PySpark] Add 
getter/setter in Python FM
URL: https://github.com/apache/spark/pull/27044
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail twice in function lookup

2019-12-30 Thread GitBox
SparkQA commented on issue #27054: [SPARK-30339][SQL][branch-2.4] Avoid to fail 
twice in function lookup
URL: https://github.com/apache/spark/pull/27054#issuecomment-569864019
 
 
   **[Test build #115979 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115979/testReport)**
 for PR 27054 at commit 
[`912eaf0`](https://github.com/apache/spark/commit/912eaf0a3136e33e99d3fa7975c1f8d5637cab1d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #27015: [SPARK-30358][ML] ML expose predictRaw and predictProbability

2019-12-30 Thread GitBox
zhengruifeng commented on issue #27015: [SPARK-30358][ML] ML expose predictRaw 
and predictProbability
URL: https://github.com/apache/spark/pull/27015#issuecomment-569863549
 
 
   Mereged to master, thanks @srowen for reviewing!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #27015: [SPARK-30358][ML] ML expose predictRaw and predictProbability

2019-12-30 Thread GitBox
zhengruifeng closed pull request #27015: [SPARK-30358][ML] ML expose predictRaw 
and predictProbability
URL: https://github.com/apache/spark/pull/27015
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #27055: [SPARK-30394]Skip 
DetermineTableStats rule when hive table can be converted to datasource table
URL: https://github.com/apache/spark/pull/27055#issuecomment-569861748
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27055: [SPARK-30394]Skip DetermineTableStats 
rule when hive table can be converted to datasource table
URL: https://github.com/apache/spark/pull/27055#issuecomment-569861845
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #27055: [SPARK-30394]Skip DetermineTableStats 
rule when hive table can be converted to datasource table
URL: https://github.com/apache/spark/pull/27055#issuecomment-569861748
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid 
unnecessary InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569861202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20773/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid 
unnecessary InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569861197
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] liupc opened a new pull request #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

2019-12-30 Thread GitBox
liupc opened a new pull request #27055: [SPARK-30394]Skip DetermineTableStats 
rule when hive table can be converted to datasource table
URL: https://github.com/apache/spark/pull/27055
 
 
   
   ### What changes were proposed in this pull request?
   This PR will skip DeterminTableStats rule when hive table can be converted 
to datasource table, thus can avoid useless stats collection for these table 
when `spark.sql.statistics.fallBackToHdfs` is true.
   
   ### Why are the changes needed?
   This PR can improve performance in some cases.
   
   
   ### Does this PR introduce any user-facing change?
   No
   
   
   ### How was this patch tested?
   UT & test on real clusters
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary 
InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569861197
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary 
InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569861202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20773/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
SparkQA commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary 
InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569861048
 
 
   **[Test build #115981 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115981/testReport)**
 for PR 26993 at commit 
[`e6945e8`](https://github.com/apache/spark/commit/e6945e88a24d51551cba105b5e7e3825bc5e0a69).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] JoshRosen commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
JoshRosen commented on a change in pull request #26993: [SPARK-30338][SQL] 
Avoid unnecessary InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#discussion_r362146085
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -318,10 +318,32 @@ private[parquet] class ParquetRowConverter(
 new ParquetMapConverter(parquetType.asGroupType(), t, updater)
 
   case t: StructType =>
+val wrappedUpdater = {
 
 Review comment:
   Good idea: I added a JIRA reference in 
https://github.com/apache/spark/pull/26993/commits/e6945e88a24d51551cba105b5e7e3825bc5e0a69


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   even if you are doing like this, i feel it's not correct, as per our 
documentation, we should force end-user to upload proper extension file.
   
https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-resource-mgmt-add-jar.html
   
   ![Screenshot 2019-12-31 at 9 42 10 
AM](https://user-images.githubusercontent.com/8948111/71609865-e89c8900-2bb1-11ea-8f73-30c7545c2073.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   even if you are doing like this, i feel it's not correct as per our 
documentation, we should force end-user to upload proper extension file.
   
https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-resource-mgmt-add-jar.html
   
   ![Screenshot 2019-12-31 at 9 42 10 
AM](https://user-images.githubusercontent.com/8948111/71609865-e89c8900-2bb1-11ea-8f73-30c7545c2073.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   even if you are doing like this, i feel it's not correct, as per our 
documentation.
   
https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-resource-mgmt-add-jar.html
   
   ![Screenshot 2019-12-31 at 9 42 10 
AM](https://user-images.githubusercontent.com/8948111/71609865-e89c8900-2bb1-11ea-8f73-30c7545c2073.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   even if you are doing like this, i feel it's not correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB edited a comment on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   even if you are like doing this, i feel it's not correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
HyukjinKwon commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary 
InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#issuecomment-569858489
 
 
   I happened to take a cursory look and seems pretty fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter

2019-12-30 Thread GitBox
HyukjinKwon commented on a change in pull request #26993: [SPARK-30338][SQL] 
Avoid unnecessary InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#discussion_r362143752
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -318,10 +318,32 @@ private[parquet] class ParquetRowConverter(
 new ParquetMapConverter(parquetType.asGroupType(), t, updater)
 
   case t: StructType =>
+val wrappedUpdater = {
 
 Review comment:
   @JoshRosen, no big deal at all but how about we put the JIRA ID somewhere in 
the comment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #27038: [SPARK-30379][Core] Avoid OOM when using collection accumulator

2019-12-30 Thread GitBox
HyukjinKwon closed pull request #27038: [SPARK-30379][Core] Avoid OOM when 
using collection accumulator
URL: https://github.com/apache/spark/pull/27038
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #27038: [SPARK-30379][Core] Avoid OOM when using collection accumulator

2019-12-30 Thread GitBox
HyukjinKwon commented on issue #27038: [SPARK-30379][Core] Avoid OOM when using 
collection accumulator
URL: https://github.com/apache/spark/pull/27038#issuecomment-569858030
 
 
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857609
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857610
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20772/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857609
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857610
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20772/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
SparkQA commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569857471
 
 
   **[Test build #115980 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115980/testReport)**
 for PR 26847 at commit 
[`c391212`](https://github.com/apache/spark/commit/c391212e818bb4947276e4ee442f92963af1cedd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569856046
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115978/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins removed a comment on issue #26847: [SPARK-30214][SQL]  A new 
framework to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569856044
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569856046
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115978/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
AmplabJenkins commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569856044
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
SparkQA removed a comment on issue #26847: [SPARK-30214][SQL]  A new framework 
to resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569849371
 
 
   **[Test build #115978 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115978/testReport)**
 for PR 26847 at commit 
[`9d10239`](https://github.com/apache/spark/commit/9d10239585e69f823cf5248196a3c74f90fe4b37).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26847: [SPARK-30214][SQL] A new framework to resolve v2 commands

2019-12-30 Thread GitBox
SparkQA commented on issue #26847: [SPARK-30214][SQL]  A new framework to 
resolve v2 commands
URL: https://github.com/apache/spark/pull/26847#issuecomment-569856026
 
 
   **[Test build #115978 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115978/testReport)**
 for PR 26847 at commit 
[`9d10239`](https://github.com/apache/spark/commit/9d10239585e69f823cf5248196a3c74f90fe4b37).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] 07ARB commented on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
07ARB commented on issue #27051: [SPARK-30389][SQL]Validate file type extension 
during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569855412
 
 
   then this fix is not require ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on issue #27051: [SPARK-30389][SQL]Validate file type extension during add jar command.

2019-12-30 Thread GitBox
yaooqinn commented on issue #27051: [SPARK-30389][SQL]Validate file type 
extension during add jar command.
URL: https://github.com/apache/spark/pull/27051#issuecomment-569854577
 
 
   the `jar` file seems not to have to be  `jar` suffixed, it can be renamed to 
whatever we want, such as `example-1.0.0-SNAPSHOT.jar.123` and works fine.
   ```
   add jar hdfs://hz-cluster10/user/kyuubi/udf/example-1.0.0-SNAPSHOT.jar.123;
   create temporary function toHex5 as 
'com.netease.bigdata.spark.hive.udf.ToHex'; 
   select toHex5(1);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >