[GitHub] [spark] SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606417460 **[Test build #120627 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120627/testReport)** for PR 28078 at commit [`ee2248f`](https://github.com/apache/spark/commit/ee2248f769ad4ee5e7dbac383c2f4f4a63512a76). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398596 **[Test build #120627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120627/testReport)** for PR 28078 at commit [`ee2248f`](https://github.com/apache/spark/commit/ee2248f769ad4ee5e7dbac383c2f4f4a63512a76). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
AmplabJenkins removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606414615 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
AmplabJenkins commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606414623 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120619/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
AmplabJenkins commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606414615 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
AmplabJenkins removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606414623 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120619/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
SparkQA removed a comment on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606338420 **[Test build #120619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120619/testReport)** for PR 28071 at commit [`5768671`](https://github.com/apache/spark/commit/5768671f4e6f33b4bd6a21ea586657d78fcb8b86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
SparkQA commented on issue #28071: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound URL: https://github.com/apache/spark/pull/28071#issuecomment-606414095 **[Test build #120619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120619/testReport)** for PR 28071 at commit [`5768671`](https://github.com/apache/spark/commit/5768671f4e6f33b4bd6a21ea586657d78fcb8b86). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606412644 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606412650 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120626/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606412650 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120626/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606412644 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606394469 **[Test build #120626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120626/testReport)** for PR 28078 at commit [`9e708d6`](https://github.com/apache/spark/commit/9e708d6f5f12c98c1cf37dea4bd4dea73a6dafb9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606412365 **[Test build #120626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120626/testReport)** for PR 28078 at commit [`9e708d6`](https://github.com/apache/spark/commit/9e708d6f5f12c98c1cf37dea4bd4dea73a6dafb9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context URL: https://github.com/apache/spark/pull/28048#issuecomment-606412176 > Sure, you can reopen this when you have a valid use case and a unit test case, @stczwd . Thanks, @dongjoon-hyun @HyukjinKwon @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
viirya commented on a change in pull request #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#discussion_r400650242 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -474,10 +474,12 @@ private[spark] class SparkSubmit extends Logging { args.mainClass = "org.apache.spark.deploy.PythonRunner" args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs } - if (clusterManager != YARN) { -// The YARN backend handles python files differently, so don't merge the lists. -args.files = mergeFileLists(args.files, args.pyFiles) - } +} + +// Non-PySpark applications will also need Python dependencies. +if (deployMode == CLIENT && clusterManager != YARN) { Review comment: No, I don't see we merge it for cluster deploy mode. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606399021 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606399021 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606399024 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25330/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398962 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25329/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606399024 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25330/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398956 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398956 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398962 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25329/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606398596 **[Test build #120627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120627/testReport)** for PR 28078 at commit [`ee2248f`](https://github.com/apache/spark/commit/ee2248f769ad4ee5e7dbac383c2f4f4a63512a76). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications
SparkQA commented on issue #28077: [SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606398594 **[Test build #120628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120628/testReport)** for PR 28077 at commit [`8b5edef`](https://github.com/apache/spark/commit/8b5edefe2ba6c4ab04a00ccb865247e93794ea4c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
viirya commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#discussion_r400641643 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -474,10 +474,12 @@ private[spark] class SparkSubmit extends Logging { args.mainClass = "org.apache.spark.deploy.PythonRunner" args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs } - if (clusterManager != YARN) { -// The YARN backend handles python files differently, so don't merge the lists. -args.files = mergeFileLists(args.files, args.pyFiles) - } +} + +// Non-PySpark applications will also need Python dependencies. Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
AmplabJenkins removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606398014 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
AmplabJenkins commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606398019 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120623/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
AmplabJenkins removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606398019 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120623/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
AmplabJenkins commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606398014 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
SparkQA commented on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606397470 **[Test build #120623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120623/testReport)** for PR 28049 at commit [`de1df1e`](https://github.com/apache/spark/commit/de1df1e8854d33be7859ea4e08ac1fae499c20ac). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config
SparkQA removed a comment on issue #28049: [SPARK-31285][CORE] uppercase schedule mode string at config URL: https://github.com/apache/spark/pull/28049#issuecomment-606352230 **[Test build #120623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120623/testReport)** for PR 28049 at commit [`de1df1e`](https://github.com/apache/spark/commit/de1df1e8854d33be7859ea4e08ac1fae499c20ac). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606396478 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120621/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
SparkQA commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606396308 **[Test build #120621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120621/testReport)** for PR 26935 at commit [`895fe06`](https://github.com/apache/spark/commit/895fe068bd3b32ed70ef84cc68e3352306099214). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606396473 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins commented on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606396478 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120621/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606396473 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
SparkQA removed a comment on issue #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore URL: https://github.com/apache/spark/pull/26935#issuecomment-606343418 **[Test build #120621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120621/testReport)** for PR 26935 at commit [`895fe06`](https://github.com/apache/spark/commit/895fe068bd3b32ed70ef84cc68e3352306099214). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function
dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function URL: https://github.com/apache/spark/pull/27836#discussion_r400638874 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -796,3 +797,64 @@ case class SchemaOfJson( override def prettyName: String = "schema_of_json" } + +/** + * A function which returns all the keys of outer JSON object. + */ +@ExpressionDescription( + usage = "_FUNC_(json_object) - returns all the keys of outer JSON object.", + arguments = """ +Arguments: + * json_object - A JSON object. If it is an invalid string, the function returns null. + If it is a JSON array or null, a runtime exception will be thrown. + """, + examples = """ +Examples: + > Select _FUNC_('{}'); +[] + > Select _FUNC_('{"key": "value"}'); +["key"] + > Select _FUNC_('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}'); +["f1","f2"] + """, + since = "3.1.0") +case class JsonObjectKeys(child: Expression) extends UnaryExpression with CodegenFallback { + override def dataType: DataType = ArrayType(StringType) + override def nullable: Boolean = true + override def prettyName: String = "json_object_keys" + + override def eval(input: InternalRow): Any = { +try { + val json = child.eval(input).asInstanceOf[UTF8String] + Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) { +parser => getJsonKeys(parser, input) + } +} catch { + case _: JsonProcessingException => null +} Review comment: I checked. ``` scala> sql("select json_object_keys(null)").show java.lang.NullPointerException scala> sql("select json_object_keys(1)").show java.lang.ClassCastException: class java.lang.Integer cannot be cast to class org.apache.spark.unsafe.types.UTF8String (java.lang.Integer is in module java.base of loader 'bootstrap'; org.apache.spark.unsafe.types.UTF8String is in unnamed module of loader 'app') ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606392826 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25328/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
SparkQA commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606394469 **[Test build #120626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120626/testReport)** for PR 28078 at commit [`9e708d6`](https://github.com/apache/spark/commit/9e708d6f5f12c98c1cf37dea4bd4dea73a6dafb9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins removed a comment on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606392821 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function
dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function URL: https://github.com/apache/spark/pull/27836#discussion_r400637284 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -796,3 +797,64 @@ case class SchemaOfJson( override def prettyName: String = "schema_of_json" } + +/** + * A function which returns all the keys of outer JSON object. + */ +@ExpressionDescription( + usage = "_FUNC_(json_object) - returns all the keys of outer JSON object.", + arguments = """ +Arguments: + * json_object - A JSON object. If it is an invalid string, the function returns null. + If it is a JSON array or null, a runtime exception will be thrown. Review comment: Is there a reason why you choose a runtime exception for `null`? In SQL world, for `null` input, `null` return is expected instead of a runtime exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606392821 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
AmplabJenkins commented on issue #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078#issuecomment-606392826 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25328/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML
zhengruifeng opened a new pull request #28078: [SPARK-31309][ML] Migrate the ChiSquareTest from MLlib to ML URL: https://github.com/apache/spark/pull/28078 ### What changes were proposed in this pull request? 1, Move the impl of ChiSq from .mllib to the .ml side; 2, in `.mllib.ChiSqTest`, call the impl in `.ml.ChiSquareTest` ### Why are the changes needed? We should migrate the algs from MLlib to ML ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function
dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function URL: https://github.com/apache/spark/pull/27836#discussion_r400636724 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala ## @@ -791,4 +791,42 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with checkDecimalInfer(_, """struct""") } } + + test("json_object_keys") { +val null_object = "" +val empty_json_object = """{}""" +val simple_json_object = """{"key": 1}""" +val another_simple_json_object = """{"key": "value", "key2": 2}""" +val json_object_with_array = """{"arrayKey": [1, 2, 3]}""" +val another_json_object_with_array = """{"key":[1,2,3,{"key":"value"},[1,2,3]]}""" +val complex_json_object = """{"f1":"abc","f2":{"f3":"a", "f4":"b"}}""" +val another_complex_json_object = """{"k1": [1, 2, {"key": 5}], "k2": {"key2": [1, 2]}}""" +val empty_json_array = """[]""" +val invalid_json_object = """{[1,2]}""" +val another_invalid_json_object = """{"key": 45, "random_string"}""" + +checkEvaluation(JsonObjectKeys(Literal(empty_json_object)), Seq.empty[UTF8String]) +checkEvaluation(JsonObjectKeys(Literal(simple_json_object)), Seq("key")) +checkEvaluation(JsonObjectKeys(Literal(another_simple_json_object)), Seq("key", "key2")) +checkEvaluation(JsonObjectKeys(Literal(json_object_with_array)), Seq("arrayKey")) +checkEvaluation(JsonObjectKeys(Literal(another_json_object_with_array)), Seq("key")) +checkEvaluation(JsonObjectKeys(Literal(complex_json_object)), Seq("f1", "f2")) +checkEvaluation(JsonObjectKeys(Literal(another_complex_json_object)), Seq("k1", "k2")) +checkEvaluation(JsonObjectKeys(Literal(invalid_json_object)), null) +checkEvaluation(JsonObjectKeys(Literal(another_invalid_json_object)), null) + +val exception = intercept[TestFailedException] { + checkEvaluation(JsonObjectKeys(Literal(null_object)), null) +}.getCause + +assert(exception.isInstanceOf[IllegalArgumentException]) Review comment: ditto. This should be checked at 818. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function
dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function URL: https://github.com/apache/spark/pull/27836#discussion_r400636651 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala ## @@ -791,4 +791,42 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with checkDecimalInfer(_, """struct""") } } + + test("json_object_keys") { +val null_object = "" +val empty_json_object = """{}""" +val simple_json_object = """{"key": 1}""" +val another_simple_json_object = """{"key": "value", "key2": 2}""" +val json_object_with_array = """{"arrayKey": [1, 2, 3]}""" +val another_json_object_with_array = """{"key":[1,2,3,{"key":"value"},[1,2,3]]}""" +val complex_json_object = """{"f1":"abc","f2":{"f3":"a", "f4":"b"}}""" +val another_complex_json_object = """{"k1": [1, 2, {"key": 5}], "k2": {"key2": [1, 2]}}""" +val empty_json_array = """[]""" +val invalid_json_object = """{[1,2]}""" +val another_invalid_json_object = """{"key": 45, "random_string"}""" Review comment: Please use the simpler pattern which I commented as an example in your sister PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function
dongjoon-hyun commented on a change in pull request #27836: [SPARK-31009][SQL] Support json_object_keys function URL: https://github.com/apache/spark/pull/27836#discussion_r400636486 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -796,3 +797,64 @@ case class SchemaOfJson( override def prettyName: String = "schema_of_json" } + +/** + * A function which returns all the keys of outer JSON object. + */ +@ExpressionDescription( + usage = "_FUNC_(json_object) - returns all the keys of outer JSON object.", + arguments = """ +Arguments: + * json_object - A JSON object. If it is an invalid string, the function returns null. + If it is a JSON array or null, a runtime exception will be thrown. + """, + examples = """ +Examples: + > Select _FUNC_('{}'); +[] + > Select _FUNC_('{"key": "value"}'); +["key"] + > Select _FUNC_('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}'); +["f1","f2"] + """, + since = "3.1.0") +case class JsonObjectKeys(child: Expression) extends UnaryExpression with CodegenFallback { + override def dataType: DataType = ArrayType(StringType) + override def nullable: Boolean = true + override def prettyName: String = "json_object_keys" + + override def eval(input: InternalRow): Any = { +try { + val json = child.eval(input).asInstanceOf[UTF8String] + Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) { +parser => getJsonKeys(parser, input) + } +} catch { + case _: JsonProcessingException => null +} Review comment: Apparently, the current implementation seems to have the same limitations: `NullPointerException` and `ClassCastException`. Please add more unit test cases with `null` and `Integer` input. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled
wangyum commented on issue #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled URL: https://github.com/apache/spark/pull/27986#issuecomment-606391972 cc @cloud-fan @HyukjinKwon @dongjoon-hyun @maryannxue Do we need this change to make `DISTRIBUTE BY`/`GROUP BY` partitioned by same partition number? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum removed a comment on issue #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled
wangyum removed a comment on issue #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled URL: https://github.com/apache/spark/pull/27986#issuecomment-602942046 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606391003 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
AmplabJenkins removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606391005 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120622/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606391005 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120622/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
AmplabJenkins commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606391003 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
SparkQA removed a comment on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606349919 **[Test build #120622 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120622/testReport)** for PR 28077 at commit [`a892907`](https://github.com/apache/spark/commit/a892907b8216a9c0934cf1cc570ddaebe707f992). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
SparkQA commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606390509 **[Test build #120622 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120622/testReport)** for PR 28077 at commit [`a892907`](https://github.com/apache/spark/commit/a892907b8216a9c0934cf1cc570ddaebe707f992). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28076: [WIP][SQL] Benchmark dates/timestamps rebasing in ORC datasource
dongjoon-hyun commented on issue #28076: [WIP][SQL] Benchmark dates/timestamps rebasing in ORC datasource URL: https://github.com/apache/spark/pull/28076#issuecomment-606390246 Could you file a JIRA before making a PR? You can make a `DRAFT` PR feature of GitHub for `WIP`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
dongjoon-hyun commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#discussion_r400634191 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -474,10 +474,12 @@ private[spark] class SparkSubmit extends Logging { args.mainClass = "org.apache.spark.deploy.PythonRunner" args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs } - if (clusterManager != YARN) { -// The YARN backend handles python files differently, so don't merge the lists. -args.files = mergeFileLists(args.files, args.pyFiles) - } +} + +// Non-PySpark applications will also need Python dependencies. +if (deployMode == CLIENT && clusterManager != YARN) { Review comment: Just a question. Did we `mergeFileLists` for `deployMode != CLIENT` already? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
dongjoon-hyun commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606388764 Hi, @viirya . This PR title looks too broad. Could you be more specific by excluding the scope of SPARK-24377 ? > [SPARK-24377][Spark Submit] make --py-files work in non pyspark application This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum closed pull request #28068: [MINOR][CORE] Remove two unused variables in LiveListenerBus
wangyum closed pull request #28068: [MINOR][CORE] Remove two unused variables in LiveListenerBus URL: https://github.com/apache/spark/pull/28068 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606386421 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120617/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606386416 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606386416 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606386421 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120617/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
SparkQA removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606317173 **[Test build #120617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120617/testReport)** for PR 27066 at commit [`e84a86d`](https://github.com/apache/spark/commit/e84a86d984fe5376aa2144c5096c7593cb322d15). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
SparkQA commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606386084 **[Test build #120617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120617/testReport)** for PR 27066 at commit [`e84a86d`](https://github.com/apache/spark/commit/e84a86d984fe5376aa2144c5096c7593cb322d15). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context URL: https://github.com/apache/spark/pull/28048#issuecomment-606385501 Sure, you can reopen this when you have a valid use case and a unit test case, @stczwd . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28042: [SPARK-31279][SQL][DOC] Add version information to the configuration of Hive
HyukjinKwon closed pull request #28042: [SPARK-31279][SQL][DOC] Add version information to the configuration of Hive URL: https://github.com/apache/spark/pull/28042 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28064: [SPARK-31295][DOC] Supplement version for configuration appear in doc
HyukjinKwon closed pull request #28064: [SPARK-31295][DOC] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28035: [SPARK-31269][DOC] Supplement version for configuration only appear in configuration doc
HyukjinKwon closed pull request #28035: [SPARK-31269][DOC] Supplement version for configuration only appear in configuration doc URL: https://github.com/apache/spark/pull/28035 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28044: [SPARK-31282][DOC] Supplement version for configuration appear in security doc
HyukjinKwon closed pull request #28044: [SPARK-31282][DOC] Supplement version for configuration appear in security doc URL: https://github.com/apache/spark/pull/28044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28064: [SPARK-31295][DOC] Supplement version for configuration appear in doc
HyukjinKwon commented on issue #28064: [SPARK-31295][DOC] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28064#issuecomment-606378104 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
HyukjinKwon closed pull request #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28044: [SPARK-31282][DOC] Supplement version for configuration appear in security doc
HyukjinKwon commented on issue #28044: [SPARK-31282][DOC] Supplement version for configuration appear in security doc URL: https://github.com/apache/spark/pull/28044#issuecomment-606378091 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28042: [SPARK-31279][SQL][DOC] Add version information to the configuration of Hive
HyukjinKwon commented on issue #28042: [SPARK-31279][SQL][DOC] Add version information to the configuration of Hive URL: https://github.com/apache/spark/pull/28042#issuecomment-606378079 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28035: [SPARK-31269][DOC] Supplement version for configuration only appear in configuration doc
HyukjinKwon commented on issue #28035: [SPARK-31269][DOC] Supplement version for configuration only appear in configuration doc URL: https://github.com/apache/spark/pull/28035#issuecomment-606378064 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
HyukjinKwon commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-606378046 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
AmplabJenkins removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606375737 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120620/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
AmplabJenkins commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606375730 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
SparkQA commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606375628 **[Test build #120620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120620/testReport)** for PR 28026 at commit [`12a17fc`](https://github.com/apache/spark/commit/12a17fcf2189796741c3e11bab2f98c1bf996c03). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
AmplabJenkins removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606375730 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
AmplabJenkins commented on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606375737 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120620/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP)
SparkQA removed a comment on issue #28026: [SPARK-31257][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-606343429 **[Test build #120620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120620/testReport)** for PR 28026 at commit [`12a17fc`](https://github.com/apache/spark/commit/12a17fcf2189796741c3e11bab2f98c1bf996c03). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #27025: [SPARK-26560][SQL] Spark should be able to run Hive UDF using jar regardless of current thread context classloader
HeartSaVioR commented on a change in pull request #27025: [SPARK-26560][SQL] Spark should be able to run Hive UDF using jar regardless of current thread context classloader URL: https://github.com/apache/spark/pull/27025#discussion_r400618527 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ## @@ -66,49 +66,52 @@ private[sql] class HiveSessionCatalog( name: String, clazz: Class[_], input: Seq[Expression]): Expression = { - -Try(super.makeFunctionExpression(name, clazz, input)).getOrElse { - var udfExpr: Option[Expression] = None - try { -// When we instantiate hive UDF wrapper class, we may throw exception if the input -// expressions don't satisfy the hive UDF, such as type mismatch, input number -// mismatch, etc. Here we catch the exception and throw AnalysisException instead. -if (classOf[UDF].isAssignableFrom(clazz)) { - udfExpr = Some(HiveSimpleUDF(name, new HiveFunctionWrapper(clazz.getName), input)) - udfExpr.get.dataType // Force it to check input data types. -} else if (classOf[GenericUDF].isAssignableFrom(clazz)) { - udfExpr = Some(HiveGenericUDF(name, new HiveFunctionWrapper(clazz.getName), input)) - udfExpr.get.dataType // Force it to check input data types. -} else if (classOf[AbstractGenericUDAFResolver].isAssignableFrom(clazz)) { - udfExpr = Some(HiveUDAFFunction(name, new HiveFunctionWrapper(clazz.getName), input)) - udfExpr.get.dataType // Force it to check input data types. -} else if (classOf[UDAF].isAssignableFrom(clazz)) { - udfExpr = Some(HiveUDAFFunction( -name, -new HiveFunctionWrapper(clazz.getName), -input, -isUDAFBridgeRequired = true)) - udfExpr.get.dataType // Force it to check input data types. -} else if (classOf[GenericUDTF].isAssignableFrom(clazz)) { - udfExpr = Some(HiveGenericUDTF(name, new HiveFunctionWrapper(clazz.getName), input)) - udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema // Force it to check data types. +// Current thread context classloader may not be the one loaded the class. Need to switch +// context classloader to initialize instance properly. +Utils.withContextClassLoader(clazz.getClassLoader) { + Try(super.makeFunctionExpression(name, clazz, input)).getOrElse { +var udfExpr: Option[Expression] = None +try { + // When we instantiate hive UDF wrapper class, we may throw exception if the input + // expressions don't satisfy the hive UDF, such as type mismatch, input number + // mismatch, etc. Here we catch the exception and throw AnalysisException instead. + if (classOf[UDF].isAssignableFrom(clazz)) { +udfExpr = Some(HiveSimpleUDF(name, new HiveFunctionWrapper(clazz.getName), input)) +udfExpr.get.dataType // Force it to check input data types. Review comment: Oh OK. I missed the case we don't cache the function. Thanks for the pointer! I'll try to reproduce the finding, and fix it without touching assumption. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
HyukjinKwon commented on issue #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#issuecomment-606373680 I think it's fine. cc @vanzin and @jerryshao This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #27657: [SPARK-30899][SQL] CreateArray/CreateMap's data type should not depend on SQLConf.get
gatorsmile commented on a change in pull request #27657: [SPARK-30899][SQL] CreateArray/CreateMap's data type should not depend on SQLConf.get URL: https://github.com/apache/spark/pull/27657#discussion_r400617350 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -37,16 +37,23 @@ import org.apache.spark.unsafe.types.UTF8String > SELECT _FUNC_(1, 2, 3); [1,2,3] """) -case class CreateArray(children: Seq[Expression]) extends Expression { +case class CreateArray(children: Seq[Expression], useStringTypeWhenEmpty: Boolean) Review comment: @iRakson When you update the code, do not forget update the PR description, which will be part of the commit message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606370934 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120615/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606370927 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications
HyukjinKwon commented on a change in pull request #28077: [SPARK-31308][PySpark] Make Python dependencies available for Non-PySpark applications URL: https://github.com/apache/spark/pull/28077#discussion_r400615348 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -474,10 +474,12 @@ private[spark] class SparkSubmit extends Logging { args.mainClass = "org.apache.spark.deploy.PythonRunner" args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs } - if (clusterManager != YARN) { -// The YARN backend handles python files differently, so don't merge the lists. -args.files = mergeFileLists(args.files, args.pyFiles) - } +} + +// Non-PySpark applications will also need Python dependencies. Review comment: nit: `will also` -> `can` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606370927 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
AmplabJenkins commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606370934 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120615/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
SparkQA removed a comment on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606292813 **[Test build #120615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120615/testReport)** for PR 27066 at commit [`a0eea6d`](https://github.com/apache/spark/commit/a0eea6d824b56d016963ef61a37a8c3cbeaced7a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class
SparkQA commented on issue #27066: [SPARK-22231][SQL] Add withField method to Column class URL: https://github.com/apache/spark/pull/27066#issuecomment-606370600 **[Test build #120615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120615/testReport)** for PR 27066 at commit [`a0eea6d`](https://github.com/apache/spark/commit/a0eea6d824b56d016963ef61a37a8c3cbeaced7a). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#discussion_r400606676 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala ## @@ -901,18 +1136,17 @@ class TaskSchedulerImplSuite extends SparkFunSuite with LocalSparkContext with B } // Here is the main check of this test -- we have the same offers again, and we schedule it -// successfully. Because the scheduler first tries to schedule with locality in mind, at first -// it won't schedule anything on executor1. But despite that, we don't abort the job. Then the -// scheduler tries for ANY locality, and successfully schedules tasks on executor1. +// successfully. Because the scheduler tries to schedule with locality in mind, at first +// it won't schedule anything on executor1. But despite that, we don't abort the job. val secondTaskAttempts = taskScheduler.resourceOffers(offers).flatten -assert(secondTaskAttempts.size == 2) -secondTaskAttempts.foreach { taskAttempt => assert("executor1" === taskAttempt.executorId) } +assert(secondTaskAttempts.isEmpty) assert(!failedTaskSet) } test("SPARK-16106 locality levels updated if executor added to existing host") { val taskScheduler = setupScheduler() +taskScheduler.resourceOffers(IndexedSeq(new WorkerOffer("executor0", "host0", 1))) Review comment: we don't need it, just otherwise the test behaves differently because the resources aren't scheduled the same (more resources are accepted up front with new code) I can also make the test pass by setting the legacy flag, or changing more logic in the test Previously the locality level would be reset on every task launch, now it is once per resourceOffers call (with certain conditions met). Workloads that relied on the old behavior would possibly regress. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions`
AmplabJenkins commented on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions` URL: https://github.com/apache/spark/pull/28056#issuecomment-606369906 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25327/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#discussion_r400606676 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala ## @@ -901,18 +1136,17 @@ class TaskSchedulerImplSuite extends SparkFunSuite with LocalSparkContext with B } // Here is the main check of this test -- we have the same offers again, and we schedule it -// successfully. Because the scheduler first tries to schedule with locality in mind, at first -// it won't schedule anything on executor1. But despite that, we don't abort the job. Then the -// scheduler tries for ANY locality, and successfully schedules tasks on executor1. +// successfully. Because the scheduler tries to schedule with locality in mind, at first +// it won't schedule anything on executor1. But despite that, we don't abort the job. val secondTaskAttempts = taskScheduler.resourceOffers(offers).flatten -assert(secondTaskAttempts.size == 2) -secondTaskAttempts.foreach { taskAttempt => assert("executor1" === taskAttempt.executorId) } +assert(secondTaskAttempts.isEmpty) assert(!failedTaskSet) } test("SPARK-16106 locality levels updated if executor added to existing host") { val taskScheduler = setupScheduler() +taskScheduler.resourceOffers(IndexedSeq(new WorkerOffer("executor0", "host0", 1))) Review comment: we don't need it, just otherwise the test behaves differently because the resources aren't scheduled the same (more resources are accepted up front with new code) I can also make the test pass by setting the legacy flag, or changing more logic in the test Previously the locality level would be reset on every task launch, now it is once per resourceOffers call (with certain conditions are met). Workloads that relied on the old behavior would possibly regress. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions`
AmplabJenkins removed a comment on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions` URL: https://github.com/apache/spark/pull/28056#issuecomment-606369900 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions`
AmplabJenkins commented on issue #28056: [SPARK-31288][SQL] Remove `getRawTable` in `alterPartitions` URL: https://github.com/apache/spark/pull/28056#issuecomment-606369900 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org