[GitHub] spark issue #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spark on K8...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #94379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94379/testReport)** for PR 21669 at commit [`c30ad8c`](https://github.com/apache/spark/commit/c30ad8c4be1d42e7da4992570a656099c073d745). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spar...
Github user ifilonenko commented on a diff in the pull request: https://github.com/apache/spark/pull/21669#discussion_r208257021 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -336,7 +336,7 @@ private[spark] class SparkSubmit extends Logging { val targetDir = Utils.createTempDir() // assure a keytab is available from any place in a JVM -if (clusterManager == YARN || clusterManager == LOCAL || isMesosClient) { +if (clusterManager == YARN || clusterManager == LOCAL || isMesosClient || isKubernetesCluster) { --- End diff -- This check can be removed, but I included it since I believed that the keytab shouldn't be stored as a secret for security reasons and should instead be only accessible from the JVM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22023: [SPARK-23928][TESTS][FOLLOWUP] Set seed to avoid ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22023 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22023: [SPARK-23928][TESTS][FOLLOWUP] Set seed to avoid flakine...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22023 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22023: [SPARK-23928][TESTS][FOLLOWUP] Set seed to avoid flakine...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22023 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94366/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22023: [SPARK-23928][TESTS][FOLLOWUP] Set seed to avoid flakine...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22023 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22025: SPARK-25043: print master and appId from spark-sql on st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22025 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22023: [SPARK-23928][TESTS][FOLLOWUP] Set seed to avoid flakine...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22023 **[Test build #94366 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94366/testReport)** for PR 22023 at commit [`b1713d8`](https://github.com/apache/spark/commit/b1713d8918c205e325e40ca4eb28eb08909a91f3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22025: SPARK-25043: print master and appId from spark-sql on st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22025 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22025: SPARK-25043: print master and appId from spark-sql on st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22025 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22025: SPARK-25043: print master and appId from spark-sq...
GitHub user abellina opened a pull request: https://github.com/apache/spark/pull/22025 SPARK-25043: print master and appId from spark-sql on startup ## What changes were proposed in this pull request? A small change to print the master and appId from spark-sql as with logging turned down all the way, we may not know this information easily. ## How was this patch tested? I ran spark-sql locally and saw the appId displayed as expected. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/abellina/spark SPARK-25043_print_master_and_app_id_from_sparksql Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22025.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22025 commit 42c510c5486a925e570fcd77a469ea57ac5807c3 Author: Alessandro Bellina Date: 2018-08-07T14:16:25Z SPARK-25043: print master and appId from spark-sql on startup --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22008 **[Test build #94378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94378/testReport)** for PR 22008 at commit [`1caf256`](https://github.com/apache/spark/commit/1caf2567694c56cca019e6608609b81ac70deefa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22008 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22008 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1916/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22008 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22008 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94365/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22008 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22008 **[Test build #94365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94365/testReport)** for PR 22008 at commit [`1caf256`](https://github.com/apache/spark/commit/1caf2567694c56cca019e6608609b81ac70deefa). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait JoinHelper extends PredicateHelper ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22009 **[Test build #94377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94377/testReport)** for PR 22009 at commit [`c224999`](https://github.com/apache/spark/commit/c22499964ac759670c3629c690f77018bc79a7c1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1915/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22009 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94363/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21986 **[Test build #94363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94363/testReport)** for PR 21986 at commit [`1823fb2`](https://github.com/apache/spark/commit/1823fb279b1e5ed7b55d6e27ede27982ce94d922). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait SimpleHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes ` * `trait ArrayBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction ` * `trait MapBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94359/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94359 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94359/testReport)** for PR 20611 at commit [`5b5bb52`](https://github.com/apache/spark/commit/5b5bb52e1c334eeec49c318e4c437d04c489671b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17185: [SPARK-19602][SQL] Support column resolution of f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17185 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21933: [SPARK-24917][CORE] make chunk size configurable
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21933 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21933: [SPARK-24917][CORE] make chunk size configurable
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21933 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94358/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1914/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17185 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21933: [SPARK-24917][CORE] make chunk size configurable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21933 **[Test build #94358 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94358/testReport)** for PR 21933 at commit [`e2961eb`](https://github.com/apache/spark/commit/e2961eb86f689de83770de5c3a73838512a62001). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22022 **[Test build #94374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94374/testReport)** for PR 22022 at commit [`16233d1`](https://github.com/apache/spark/commit/16233d181b0a61d6cd45a7dc42d49a8905c964ea). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18323 **[Test build #94376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94376/testReport)** for PR 18323 at commit [`2e2b2ca`](https://github.com/apache/spark/commit/2e2b2ca39ffb595ec5c26bcec71afa9df8a612c6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21978: SPARK-25006: Add CatalogTableIdentifier.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21978 **[Test build #94375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94375/testReport)** for PR 21978 at commit [`00295ee`](https://github.com/apache/spark/commit/00295ee6b3713995641c90a9b3b7cd4a6b79ded6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1913/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22024 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22024 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21305 **[Test build #94373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94373/testReport)** for PR 21305 at commit [`e81790d`](https://github.com/apache/spark/commit/e81790d072ed66f1126d5918bd1a39222a9f5cfa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22024 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21988: [SPARK-25003][PYSPARK][BRANCH-2.2] Use SessionExt...
Github user RussellSpitzer closed the pull request at: https://github.com/apache/spark/pull/21988 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21989: [SPARK-25003][PYSPARK][BRANCH-2.3] Use SessionExt...
Github user RussellSpitzer closed the pull request at: https://github.com/apache/spark/pull/21989 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21305 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1912/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21305 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22024: [SPARK-25034][CORE] Remove allocations in onBlock...
GitHub user vincent-grosbois opened a pull request: https://github.com/apache/spark/pull/22024 [SPARK-25034][CORE] Remove allocations in onBlockFetchSuccess This method is only transferring a ManagedBuffer to the caller, so there is no reason why it should allocate 2 (!) intermediate data buffers in order to do so. In this commit I'm removing the conversion from any kind of managed buffer besides FileSegment to a NioManagedBuffer. However if you check the only calling method getRemoteBytes(), you will see that here we either: - do a memory-map if we have a FileSegmentManagedBuffer - try again to call the nioByteBuffer() method otherwise So in any case the conversion will occur later. ## What changes were proposed in this pull request? Remove needless temporary allocations ## How was this patch tested? Tested this change with a few jobs You can merge this pull request into a Git repository by running: $ git pull https://github.com/vincent-grosbois/spark no-alloc-onfetchsuccess Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22024.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22024 commit 2c182b3c93c7cc70f042d7dcd82520ac2adece1c Author: Vincent Grosbois Date: 2018-08-07T12:34:32Z [SPARK-25034][CORE] Remove allocations in onBlockFetchSuccess This method is only transferring a ManagedBuffer to the caller, so there is no reason why it should allocate 2 (!) intermediate data buffers in order to do so. In this commit I'm removing the conversion from any kind of managed buffer besides FileSegment to a NioManagedBuffer. However if you check the only calling method getRemoteBytes(), you will see that here we either: - do a memory-map if we have a FileSegmentManagedBuffer - try again to call the nioByteBuffer() method otherwise So in any case the conversion will occur later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21305 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21305 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21305 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94364/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21305 **[Test build #94364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94364/testReport)** for PR 21305 at commit [`e81790d`](https://github.com/apache/spark/commit/e81790d072ed66f1126d5918bd1a39222a9f5cfa). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22006: [SPARK-25031][SQL] Fix MapType schema print
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22006 **[Test build #94372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94372/testReport)** for PR 22006 at commit [`4328199`](https://github.com/apache/spark/commit/4328199fe3738ceec0a2e87b934a20f56e08dc28). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22006: [SPARK-25031][SQL] Fix MapType schema print
Github user invkrh commented on a diff in the pull request: https://github.com/apache/spark/pull/22006#discussion_r208215172 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala --- @@ -452,4 +452,31 @@ class DataTypeSuite extends SparkFunSuite { new StructType().add("f1", IntegerType).add("f", new StructType().add("f2", StringType, false)), new StructType().add("f2", IntegerType).add("g", new StructType().add("f1", StringType)), false) + + test("SPARK-25031: MapType should produce current formatted string for complex types") { + --- End diff -- Done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22020: [SPARK-25041][build] upgrade genJavaDoc-plugin from 0.10...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22020 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94356/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22020: [SPARK-25041][build] upgrade genJavaDoc-plugin from 0.10...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22020 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22020: [SPARK-25041][build] upgrade genJavaDoc-plugin from 0.10...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22020 **[Test build #94356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94356/testReport)** for PR 22020 at commit [`1b41ce4`](https://github.com/apache/spark/commit/1b41ce44800310cd0ebd321f8436db5bed452935). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1911/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21986 **[Test build #94371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94371/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r208211250 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -365,3 +364,101 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } + +/** + * Merges two given maps into a single map by applying function to the pair of values with + * the same key. + */ +@ExpressionDescription( + usage = +""" + _FUNC_(map1, map2, function) - Merges two given maps into a single map by applying + function to the pair of values with the same key. For keys only presented in one map, + NULL will be passed as the value for the missing key. If an input map contains duplicated + keys, only the first entry of the duplicated key is passed into the lambda function. +""", + examples = """ +Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2)); + {1:"ax",2:"by"} + """, + since = "2.4.0") +case class MapZipWith(left: Expression, right: Expression, function: Expression) + extends HigherOrderFunction with CodegenFallback { + + @transient lazy val functionForEval: Expression = functionsForEval.head + + @transient lazy val MapType(keyType, leftValueType, _) = getMapType(left) + + @transient lazy val MapType(_, rightValueType, _) = getMapType(right) + + @transient lazy val arrayDataUnion = new ArrayDataUnion(keyType) + + @transient lazy val ordering = TypeUtils.getInterpretedOrdering(keyType) + + override def inputs: Seq[Expression] = left :: right :: Nil + + override def functions: Seq[Expression] = function :: Nil + + override def nullable: Boolean = left.nullable || right.nullable + + override def dataType: DataType = MapType(keyType, function.dataType, function.nullable) + + override def checkInputDataTypes(): TypeCheckResult = { +(left.dataType, right.dataType) match { + case (MapType(k1, _, _), MapType(k2, _, _)) if k1.sameType(k2) => +TypeUtils.checkForOrderingExpr(k1, s"function $prettyName") + case _ => TypeCheckResult.TypeCheckFailure(s"The input to function $prettyName should have " + +s"been two ${MapType.simpleString}s with the same key type, but it's " + +s"[${left.dataType.catalogString}, ${right.dataType.catalogString}].") +} + } + + private def getMapType(expr: Expression) = expr.dataType match { +case m: MapType => m +case _ => MapType.defaultConcreteType + } + + override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapZipWith = { +val arguments = Seq((keyType, false), (leftValueType, true), (rightValueType, true)) +copy(function = f(function, arguments)) + } + + override def eval(input: InternalRow): Any = { +val value1 = left.eval(input) +if (value1 == null) { + null +} else { + val value2 = right.eval(input) + if (value2 == null) { +null + } else { +nullSafeEval(input, value1, value2) + } +} + } + + @transient lazy val LambdaFunction(_, Seq( +keyVar: NamedLambdaVariable, +value1Var: NamedLambdaVariable, +value2Var: NamedLambdaVariable), +_) = function + + private def nullSafeEval(inputRow: InternalRow, value1: Any, value2: Any): Any = { +val mapData1 = value1.asInstanceOf[MapData] +val mapData2 = value2.asInstanceOf[MapData] +val keys = arrayDataUnion(mapData1.keyArray(), mapData2.keyArray()) +val values = new GenericArrayData(new Array[Any](keys.numElements())) +keys.foreach(keyType, (idx: Int, key: Any) => { + val v1 = GetMapValueUtil.getValueEval(mapData1, key, keyType, leftValueType, ordering) --- End diff -- Ok, I will change it. Thanks a lot! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21986 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94367/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22017 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22017 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94362/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21986 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22017 **[Test build #94362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94362/testReport)** for PR 22017 at commit [`ec583eb`](https://github.com/apache/spark/commit/ec583eb29ba6fdb79d0b85cbecb3f709e6648b25). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ArrayDataUnion(elementType: DataType) extends ((ArrayData, ArrayData) => ArrayData) ` * `case class ArrayUnion(left: Expression, right: Expression) extends ArraySetLike` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21986 **[Test build #94367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94367/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r208210338 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -365,3 +364,101 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } + +/** + * Merges two given maps into a single map by applying function to the pair of values with + * the same key. + */ +@ExpressionDescription( + usage = +""" + _FUNC_(map1, map2, function) - Merges two given maps into a single map by applying + function to the pair of values with the same key. For keys only presented in one map, + NULL will be passed as the value for the missing key. If an input map contains duplicated + keys, only the first entry of the duplicated key is passed into the lambda function. +""", + examples = """ +Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2)); + {1:"ax",2:"by"} + """, + since = "2.4.0") +case class MapZipWith(left: Expression, right: Expression, function: Expression) + extends HigherOrderFunction with CodegenFallback { + + @transient lazy val functionForEval: Expression = functionsForEval.head + + @transient lazy val MapType(keyType, leftValueType, _) = getMapType(left) + + @transient lazy val MapType(_, rightValueType, _) = getMapType(right) + + @transient lazy val arrayDataUnion = new ArrayDataUnion(keyType) + + @transient lazy val ordering = TypeUtils.getInterpretedOrdering(keyType) + + override def inputs: Seq[Expression] = left :: right :: Nil + + override def functions: Seq[Expression] = function :: Nil + + override def nullable: Boolean = left.nullable || right.nullable + + override def dataType: DataType = MapType(keyType, function.dataType, function.nullable) + + override def checkInputDataTypes(): TypeCheckResult = { +(left.dataType, right.dataType) match { + case (MapType(k1, _, _), MapType(k2, _, _)) if k1.sameType(k2) => +TypeUtils.checkForOrderingExpr(k1, s"function $prettyName") + case _ => TypeCheckResult.TypeCheckFailure(s"The input to function $prettyName should have " + +s"been two ${MapType.simpleString}s with the same key type, but it's " + +s"[${left.dataType.catalogString}, ${right.dataType.catalogString}].") +} + } + + private def getMapType(expr: Expression) = expr.dataType match { +case m: MapType => m +case _ => MapType.defaultConcreteType + } + + override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapZipWith = { +val arguments = Seq((keyType, false), (leftValueType, true), (rightValueType, true)) +copy(function = f(function, arguments)) + } + + override def eval(input: InternalRow): Any = { +val value1 = left.eval(input) +if (value1 == null) { + null +} else { + val value2 = right.eval(input) + if (value2 == null) { +null + } else { +nullSafeEval(input, value1, value2) + } +} + } + + @transient lazy val LambdaFunction(_, Seq( +keyVar: NamedLambdaVariable, +value1Var: NamedLambdaVariable, +value2Var: NamedLambdaVariable), +_) = function + + private def nullSafeEval(inputRow: InternalRow, value1: Any, value2: Any): Any = { +val mapData1 = value1.asInstanceOf[MapData] +val mapData2 = value2.asInstanceOf[MapData] +val keys = arrayDataUnion(mapData1.keyArray(), mapData2.keyArray()) +val values = new GenericArrayData(new Array[Any](keys.numElements())) +keys.foreach(keyType, (idx: Int, key: Any) => { + val v1 = GetMapValueUtil.getValueEval(mapData1, key, keyType, leftValueType, ordering) --- End diff -- I think there is no plan to have a different map implementation and anyway there is a lot of code which depends on having the array based version of MapData. Regarding the duplicated code, to be honest, I think that avoiding the refactoring introduced by that would also make this PR cleaner... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22021 **[Test build #94370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94370/testReport)** for PR 22021 at commit [`fb68910`](https://github.com/apache/spark/commit/fb68910f82b1a2729364573a1c926b5b7b5c7c12). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1910/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22022 **[Test build #94369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94369/testReport)** for PR 22022 at commit [`657d364`](https://github.com/apache/spark/commit/657d3643e63d79095c47b45ce14429e9fa08f25b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1909/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22006: [SPARK-25031][SQL] Fix MapType schema print
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22006#discussion_r208209419 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala --- @@ -452,4 +452,31 @@ class DataTypeSuite extends SparkFunSuite { new StructType().add("f1", IntegerType).add("f", new StructType().add("f2", StringType, false)), new StructType().add("f2", IntegerType).add("g", new StructType().add("f1", StringType)), false) + + test("SPARK-25031: MapType should produce current formatted string for complex types") { + --- End diff -- nit: unneeded blank line --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22021 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22022 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22021 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22006: [SPARK-25031][SQL] Fix MapType schema print
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22006 **[Test build #94368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94368/testReport)** for PR 22006 at commit [`06f656d`](https://github.com/apache/spark/commit/06f656d7fb37f8f41a213cfc861d3e2515d26d33). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22006: [SPARK-25031][SQL] Fix MapType schema print
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22006 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17185 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17185 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94353/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17185 **[Test build #94353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94353/testReport)** for PR 17185 at commit [`5f7e5d7`](https://github.com/apache/spark/commit/5f7e5d7bddca593d72818b07d71f678bd0a1982d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r208204796 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -365,3 +364,101 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } + +/** + * Merges two given maps into a single map by applying function to the pair of values with + * the same key. + */ +@ExpressionDescription( + usage = +""" + _FUNC_(map1, map2, function) - Merges two given maps into a single map by applying + function to the pair of values with the same key. For keys only presented in one map, + NULL will be passed as the value for the missing key. If an input map contains duplicated + keys, only the first entry of the duplicated key is passed into the lambda function. +""", + examples = """ +Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2)); + {1:"ax",2:"by"} + """, + since = "2.4.0") +case class MapZipWith(left: Expression, right: Expression, function: Expression) + extends HigherOrderFunction with CodegenFallback { + + @transient lazy val functionForEval: Expression = functionsForEval.head + + @transient lazy val MapType(keyType, leftValueType, _) = getMapType(left) + + @transient lazy val MapType(_, rightValueType, _) = getMapType(right) + + @transient lazy val arrayDataUnion = new ArrayDataUnion(keyType) + + @transient lazy val ordering = TypeUtils.getInterpretedOrdering(keyType) + + override def inputs: Seq[Expression] = left :: right :: Nil + + override def functions: Seq[Expression] = function :: Nil + + override def nullable: Boolean = left.nullable || right.nullable + + override def dataType: DataType = MapType(keyType, function.dataType, function.nullable) + + override def checkInputDataTypes(): TypeCheckResult = { +(left.dataType, right.dataType) match { + case (MapType(k1, _, _), MapType(k2, _, _)) if k1.sameType(k2) => +TypeUtils.checkForOrderingExpr(k1, s"function $prettyName") + case _ => TypeCheckResult.TypeCheckFailure(s"The input to function $prettyName should have " + +s"been two ${MapType.simpleString}s with the same key type, but it's " + +s"[${left.dataType.catalogString}, ${right.dataType.catalogString}].") +} + } + + private def getMapType(expr: Expression) = expr.dataType match { +case m: MapType => m +case _ => MapType.defaultConcreteType + } + + override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapZipWith = { +val arguments = Seq((keyType, false), (leftValueType, true), (rightValueType, true)) +copy(function = f(function, arguments)) + } + + override def eval(input: InternalRow): Any = { +val value1 = left.eval(input) +if (value1 == null) { + null +} else { + val value2 = right.eval(input) + if (value2 == null) { +null + } else { +nullSafeEval(input, value1, value2) + } +} + } + + @transient lazy val LambdaFunction(_, Seq( +keyVar: NamedLambdaVariable, +value1Var: NamedLambdaVariable, +value2Var: NamedLambdaVariable), +_) = function + + private def nullSafeEval(inputRow: InternalRow, value1: Any, value2: Any): Any = { +val mapData1 = value1.asInstanceOf[MapData] +val mapData2 = value2.asInstanceOf[MapData] +val keys = arrayDataUnion(mapData1.keyArray(), mapData2.keyArray()) +val values = new GenericArrayData(new Array[Any](keys.numElements())) +keys.foreach(keyType, (idx: Int, key: Any) => { + val v1 = GetMapValueUtil.getValueEval(mapData1, key, keyType, leftValueType, ordering) --- End diff -- Thanks for mentioning this! I'm not happy with the current complexity either. I've assumed that the implementation of maps will change into something with O(1) element access in future. By then, the complexity would be O(N) for types supporting equals as well and we would safe a portion of duplicated code. If you think that maps will remain like this for a long time, really like your suggestion with indexes. @ueshin What's your view on that? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21241: [SPARK-24135][K8s] Resilience to init-container errors o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21241 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21241: [SPARK-24135][K8s] Resilience to init-container errors o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21241 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94354/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21241: [SPARK-24135][K8s] Resilience to init-container errors o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21241 **[Test build #94354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94354/testReport)** for PR 21241 at commit [`9df84e8`](https://github.com/apache/spark/commit/9df84e87a36bd6b43b0f74c26ad3bf70a67bb467). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22021 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22021 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94357/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22021 **[Test build #94357 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94357/testReport)** for PR 22021 at commit [`fb68910`](https://github.com/apache/spark/commit/fb68910f82b1a2729364573a1c926b5b7b5c7c12). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21845 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21845 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94349/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21845 **[Test build #94349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94349/testReport)** for PR 21845 at commit [`7afc5c5`](https://github.com/apache/spark/commit/7afc5c52fa31595b1eb458100d37fe92f62e31aa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21932: [SPARK-24979][SQL] add AnalysisHelper#resolveOperatorsUp
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21932 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21932: [SPARK-24979][SQL] add AnalysisHelper#resolveOperatorsUp
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94351/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21258: [SPARK-23933][SQL] Add map_from_arrays function
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/21258#discussion_r208199133 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -235,6 +235,69 @@ case class CreateMap(children: Seq[Expression]) extends Expression { override def prettyName: String = "map" } +/** + * Returns a catalyst Map containing the two arrays in children expressions as keys and values. + */ +@ExpressionDescription( + usage = """ +_FUNC_(keys, values) - Creates a map with a pair of the given key/value arrays. All elements + in keys should not be null""", + examples = """ +Examples: + > SELECT _FUNC_([1.0, 3.0], ['2', '4']); + {1.0:"2",3.0:"4"} + """, since = "2.4.0") +case class CreateMapFromArray(left: Expression, right: Expression) +extends BinaryExpression with ExpectsInputTypes { + + override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType, ArrayType) + + override def checkInputDataTypes(): TypeCheckResult = { +(left.dataType, right.dataType) match { + case (ArrayType(_, cn), ArrayType(_, _)) => +if (!cn) { + TypeCheckResult.TypeCheckSuccess +} else { + TypeCheckResult.TypeCheckFailure("All of the given keys should be non-null") +} + case _ => +TypeCheckResult.TypeCheckFailure("The given two arguments should be an array") +} + } + + override def dataType: DataType = { +MapType( + keyType = left.dataType.asInstanceOf[ArrayType].elementType, + valueType = right.dataType.asInstanceOf[ArrayType].elementType, + valueContainsNull = left.dataType.asInstanceOf[ArrayType].containsNull) + } + + override def nullable: Boolean = false + + override def nullSafeEval(keyArray: Any, valueArray: Any): Any = { +val keyArrayData = keyArray.asInstanceOf[ArrayData] --- End diff -- I would like to err on the safe side here. `CreateMap` should be fixed IMO. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21932: [SPARK-24979][SQL] add AnalysisHelper#resolveOperatorsUp
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21932 **[Test build #94351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94351/testReport)** for PR 21932 at commit [`9d12a9e`](https://github.com/apache/spark/commit/9d12a9ee7b3c0d6037c3c8d99642fdb45638f4f2). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` s\"its class is $` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21596 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94350/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org