[GitHub] spark issue #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory operat...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21968 can we do the same thing for the columnar one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22226: [SPARK-25252][SQL] Support arrays of any types by to_jso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/6 **[Test build #95331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95331/testReport)** for PR 6 at commit [`e91a33c`](https://github.com/apache/spark/commit/e91a33cbf5416c93b95ca0bd594a5418ee033f15). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22240 **[Test build #95329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95329/testReport)** for PR 22240 at commit [`7e2975b`](https://github.com/apache/spark/commit/7e2975bd4f723869cded56dc1269e00d7e03df1f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21546 **[Test build #95325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95325/testReport)** for PR 21546 at commit [`2fe46f8`](https://github.com/apache/spark/commit/2fe46f82dc38af972bc0974aca1fd846bcb483e5). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22250 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95328/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22240 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95329/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21790: [SPARK-24544][SQL] Print actual failure cause when look ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21790 **[Test build #95330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95330/testReport)** for PR 21790 at commit [`690035a`](https://github.com/apache/spark/commit/690035a877d21de75310011eeecc80f2ff87b4bf). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class NoSuchFunctionException(db: String, func: String, rootCause: Option[String] = None)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/21860 cc @cloud-fan @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22240 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95327/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22238: [SPARK-25245][DOCS][SS] Explain regarding limiting modif...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22238 **[Test build #95326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95326/testReport)** for PR 22238 at commit [`e2ee43d`](https://github.com/apache/spark/commit/e2ee43da2f9bf4fb95c938764ee3584bbae06c1b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22246 **[Test build #95324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95324/testReport)** for PR 22246 at commit [`e0d424d`](https://github.com/apache/spark/commit/e0d424d645010108a497c057fa4ad1e198f1e3d0). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22250 **[Test build #95328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95328/testReport)** for PR 22250 at commit [`f9b32d5`](https://github.com/apache/spark/commit/f9b32d5d044a899529959ad5042f8cf95c789ea8). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21790: [SPARK-24544][SQL] Print actual failure cause when look ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21790 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22250 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22112 **[Test build #95327 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95327/testReport)** for PR 22112 at commit [`dc45157`](https://github.com/apache/spark/commit/dc45157d1a49ed8145a421422e68bcf9c32faa17). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22238: [SPARK-25245][DOCS][SS] Explain regarding limiting modif...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22238 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22238: [SPARK-25245][DOCS][SS] Explain regarding limiting modif...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22238 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95326/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21790: [SPARK-24544][SQL] Print actual failure cause when look ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21790 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95330/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95325/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22246 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95324/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22246 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22112 **[Test build #95332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95332/testReport)** for PR 22112 at commit [`dc45157`](https://github.com/apache/spark/commit/dc45157d1a49ed8145a421422e68bcf9c32faa17). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22246 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20965 oh, made a mistake ... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22246 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2604/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2605/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22246 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2603/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #95334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95334/testReport)** for PR 20965 at commit [`7564d32`](https://github.com/apache/spark/commit/7564d32bd9648565e1b69c540ae5256752864889). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22246 **[Test build #95333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95333/testReport)** for PR 22246 at commit [`e0d424d`](https://github.com/apache/spark/commit/e0d424d645010108a497c057fa4ad1e198f1e3d0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22250 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2606/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22250: [SPARK-25259][SQL] left/right join support push down dur...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22250 **[Test build #95335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95335/testReport)** for PR 22250 at commit [`f9b32d5`](https://github.com/apache/spark/commit/f9b32d5d044a899529959ad5042f8cf95c789ea8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...
Github user TomaszGaweda commented on a diff in the pull request: https://github.com/apache/spark/pull/22249#discussion_r213213173 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2459,6 +2459,26 @@ object functions { StringTrimLeft(e.expr, Literal(trimString)) } + /** +* Extracts a part from a URL. +* +* @group string_funcs +* @since 2.4.0 +*/ + def parse_url(url: Column, partToExtract: String): Column = withExpr { --- End diff -- @HyukjinKwon Thanks for the suggestion, however now users are complaining about stringly-typed system in Spark, there are libs like Frameless from Typelevel to archieve a bit more type safety. `expr` is springly-typed, while functions in `functions` object or accessed via `UserDefinedFunction` are a bit more type safe. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2607/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #95336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95336/testReport)** for PR 20965 at commit [`f617623`](https://github.com/apache/spark/commit/f617623a5bbd5f8f00bfac4d3796c0097805b3c0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2608/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #95337 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95337/testReport)** for PR 20965 at commit [`f5ffedd`](https://github.com/apache/spark/commit/f5ffeddd9ebedfc4633fed03b86536efcfa2b02b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22251: [SPARK-25260][SQL] Fix namespace handling in Sche...
GitHub user arunmahadevan opened a pull request: https://github.com/apache/spark/pull/22251 [SPARK-25260][SQL] Fix namespace handling in SchemaConverters.toAvroType ## What changes were proposed in this pull request? `toAvroType` converts spark data type to avro schema. It always appends the record name to namespace so its impossible to have an Avro namespace independent of the record name. When invoked with a spark data type like, ```java val sparkSchema = StructType(Seq( StructField("name", StringType, nullable = false), StructField("address", StructType(Seq( StructField("city", StringType, nullable = false), StructField("state", StringType, nullable = false))), nullable = false))) // map it to an avro schema with record name "employee" and top level namespace "foo.bar", val avroSchema = SchemaConverters.toAvroType(sparkSchema, false, "employee", "foo.bar") // result is // avroSchema.getName = employee // avroSchema.getNamespace = foo.bar.employee // avroSchema.getFullname = foo.bar.employee.employee ``` The patch proposes to fix this so that the result is ``` avroSchema.getName = employee avroSchema.getNamespace = foo.bar avroSchema.getFullname = foo.bar.employee ``` ## How was this patch tested? New and existing unit tests. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/arunmahadevan/spark avro-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22251.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22251 commit f47483951e12d563b7696940a2cfc2fdc3b27ab2 Author: Arun Mahadevan Date: 2018-08-28T08:00:17Z [SPARK-25260][SQL] Fix namespace handling in SchemaConverters.toAvroType --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user arunmahadevan commented on the issue: https://github.com/apache/spark/pull/22251 cc @gengliangwang @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22251 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22251 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22251 **[Test build #95338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95338/testReport)** for PR 22251 at commit [`f474839`](https://github.com/apache/spark/commit/f47483951e12d563b7696940a2cfc2fdc3b27ab2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22251 **[Test build #95338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95338/testReport)** for PR 22251 at commit [`f474839`](https://github.com/apache/spark/commit/f47483951e12d563b7696940a2cfc2fdc3b27ab2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95338/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22251: [SPARK-25260][SQL] Fix namespace handling in SchemaConve...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22251 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2609/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #95339 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95339/testReport)** for PR 20965 at commit [`a737837`](https://github.com/apache/spark/commit/a73783756ffa846916d83495dc97ae613d5c7039). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22249#discussion_r213227841 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2459,6 +2459,26 @@ object functions { StringTrimLeft(e.expr, Literal(trimString)) } + /** +* Extracts a part from a URL. +* +* @group string_funcs +* @since 2.4.0 +*/ + def parse_url(url: Column, partToExtract: String): Column = withExpr { --- End diff -- I mean, > I would suggest method that return handler for any registered function. So that you can write: SqlFunction something = spark.(...?).getFunction("parse_url") Can this support strongly typed one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21546 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21546 **[Test build #95340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95340/testReport)** for PR 21546 at commit [`2fe46f8`](https://github.com/apache/spark/commit/2fe46f82dc38af972bc0974aca1fd846bcb483e5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2610/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22252: [SPARK-25261][MINOR][DOC] correct the default uni...
GitHub user ivoson opened a pull request: https://github.com/apache/spark/pull/22252 [SPARK-25261][MINOR][DOC] correct the default unit for spark.executor|driver.memory as described in configuration.md ## What changes were proposed in this pull request? As described in [SPARK-25261](https://issues.apache.org/jira/projects/SPARK/issues/SPARK-25261)ï¼the unit of spark.executor.memory and spark.driver.memory is bytes if no unit specified, while in https://spark.apache.org/docs/latest/configuration.html#application-properties, they are descibed as MiB, which may lead to some misunderstandings. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/ivoson/spark branch-correct-configuration Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22252.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22252 commit 8f360b97e9f0afff7e6627df15cc17f103a5c2b3 Author: Huang Tengfei Date: 2018-08-28T07:28:06Z update configuration.md, correct the default unit of spark.executor.memory and spark.driver.memory. Change-Id: I2935b93559aa3e016bbab7a083c8c24bbdc6f685 commit 3eb3b66a52435366f258cbb40d01d8f3b0141bff Author: huangtengfei02 Date: 2018-08-28T08:29:25Z fix style issue Change-Id: I73e82b8bd07064d874bf76df52cb661097532884 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22252: [SPARK-25261][MINOR][DOC] correct the default unit for s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22252 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22252: [SPARK-25261][MINOR][DOC] correct the default unit for s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22252 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22236: [SPARK-10697][ML] Add lift to Association rules
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22236 thanks for pointing that out @hhbyyh . I wasn't aware of it. I think lift and confidence are more useful metrics than the support (the votes on the JIRAs seem also to agree with me on this). Anyway, after this PR, adding support too would be trivial. I checked your PR @hhbyyh and if you don't mind I'll take the naming from there as I prefer it over the one I used here. :) Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22252: [SPARK-25261][MINOR][DOC] correct the default unit for s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22252 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22251: [SPARK-25260][SQL] Fix namespace handling in Sche...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22251#discussion_r213231887 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala --- @@ -143,29 +143,25 @@ object SchemaConverters { val avroType = LogicalTypes.decimal(d.precision, d.scale) val fixedSize = minBytesForPrecision(d.precision) // Need to avoid naming conflict for the fixed fields -val name = prevNameSpace match { +val name = nameSpace match { case "" => s"$recordName.fixed" - case _ => s"$prevNameSpace.$recordName.fixed" + case _ => s"$nameSpace.$recordName.fixed" } avroType.addToSchema(SchemaBuilder.fixed(name).size(fixedSize)) case BinaryType => builder.bytesType() case ArrayType(et, containsNull) => builder.array() - .items(toAvroType(et, containsNull, recordName, prevNameSpace)) + .items(toAvroType(et, containsNull, recordName, nameSpace)) case MapType(StringType, vt, valueContainsNull) => builder.map() - .values(toAvroType(vt, valueContainsNull, recordName, prevNameSpace)) + .values(toAvroType(vt, valueContainsNull, recordName, nameSpace)) case st: StructType => -val nameSpace = prevNameSpace match { - case "" => recordName - case _ => s"$prevNameSpace.$recordName" -} - +val childNameSpace = if (nameSpace != "") s"$nameSpace.$recordName" else recordName val fieldsAssembler = builder.record(recordName).namespace(nameSpace).fields() --- End diff -- +1, this line is the only difference for the whole code change. The namespace here should not be the one with `recordName` at the end. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...
Github user TomaszGaweda commented on a diff in the pull request: https://github.com/apache/spark/pull/22249#discussion_r213232338 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2459,6 +2459,26 @@ object functions { StringTrimLeft(e.expr, Literal(trimString)) } + /** +* Extracts a part from a URL. +* +* @group string_funcs +* @since 2.4.0 +*/ + def parse_url(url: Column, partToExtract: String): Column = withExpr { --- End diff -- @HyukjinKwon I see now. Yeah, wrapping in the `Column` will be necessary, at least no string concatenation will be required --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22236: [SPARK-10697][ML] Add lift to Association rules
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22236 @holdenk @srowen anyway I realized that computing the lift for previously saved models is not feasible as we don't know the number of training records, so we cannot compute the items support. I'll keep the re-computation at read time in order to avoid saving the `itemSupport` map, but for previous models the lift is going to be null as we cannot compute it. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22251: [SPARK-25260][SQL] Fix namespace handling in Sche...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22251#discussion_r213232867 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -1099,6 +1098,27 @@ class AvroSuite extends QueryTest with SharedSQLContext with SQLTestUtils { } } + test("check namespace - toAvroType") { --- End diff -- @arunmahadevan, can we add a simple end-to-end test as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22251: [SPARK-25260][SQL] Fix namespace handling in Sche...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22251#discussion_r213233392 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -1099,6 +1098,27 @@ class AvroSuite extends QueryTest with SharedSQLContext with SQLTestUtils { } } + test("check namespace - toAvroType") { +val sparkSchema = StructType(Seq( + StructField("name", StringType, nullable = false), + StructField("address", StructType(Seq( +StructField("city", StringType, nullable = false), +StructField("state", StringType, nullable = false))), +nullable = false))) +val employeeType = SchemaConverters.toAvroType(sparkSchema, + recordName = "employee", + nameSpace = "foo.bar") --- End diff -- nit: could you also add a case for `nameSpace` as `""` ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22236: [SPARK-10697][ML] Add lift to Association rules
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22236 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22236: [SPARK-10697][ML] Add lift to Association rules
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22236 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2611/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22236: [SPARK-10697][ML] Add lift to Association rules
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22236 **[Test build #95341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95341/testReport)** for PR 22236 at commit [`44a0021`](https://github.com/apache/spark/commit/44a002103aaf89a6b17688e50ee351a1576389d5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2612/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22243 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22243 **[Test build #95342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95342/testReport)** for PR 22243 at commit [`393379c`](https://github.com/apache/spark/commit/393379ca56225424ee58cbbe300b6ec8c83cbc7e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22048: [SPARK-25108][SQL] Fix the show method to display the wi...
Github user xuejianbest commented on the issue: https://github.com/apache/spark/pull/22048 I looked at all the 0x-0x characters (unicode) and showed them under Xshell, then found all the full width characters. Get the latest regular expression. A new version has been submitted, and the variable name has been changed and comments added, please see below. @srowen @kiszk A new version is submitted, and the variable name is changed and an annotation is added. See below. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22048: [SPARK-25108][SQL] Fix the show method to display the wi...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22048 @xuejianbest thank you for your update. Would it be possible to commit test cases, too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213252905 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand( val value = ExternalCatalogUtils.unescapePathName(ps(1)) if (resolver(columnName, partitionNames.head)) { scanPartitions(spark, fs, filter, st.getPath, spec ++ Map(partitionNames.head -> value), -partitionNames.drop(1), threshold, resolver) +partitionNames.drop(1), threshold, resolver, listFilesInParallel = false) --- End diff -- @kiszk Right, all `Future`s do the same - trying to execute another `Future` on the same fixed thread pool. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER PARTITI...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22233 **[Test build #95343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95343/testReport)** for PR 22233 at commit [`071de47`](https://github.com/apache/spark/commit/071de4767d762247751f530be78d6b99758bd95c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22226: [SPARK-25252][SQL] Support arrays of any types by to_jso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/6 **[Test build #95331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95331/testReport)** for PR 6 at commit [`e91a33c`](https://github.com/apache/spark/commit/e91a33cbf5416c93b95ca0bd594a5418ee033f15). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22226: [SPARK-25252][SQL] Support arrays of any types by to_jso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/6 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22226: [SPARK-25252][SQL] Support arrays of any types by to_jso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/6 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95331/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory operat...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/21968 @cloud-fan `VectorizedHashMapGenerator`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22215: [SPARK-25222][K8S] Improve container status logging
Github user rvesse commented on the issue: https://github.com/apache/spark/pull/22215 @liyinan926 @nrchakradhar Addressed all your comments, thanks for the reviews. Is someone able to kick off the Jenkins testing on this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22205 **[Test build #95344 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95344/testReport)** for PR 22205 at commit [`9ab1fa0`](https://github.com/apache/spark/commit/9ab1fa0979683497d1612a280888eb94330d123b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213264786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -266,7 +266,9 @@ object SQLConf { .createWithDefault(Long.MaxValue) val SHUFFLE_PARTITIONS = buildConf("spark.sql.shuffle.partitions") -.doc("The default number of partitions to use when shuffling data for joins or aggregations.") +.doc("The default number of partitions to use when shuffling data for joins or aggregations. " + + "Note: For structured streaming, this configuration cannot be changed between query " + --- End diff -- s/cannot be/must not be/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213264912 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -868,7 +870,9 @@ object SQLConf { .internal() .doc( "The class used to manage state data in stateful streaming queries. This class must " + - "be a subclass of StateStoreProvider, and must have a zero-arg constructor.") + "be a subclass of StateStoreProvider, and must have a zero-arg constructor. " + + "Note: For structured streaming, this configuration cannot be changed between query " + --- End diff -- s/cannot be/must not be/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22112 **[Test build #95332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95332/testReport)** for PR 22112 at commit [`dc45157`](https://github.com/apache/spark/commit/dc45157d1a49ed8145a421422e68bcf9c32faa17). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95332/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22112 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22205 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2613/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22205 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22243 **[Test build #95342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95342/testReport)** for PR 22243 at commit [`393379c`](https://github.com/apache/spark/commit/393379ca56225424ee58cbbe300b6ec8c83cbc7e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22243 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95342/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22243: [MINOR] Avoid code duplication for nullable in Higher Or...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22243 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213267299 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -266,7 +266,9 @@ object SQLConf { .createWithDefault(Long.MaxValue) val SHUFFLE_PARTITIONS = buildConf("spark.sql.shuffle.partitions") -.doc("The default number of partitions to use when shuffling data for joins or aggregations.") +.doc("The default number of partitions to use when shuffling data for joins or aggregations. " + + "Note: For structured streaming, this configuration cannot be changed between query " + --- End diff -- The sentence is borrowed from existing one: https://github.com/apache/spark/blob/8198ea50192cad615071beb5510c73aa9e9178f4/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L978-L979 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org