[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85353/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20068 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20068 **[Test build #85353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85353/testReport)** for PR 20068 at commit [`ebe2900`](https://github.com/apache/spark/commit/ebe2900aadd3af0114ed71506088c6a736dd5002). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19954 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85352/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19954 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19954 **[Test build #85352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85352/testReport)** for PR 19954 at commit [`9d9c841`](https://github.com/apache/spark/commit/9d9c841b3528e0806280a58a0a8acaa456aa6e44). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19929#discussion_r158595341 --- Diff: python/pyspark/sql/functions.py --- @@ -2075,9 +2075,10 @@ class PandasUDFType(object): def udf(f=None, returnType=StringType()): --- End diff -- I am saying this because I had few talks about this before and I am pretty sure we usually keep them as same whenever possible. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19929#discussion_r158595281 --- Diff: python/pyspark/sql/functions.py --- @@ -2075,9 +2075,10 @@ class PandasUDFType(object): def udf(f=None, returnType=StringType()): --- End diff -- @gatorsmile, however, wouldn't it be better to keep them consistent if possible? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19929#discussion_r158594837 --- Diff: python/pyspark/sql/functions.py --- @@ -2075,9 +2075,10 @@ class PandasUDFType(object): def udf(f=None, returnType=StringType()): --- End diff -- Scala and Python are different, because that is also for JAVA API. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19982: [SPARK-22787] [TEST] [SQL] Add a TPC-H query suite
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19982 @maropu Thanks for your contribution. It looks over engineering. We do not need such complicated solutions for this simple use case. We just need to record them in the log. We are also proposing new APIs for our logs. @jiangxb1987 is working on the design. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20069: [SPARK-22895] [SQL] Push down the deterministic predicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20069 **[Test build #85355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85355/testReport)** for PR 20069 at commit [`ad6607c`](https://github.com/apache/spark/commit/ad6607c642ffac811f0fa84d9256524676c9c75e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20069: [SPARK-22895] [SQL] Push down the deterministic p...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/20069 [SPARK-22895] [SQL] Push down the deterministic predicates that are after the first non-deterministic ## What changes were proposed in this pull request? Currently, we do not guarantee an order evaluation of conjuncts in either Filter or Join operator. This is also true to the mainstream RDBMS vendors like DB2 and MS SQL Server. Thus, we should also push down the deterministic predicates that are after the first non-deterministic, if possible. ## How was this patch tested? Updated the existing test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark morePushDown Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20069.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20069 commit ad6607c642ffac811f0fa84d9256524676c9c75e Author: gatorsmile Date: 2017-12-24T06:25:54Z fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Thank you, @mridulm for reviewing this PR. I have addressed the latest review comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85354/testReport)** for PR 20002 at commit [`3b08951`](https://github.com/apache/spark/commit/3b089518e66bc4facf7bc07db1d12663dd567393). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20068 **[Test build #85353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85353/testReport)** for PR 20068 at commit [`ebe2900`](https://github.com/apache/spark/commit/ebe2900aadd3af0114ed71506088c6a736dd5002). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20068 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20068: [SPARK-17916][SQL] Fix empty string being parsed ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20068#discussion_r158593580 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala --- @@ -152,7 +152,7 @@ class CSVOptions( writerSettings.setIgnoreLeadingWhitespaces(ignoreLeadingWhiteSpaceFlagInWrite) writerSettings.setIgnoreTrailingWhitespaces(ignoreTrailingWhiteSpaceFlagInWrite) writerSettings.setNullValue(nullValue) -writerSettings.setEmptyValue(nullValue) +writerSettings.setEmptyValue("") --- End diff -- Can we simply expose this as an option and keep the previous behaviour if this option is not set explicitly by the user? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19954 **[Test build #85352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85352/testReport)** for PR 19954 at commit [`9d9c841`](https://github.com/apache/spark/commit/9d9c841b3528e0806280a58a0a8acaa456aa6e44). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20002 I left a couple of comments @sujithjay, overall it is looking good, thanks for working on it ! We can merge it once they are addressed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158592833 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -259,6 +259,27 @@ class PartitioningSuite extends SparkFunSuite with SharedSparkContext with Priva val partitioner = new RangePartitioner(22, rdd) assert(partitioner.numPartitions === 3) } + + test("defaultPartitioner") { +val rdd1 = sc.parallelize((1 to 1000).map(x => (x, x)), 150) +val rdd2 = sc + .parallelize(Array((1, 2), (2, 3), (2, 4), (3, 4))) + .partitionBy(new HashPartitioner(10)) +val rdd3 = sc + .parallelize(Array((1, 6), (7, 8), (3, 10), (5, 12), (13, 14))) + .partitionBy(new HashPartitioner(100)) + +val partitioner1 = Partitioner.defaultPartitioner(rdd1, rdd2) +val partitioner2 = Partitioner.defaultPartitioner(rdd2, rdd3) +val partitioner3 = Partitioner.defaultPartitioner(rdd3, rdd1) +val partitioner4 = Partitioner.defaultPartitioner(rdd1, rdd2, rdd3) + +assert(partitioner1.numPartitions == rdd1.getNumPartitions) +assert(partitioner2.numPartitions == rdd3.getNumPartitions) +assert(partitioner3.numPartitions == rdd3.getNumPartitions) +assert(partitioner4.numPartitions == rdd3.getNumPartitions) --- End diff -- Can you add a testcase such that numPartitions 9 vs 11 is not treated as an order of magnitude jump (to prevent future changes which end up breaking this). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158592810 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, ObjectOutputStream} import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import scala.language.existentials --- End diff -- If we explicitly set the type, is it still required ? For example, with `val hasMaxPartitioner: Option[RDD[_]] = ...` ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20067 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85351/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20067 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20067 **[Test build #85351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85351/testReport)** for PR 20067 at commit [`ae998ec`](https://github.com/apache/spark/commit/ae998ec2b5548b7028d741da4813473dde1ad81e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19683 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85350/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19683 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19683 **[Test build #85350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85350/testReport)** for PR 19683 at commit [`272a059`](https://github.com/apache/spark/commit/272a059db579d11ea5f49387a36ff23a3199c494). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user aa8y commented on the issue: https://github.com/apache/spark/pull/20068 @gatorsmile I've created this PR since #12904 has not been updated in a while. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20068 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20068: SPARK-17916: Fix empty string being parsed as nul...
GitHub user aa8y opened a pull request: https://github.com/apache/spark/pull/20068 SPARK-17916: Fix empty string being parsed as null when nullValue is set. ## What changes were proposed in this pull request? When the option `nullValue` is set, the empty value is also set to the same value. Therefore empty strings get parsed as `null`, which should not happen. This PR explicitly changes this to be an empty string. ## How was this patch tested? Tests were added without the fix. It was tested that they failed. Then the fix was added and the tests have been ensured to pass. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aa8y/spark csvEmptyValue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20068.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20068 commit 1c3d2216380c9cc89ea829588305b5f31c71d6d5 Author: Jeff Zhang Date: 2016-04-29T17:42:52Z Rebase with master. commit b4eddd67234637feb1b255811d8d018b28894095 Author: Arun Allamsetty Date: 2017-10-14T19:46:53Z Merge remote-tracking branch 'upstream/master' commit f406de9fe13f96b0ee615d496c283b21f415fd2b Author: Arun Allamsetty Date: 2017-12-12T00:44:15Z Merge remote-tracking branch 'upstream/master' commit 762c14487c762a193fd4f4359c51aaba71eca3f9 Author: Arun Allamsetty Date: 2017-12-21T21:49:50Z Merge remote-tracking branch 'upstream/master' commit ebe2900aadd3af0114ed71506088c6a736dd5002 Author: Arun Allamsetty Date: 2017-12-21T22:52:15Z SPARK-17916: Fix empty string being parsed as null when nullValue is set. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85349/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85349/testReport)** for PR 20002 at commit [`3dd1ad8`](https://github.com/apache/spark/commit/3dd1ad8e25b7c23b58d33cc422570f4cb133fd4b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20067 **[Test build #85351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85351/testReport)** for PR 20067 at commit [`ae998ec`](https://github.com/apache/spark/commit/ae998ec2b5548b7028d741da4813473dde1ad81e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20067: [SPARK-22894][SQL] DateTimeOperations should acce...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/20067 [SPARK-22894][SQL] DateTimeOperations should accept SQL like string type ## What changes were proposed in this pull request? `DateTimeOperations` accept [`StringType`](https://github.com/apache/spark/blob/ae998ec2b5548b7028d741da4813473dde1ad81e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L669), but: ``` spark-sql> SELECT '2017-12-24' + interval 2 months 2 seconds; Error in query: cannot resolve '(CAST('2017-12-24' AS DOUBLE) + interval 2 months 2 seconds)' due to data type mismatch: differing types in '(CAST('2017-12-24' AS DOUBLE) + interval 2 months 2 seconds)' (double and calendarinterval).; line 1 pos 7; 'Project [unresolvedalias((cast(2017-12-24 as double) + interval 2 months 2 seconds), None)] +- OneRowRelation spark-sql> ``` After this PR: ``` spark-sql> SELECT '2017-12-24' + interval 2 months 2 seconds; 2018-02-24 00:00:02 Time taken: 0.2 seconds, Fetched 1 row(s) ``` ## How was this patch tested? unit tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-22894 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20067.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20067 commit ae998ec2b5548b7028d741da4813473dde1ad81e Author: Yuming Wang Date: 2017-12-23T19:45:31Z DateTimeOperations should accept SQL like string type --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19683 **[Test build #85350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85350/testReport)** for PR 19683 at commit [`272a059`](https://github.com/apache/spark/commit/272a059db579d11ea5f49387a36ff23a3199c494). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85345/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85345/testReport)** for PR 20064 at commit [`f94e9f3`](https://github.com/apache/spark/commit/f94e9f3bc3a3bd2293d1d081b02bcd0ccc1d3053). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20060: [SPARK-22889][SPARKR] Set overwrite=T when instal...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20060 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20060: [SPARK-22889][SPARKR] Set overwrite=T when install Spark...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20060 merged to master/2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158586350 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, ObjectOutputStream} import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import scala.language.existentials --- End diff -- Without this import, there was a compiler warning: ``` Warning:(63, 29) inferred existential type Option[org.apache.spark.rdd.RDD[_$2]]( forSome { type _$2 }), which cannot be expressed by wildcards, should be enabled by making the implicit value scala.language.existentials visible. This can be achieved by adding the import clause 'import scala.language.existentials' or by setting the compiler option -language:existentials. See the Scaladoc for value scala.language.existentials for a discussion why the feature should be explicitly enabled. ``` Spark build failed because of this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158586256 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, ObjectOutputStream} import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import scala.language.existentials --- End diff -- Curious, why was this required ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples
Github user chetkhatri commented on the issue: https://github.com/apache/spark/pull/20018 Thanks @HyukjinKwon @wangyum --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20022#discussion_r158585754 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala --- @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext { Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0))) } + test("Window spill with less than the inMemoryThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") { + assertNotSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + + test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") { + assertNotSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + + test("Window spill with more than the inMemoryThreshold and spillThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") { + assertSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + test("SPARK-21258: complex object in combination with spilling") { // Make sure we trigger the spilling path. -withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") { +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0", --- End diff -- Ahh, now I see 🙂 Sure, I'll set it soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85349/testReport)** for PR 20002 at commit [`3dd1ad8`](https://github.com/apache/spark/commit/3dd1ad8e25b7c23b58d33cc422570f4cb133fd4b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85348/testReport)** for PR 20002 at commit [`6623227`](https://github.com/apache/spark/commit/6623227161a660d924efae1317688c3535d82cb2). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85348/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85348/testReport)** for PR 20002 at commit [`6623227`](https://github.com/apache/spark/commit/6623227161a660d924efae1317688c3535d82cb2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20031: [SPARK-22844][R] Adds date_trunc in R API
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20031 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20031: [SPARK-22844][R] Adds date_trunc in R API
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20031 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20065: [HOTFIX] Fix Scala style checks
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20065 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20031: [SPARK-22844][R] Adds date_trunc in R API
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20031 Thanks for review and approval, @dongjoon-hyun and @felixcheung. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20065 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20065 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20065 **[Test build #85347 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85347/testReport)** for PR 20065 at commit [`da55100`](https://github.com/apache/spark/commit/da55100c22754fef5076b4f15a24e4a5fd2545ae). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85347/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Thank you, @HyukjinKwon . I will try again after the hotfix is merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20018 Thank you @wangyum :D. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespa...
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/20066 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20018 Thanks @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespace to f...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20066 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/console --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20002 @sujithjay, I opened a hotfix. It should be fine soon (maybe after few hours). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespa...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/20066 [SPARK-22833][Examples][FOLLOWUP] Remove whitespace to fix scalastyle checks failed ## What changes were proposed in this pull request? This is a followup PR for: https://github.com/apache/spark/pull/20018. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-22833 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20066.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20066 commit df92f6ce38a14fc248d5830090dfa473371a129c Author: Yuming Wang Date: 2017-12-23T15:59:29Z Remove whitespace --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Scala style tests are failing on a file 'SparkHiveExample.scala' , which is unrelated to this PR. Will rebase to master and try again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20065 **[Test build #85347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85347/testReport)** for PR 20065 at commit [`da55100`](https://github.com/apache/spark/commit/da55100c22754fef5076b4f15a24e4a5fd2545ae). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20018 I just opened a quick hotfix - https://github.com/apache/spark/pull/20065 as I think we don't run examples in the build and tests and all we need would just be the style. Reverting works also fine to me @srowen. I can close mine. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20065: [HOTFIX] Fix Scala style checks
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/20065 [HOTFIX] Fix Scala style checks ## What changes were proposed in this pull request? This PR fixes a style that broke the build. ## How was this patch tested? Manually tested. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark minor-style Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20065.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20065 commit da55100c22754fef5076b4f15a24e4a5fd2545ae Author: hyukjinkwon Date: 2017-12-23T15:50:59Z Fix style --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85346/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85346/testReport)** for PR 20002 at commit [`4729d80`](https://github.com/apache/spark/commit/4729d8036e984ecb7e8143f9f1cd7a3d84ec1754). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20018 Seems this did not passed the test .. this causes a build failure: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/console ``` Running Scala style checks Scalastyle checks failed at following occurrences: [error] /home/jenkins/workspace/SparkPullRequestBuilder/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala:138:0: Whitespace at end of line [error] Total time: 13 s, completed Dec 23, 2017 7:34:15 AM [error] running /home/jenkins/workspace/SparkPullRequestBuilder/dev/lint-scala ; received return code 1 ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85346/testReport)** for PR 20002 at commit [`4729d80`](https://github.com/apache/spark/commit/4729d8036e984ecb7e8143f9f1cd7a3d84ec1754). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85345/testReport)** for PR 20064 at commit [`f94e9f3`](https://github.com/apache/spark/commit/f94e9f3bc3a3bd2293d1d081b02bcd0ccc1d3053). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19929#discussion_r158584224 --- Diff: python/pyspark/sql/functions.py --- @@ -2075,9 +2075,10 @@ class PandasUDFType(object): def udf(f=None, returnType=StringType()): """Creates a user defined function (UDF). -.. note:: The user-defined functions must be deterministic. Due to optimization, +.. note:: The user-defined functions are considered deterministic. Due to optimization, duplicate invocations may be eliminated or the function may even be invoked more times than -it is present in the query. +it is present in the query. If your function is not deterministic, call +`asNondeterministic`. --- End diff -- Let's say this more explicitly like .. call `asNondeterministic()` in the user-defined function. It's partly because I think `UserDefinedFunction` is not documented in PySpark. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85344 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85344/testReport)** for PR 20002 at commit [`8b35452`](https://github.com/apache/spark/commit/8b3545265b534e511ac947071e416360184d740e). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85344/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85343 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/testReport)** for PR 20064 at commit [`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85343/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85344 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85344/testReport)** for PR 20002 at commit [`8b35452`](https://github.com/apache/spark/commit/8b3545265b534e511ac947071e416360184d740e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/testReport)** for PR 20064 at commit [`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20064 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85342/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85342/testReport)** for PR 20002 at commit [`961e384`](https://github.com/apache/spark/commit/961e3848cea1dc1b6568c1612eef7bedba4270d5). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85342/testReport)** for PR 20002 at commit [`961e384`](https://github.com/apache/spark/commit/961e3848cea1dc1b6568c1612eef7bedba4270d5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20022#discussion_r158584088 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala --- @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext { Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0))) } + test("Window spill with less than the inMemoryThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") { + assertNotSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + + test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") { + assertNotSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + + test("Window spill with more than the inMemoryThreshold and spillThreshold") { +val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value") +val window = Window.partitionBy($"key").orderBy($"value") + +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1", + SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") { + assertSpilled(sparkContext, "select") { +df.select($"key", sum("value").over(window)).collect() + } +} + } + test("SPARK-21258: complex object in combination with spilling") { // Make sure we trigger the spilling path. -withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") { +withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0", --- End diff -- Yeah, i mean, how about set it to 1 instead of 0? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20062: [SPARK-22892] [SQL] Simplify some estimation logic by us...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/20062 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85341 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85341/testReport)** for PR 20064 at commit [`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85341/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85341/testReport)** for PR 20064 at commit [`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85339/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20064 **[Test build #85339 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85339/testReport)** for PR 20064 at commit [`8540b91`](https://github.com/apache/spark/commit/8540b912e8e846f9e0fb8c94a8dcc48a05be6a57). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20052: [SPARK-20694][EXAMPLES]Update SQLDataSourceExampl...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20052 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20052: [SPARK-20694][EXAMPLES]Update SQLDataSourceExample.scala
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20052 Merged to master/2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org