[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13244#discussion_r64144623 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -32,4 +32,4 @@ package org.apache.spark.sql.catalyst.plans.logical * @param sizeInBytes Physical size in bytes. For leaf operators this defaults to 1, otherwise it *defaults to the product of children's `sizeInBytes`. */ -private[sql] case class Statistics(sizeInBytes: BigInt) +private[sql] case class Statistics(sizeInBytes: BigInt, isBroadcastable: Boolean = false) --- End diff -- would be good to document isBroadcastable in the classdoc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220815712 @cloud-fan Based on my understanding, runtime conf (`class RuntimeConfig`) is designed as the public/external interface for users to access the internal conf. If users want to make a change on Config at runtime, they must use `RuntimeConfig`. In the future, we can further enhance it to block external users to change the internal conf? Also easier to manage Hadoop configuration in `RuntimeConfig`? `SQLConf` will be just an internal implementation of configuration. We do not expect external users to directly access it. You know, this is just my understanding. : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15194] [ML] Add Python ML API for Multi...
GitHub user praveendareddy21 opened a pull request: https://github.com/apache/spark/pull/13248 [SPARK-15194] [ML] Add Python ML API for MultivariateGaussian ## What changes were proposed in this pull request? Added MultivariateGaussian in pyspark ML to match scala's ML API ## How was this patch tested? Tested locally and also added testcases from scala's testsuite (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) You can merge this pull request into a Git repository by running: $ git pull https://github.com/praveendareddy21/spark local_branch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13248.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13248 commit a7250b4dd538be255f8220de20277d69edbeebac Author: redDate: 2016-05-22T05:22:05Z added Multivariate gaussian in ML Pyspark commit 0c58e8866498d4e42af0542819fca8a6d76af08a Author: red Date: 2016-05-22T05:33:56Z added testcase for python multivariate --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15194] [ML] Add Python ML API for Multi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13248#issuecomment-220815663 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13243#issuecomment-220815424 The fallback approach doesn't look that simple and clean, can you try split the generated code like we did in `CreateExternalRow`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220815210 What's the difference between runtime conf and normal conf? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220814644 Unfortunately, `Dataset` (or `Dataframe`) seems not suitable to achieve the goal on Python. ```python >>> spark.parallelize(range(1, 10)).toDS() ... AttributeError: 'RDD' object has no attribute 'toDS' >>> spark.parallelize(range(1, 10)).toDF() ... TypeError: Can not infer schema for type: ``` I'll think about this more until tomorrow and close this if I cannot find a neat solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220814526 **[Test build #59090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59090/consoleFull)** for PR 13247 at commit [`f443064`](https://github.com/apache/spark/commit/f443064bfabb9e1055d75b7ee1b33085d72b1a3f). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220814528 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59090/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220814527 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15396] [SQL] [DOC] It can't connect hiv...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/13225#issuecomment-220814492 @rxin @jameszhouyi Do you think the document changes in this PR are clear? Please let me know if anything is missing or inappropriate. Thanks! Also CC all the Committers who changed the related codes. @yhuai @andrewor14 @cloud-fan @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13247#issuecomment-220814370 **[Test build #59090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59090/consoleFull)** for PR 13247 at commit [`f443064`](https://github.com/apache/spark/commit/f443064bfabb9e1055d75b7ee1b33085d72b1a3f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/13095#issuecomment-220814326 Thank you, @cloud-fan ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/13247 [SPARK-15470] [SQL] Unify the Configuration Interface in SQLContext What changes were proposed in this pull request? We introduced `RuntimeConfig` in `SQLContext` in the PR https://github.com/apache/spark/pull/12669. Now, `SQLContext` has both `conf` and `runtimeConf`. `SQLContext` is being replaced by `SparkSession`. Like `SparkSession`, we should not have two configuration interfaces. That means, we should not expose `conf` to external users. This PR contains three major parts: 1. removed `conf` from `SQLContext`. 2. added the missing functions into `RuntimeConfig`, including two `set` functions and one `clear` function. 3. fixed the test cases in `SparkSessionBuilderSuite.scala`. Without this fix, we are unable to individually run the test cases. All the test cases require `initialSession`. @rxin @andrewor14 @yhuai @cloud-fan Do you think this PR is valid? Thanks! How was this patch tested? Existing test cases cover it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark configNew Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13247.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13247 commit f5708f52171ef5ce04eb4358d101d55862cc2294 Author: gatorsmileDate: 2016-05-21T21:25:38Z initial fix. commit 0808fa13f04a228377be8f1d17d0aa7da4a47aee Author: gatorsmile Date: 2016-05-22T03:29:38Z update the test suites. commit f443064bfabb9e1055d75b7ee1b33085d72b1a3f Author: gatorsmile Date: 2016-05-22T04:33:35Z remove conf from SQLContext --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15206][SQL] add testcases for distinct ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12984 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13095#issuecomment-220814047 **[Test build #59089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59089/consoleFull)** for PR 13095 at commit [`d7c2420`](https://github.com/apache/spark/commit/d7c2420cd21e812e08bdea7aa27adf42fe534b98). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220814048 **[Test build #59088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59088/consoleFull)** for PR 13216 at commit [`d677105`](https://github.com/apache/spark/commit/d67710504723ef42b6719d2b242aa0527cad2584). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15206][SQL] add testcases for distinct ...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12984#issuecomment-220814026 thanks, merging to master and 2.0! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13244#issuecomment-220813993 **[Test build #3009 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3009/consoleFull)** for PR 13244 at commit [`8b9bf51`](https://github.com/apache/spark/commit/8b9bf515423fa422d3c8436097acd87c4d09b733). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15379][SQL] check special invalid date
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13169#discussion_r64143999 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -353,6 +353,20 @@ class DateTimeUtilsSuite extends SparkFunSuite { c.getTimeInMillis * 1000 + 123456) } + test("SPARK-15379: special invalid date string") { +// Test stringToDate +assert(stringToDate( + UTF8String.fromString("2015-02-29 00:00:00")).isEmpty) --- End diff -- Can we try date string(without timestamp part) here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13095#issuecomment-220813962 LGTM, pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13095#issuecomment-220813959 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13210#issuecomment-220813852 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59087/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13210#issuecomment-220813851 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13210#issuecomment-220813817 **[Test build #59087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59087/consoleFull)** for PR 13210 at commit [`abc12a5`](https://github.com/apache/spark/commit/abc12a5ee606282b069ff0c326a2f32d4ed2fbe2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220813421 I see. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15464][ML][MLlib][SQL][Tests] Replace S...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13242#issuecomment-220812010 cc @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13244#issuecomment-220811992 **[Test build #3009 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3009/consoleFull)** for PR 13244 at commit [`8b9bf51`](https://github.com/apache/spark/commit/8b9bf515423fa422d3c8436097acd87c4d09b733). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13244#issuecomment-220811981 Looks good at high level. Will take a closer look later! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220811972 hm we are trying to avoid returning rdds in the new apis. one thing we can do is to introduce a parallelize api that returns dataset? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13121 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13210#issuecomment-220811819 **[Test build #59087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59087/consoleFull)** for PR 13210 at commit [`abc12a5`](https://github.com/apache/spark/commit/abc12a5ee606282b069ff0c326a2f32d4ed2fbe2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220811816 Merging in master/2.0. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13246#issuecomment-220810589 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59086/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13246#issuecomment-220810588 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13246#issuecomment-220810568 **[Test build #59086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59086/consoleFull)** for PR 13246 at commit [`3a57975`](https://github.com/apache/spark/commit/3a5797544792557a6a143784277753f4d93dd031). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11696] [ML, MLlib] Optimization: Extend...
Github user NarineK closed the pull request at: https://github.com/apache/spark/pull/9667 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809759 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59082/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809758 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809739 **[Test build #59082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59082/consoleFull)** for PR 13245 at commit [`65f9746`](https://github.com/apache/spark/commit/65f9746362ac6fb227a5c8ff59717852b5ae87c4). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809417 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809418 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59085/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220809398 **[Test build #59085 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59085/consoleFull)** for PR 13245 at commit [`4f6a69e`](https://github.com/apache/spark/commit/4f6a69e75d3c96f3b2ed9d93edf8d1bf958acf1c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
Github user bomeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13246#discussion_r64142270 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -227,8 +227,8 @@ object IntegerIndex { * - Unnamed grouping expressions are named so that they can be referred to across phases of *aggregation * - Aggregations that appear multiple times are deduplicated. - * - The compution of the aggregations themselves is separated from the final result. For example, - *the `count` in `count + 1` will be split into an [[AggregateExpression]] and a final + * - The computation of the aggregations themselves is separated from the final result. For + *example, the `count` in `count + 1` will be split into an [[AggregateExpression]] and a final --- End diff -- This is just needed for 100-char line limit as previous line fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13246#issuecomment-220807733 **[Test build #59086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59086/consoleFull)** for PR 13246 at commit [`3a57975`](https://github.com/apache/spark/commit/3a5797544792557a6a143784277753f4d93dd031). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos
GitHub user bomeng opened a pull request: https://github.com/apache/spark/pull/13246 [SPARK-15468] [SQL] some some typos ## What changes were proposed in this pull request? Fix some typos while browsing the codes. ## How was this patch tested? None and obvious. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bomeng/spark typo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13246.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13246 commit ff73a8ddc036e1d8edf7eaa3be2e39db4b17d67f Author: bomengDate: 2016-05-19T01:32:27Z fix typo commit 6b05bc95623483f96757a917508fc3737b20bc90 Author: Bo Meng Date: 2016-05-20T18:48:17Z Merge remote-tracking branch 'upstream/master' into typo commit 3a5797544792557a6a143784277753f4d93dd031 Author: Bo Meng Date: 2016-05-21T22:32:12Z Merge remote-tracking branch 'upstream/master' into typo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220807530 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220807531 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59084/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220807510 **[Test build #59084 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59084/consoleFull)** for PR 13216 at commit [`c29acae`](https://github.com/apache/spark/commit/c29acaeccc5342b51f645449ee75e8e513c89c36). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/13211#discussion_r64142098 --- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala --- @@ -437,7 +438,9 @@ class ListAccumulator[T] extends AccumulatorV2[T, java.util.List[T]] { s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}") } - override def value: java.util.List[T] = java.util.Collections.unmodifiableList(_list) + override def value: java.util.List[T] = _list.synchronized { +java.util.Collections.unmodifiableList(new ArrayList[T](_list)) --- End diff -- I think so. Allowing users modifying the list seems not a good idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220806369 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59083/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220806446 **[Test build #59085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59085/consoleFull)** for PR 13245 at commit [`4f6a69e`](https://github.com/apache/spark/commit/4f6a69e75d3c96f3b2ed9d93edf8d1bf958acf1c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220806368 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220806334 **[Test build #59083 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59083/consoleFull)** for PR 13121 at commit [`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220805756 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59081/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220805754 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220805723 **[Test build #59081 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59081/consoleFull)** for PR 13245 at commit [`810f08a`](https://github.com/apache/spark/commit/810f08a666c5d14a2178e329b7c1727603be485e). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220805471 **[Test build #59084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59084/consoleFull)** for PR 13216 at commit [`c29acae`](https://github.com/apache/spark/commit/c29acaeccc5342b51f645449ee75e8e513c89c36). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15280] [Input/Output] Refactored OrcOut...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13066 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15280] [Input/Output] Refactored OrcOut...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/13066#issuecomment-220805144 Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220804464 LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL] disable whole stage codegen...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12322#discussion_r64141139 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -620,6 +620,12 @@ class DatasetSuite extends QueryTest with SharedSQLContext { val df = streaming.join(static, Seq("b")) assert(df.isStreaming, "streaming Dataset returned false for 'isStreaming'.") } + + test("SPARK-14554: Dataset.map may generate wrong java code for wide table") { +val wideDF = sqlContext.range(10).select(Seq.tabulate(1000) {i => ('id + i).as(s"c$i")} : _*) +// Make sure the generated code for this plan can compile and execute. +wideDF.map(_.getLong(0)).collect() --- End diff -- Do you know why this test case is super slow? It took more than 5 minutes to finish it. Is this expected? ``` - SPARK-14554: Dataset.map may generate wrong java code for wide table (5 minutes, 20 seconds) ``` See the link: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220803288 **[Test build #59083 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59083/consoleFull)** for PR 13121 at commit [`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220803118 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220802722 **[Test build #59082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59082/consoleFull)** for PR 13245 at commit [`65f9746`](https://github.com/apache/spark/commit/65f9746362ac6fb227a5c8ff59717852b5ae87c4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220801908 **[Test build #59081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59081/consoleFull)** for PR 13245 at commit [`810f08a`](https://github.com/apache/spark/commit/810f08a666c5d14a2178e329b7c1727603be485e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13245#issuecomment-220801877 Hi, @rxin . I'm wondering your opinion about this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13245 [SPARK-15466][SQL] Make `SparkSession` as the entry point to programming with RDD too ## What changes were proposed in this pull request? `SparkSession` greatly reduces the number of concepts which Spark users must know. Currently, `SparkSession` is defined as the entry point to programming Spark with the Dataset and DataFrame API. And, we can easily get `RDD` by calling `Dataset.rdd` or `DataFrame.rdd`, too. However, many usages (including examples) are observed to extract `SparkSession.sparkContext` and keep it as own variable to call `parallelize`. If `SparkSession` supports RDD seamlessly too, it would be great for usability. We can do this by simply adding `parallelize` API. **Example** ```scala object SparkPi { def main(args: Array[String]) { val spark = SparkSession .builder .appName("Spark Pi") .getOrCreate() -val sc = spark.sparkContext val slices = if (args.length > 0) args(0).toInt else 2 val n = math.min(10L * slices, Int.MaxValue).toInt // avoid overflow -val count = sc.parallelize(1 until n, slices).map { i => +val count = spark.parallelize(1 until n, slices).map { i => val count = spark.parallelize(1 until n, slices).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y < 1) 1 else 0 }.reduce(_ + _) println("Pi is roughly " + 4.0 * count / n) spark.stop() } } ``` ```python spark = SparkSession\ .builder\ .appName("PythonPi")\ .getOrCreate() - sc = spark._sc - partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2 n = 10 * partitions def f(_): x = random() * 2 - 1 y = random() * 2 - 1 return 1 if x ** 2 + y ** 2 < 1 else 0 -count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add) count = spark.parallelize(range(1, n + 1), partitions).map(f).reduce(add) print("Pi is roughly %f" % (4.0 * count / n)) spark.stop() ``` ## How was this patch tested? Pass the Jenkins test (with new python test) and also manual. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-15466 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13245.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13245 commit 810f08a666c5d14a2178e329b7c1727603be485e Author: Dongjoon HyunDate: 2016-05-21T21:36:04Z [SPARK-15466][SQL] Make `SparkSession` as the entry point to programming with RDD too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220801790 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220801786 **[Test build #59080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59080/consoleFull)** for PR 13121 at commit [`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220801791 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59080/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220801477 **[Test build #59080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59080/consoleFull)** for PR 13121 at commit [`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/13121#discussion_r64140446 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfSuite.scala --- @@ -107,6 +107,53 @@ class SQLConfSuite extends QueryTest with SharedSQLContext { } } + test("reset - public conf") { +spark.sqlContext.conf.clear() +val original = spark.conf.get(SQLConf.GROUP_BY_ORDINAL) +try{ --- End diff -- Thanks, let me fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220799366 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220799367 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59079/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13105#issuecomment-220799325 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59078/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13105#issuecomment-220799323 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220799329 **[Test build #59079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull)** for PR 13216 at commit [`bc0d10a`](https://github.com/apache/spark/commit/bc0d10a5103c4e82dce725be792530120c9f6ff6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class TruncateTable(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13105#issuecomment-220799279 **[Test build #59078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59078/consoleFull)** for PR 13105 at commit [`87c6f27`](https://github.com/apache/spark/commit/87c6f27e8755c6f72e4821cf5cd1b77baf74ed4b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13211#discussion_r64139990 --- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala --- @@ -437,7 +438,9 @@ class ListAccumulator[T] extends AccumulatorV2[T, java.util.List[T]] { s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}") } - override def value: java.util.List[T] = java.util.Collections.unmodifiableList(_list) + override def value: java.util.List[T] = _list.synchronized { +java.util.Collections.unmodifiableList(new ArrayList[T](_list)) --- End diff -- One last thought ... now that this is cloned, does it need to be in an unmodifiable wrapper? maybe it's still a good idea so that the caller doesn't somehow think modifying the list modifies the accumulator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15459][SQL] Make Range logical and phys...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13239#discussion_r64139787 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -359,8 +359,8 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { generator, join = join, outer = outer, g.output, planLater(child)) :: Nil case logical.OneRowRelation => execution.RDDScanExec(Nil, singleRowRdd, "OneRowRelation") :: Nil - case r @ logical.Range(start, end, step, numSlices, output) => -execution.RangeExec(start, step, numSlices, r.numElements, output) :: Nil + case r : logical.Range => --- End diff -- nit. 'case r :' -> 'case r:' ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13226 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13226#issuecomment-220796810 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15459][SQL] Make Range logical and phys...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13239#issuecomment-220796771 cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15327] [SQL] fix split expression in wh...
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13235#issuecomment-220796543 Thanks! Not sure if I understand it correctly, but what happens when the whole-stage codegen generates code longer then 64k? Because I thought about fixing this issue by passing some of the variables to the generated functions (but was not sure how to do that exactly). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13211#issuecomment-220796529 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13211#issuecomment-220796487 **[Test build #59074 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59074/consoleFull)** for PR 13211 at commit [`3af08bd`](https://github.com/apache/spark/commit/3af08bd7c417520854971d2d14fb4ae608a9522a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220796420 **[Test build #59079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull)** for PR 13216 at commit [`bc0d10a`](https://github.com/apache/spark/commit/bc0d10a5103c4e82dce725be792530120c9f6ff6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15327] [SQL] fix split expression in wh...
Github user jurriaan commented on a diff in the pull request: https://github.com/apache/spark/pull/13235#discussion_r64139152 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2477,6 +2477,30 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { } } + test("SPARK-15327: fail to compile generated code with complex data structure") { +withTempDir{ dir => + val json = +""" + |{"h": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", "count": 3}], + |"b": [{"e": "test", "count": 1}]}}, "d": {"b": {"c": [{"e": "adfgd"}], + |"a": [{"e": "testing", "count": 3}], "b": [{"e": "test", "count": 1}]}}, + |"c": {"b": {"c": [{"e": "adfgd"}], "a": [{"count": 3}], + |"b": [{"e": "test", "count": 1}]}}, "a": {"b": {"c": [{"e": "adfgd"}], + |"a": [{"count": 3}], "b": [{"e": "test", "count": 1}]}}, + |"e": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", "count": 3}], + |"b": [{"e": "test", "count": 1}]}}, "g": {"b": {"c": [{"e": "adfgd"}], + |"a": [{"e": "testing", "count": 3}], "b": [{"e": "test", "count": 1}]}}, + |"f": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", "count": 3}], + |"b": [{"e": "test", "count": 1}]}}, "b": {"b": {"c": [{"e": "adfgd"}], + |"a": [{"count": 3}], "b": [{"e": "test", "count": 1}]}}}' --- End diff -- Nice fixture, haha! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13226#issuecomment-220796372 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13216#issuecomment-220796197 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13243#issuecomment-220796216 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13243#issuecomment-220796188 **[Test build #59077 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59077/consoleFull)** for PR 13243 at commit [`59b0a76`](https://github.com/apache/spark/commit/59b0a76a2dc2ed6484f005880001c273536088ae). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13243#issuecomment-220796217 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59077/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13244#issuecomment-220796189 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13188#issuecomment-220796056 I'm going to cherry-pick this into 2.0 since it has caused confusion and people thought 2.0 couldn't run the queries. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15282][SQL] PushDownPredicate should no...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-220796055 Thank you. @cloud-fan . By the way, to be clear with this, should we revert the change on `PushDownPredicate`, too? I think it's another separate issue. If the decision on this is finalized too, I can update this PR again. Thank you for fast decision, @marmbrus , @markhamstra , @thunterdb , @cloud-fan . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13244#issuecomment-220796042 @rxin Could you take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...
GitHub user jurriaan opened a pull request: https://github.com/apache/spark/pull/13244 [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 or -1 ## What changes were proposed in this pull request? This PR makes BroadcastHint more deterministic by using a special isBroadcastable property instead of setting the sizeInBytes to 1. See https://issues.apache.org/jira/browse/SPARK-15415 ## How was this patch tested? Added testcases to test if the broadcast hash join is included in the plan when the BroadcastHint is supplied and also tests for propagation of the joins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jurriaan/spark broadcast-hint Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13244.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13244 commit 8b9bf515423fa422d3c8436097acd87c4d09b733 Author: Jurriaan PruisDate: 2016-05-21T19:18:54Z [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is low or disabled This makes BroadcastHint more deterministic by using a special isBroadcastable property instead of setting the sizeInBytes to 1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/13121#issuecomment-220795764 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org