[GitHub] spark issue #13554: [Documentation] Fixed target JAR path

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13554 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13554: [Documentation] Fixed target JAR path

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13554 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60156/ Test PASSed. ---

[GitHub] spark issue #13554: [Documentation] Fixed target JAR path

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13554 **[Test build #60156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60156/consoleFull)** for PR 13554 at commit [`ae90f1b`](https://github.com/apache/spark/commit/

[GitHub] spark issue #10706: [SPARK-12543] [SPARK-4226] [SQL] Subquery in expression

2016-06-07 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/10706 predicate subquery (IN, EXISTS) in SELECT is not supported in 2.0, only supported in WHERE/HAVING. --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark issue #13554: [Documentation] Fixed JAR path

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13554 **[Test build #60156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60156/consoleFull)** for PR 13554 at commit [`ae90f1b`](https://github.com/apache/spark/commit/a

[GitHub] spark pull request #13554: [Documentation] Fixed JAR path

2016-06-07 Thread prabeesh
GitHub user prabeesh opened a pull request: https://github.com/apache/spark/pull/13554 [Documentation] Fixed JAR path ## What changes were proposed in this pull request? Spark-2.0.0 uses Scala-2.11 ## How was this patch tested? n/a You can merge this

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13549 Any reason why there isn't a jira ticket? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark issue #13496: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from ...

2016-06-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13496 LGTM except one minor comment, thanks for working on it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request #13496: [SPARK-15753][SQL] Move Analyzer stuff to Analyze...

2016-06-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13496#discussion_r66198253 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -452,6 +452,21 @@ class Analyzer( def ap

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13537 **[Test build #60154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60154/consoleFull)** for PR 13537 at commit [`1d5574a`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13189 **[Test build #60155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60155/consoleFull)** for PR 13189 at commit [`bc8d102`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #13532: [SPARK-15204][SQL] improve nullability inference for Agg...

2016-06-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13532 can we wait for https://github.com/apache/spark/pull/13553? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13537: [SPARK-15794] Should truncate toString() of very ...

2016-06-07 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13537#discussion_r66197227 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala --- @@ -293,8 +294,8 @@ case class StructType(fields: Array[StructField]) ex

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13537 **[Test build #60153 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60153/consoleFull)** for PR 13537 at commit [`c52c44d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60153/ Test FAILed. ---

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13537 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13537 **[Test build #60153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60153/consoleFull)** for PR 13537 at commit [`c52c44d`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #13553: [SPARK-15814][SQL] Aggregator can return null result

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13553 **[Test build #60152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60152/consoleFull)** for PR 13553 at commit [`3471199`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 Ok. I got it. So I think the point is the memory usage reduction is not worth doing this change. Let me close it now. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce ...

2016-06-07 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/13439 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #13553: [SPARK-15814][SQL] Aggregator can return null result

2016-06-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13553 cc @yhuai @liancheng @clockfly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled a

[GitHub] spark pull request #13553: [SPARK-15814][SQL] Aggregator can return null res...

2016-06-07 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/13553 [SPARK-15814][SQL] Aggregator can return null result ## What changes were proposed in this pull request? It's similar to the bug fixed in https://github.com/apache/spark/pull/13425, we s

[GitHub] spark issue #13537: [SPARK-15794] Should truncate toString() of very wide pl...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13537 **[Test build #60151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60151/consoleFull)** for PR 13537 at commit [`68a97dc`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13439 Going back to my original question: what's the point of this complicated pull request? How much memory would you save in practice? The column batches are not for persistent memory storage yet, and they

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 So will it be more practice to benchmark the case in which there are some constant and some not constant column vectors are used together? And compare it with the original case in which all columns a

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 I see. My question is, as for example we create 2 column vectors, one is constant and one is not. Because we will not re-use the column vectors, so their constant flag is fixed and not changed. As th

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13439 What I meant is that if in one process you have some invocation of the function that would hit the true branch, and some other invocation of the function that would hit the false branch, the performanc

[GitHub] spark pull request #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/13548 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13548 **[Test build #60150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60150/consoleFull)** for PR 13548 at commit [`41b7b79`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13548 **[Test build #3069 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3069/consoleFull)** for PR 13548 at commit [`41b7b79`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread squito
GitHub user squito reopened a pull request: https://github.com/apache/spark/pull/13548 [DO NOT MERGE] lots of blacklist testing making jenkins run the scheduler tests a lot You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/sp

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-07 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Thank you for the quick responses @sun-rui and @shivaram . Here is how the `dataframe.queyExection.toString` printout starts with: == Parsed Logical Plan == 'SerializeFromObject

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13413 **[Test build #60149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60149/consoleFull)** for PR 13413 at commit [`5145e53`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-07 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13413 @maropu Thanks for the review, addressed all the comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-07 Thread techaddict
Github user techaddict commented on a diff in the pull request: https://github.com/apache/spark/pull/13413#discussion_r66192955 --- Diff: python/pyspark/sql/tests.py --- @@ -1481,17 +1481,7 @@ def test_list_functions(self): spark.sql("CREATE DATABASE some_db")

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 Besides, I just wrote this test according to other tests in `ColumnarBatchBenchmark` that benchmark on-heap, off-heap column vector access. I was thinking it might be enough. If not, any else need to

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 hmm, but as the flag is set, I think it will not be changed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark issue #13526: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedD...

2016-06-07 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/13526 could we "rewind"/undo the append for the key and change it to a map that inserts new values and key? so remove one append and replace it with another operation? --- If your project is set up

[GitHub] spark pull request #13547: Update KafkaWordCount.scala

2016-06-07 Thread ShreyasFadnavis
Github user ShreyasFadnavis closed the pull request at: https://github.com/apache/spark/pull/13547 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark issue #10706: [SPARK-12543] [SPARK-4226] [SQL] Subquery in expression

2016-06-07 Thread kamalcoursera
Github user kamalcoursera commented on the issue: https://github.com/apache/spark/pull/10706 Hi Davies, Could you please shed more light on the status of correlated but non-scalar subquery in Spark 2.0 release. Appreciate if you can summarize any other restrictions, if any.

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13439 I am not sure if you are really testing it correctly -- your benchmark is mostly likely just testing how well the CPU does branch prediction when the flag is always true or false. --- If your projec

[GitHub] spark issue #13526: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedD...

2016-06-07 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/13526 the tricky part with that is that (ds: Dataset[(K, V)]).groupBy(_._1).mapValues(_._2) should return a KeyValueGroupedDataset[K, V] On Tue, Jun 7, 2016 at 8:22 PM, Wenchen Fan

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13549 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13549 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60148/ Test PASSed. ---

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13549 **[Test build #60148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60148/consoleFull)** for PR 13549 at commit [`a287a9a`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13552: [SPARK-15813] Use past tense for the cancel container re...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13552 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #13552: [SPARK-15813] Use past tense for the cancel conta...

2016-06-07 Thread peterableda
GitHub user peterableda opened a pull request: https://github.com/apache/spark/pull/13552 [SPARK-15813] Use past tense for the cancel container request message ## What changes were proposed in this pull request? Use past tense for the cancel container request message as it is log

[GitHub] spark issue #13543: [SPARK-15806] [Documentation] update doc for SPARK_MASTE...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13543 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13543: [SPARK-15806] [Documentation] update doc for SPARK_MASTE...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13543 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60146/ Test PASSed. ---

[GitHub] spark issue #13543: [SPARK-15806] [Documentation] update doc for SPARK_MASTE...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13543 **[Test build #60146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60146/consoleFull)** for PR 13543 at commit [`adcaaab`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13550: SPARK-15755

2016-06-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13550 @marymwu this has been fixed in https://github.com/apache/spark/commit/09b3c56c91831b3e8d909521b8f3ffbce4eb0395. Could you close this PR? --- If your project is set up for it, you can r

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for distinct/dropDupl...

2016-06-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13545 What do you think `dropDuplicates`? 1. ds.select("_1", "_2", "_3").dropDuplicates(Seq("_1", "_2")).orderBy("_1", "_2").show() 2. ds.select("_1", "_2", "_3").dropDuplicates("_1", "_

[GitHub] spark pull request #13551: merge original repository

2016-06-07 Thread AllenShi
Github user AllenShi closed the pull request at: https://github.com/apache/spark/pull/13551 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is e

[GitHub] spark pull request #13551: merge original repository

2016-06-07 Thread AllenShi
GitHub user AllenShi opened a pull request: https://github.com/apache/spark/pull/13551 merge original repository ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please exp

[GitHub] spark issue #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13548 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13548 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60138/ Test FAILed. ---

[GitHub] spark issue #13548: [DO NOT MERGE] lots of blacklist testing

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13548 **[Test build #60138 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60138/consoleFull)** for PR 13548 at commit [`5bc48f2`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13550: SPARK-15755

2016-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13550 It would be nicer if this PR follows https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and has a test. --- If your project is set up for it, you can reply to this email a

[GitHub] spark issue #13550: SPARK-15755

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13550 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #13371: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13371 cc @rxin Can you also take a look of this? This is staying for a while too. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #13550: SPARK-15755

2016-06-07 Thread marymwu
GitHub user marymwu opened a pull request: https://github.com/apache/spark/pull/13550 SPARK-15755 JIRA Issue: https://issues.apache.org/jira/browse/SPARK-15755 java.lang.NullPointerException when run spark 2.0 setting spark.serializer=org.apache.spark.serializer.KryoSeriali

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13439 @rxin hmm, I just think if we can improve it by just adding conditional check, it might be worth doing. For the performance hurt, this is benchmark for on-heap and off-heap column vectors be

[GitHub] spark issue #12258: [SPARK-14485][CORE] ignore task finished for executor lo...

2016-06-07 Thread zhonghaihua
Github user zhonghaihua commented on the issue: https://github.com/apache/spark/pull/12258 @vanzin my JIRA username is `iward`. Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13549 **[Test build #60148 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60148/consoleFull)** for PR 13549 at commit [`a287a9a`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #13549: Added support for sorting after streaming aggregation wi...

2016-06-07 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/13549 @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fe

[GitHub] spark pull request #13549: Added support for sorting after streaming aggrega...

2016-06-07 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/13549#discussion_r66182722 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala --- @@ -123,27 +159,6 @@ object UnsupportedOpera

[GitHub] spark pull request #13549: Added support for sorting after streaming aggrega...

2016-06-07 Thread tdas
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/13549 Added support for sorting after streaming aggregation with complete mode ## What changes were proposed in this pull request? When the output mode is complete, then the output of a streaming a

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60147/ Test PASSed. ---

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13544 **[Test build #60147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60147/consoleFull)** for PR 13544 at commit [`e86119e`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-07 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66182476 --- Diff: R/pkg/R/mllib.R --- @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) { invisible(x) }

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13544 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13439 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60141/ Test PASSed. ---

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13439 **[Test build #60141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60141/consoleFull)** for PR 13439 at commit [`2226efc`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13540: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13540 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60145/ Test PASSed. ---

[GitHub] spark issue #13300: [SPARK-15463][SQL] support creating dataframe out of Dat...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13300 @pjfanning we are now focusing on bug fixes and stability fixes rather than adding new features. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13540: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13540 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13540: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13540 **[Test build #60145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60145/consoleFull)** for PR 13540 at commit [`d1c00da`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13545: [SPARK-15807][SQL] Support varargs for distinct/dropDupl...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13545 For API design it would be better to be very conservative, because we cannot remove APIs. There is always value in adding something, but there is also a cost to maintenance and user experience (too man

[GitHub] spark issue #13542: [SPARK-15730][SQL][WIP] Respect the --hiveconf in the sp...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13542 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60144/ Test PASSed. ---

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13439 @viirya this is still a pretty major change for unclear benefits. There might be other more important things that need more eyes on... --- If your project is set up for it, you can reply to this emai

[GitHub] spark issue #13542: [SPARK-15730][SQL][WIP] Respect the --hiveconf in the sp...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13542 **[Test build #60144 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60144/consoleFull)** for PR 13542 at commit [`1fc8a30`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13542: [SPARK-15730][SQL][WIP] Respect the --hiveconf in the sp...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13542 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #13545: [SPARK-15807][SQL] Support varargs for distinct/d...

2016-06-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13545#discussion_r66181659 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2262,6 +2275,19 @@ class Dataset[T] private[sql]( def distinct(): Dataset[T]

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13439 Wouldn't this hurt performance even more due to the extra branch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13544 **[Test build #60147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60147/consoleFull)** for PR 13544 at commit [`e86119e`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13439 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60140/ Test PASSed. ---

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60143/ Test PASSed. ---

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13544 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13439: [SPARK-15701][SQL] Modify ColumnVector to reduce memory ...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13439 **[Test build #60140 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60140/consoleFull)** for PR 13439 at commit [`07ef523`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13544 **[Test build #60143 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60143/consoleFull)** for PR 13544 at commit [`20bff4b`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13543: [SPARK-15806] [Documentation] update doc for SPARK_MASTE...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13543 **[Test build #60146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60146/consoleFull)** for PR 13543 at commit [`adcaaab`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #13540: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13540 **[Test build #60145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60145/consoleFull)** for PR 13540 at commit [`d1c00da`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #13540: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf...

2016-06-07 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/13540 Thanks @BryanCutler @MechCoder @MLnick for the review. I just update the PR to make it as property. Regarding the pyspark docs, I think there's umbrella jira to parity scala mllib and pyspark mllib,

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/13544 @rxin a small problem: in `HiveContext` there is a method `refreshTable` for refreshing metadata of Hive table. now using new SparkSession API with hive support, the method is remov

[GitHub] spark issue #13544: [SPARK-15805][SQL][Documents] update sql programming gui...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13544 **[Test build #60143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60143/consoleFull)** for PR 13544 at commit [`20bff4b`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #13542: [SPARK-15730][SQL][WIP] Respect the --hiveconf in the sp...

2016-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13542 **[Test build #60144 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60144/consoleFull)** for PR 13542 at commit [`1fc8a30`](https://github.com/apache/spark/commit/1

[GitHub] spark pull request #12938: [SPARK-15162][SPARK-15164][PySpark][DOCS][ML] upd...

2016-06-07 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12938#discussion_r66177599 --- Diff: python/pyspark/ml/classification.py --- @@ -183,7 +191,7 @@ def getThresholds(self): If :py:attr:`thresholds` is set, return its value.

[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

2016-06-07 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13189 Seems it is fine to not have metrics when we use hiveResultString. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

  1   2   3   4   >