[GitHub] spark issue #8849: [SPARK-9883][MLlib] Distance to each kmean cluster given ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/8849 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15436 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67337/ Test FAILed. ---

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #67337 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67337/consoleFull)** for PR 15436 at commit [`d4bb9e4`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15583: [SPARK-18047][Core] Spark worker port should be g...

2016-10-21 Thread darionyaphet
Github user darionyaphet closed the pull request at: https://github.com/apache/spark/pull/15583 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #15566: [SPARK-18026][SQL] should not always lowercase pa...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15566#discussion_r84491905 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -209,42 +209,41 @@ case class PreprocessTableInsertion(con

[GitHub] spark pull request #15556: [SPARK-18010][Core] Reduce work performed for bui...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15556#discussion_r84492074 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -557,7 +560,8 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark issue #15566: [SPARK-18026][SQL] should not always lowercase partition...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15566 **[Test build #67340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67340/consoleFull)** for PR 15566 at commit [`3d6f418`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #15583: [SPARK-18047][Core] Spark worker port should be greater ...

2016-10-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15583 +1 for not a problem if I understood this correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark issue #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for truePos...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15585 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67338/ Test PASSed. ---

[GitHub] spark issue #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for truePos...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15585 **[Test build #67338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67338/consoleFull)** for PR 15585 at commit [`c11ba29`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for truePos...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15585 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for ...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15585#discussion_r84483984 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/RegressionMetrics.scala --- @@ -73,7 +73,7 @@ class RegressionMetrics @Since("2.0.0") (

[GitHub] spark pull request #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for ...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15585#discussion_r84484225 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/evaluation/RankingMetricsSuite.scala --- @@ -65,5 +65,4 @@ class RankingMetricsSuite extends SparkFunSu

[GitHub] spark pull request #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for ...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15585#discussion_r84484357 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/evaluation/MultilabelMetricsSuite.scala --- @@ -47,7 +47,7 @@ class MultilabelMetricsSuite extends Spar

[GitHub] spark issue #15586: [MINOR] Tiny follow-up to SPARK-16606, to correct more i...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15586 **[Test build #67339 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67339/consoleFull)** for PR 15586 at commit [`4ea0539`](https://github.com/apache/spark/commit/4

[GitHub] spark pull request #15556: [SPARK-18010][Core] Reduce work performed for bui...

2016-10-21 Thread vijoshi
Github user vijoshi commented on a diff in the pull request: https://github.com/apache/spark/pull/15556#discussion_r84482418 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -557,7 +560,8 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request #15586: [MINOR] Tiny follow-up to SPARK-16606, to correct...

2016-10-21 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/15586 [MINOR] Tiny follow-up to SPARK-16606, to correct more instances of the same log message typo ## What changes were proposed in this pull request? Tiny follow-up to SPARK-16606 / https://git

[GitHub] spark issue #10881: [SPARK-12967][Netty] Avoid NettyRpc error message during...

2016-10-21 Thread 1236897
Github user 1236897 commented on the issue: https://github.com/apache/spark/pull/10881 could i know how to merge the updated code to my project to avoid this error? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark issue #13526: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedD...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13526 That's a good point, let's focus on `ds.groupBy(...).mapValues(...)` then. One thought, in `mapValues`, we will project away the previous value attributes, so the workflow should be: ``` c

[GitHub] spark pull request #13036: [SPARK-15243][ML][SQL][PYSPARK] Param methods sho...

2016-10-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/13036#discussion_r84476560 --- Diff: python/pyspark/sql/tests.py --- @@ -799,25 +799,30 @@ def test_first_last_ignorenulls(self): def test_approxQuantile(self):

[GitHub] spark issue #15297: [SPARK-9862]Handling data skew

2016-10-21 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15297 Ok so how does that affect the overall job and # of outputs? I don't know the internals of Spark SQL so sorry if I'm missing something obvious. Basically now you will have multiple tasks whereas

[GitHub] spark issue #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for truePos...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15585 **[Test build #67338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67338/consoleFull)** for PR 15585 at commit [`c11ba29`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15539 We have REFRESH TABLE/PATH because we cache things, so I think we should consider caching and refreshing together. Currently we have 4 caches: 1. **table name to `LogicalRelation` cache**:

[GitHub] spark pull request #15585: [SPARK-18049][MLLIB][TEST] Add missing tests for ...

2016-10-21 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/15585 [SPARK-18049][MLLIB][TEST] Add missing tests for truePositiveRate and weightedTruePositiveRate ## What changes were proposed in this pull request? Add missing tests for `truePositiveRate`

[GitHub] spark issue #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return duplica...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15450 **[Test build #67335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67335/consoleFull)** for PR 15450 at commit [`793e4d5`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67334/consoleFull)** for PR 15513 at commit [`f71e39e`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15435: [SPARK-17139][ML] Add model summary for Multinomi...

2016-10-21 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/15435#discussion_r84468260 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala --- @@ -1756,55 +1765,105 @@ class LogisticRegressionSu

[GitHub] spark issue #13526: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedD...

2016-10-21 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/13526 @cloud-fan i can try to optimize ```grouped.mapValues(...).mapValues(...)``` but its a bit of an anti-pattern (there should be no need to do mapValues twice) so i dont think there is much gain

[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67333 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67333/consoleFull)** for PR 15513 at commit [`8a773d4`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67332/consoleFull)** for PR 15582 at commit [`1c5277d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15584: [SPARK-17898] [DOCS] --repositories needs username and p...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15584 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15584: [SPARK-17898] [DOCS] --repositories needs username and p...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15584 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67336/ Test PASSed. ---

[GitHub] spark issue #15584: [SPARK-17898] [DOCS] --repositories needs username and p...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15584 **[Test build #67336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67336/consoleFull)** for PR 15584 at commit [`a08b5a9`](https://github.com/apache/spark/commit/

[GitHub] spark issue #1778: [MLlib] [SPARK-2885] DIMSUM: All-pairs similarity

2016-10-21 Thread appierys
Github user appierys commented on the issue: https://github.com/apache/spark/pull/1778 Does anyone know how to extend this to the 'Cross Product' case as mentioned in the paper? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #67337 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67337/consoleFull)** for PR 15436 at commit [`d4bb9e4`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #15584: [SPARK-17898] [DOCS] --repositories needs username and p...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15584 **[Test build #67336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67336/consoleFull)** for PR 15584 at commit [`a08b5a9`](https://github.com/apache/spark/commit/a

[GitHub] spark pull request #15584: [SPARK-17898] [DOCS] --repositories needs usernam...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15584#discussion_r84458669 --- Diff: docs/programming-guide.md --- @@ -182,7 +182,7 @@ variable called `sc`. Making your own SparkContext will not work. You can set wh context conn

[GitHub] spark pull request #15584: [SPARK-17898] [DOCS] --repositories needs usernam...

2016-10-21 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/15584 [SPARK-17898] [DOCS] --repositories needs username and password ## What changes were proposed in this pull request? Document `user:password@` syntax as possible means of specifying credenti

[GitHub] spark pull request #15536: [SPARK-13275] [Web UI] Visually clarified executo...

2016-10-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15536 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15581: [SPARK-18044][STREAMING] FileStreamSource should not inf...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15581 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15536: [SPARK-13275] [Web UI] Visually clarified executors star...

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15536 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark issue #15581: [SPARK-18044][STREAMING] FileStreamSource should not inf...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15581 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67331/ Test PASSed. ---

[GitHub] spark issue #15581: [SPARK-18044][STREAMING] FileStreamSource should not inf...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15581 **[Test build #67331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67331/consoleFull)** for PR 15581 at commit [`ad7ef81`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15382: [SPARK-17810] [SQL] Default spark.sql.warehouse.dir is r...

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15382 I think this would have to go into 2.0.x as I think it's a regression from 2.0.0, in that the default location moved from a local file to HDFS for many users, and, it's an HDFS path that can't be cre

[GitHub] spark issue #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return duplica...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15450 **[Test build #67335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67335/consoleFull)** for PR 15450 at commit [`793e4d5`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-21 Thread vijoshi
Github user vijoshi commented on the issue: https://github.com/apache/spark/pull/15410 @ajbozarth yeah sounds useful. but note that for the very first load, we would have no 'Last Updated' value to display since that gets set only after the log scan+replay cycle completes at least onc

[GitHub] spark pull request #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return ...

2016-10-21 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15450#discussion_r84455681 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -378,10 +382,10 @@ class KMeans private ( costs.unpersist(blocki

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15513 I just addressed the comments updated the PR description. I will remove `[WIP]` although I will double-check again tomorrow. --- If your project is set up for it, you can reply to this email an

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67334/consoleFull)** for PR 15513 at commit [`f71e39e`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #15583: [SPARK-18047][Core] Spark worker port should be greater ...

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15583 (See JIRA) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #15583: [SPARK-18047][Core] Spark worker port should be greater ...

2016-10-21 Thread darionyaphet
Github user darionyaphet commented on the issue: https://github.com/apache/spark/pull/15583 Could you please point out the reason why? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark issue #15580: [SPARK-18042][SQL] OutputWriter should expose file path ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67330/ Test PASSed. ---

[GitHub] spark issue #15580: [SPARK-18042][SQL] OutputWriter should expose file path ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15580 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67333/consoleFull)** for PR 15513 at commit [`8a773d4`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #15580: [SPARK-18042][SQL] OutputWriter should expose file path ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15580 **[Test build #67330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67330/consoleFull)** for PR 15580 at commit [`1942361`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15483: [SPARK-17935][SQL]Add KafkaForeachWriter in external kaf...

2016-10-21 Thread zhangxinyu1
Github user zhangxinyu1 commented on the issue: https://github.com/apache/spark/pull/15483 @marmbrus Yes, I also think `KafkaSink` is better than `KafkaForeachWriter` for users. I 'm trying to complete `KafkaSink` these days. Like `KafkaSource` in external kafka-0-10-sql, there i

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15556 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15556 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67329/ Test PASSed. ---

[GitHub] spark issue #15583: [SPARK-18047][Core] Spark worker port should be greater ...

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15583 I disagree with this change, per the JIRA --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15556 **[Test build #67329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67329/consoleFull)** for PR 15556 at commit [`552c405`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15583: [SPARK-18047][Core] Spark worker port should be greater ...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15583 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #15583: [SPARK-18047][Core] Spark worker port should be g...

2016-10-21 Thread darionyaphet
GitHub user darionyaphet opened a pull request: https://github.com/apache/spark/pull/15583 [SPARK-18047][Core] Spark worker port should be greater than 1023 ## What changes were proposed in this pull request? The port numbers in the range from 0 to 1023 are the well-known po

[GitHub] spark issue #15417: [SPARK-17851][SQL][TESTS] Make sure all test sqls in cat...

2016-10-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15417 ping @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67332/consoleFull)** for PR 15582 at commit [`1c5277d`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15556 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnal...

2016-10-21 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/15582 [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSuite` to package `sql` ## What changes were proposed in this pull request? The testsuite `HiveDataFrameAnalyticsSuite` has nothin

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15556 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67328/ Test FAILed. ---

[GitHub] spark issue #15556: [SPARK-18010][Core] Reduce work performed for building u...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15556 **[Test build #67328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67328/consoleFull)** for PR 15556 at commit [`1a53388`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15579: Added support for extra command in front of spark.

2016-10-21 Thread sheepduke
Github user sheepduke commented on the issue: https://github.com/apache/spark/pull/15579 @mridulm Yes, I tested it in our cluster (5 nodes including 1 master). My colleague tested with some benchmarks. It seems that NUMA helps a lot for those applications that have very bad cache loca

[GitHub] spark pull request #15580: [SPARK-18042][SQL] OutputWriter should expose fil...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15580#discussion_r84442086 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOutputWriter.scala --- @@ -140,16 +143,17 @@ private[parquet]

[GitHub] spark pull request #15564: [SPARK-17331][FOLLOWUP][ML][CORE] Avoid allocatin...

2016-10-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15564 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15539 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67326/ Test PASSed. ---

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15539 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15539 **[Test build #67326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67326/consoleFull)** for PR 15539 at commit [`262f6ee`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15514: [SPARK-17960][PySpark] [Upgrade to Py4J 0.10.4]

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15514 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark issue #15564: [SPARK-17331][FOLLOWUP][ML][CORE] Avoid allocating 0-len...

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15564 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark pull request #15514: [SPARK-17960][PySpark] [Upgrade to Py4J 0.10.4]

2016-10-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15514 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15579: Added support for extra command in front of spark.

2016-10-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15579 I'm still not clear this is worth doing just for numactl which few or no OSes will have installed by default --- If your project is set up for it, you can reply to this email and have your reply app

[GitHub] spark pull request #15580: [SPARK-18042][SQL] OutputWriter should expose fil...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15580#discussion_r84441052 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOutputWriter.scala --- @@ -80,15 +81,15 @@ private[parquet] cl

[GitHub] spark issue #15581: [SPARK-18044][STREAMING] FileStreamSource should not inf...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15581 CC @zsxwing @brkyvz @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15581: [SPARK-18044][STREAMING] FileStreamSource should not inf...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15581 **[Test build #67331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67331/consoleFull)** for PR 15581 at commit [`ad7ef81`](https://github.com/apache/spark/commit/a

[GitHub] spark pull request #15581: [SPARK-18044][STREAMING] FileStreamSource should ...

2016-10-21 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15581 [SPARK-18044][STREAMING] FileStreamSource should not infer partitions in every batch ## What changes were proposed in this pull request? In `FileStreamSource.getBatch`, we will create a

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14847 ping @rxin @cloud-fan Please review this when you have time. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark issue #15484: [SPARK-17868][SQL] Do not use bitmasks during parsing an...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15484 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67325/ Test PASSed. ---

[GitHub] spark issue #15484: [SPARK-17868][SQL] Do not use bitmasks during parsing an...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15484 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15484: [SPARK-17868][SQL] Do not use bitmasks during parsing an...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15484 **[Test build #67325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67325/consoleFull)** for PR 15484 at commit [`925a5ca`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14847 @ioana-delaney I've added the test using bucketed table. Please take a look. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14847 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67324/ Test PASSed. ---

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14847 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14847 **[Test build #67324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67324/consoleFull)** for PR 14847 at commit [`3c7c1b5`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15526: [SPARK-17986] [ML] SQLTransformer should remove temporar...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15526 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15526: [SPARK-17986] [ML] SQLTransformer should remove temporar...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15526 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67327/ Test PASSed. ---

[GitHub] spark issue #15526: [SPARK-17986] [ML] SQLTransformer should remove temporar...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15526 **[Test build #67327 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67327/consoleFull)** for PR 15526 at commit [`d5c3b41`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15579: Added support for extra command in front of spark.

2016-10-21 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/15579 Btw, curious if you have actually tested this in yarn - I have a feeling it wont work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #15580: [SPARK-18042][SQL] OutputWriter should expose file path ...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15580 **[Test build #67330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67330/consoleFull)** for PR 15580 at commit [`1942361`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #15580: [SPARK-18042][SQL] OutputWriter should expose file path ...

2016-10-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15580 cc @hvanhovell @cloud-fan and @ericl --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featur

[GitHub] spark pull request #15580: [SPARK-18042][SQL] OutputWriter should expose fil...

2016-10-21 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/15580 [SPARK-18042][SQL] OutputWriter should expose file path written ## What changes were proposed in this pull request? This patch adds a new "path" method on OutputWriter that returns the path of the

[GitHub] spark pull request #15556: [SPARK-18010][Core] Reduce work performed for bui...

2016-10-21 Thread vijoshi
Github user vijoshi commented on a diff in the pull request: https://github.com/apache/spark/pull/15556#discussion_r84430841 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ReplayListenerBus.scala --- @@ -43,38 +43,56 @@ private[spark] class ReplayListenerBus extends Spar

[GitHub] spark issue #13526: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedD...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13526 To optimize `ds.groupBy(...).mapValues(...)`, yea it's not trivial as you explained above. But for `grouped.mapValues(...).mapValues(...)`, I think it should not be that hard, as it's a pattern li

<    1   2   3   4   5   >