[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3721#discussion_r22027430 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/WriteAheadLogManager.scala --- @@ -125,7 +125,7 @@ private[streaming] class WriteAheadLogM

[GitHub] spark pull request: [WIP] [SPARK-4273] [SQL] Providing ExternalSet...

2014-12-17 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3137#issuecomment-67452153 @marmbrus Thanks. I'm also trying another approach to optimize this operation. I want to discuss it with you later. --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-4006] In long running contexts, we enco...

2014-12-17 Thread tsliwowicz
Github user tsliwowicz commented on the pull request: https://github.com/apache/spark/pull/2914#issuecomment-67451754 hurray :-) On Thu, Dec 18, 2014 at 12:13 AM, andrewor14 wrote: > > Finally. I'm merging this into branch-1.0 thanks for your patience > @tsliwo

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread harishreedharan
Github user harishreedharan commented on a diff in the pull request: https://github.com/apache/spark/pull/3721#discussion_r22027214 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/WriteAheadLogManager.scala --- @@ -125,7 +125,7 @@ private[streaming] class WriteA

[GitHub] spark pull request: [SPARK-4756][SQL] FIX: sessionToActivePool gro...

2014-12-17 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3617#issuecomment-67451624 This is a known issue, and the solution provided in this PR LGTM, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67451312 [Test build #24574 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24574/consoleFull) for PR 3222 at commit [`1bf1192`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67451318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance

2014-12-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-67450454 Calculating 2 squared distances between the vectors of 2 dims: * `DenseVector` vs. `SparseVector` * breezeSquaredDistance: ~25 secs * This PR: ~

[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...

2014-12-17 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/3319#discussion_r22026283 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala --- @@ -256,72 +524,297 @@ object Matrices { * Generate a `DenseMatrix` con

[GitHub] spark pull request: SPARK-2641: Passing num executors to spark arg...

2014-12-17 Thread kjsingh
Github user kjsingh commented on a diff in the pull request: https://github.com/apache/spark/pull/1657#discussion_r22026183 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -105,6 +105,8 @@ private[spark] class SparkSubmitArguments(args: Seq[

[GitHub] spark pull request: [SPARK-4140] Document dynamic allocation

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3731#issuecomment-67448076 [Test build #24575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24575/consoleFull) for PR 3731 at commit [`b9843f2`](https://githu

[GitHub] spark pull request: [SPARK-4140] Document dynamic allocation

2014-12-17 Thread andrewor14
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/3731 [SPARK-4140] Document dynamic allocation Once the external shuffle service is also documented, the dynamic allocation section will link to it. Let me know if the whole dynamic allocation should

[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance

2014-12-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-67447322 Yes. The additional tests are used to test these bugs we found in this PR. For example, `fastSquaredDistance(v2, norm2, v4, norm4, precision)` is used to test the case whe

[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance

2014-12-17 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/3643#discussion_r22025570 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala --- @@ -264,6 +263,84 @@ object MLUtils { } Vectors.fromBreeze(vector

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67445872 [Test build #24574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24574/consoleFull) for PR 3222 at commit [`1bf1192`](https://githu

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67445534 [Test build #24573 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24573/consoleFull) for PR 3722 at commit [`79e679f`](https://gith

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67445541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-17 Thread tgaloppo
Github user tgaloppo commented on the pull request: https://github.com/apache/spark/pull/3022#issuecomment-67445452 Working on these changes; still a few left. Great feedback; really helping to improve my scala! --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67445293 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67445289 [Test build #24572 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24572/consoleFull) for PR 3730 at commit [`d79ed04`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67444271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67444267 **[Test build #24571 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24571/consoleFull)** for PR 3222 at commit [`dc9aa11`](https://git

[GitHub] spark pull request: [SPARK-4871][SQL] Show sql statement in spark ...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3718#issuecomment-67442452 [Test build #24570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24570/consoleFull) for PR 3718 at commit [`4d2038a`](https://gith

[GitHub] spark pull request: [SPARK-4871][SQL] Show sql statement in spark ...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3718#issuecomment-67442456 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4844][MLLIB]SGD should support cus...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3729#issuecomment-67441600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4844][MLLIB]SGD should support cus...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3729#issuecomment-67441598 [Test build #24569 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24569/consoleFull) for PR 3729 at commit [`0282fcf`](https://gith

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67441188 [Test build #24573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24573/consoleFull) for PR 3722 at commit [`79e679f`](https://githu

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3661#issuecomment-67441156 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3661#issuecomment-67441152 [Test build #24568 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24568/consoleFull) for PR 3661 at commit [`15357d2`](https://gith

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67441145 Also @msiddalingaiah commits should typically have a SPARK jira ticket associated with them so we can properly credit contributors when we do release notes, plus tie a com

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67441155 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67440967 [Test build #24572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24572/consoleFull) for PR 3730 at commit [`d79ed04`](https://githu

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67440904 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67440944 Thanks. This is a remnant from a serialization bug that GraphX used to have but was fixed a while ago, so there's no reason to set the locality wait anymore. --- If y

[GitHub] spark pull request: [HOTFIX][SQL] Fix parquet filter suite

2014-12-17 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3727#issuecomment-67440637 Thanks for helping fixing this. I should have pointed out that #3644 and #3367 conflict to each other since the latter recognizes more filter expressions. --- If your

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3721#issuecomment-67439083 [Test build #24566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24566/consoleFull) for PR 3721 at commit [`d3d8a51`](https://gith

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3721#issuecomment-67439090 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4861][SQL] Refactory command in spark s...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3712#issuecomment-67438514 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4861][SQL] Refactory command in spark s...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3712#issuecomment-67438510 [Test build #24567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24567/consoleFull) for PR 3712 at commit [`0e03be8`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67438369 @avulanov The `AdaDelta` and `AdaGrad` has been placed in a separate file. It can be used in #1290. --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-67438060 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-67438057 [Test build #24565 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24565/consoleFull) for PR 3433 at commit [`372d665`](https://gith

[GitHub] spark pull request: [SPARK-4693] [SQL] PruningPredicates may be wr...

2014-12-17 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3556#issuecomment-67437985 @marmbrus Thank you for your comments. I will do it right away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-67437977 [Test build #24571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24571/consoleFull) for PR 3222 at commit [`dc9aa11`](https://githu

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3730#issuecomment-67437380 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [WIP][SPARK-4844][MLLIB]SGD should support cus...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3729#issuecomment-67437126 [Test build #24569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24569/consoleFull) for PR 3729 at commit [`0282fcf`](https://githu

[GitHub] spark pull request: [SPARK-4880] remove spark.locality.wait in Ana...

2014-12-17 Thread Earne
GitHub user Earne opened a pull request: https://github.com/apache/spark/pull/3730 [SPARK-4880] remove spark.locality.wait in Analytics spark.locality.wait set to 10 in examples/graphx/Analytics.scala. Should be left to the user. You can merge this pull request into a Git re

[GitHub] spark pull request: [SPARK-4871][SQL] Show sql statement in spark ...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3718#issuecomment-67437137 [Test build #24570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24570/consoleFull) for PR 3718 at commit [`4d2038a`](https://githu

[GitHub] spark pull request: Revert https://github.com/apache/spark/pull/28...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3728#issuecomment-67437111 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread mvj101
Github user mvj101 commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67437100 https://github.com/apache/spark/pull/3728 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] spark pull request: [WIP][SPARK-4844][MLLIB]SGD should support cus...

2014-12-17 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/3729 [WIP][SPARK-4844][MLLIB]SGD should support custom sampling. JIRA: [SPARK-484](https://issues.apache.org/jira/browse/SPARK-4844) You can merge this pull request into a Git repository by running: $

[GitHub] spark pull request: Revert https://github.com/apache/spark/pull/28...

2014-12-17 Thread mvj101
GitHub user mvj101 opened a pull request: https://github.com/apache/spark/pull/3728 Revert https://github.com/apache/spark/pull/2872 Manual revert of d12c0711faa3d4333513fcbbbee4868bcb784a26 You can merge this pull request into a Git repository by running: $ git pull https://gi

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3661#issuecomment-67436556 [Test build #24568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24568/consoleFull) for PR 3661 at commit [`15357d2`](https://githu

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3661#discussion_r22021962 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ContextWaiter.scala --- @@ -17,30 +17,63 @@ package org.apache.spark.streaming

[GitHub] spark pull request: [SPARK-4861][SQL] Refactory command in spark s...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3712#issuecomment-67436278 [Test build #24567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24567/consoleFull) for PR 3712 at commit [`0e03be8`](https://githu

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread mvj101
Github user mvj101 commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67435661 Ok, I'll send a PR to revert in a few minutes. Thanks, Mike --- If your project is set up for it, you can reply to this email and have your reply appear on Gi

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67434905 Ugh, this doesn't revert cleanly due to another patch that I merged. I've go to go, so I'm just going to leave this for now. Someone else can deal with this if it's u

[GitHub] spark pull request: [DOC]: Improve Partition docs

2014-12-17 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67434853 Ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67434751 Actually, I'm going to revert this for now. Looks like the `boto` update will take a bit more work than I thought. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67434318 > I or someone else can address that corner case as time allows. Let's just update boto. I'll submit a PR for this shortly. --- If your project is set up for

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3721#issuecomment-67434030 [Test build #24566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24566/consoleFull) for PR 3721 at commit [`d3d8a51`](https://githu

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-17 Thread tkyaw
Github user tkyaw commented on the pull request: https://github.com/apache/spark/pull/3407#issuecomment-67433948 Merged with master as requested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3721#issuecomment-67433896 Used `Future` to fix the test and override my previous commits. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-4790][STREAMING] Fix ReceivedBlockTrack...

2014-12-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3726#discussion_r2202 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/WriteAheadLogManager.scala --- @@ -146,10 +163,15 @@ private[streaming] class WriteAheadLo

[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-67433690 [Test build #24565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24565/consoleFull) for PR 3433 at commit [`372d665`](https://githu

[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-17 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-67433680 Thanks @marmbrus . In Hive, `||` is assigned as Logical OR operator and Hive has `concat` function instead of `||` so I don't think we need change for `HiveQL.scala`.

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-12-17 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-67433589 Oh, sorry I thought you could reopen your own PRs. I guess that is not the case. Either way, please open a new one when ready. --- If your project is set up for it, y

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-12-17 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-67433471 Thanks for comments! Sorry for letting it out of date. I have no authority to reopen PRs, so I'll start a new session for this. --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3407#issuecomment-67433409 [Test build #24564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24564/consoleFull) for PR 3407 at commit [`19115ad`](https://gith

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3407#issuecomment-67433411 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4861][SQL] Refactory command in spark s...

2014-12-17 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3712#discussion_r22020964 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -52,9 +49,7 @@ case class InsertIntoHiveTable(

[GitHub] spark pull request: [SPARK-4861][SQL] Refactory command in spark s...

2014-12-17 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3712#discussion_r22020870 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -52,9 +49,7 @@ case class InsertIntoHiveTable( t

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread mvj101
Github user mvj101 commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67432970 Oops, apologies for this breakage. I haven't worked with spot instances. Feel free to revert this pull request and I or someone else can address that corner case as time a

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3661#discussion_r22020714 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ContextWaiter.scala --- @@ -17,30 +17,63 @@ package org.apache.spark.streaming

[GitHub] spark pull request: [SPARK-2663] [SQL] Support the Grouping Set

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1567#issuecomment-67432292 [Test build #24563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24563/consoleFull) for PR 1567 at commit [`fe65fcc`](https://gith

[GitHub] spark pull request: [SPARK-2096][SQL] support dot notation on arra...

2014-12-17 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/2405#issuecomment-67432312 Hi, @marmbrus ,the key point why I want to introduce `UnResolvedGetField` is that: for something like `a.b[0].c.d`, we first parse it to `GetField(GetField(GetItem(Unre

[GitHub] spark pull request: [SPARK-2663] [SQL] Support the Grouping Set

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1567#issuecomment-67432296 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4790][STREAMING] Fix ReceivedBlockTrack...

2014-12-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3726#discussion_r22020330 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/WriteAheadLogManager.scala --- @@ -146,10 +163,15 @@ private[streaming] class WriteAheadLo

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2014-12-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3661#discussion_r22020241 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ContextWaiter.scala --- @@ -17,30 +17,63 @@ package org.apache.spark.streaming

[GitHub] spark pull request: [SPARK-4873][Streaming] WriteAheadLogBasedBloc...

2014-12-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3721#issuecomment-67430734 My first trial was making `cleanupOldBatches` return a Future so that the caller can use it to wait. But I looked other places and found they used `eventually`. So I foll

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67429749 It looks like this PR may have broken the ability to launch spot clusters: ```python Traceback (most recent call last): File "./spark_ec2.py", line 114

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

2014-12-17 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/1290#issuecomment-67429185 You're right that optimization APIs are not great currently. The changes you suggest (removing private[mllib] and changing the Optimizer APIs) should be done at some p

[GitHub] spark pull request: SPARK-4547 [MLLIB] [WIP] OOM when making bins ...

2014-12-17 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3702#issuecomment-67428833 We're in agreement. My earlier statement "The simplistic approach should never be off by more than numPartitions." meant that the total count would never be off by mor

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread dreid93
Github user dreid93 commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67428485 @changetip does not appear to be picking up my mentions and sending the appropriate tip. :/ --- If your project is set up for it, you can reply to this email and have yo

[GitHub] spark pull request: [SPARK-4822] Use sphinx tags for Python doc an...

2014-12-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3685 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4822] Use sphinx tags for Python doc an...

2014-12-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3685#issuecomment-67428413 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-4822] Use sphinx tags for Python doc an...

2014-12-17 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3685#issuecomment-67428216 Oops, I was wrong. LGTM Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-4790][STREAMING] Fix ReceivedBlockTrack...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3726#issuecomment-67428130 [Test build #24562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24562/consoleFull) for PR 3726 at commit [`af00fd1`](https://gith

[GitHub] spark pull request: [SPARK-4790][STREAMING] Fix ReceivedBlockTrack...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3726#issuecomment-67428135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

2014-12-17 Thread dreid93
Github user dreid93 commented on the pull request: https://github.com/apache/spark/pull/2872#issuecomment-67428078 @jontg a coffee for you sir @ChangeTip --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3407#issuecomment-67428050 [Test build #24564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24564/consoleFull) for PR 3407 at commit [`19115ad`](https://githu

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-12-17 Thread lianhuiwang
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-67427901 yes, i think we can close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-12-17 Thread lianhuiwang
Github user lianhuiwang closed the pull request at: https://github.com/apache/spark/pull/1113 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4822] Use sphinx tags for Python doc an...

2014-12-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3685#issuecomment-67427419 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [SPARK-4822] Use sphinx tags for Python doc an...

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3685#issuecomment-67427413 [Test build #24561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24561/consoleFull) for PR 3685 at commit [`88a0fd9`](https://gith

[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

2014-12-17 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/1269#issuecomment-67427124 I've been looking at the various topic modeling PRs (3 currently) to try to get a sense of how they compare in terms of accuracy and speed. By "scaling," I really mean

[GitHub] spark pull request: [SPARK-2663] [SQL] Support the Grouping Set

2014-12-17 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/1567#issuecomment-67427000 Thank you @marmbrus , I've finished the updating, will add "WIP" next time. :) Can you review the code again? --- If your project is set up for it, you can re

[GitHub] spark pull request: [SPARK-2663] [SQL] Support the Grouping Set

2014-12-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1567#issuecomment-67426864 [Test build #24563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24563/consoleFull) for PR 1567 at commit [`fe65fcc`](https://githu

[GitHub] spark pull request: Added setMinCount to Word2Vec.scala

2014-12-17 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3693#issuecomment-67426600 Oh, and yes, I meant to use 4 spaces inside the function. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-17 Thread tgaloppo
Github user tgaloppo commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22018162 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Soft

  1   2   3   4   5   >