[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....

2015-01-22 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/4129#discussion_r23368844 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -40,6 +40,14 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request: https://github.com/apache/spark/pull/4157#discussion_r23370062 --- Diff: core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala --- @@ -375,16 +375,22 @@ private[nio] class ConnectionManager(

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/4104#issuecomment-71021795 Thanks TD, done with code rebase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4104#issuecomment-71022025 [Test build #25967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25967/consoleFull) for PR 4104 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-71004965 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-71004957 [Test build #25964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25964/consoleFull) for PR 4032 at commit

[GitHub] spark pull request: [SPARK-4813][Streaming] Fix the issue that Con...

2015-01-22 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3661#discussion_r23372367 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ContextWaiter.scala --- @@ -17,30 +17,63 @@ package org.apache.spark.streaming

[GitHub] spark pull request: SPARK-5308 [BUILD] MD5 / SHA1 hash format does...

2015-01-22 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/4161 SPARK-5308 [BUILD] MD5 / SHA1 hash format doesn't match standard Maven output Here's one way to make the hashes match what Maven's plugins would create. It takes a little extra footwork since OS X

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-71004690 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-71004681 [Test build #25963 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25963/consoleFull) for PR 3820 at commit

[GitHub] spark pull request: SPARK-4506 [DOCS] Addendum: Update more docs t...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4160#issuecomment-71008499 [Test build #25965 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25965/consoleFull) for PR 4160 at commit

[GitHub] spark pull request: SPARK-4506 [DOCS] Addendum: Update more docs t...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4160#issuecomment-71008502 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-5308 [BUILD] MD5 / SHA1 hash format does...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4161#issuecomment-71027549 [Test build #25968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25968/consoleFull) for PR 4161 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23373392 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -107,8 +107,14 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-71020806 [Test build #25966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25966/consoleFull) for PR 4032 at commit

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2015-01-22 Thread musicx
Github user musicx commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-70984348 Hi @witgo, where can I find your email? 中文交流 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70984720 @tdas I think this PR is almost ready, please follow the example to double check it, thanks! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70985022 [Test build #25958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25958/consoleFull) for PR 3715 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23360371 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -106,6 +106,12 @@ private[streaming] class

[GitHub] spark pull request: SPARK-5357: Update commons-codec version to 1....

2015-01-22 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4153#issuecomment-70985727 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4158#issuecomment-70986035 [Test build #25959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25959/consoleFull) for PR 4158 at commit

[GitHub] spark pull request: SPARK-5357: Update commons-codec version to 1....

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4153#issuecomment-70986011 [Test build #25960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25960/consoleFull) for PR 4153 at commit

[GitHub] spark pull request: [SPARK-5353] Log failures in REPL class loadin...

2015-01-22 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4130#discussion_r23361106 --- Diff: repl/src/main/scala/org/apache/spark/repl/ExecutorClassLoader.scala --- @@ -91,7 +91,14 @@ class ExecutorClassLoader(conf: SparkConf, classUri:

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread shenh062326
GitHub user shenh062326 opened a pull request: https://github.com/apache/spark/pull/4157 [SPARK-4934][CORE] Print remote address in ConnectionManager Connection key is hard to read : key already cancelled ? sun.nio.ch.SelectionKeyImpl@52b0e278. It’s hard to solve problem by

[GitHub] spark pull request: [SQL] SPARK-5309: Use Dictionary for Binary-S...

2015-01-22 Thread MickDavies
Github user MickDavies commented on the pull request: https://github.com/apache/spark/pull/4139#issuecomment-70984197 I've looked through ParquetQuerySuite and ParquetQuerySuite2 and its not obvious that there are tests that will exercise this change. I.e. where Parquet uses

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70984126 [Test build #25956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25956/consoleFull) for PR 3715 at commit

[GitHub] spark pull request: [SPARK-5147][Streaming] Delete the received da...

2015-01-22 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/4037#issuecomment-70984564 OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-5147][Streaming] Delete the received da...

2015-01-22 Thread jerryshao
Github user jerryshao closed the pull request at: https://github.com/apache/spark/pull/4037 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2015-01-22 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-70984655 witgo#qq.com --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4157#issuecomment-70984613 [Test build #25957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25957/consoleFull) for PR 4157 at commit

[GitHub] spark pull request: SPARK-5357: Update commons-codec version to 1....

2015-01-22 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4153#issuecomment-70987116 I think it is pretty safe to update. However really the right thing is to depend on commons codec in your app and set your classpath to take precedence. In general. ---

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23361754 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -106,6 +106,12 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...

2015-01-22 Thread OopsOutOfMemory
Github user OopsOutOfMemory commented on a diff in the pull request: https://github.com/apache/spark/pull/4127#discussion_r23359955 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala --- @@ -178,3 +180,34 @@ case class DescribeCommand(

[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-70983891 [Test build #25950 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25950/consoleFull) for PR 3999 at commit

[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-70983903 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5297][Streaming][backport] Backport SPA...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4154#issuecomment-70984233 [Test build #25954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25954/consoleFull) for PR 4154 at commit

[GitHub] spark pull request: [SPARK-5297][Streaming][backport] Backport SPA...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4154#issuecomment-70984236 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-01-22 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4015#issuecomment-70985149 cc @marmbrus @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/4158 [SPARK-5364] [SQL] HiveQL transform doesn't support the non output clause This is a quick fix for query (in HiveContext) like: ``` SELECT transform(key + 1, value) USING '/bin/cat'

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23360875 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -106,6 +106,12 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-5307] SerializationDebugger - take 2

2015-01-22 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4098#issuecomment-70988349 This is really cool. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70988894 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-7099 [Test build #25951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25951/consoleFull) for PR 3715 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23361948 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -106,6 +106,12 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4157#discussion_r23365112 --- Diff: core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala --- @@ -375,16 +375,22 @@ private[nio] class ConnectionManager(

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4104#discussion_r23365189 --- Diff: project/MimaExcludes.scala --- @@ -82,6 +82,10 @@ object MimaExcludes { // SPARK-5166 Spark SQL API stabilization

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4104#discussion_r23365219 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala --- @@ -211,7 +211,9 @@ trait JavaDStreamLike[T, This :

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-70997210 [Test build #25964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25964/consoleFull) for PR 4032 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-70997541 Hey @tdas , I've updated the code and rebased the branch according to your comments, also with several rounds of test in my local set, the previous exception I

[GitHub] spark pull request: [SPARK-5365][MLlib] Refactor KMeans to reduce ...

2015-01-22 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4159#issuecomment-70997638 So this returns `(p, (r1, r2, r3, ...))` instead of `(r1, p), (r2, p), (r3, p), ...` Makes sense to me, especially if you have reason to believe this is a bottleneck

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4104#discussion_r23365454 --- Diff: project/MimaExcludes.scala --- @@ -82,6 +82,10 @@ object MimaExcludes { // SPARK-5166 Spark SQL API stabilization

[GitHub] spark pull request: [SPARK-5315][Streaming] Fix reduceByWindow Jav...

2015-01-22 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/4104#issuecomment-70997871 Can you update this patch to deal with the merge conflicts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-5365][MLlib] Refactor KMeans to reduce ...

2015-01-22 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4159#issuecomment-70998144 Especially when there are many runs to use and p is also high dimensional and selected in more than one run. Then collecting redundant p would be too useless and

[GitHub] spark pull request: [SPARK-3726] [MLlib] Allow sampling_rate not e...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4073#issuecomment-70998096 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-3726] [MLlib] Allow sampling_rate not e...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4073#issuecomment-70998086 [Test build #25961 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25961/consoleFull) for PR 4073 at commit

[GitHub] spark pull request: [SPARK-5365][MLlib] Refactor KMeans to reduce ...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4159#issuecomment-70998595 [Test build #25962 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25962/consoleFull) for PR 4159 at commit

[GitHub] spark pull request: [SPARK-5365][MLlib] Refactor KMeans to reduce ...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4159#issuecomment-70998601 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5347][CORE] Change FileSplit to InputSp...

2015-01-22 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4150#issuecomment-70999157 My only question was whether `getLength()` is indeed defined in the `InputSplit` interface in older Hadoop versions, but it looks like it is. This change compiles with

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23366311 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala --- @@ -206,9 +208,13 @@ class JobGenerator(jobScheduler:

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23366414 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceivedBlockTrackerSuite.scala --- @@ -82,15 +82,13 @@ class ReceivedBlockTrackerSuite

[GitHub] spark pull request: SPARK-4506 [DOCS] Addendum: Update more docs t...

2015-01-22 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/4160 SPARK-4506 [DOCS] Addendum: Update more docs to reflect that standalone works in cluster mode This is a trivial addendum to SPARK-4506, which was already resolved. noted by Asim Jalis in

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23366552 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -107,8 +107,14 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-4654][CORE] Clean up DAGScheduler getMi...

2015-01-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/4134#discussion_r23366593 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -349,34 +349,7 @@ class DAGScheduler( } private def

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-71000469 looks almost good. some minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4939] move to next locality when no pen...

2015-01-22 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3779#issuecomment-71000700 @kayousterhout @davies @pwendell can this be merged for 1.2.1? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-5147][Streaming] Delete the received da...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4149#issuecomment-70989118 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5147][Streaming] Delete the received da...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4149#issuecomment-70989110 [Test build #25952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25952/consoleFull) for PR 4149 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-70989477 I understand this patch now. Please update it based on my comments, and test it in your harness to make sure that it addresses the exception problem. --- If your project

[GitHub] spark pull request: [SPARK-3726] [MLlib] Allow sampling_rate not e...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4073#issuecomment-70989958 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: Refactor KMeans to reduce redundant data

2015-01-22 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4159 Refactor KMeans to reduce redundant data If a point is selected as new centers for many runs, it would collect many redundant data. This pr refactors it. You can merge this pull request into a Git

[GitHub] spark pull request: Refactor KMeans to reduce redundant data

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4159#issuecomment-70990653 [Test build #25962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25962/consoleFull) for PR 4159 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23362757 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -106,6 +106,12 @@ private[streaming] class

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70992015 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70992003 [Test build #25958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25958/consoleFull) for PR 3715 at commit

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-22 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-70992950 I have fixed the bug This is quite embarrassing, I forgot to set those factors(NANOS_PER_SECOND, SECONDS_PER_MINUTE, MINUTES_PER_HOUR) to Long when divide, so

[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4158#issuecomment-70992965 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-70992996 [Test build #25963 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25963/consoleFull) for PR 3820 at commit

[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4158#issuecomment-70992962 [Test build #25959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25959/consoleFull) for PR 4158 at commit

[GitHub] spark pull request: SPARK-5357: Update commons-codec version to 1....

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4153#issuecomment-70993462 [Test build #25960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25960/consoleFull) for PR 4153 at commit

[GitHub] spark pull request: SPARK-5357: Update commons-codec version to 1....

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4153#issuecomment-70993475 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4158#issuecomment-70989376 Hi @chenghao-intel, I already did this and support for custom field delimiter and SerDe in PR #4014. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/4032#discussion_r23361989 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceivedBlockTrackerSuite.scala --- @@ -82,11 +82,6 @@ class ReceivedBlockTrackerSuite

[GitHub] spark pull request: [SPARK-3726] [MLlib] Allow sampling_rate not e...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4073#issuecomment-70989944 [Test build #25955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25955/consoleFull) for PR 4073 at commit

[GitHub] spark pull request: [SPARK-3726] [MLlib] Allow sampling_rate not e...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4073#issuecomment-70990119 [Test build #25961 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25961/consoleFull) for PR 4073 at commit

[GitHub] spark pull request: [SPARK-5233][Streaming] Fix error replaying of...

2015-01-22 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/4032#issuecomment-70990301 OK, will do, thanks a lot for your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70990806 [Test build #25956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25956/consoleFull) for PR 3715 at commit

[GitHub] spark pull request: [SPARK-5154] [PySpark] [Streaming] Kafka strea...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3715#issuecomment-70990814 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4157#issuecomment-70991309 [Test build #25957 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25957/consoleFull) for PR 4157 at commit

[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4157#issuecomment-70991317 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-4506 [DOCS] Addendum: Update more docs t...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4160#issuecomment-71000852 [Test build #25965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25965/consoleFull) for PR 4160 at commit

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-01-22 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r23367855 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -106,7 +106,22 @@ private[spark] abstract class Task[T](val stageId: Int, var

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-01-22 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r23367928 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -106,7 +106,22 @@ private[spark] abstract class Task[T](val stageId: Int, var

[GitHub] spark pull request: [SPARK-5063] More helpful error messages for s...

2015-01-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3884#discussion_r23397600 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -955,6 +977,11 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-5063] More helpful error messages for s...

2015-01-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3884#discussion_r23397684 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -76,10 +76,25 @@ import org.apache.spark.util.random.{BernoulliSampler, PoissonSampler,

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399719 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399756 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399765 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -280,6 +280,48 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399804 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +86,68 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode=k-means||

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399766 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -280,6 +280,48 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399715 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

  1   2   3   4   >