[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4675 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-20 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-75217984 Good:) Merged into master and branch-1.3. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPAR-5814][MLLIB][GRAPHX] Remove JBLAS from r...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4699#issuecomment-75218599 [Test build #27774 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27774/consoleFull) for PR 4699 at commit

[GitHub] spark pull request: Fixed overflow on large range with high number...

2015-02-20 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4646#issuecomment-75218633 This appears to be superseded by https://github.com/apache/spark/pull/4701 Do you mind closing this PR? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4677#discussion_r25062779 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala --- @@ -76,8 +77,42 @@ class GradientBoostedTrees(private val

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4677#discussion_r25062776 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala --- @@ -76,8 +77,42 @@ class GradientBoostedTrees(private val

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75223810 @mengxr Fixed ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75224879 [Test build #27776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27776/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/4677#discussion_r25062848 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala --- @@ -76,8 +77,42 @@ class GradientBoostedTrees(private val

[GitHub] spark pull request: [SPARK-5522] Accelerate the Histroty Server st...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4525#issuecomment-75223664 [Test build #27773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27773/consoleFull) for PR 4525 at commit

[GitHub] spark pull request: [SPARK-5522] Accelerate the Histroty Server st...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4525#issuecomment-75223672 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-4588 [MLLIB] [WIP] Add API for feature a...

2015-02-20 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4460#issuecomment-75223695 @srowen If we mark a string column categorical, it may be hard to answer how many categories it has without looking at the data. If a column is marked categorical, it

[GitHub] spark pull request: SPARK-4588 [MLLIB] [WIP] Add API for feature a...

2015-02-20 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4460#issuecomment-75225746 Don't you always have to look at the data to determine how many unique values a column has, regardless of type? String and int are encodings, but attribute types like

[GitHub] spark pull request: [SPARK-5016] Distribute Gaussian Initializatio...

2015-02-20 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/4654#issuecomment-75225799 Just to clarify, by cluster mode do you mean running `./bin/spark-shell --master spark://manoj-X550LD:7077` where the url is generated by doing

[GitHub] spark pull request: [SPAR-5814][MLLIB][GRAPHX] Remove JBLAS from r...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4699#issuecomment-75227371 [Test build #27774 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27774/consoleFull) for PR 4699 at commit

[GitHub] spark pull request: [SPAR-5814][MLLIB][GRAPHX] Remove JBLAS from r...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4699#issuecomment-75227377 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75224376 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75224251 [Test build #27775 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27775/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75224373 [Test build #27775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27775/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4688#issuecomment-75203105 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25054004 --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala --- @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag]( // bounds are inclusive, hence the +

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4688#issuecomment-75203094 [Test build #27769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27769/consoleFull) for PR 4688 at commit

[GitHub] spark pull request: [SPARK-4423] Improve foreach() documentation t...

2015-02-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4696#discussion_r25054127 --- Diff: docs/programming-guide.md --- @@ -728,6 +728,61 @@ def doStuff(self, rdd): /div +### Understanding closures +One of the

[GitHub] spark pull request: [SPARK-3454] Expose JSON representation of dat...

2015-02-20 Thread sarutak
Github user sarutak closed the pull request at: https://github.com/apache/spark/pull/2333 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3454] Expose JSON representation of dat...

2015-02-20 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2333#issuecomment-75205810 OK. I close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25058069 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -484,7 +505,7 @@ class DAGScheduler( Total number of

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4698#issuecomment-75202758 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5909][SQL] Add a clearCache command to ...

2015-02-20 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/4694#issuecomment-75203984 @marmbrus Should we backport this to branch-1.3? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-5909][SQL] Add a clearCache command to ...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4694 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25055713 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -47,26 +47,19 @@ import org.apache.spark.util.CallSite * be updated for

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056356 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -132,9 +88,7 @@ private[spark] class Stage( } def

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25057714 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -229,41 +227,56 @@ class DAGScheduler( /** *

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25057992 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -306,26 +319,31 @@ class DAGScheduler( } }

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4698#issuecomment-75210764 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4698#issuecomment-75210755 [Test build #27772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27772/consoleFull) for PR 4698 at commit

[GitHub] spark pull request: SPARK-5841 [CORE] [HOTFIX 2] Memory leak in Di...

2015-02-20 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4690#issuecomment-75213802 I am also OK with just using the try-catch in both places it's needed, rather than make a utility function. I wouldn't remove `Utils.inShutdown`. I assume there was some

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4698 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5909][SQL] Add a clearCache command to ...

2015-02-20 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/4694#issuecomment-75203829 LGTM, merging to master, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2103][Streaming] Change to ClassTag for...

2015-02-20 Thread salex89
Github user salex89 commented on the pull request: https://github.com/apache/spark/pull/1508#issuecomment-75203789 OK, sorry for necroposting then. Should I open an issue on JIRA with the snippet? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056104 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,53 +70,16 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4688#issuecomment-75207681 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4688#issuecomment-75207674 [Test build #27770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27770/consoleFull) for PR 4688 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056762 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -229,41 +227,56 @@ class DAGScheduler( /** *

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4698#issuecomment-75202770 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request: SPARK-5744 [CORE] Take 2. RDD.isEmpty / take f...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4698#issuecomment-75202844 [Test build #27772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27772/consoleFull) for PR 4698 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25058302 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -912,6 +959,196 @@ class DAGScheduler( } /** +

[GitHub] spark pull request: [SPARK-5522] Accelerate the Histroty Server st...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4525#issuecomment-75215693 [Test build #27773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27773/consoleFull) for PR 4525 at commit

[GitHub] spark pull request: [SPARK-5016] Distribute Gaussian Initializatio...

2015-02-20 Thread tgaloppo
Github user tgaloppo commented on the pull request: https://github.com/apache/spark/pull/4654#issuecomment-75236020 @MechCoder I mean making sure this is run on a cluster and not just on a single machine. My hypothesis is that cost of distributing the tasks to the cluster nodes (and

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25068816 --- Diff: core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala --- @@ -29,22 +29,42 @@ class JdbcRDDSuite extends FunSuite with BeforeAndAfter with

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-75228839 @wangxiaojing I'd like to revive this PR and get it committed. There have been a number of requests for this functionality, and several JIRAs and PRs about it. Would you

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066040 --- Diff: python/pyspark/streaming/context.py --- @@ -242,14 +242,14 @@ def socketTextStream(self, hostname, port, storageLevel=StorageLevel.MEMORY_AND_

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066019 --- Diff: docs/streaming-programming-guide.md --- @@ -659,6 +659,7 @@ methods for creating DStreams from files and Akka actors as input sources. +

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75233412 [Test build #27776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27776/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-75233417 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2312] Logging Unhandled messages

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2055#discussion_r25068858 --- Diff: core/src/main/scala/org/apache/spark/util/ActorLogReceive.scala --- @@ -43,7 +43,13 @@ private[spark] trait ActorLogReceive { private

[GitHub] spark pull request: [SPARK-2312] Logging Unhandled messages

2015-02-20 Thread isaias
Github user isaias commented on a diff in the pull request: https://github.com/apache/spark/pull/2055#discussion_r25069002 --- Diff: core/src/main/scala/org/apache/spark/util/ActorLogReceive.scala --- @@ -43,7 +43,13 @@ private[spark] trait ActorLogReceive { private

[GitHub] spark pull request: Spark-5708: Add Slf4jSink to Spark Metrics

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4644#discussion_r25068991 --- Diff: docs/monitoring.md --- @@ -176,6 +176,7 @@ Each instance can report to zero or more _sinks_. Sinks are contained in the * `JmxSink`: Registers

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066028 --- Diff: examples/src/main/python/streaming/hdfs_wordcount.py --- @@ -39,7 +39,7 @@ sc = SparkContext(appName=PythonStreamingHDFSWordCount)

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25065970 --- Diff: docs/streaming-programming-guide.md --- @@ -641,17 +641,17 @@ methods for creating DStreams from files and Akka actors as input sources.

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066003 --- Diff: docs/streaming-programming-guide.md --- @@ -641,17 +641,17 @@ methods for creating DStreams from files and Akka actors as input sources.

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066177 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -34,8 +34,10 @@ import

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066204 --- Diff: streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java --- @@ -1739,7 +1739,11 @@ public Integer call(String s) throws Exception {

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r25066151 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala --- @@ -204,9 +204,10 @@ class JavaStreamingContext(val

[GitHub] spark pull request: Spark-5708: Add Slf4jSink to Spark Metrics

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4644#discussion_r25069044 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/Slf4jSink.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: Spark-5708: Add Slf4jSink to Spark Metrics

2015-02-20 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4644#issuecomment-75237302 This looks pretty fine to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5775] BugFix: GenericRow cannot be cast...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4697#issuecomment-75242649 [Test build #2 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/2/consoleFull) for PR 4697 at commit

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread hotou
Github user hotou commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25071707 --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala --- @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag]( // bounds are inclusive, hence the +

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread hotou
Github user hotou commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25071164 --- Diff: core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala --- @@ -29,22 +29,42 @@ class JdbcRDDSuite extends FunSuite with BeforeAndAfter with

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread hotou
Github user hotou commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25072880 --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala --- @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag]( // bounds are inclusive, hence the +

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25073470 --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala --- @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag]( // bounds are inclusive, hence the

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25073572 --- Diff: core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala --- @@ -29,22 +29,42 @@ class JdbcRDDSuite extends FunSuite with BeforeAndAfter with

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread hotou
Github user hotou commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25073802 --- Diff: core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala --- @@ -29,22 +29,42 @@ class JdbcRDDSuite extends FunSuite with BeforeAndAfter with

[GitHub] spark pull request: [SPARK-5860][CORE] JdbcRDD: overflow on large ...

2015-02-20 Thread hotou
Github user hotou commented on a diff in the pull request: https://github.com/apache/spark/pull/4701#discussion_r25074390 --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala --- @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag]( // bounds are inclusive, hence the +

[GitHub] spark pull request: [SPARK-5158] [core] [security] Spark standalon...

2015-02-20 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/4106#issuecomment-75323707 Ah, I mixed up the history server with the event log directory. Let me try getting the history server up and see. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4214#issuecomment-75325198 [Test build #27791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27791/consoleFull) for PR 4214 at commit

[GitHub] spark pull request: [SPARK-5158] [core] [security] Spark standalon...

2015-02-20 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/4106#issuecomment-75325126 I get the same exception when I try to start the history server without running kinit first. My settings are: spark.history.kerberos.enabled true

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4214#discussion_r25109304 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -230,6 +250,45 @@ private[history] class

[GitHub] spark pull request: SPARK-5841 [CORE] [HOTFIX 2] Memory leak in Di...

2015-02-20 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/4690#issuecomment-75337749 @srowen @andrewor14 I can potentially issue a different PR for the performance issue and back it out from here (it does seem somewhat unrelated). In general, I

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/731#discussion_r25113210 --- Diff: core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala --- @@ -20,13 +20,14 @@ package org.apache.spark.deploy

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-20 Thread avulanov
GitHub user avulanov opened a pull request: https://github.com/apache/spark/pull/4709 [MLLIB] SPARK-5912 Programming guide for feature selection Added description of ChiSqSelector and few words about feature selection in general. I could add a code example, however it would not

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4027#discussion_r25100490 --- Diff: docs/running-on-mesos.md --- @@ -226,6 +226,20 @@ See the [configuration page](configuration.html) for information on Spark config The

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4027#discussion_r25100554 --- Diff: core/src/test/scala/org/apache/spark/scheduler/mesos/CoarseMesosSchedulerBackendSuite.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4027#discussion_r25100423 --- Diff: docs/running-on-mesos.md --- @@ -226,6 +226,20 @@ See the [configuration page](configuration.html) for information on Spark config The

[GitHub] spark pull request: [SPARK-4423] Improve foreach() documentation t...

2015-02-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4696#discussion_r25101597 --- Diff: docs/programming-guide.md --- @@ -728,6 +728,63 @@ def doStuff(self, rdd): /div +### Understanding closures +One of the

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-75318322 [Test build #27786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27786/consoleFull) for PR 3916 at commit

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-75318334 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4423] Improve foreach() documentation t...

2015-02-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4696#discussion_r25102665 --- Diff: docs/programming-guide.md --- @@ -728,6 +728,63 @@ def doStuff(self, rdd): /div +### Understanding closures +One of the

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/4311#issuecomment-75321260 @twinkle-sachdeva Would you mind creating an equivalent PR on the master branch? It will make it speed up the review/merge process for us. Thanks. --- If your

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4311#issuecomment-75321476 [Test build #27790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27790/consoleFull) for PR 4311 at commit

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4214#discussion_r25109562 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -230,6 +250,45 @@ private[history] class

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4027#discussion_r25112596 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala --- @@ -63,20 +63,25 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4423] Improve foreach() documentation t...

2015-02-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4696#discussion_r25101857 --- Diff: docs/programming-guide.md --- @@ -728,6 +728,63 @@ def doStuff(self, rdd): /div +### Understanding closures +One of the

[GitHub] spark pull request: [SPARK-4423] Improve foreach() documentation t...

2015-02-20 Thread ilganeli
Github user ilganeli commented on a diff in the pull request: https://github.com/apache/spark/pull/4696#discussion_r25107441 --- Diff: docs/programming-guide.md --- @@ -728,6 +728,63 @@ def doStuff(self, rdd): /div +### Understanding closures +One of the

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4214#discussion_r25109254 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -73,27 +103,15 @@ private[history] class

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/731#discussion_r25113240 --- Diff: docs/configuration.md --- @@ -1207,6 +1207,25 @@ Apart from these, the following properties are also available, and may be useful tr

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/731#discussion_r25113277 --- Diff: docs/configuration.md --- @@ -1207,6 +1207,25 @@ Apart from these, the following properties are also available, and may be useful tr

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4027#discussion_r25100639 --- Diff: core/src/test/scala/org/apache/spark/scheduler/mesos/CoarseMesosSchedulerBackendSuite.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/4027#issuecomment-75314717 Hi @tnachen this looks like a reasonably straight-forward change. My inline comments have mostly to do with naming and docs. One high level question: does it make

[GitHub] spark pull request: SPARK-5841 [CORE] [HOTFIX 2] Memory leak in Di...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4690#discussion_r25103499 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -174,11 +174,8 @@ class NewHadoopRDD[K, V]( }

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-20 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/4311#issuecomment-75324157 @twinkle-sachdeva I just noticed that there are a lot of style guide violations in this patch. Please look at how the rest of the code is formatted for reference.

  1   2   3   4   >