[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4725#issuecomment-75521453 Ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4725#issuecomment-75519690 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5724] fix the misconfiguration in AkkaU...

2015-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4512 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5943][Streaming] Update the test to use...

2015-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4722 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...

2015-02-23 Thread GenTang
Github user GenTang commented on a diff in the pull request: https://github.com/apache/spark/pull/3920#discussion_r25155799 --- Diff: examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala --- @@ -18,20 +18,34 @@ package

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4725#discussion_r25157906 --- Diff: examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala --- @@ -36,7 +36,7 @@ object HBaseTest { // Initialize hBase table if

[GitHub] spark pull request: [SPARK-5943][Streaming] Update the test to use...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4722#issuecomment-75526163 LGTM. The new method is in branch-1.3, so can be back-ported, and I think this qualifies as a good tiny fix. I verified these are all the occurrences. --- If your

[GitHub] spark pull request: [SPARK-4730][YARN] Warn against deprecated YAR...

2015-02-23 Thread zuxqoj
Github user zuxqoj commented on a diff in the pull request: https://github.com/apache/spark/pull/3590#discussion_r25155375 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -78,11 +79,25 @@ private[spark] class

[GitHub] spark pull request: [SPARK-5174][SPARK-5175] provide more APIs in ...

2015-02-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/3984#issuecomment-75528392 sure, thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5802][MLLIB] cache transformed data in ...

2015-02-23 Thread joshdevins
Github user joshdevins commented on the pull request: https://github.com/apache/spark/pull/4593#issuecomment-75540678 I have the same concern as @dbtsai in his comment. Most consumers of this API will already be caching their dataset before the learning phase. Without user care, this

[GitHub] spark pull request: [SPARK-5802][MLLIB] cache transformed data in ...

2015-02-23 Thread petro-rudenko
Github user petro-rudenko commented on the pull request: https://github.com/apache/spark/pull/4593#issuecomment-75550855 @dbtsai, @joshdevins here's an issue i have. I'm using new ml pipeline with hyperparameter grid search. Because folds doesn't depend from hyperparameter, i've

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw closed the pull request at: https://github.com/apache/spark/pull/4728 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
GitHub user xunyuw opened a pull request: https://github.com/apache/spark/pull/4728 Merge pull request #1 from apache/master SYNC 2015-02-08 20:00 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xunyuw/spark master Alternatively

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread potix2
Github user potix2 closed the pull request at: https://github.com/apache/spark/pull/4725 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread potix2
Github user potix2 commented on a diff in the pull request: https://github.com/apache/spark/pull/4725#discussion_r25161690 --- Diff: examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala --- @@ -36,7 +36,7 @@ object HBaseTest { // Initialize hBase table if

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4725#discussion_r25162029 --- Diff: examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala --- @@ -36,7 +36,7 @@ object HBaseTest { // Initialize hBase table if

[GitHub] spark pull request: Merge pull request #2 from apache/master

2015-02-23 Thread xunyuw
GitHub user xunyuw reopened a pull request: https://github.com/apache/spark/pull/4727 Merge pull request #2 from apache/master SYNC 2015-02-23 20:00 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xunyuw/spark master

[GitHub] spark pull request: Merge pull request #2 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw closed the pull request at: https://github.com/apache/spark/pull/4727 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Merge pull request #2 from apache/master

2015-02-23 Thread xunyuw
GitHub user xunyuw opened a pull request: https://github.com/apache/spark/pull/4727 Merge pull request #2 from apache/master SYNC 2015-02-23 20:00 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xunyuw/spark master Alternatively

[GitHub] spark pull request: Merge pull request #2 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw closed the pull request at: https://github.com/apache/spark/pull/4727 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3147][MLLib] A/B testing

2015-02-23 Thread feynmanliang
Github user feynmanliang commented on the pull request: https://github.com/apache/spark/pull/4716#issuecomment-75543022 `[error] * abstract method numDim()Int in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary does not have a correspondent in old version`

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw closed the pull request at: https://github.com/apache/spark/pull/4726 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw commented on the pull request: https://github.com/apache/spark/pull/4726#issuecomment-75531993 SYNC 2015-02-23 20:00 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4726#issuecomment-75532001 Mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
GitHub user xunyuw opened a pull request: https://github.com/apache/spark/pull/4726 Merge pull request #1 from apache/master SYNC 2015-02-23 20:00 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xunyuw/spark master Alternatively

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
Github user xunyuw closed the pull request at: https://github.com/apache/spark/pull/4726 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK 5280] RDF Loader added + documentation

2015-02-23 Thread lukovnikov
Github user lukovnikov commented on the pull request: https://github.com/apache/spark/pull/4650#issuecomment-75533258 @maropu tests are added and build tests passed. Is it ready for merging now? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-2087] [SQL] Multiple thriftserver sessi...

2015-02-23 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-75535316 /cc @liancheng can you review this for me? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3147][MLLib] A/B testing

2015-02-23 Thread feynmanliang
Github user feynmanliang closed the pull request at: https://github.com/apache/spark/pull/4716 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3147][MLLib] A/B testing

2015-02-23 Thread feynmanliang
GitHub user feynmanliang reopened a pull request: https://github.com/apache/spark/pull/4716 [SPARK-3147][MLLib] A/B testing Implementation of A/B testing using Streaming API. This contribution is my original work and I license the work to the project under the project's

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4726#issuecomment-75532361 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-23 Thread xunyuw
GitHub user xunyuw reopened a pull request: https://github.com/apache/spark/pull/4726 Merge pull request #1 from apache/master SYNC 2015-02-23 20:00 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xunyuw/spark master

[GitHub] spark pull request: [SPARK-5926] [SQL] make DataFrame.explain leve...

2015-02-23 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/4707#issuecomment-75536583 retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [WIP][SPARK-4902][CORE] gap-sampling performan...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3744#issuecomment-75561628 @witgo is this still live and have you followed up on Xiangrui's comment? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [MLLIB] SPARK-4362: Added classProbabilities m...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3626#issuecomment-75564514 @alanctgardner have you had a look at @jkbradley 's feedback? I'm wondering this is still live. It needs a rebase if so. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-4845][Core] Adding a parallelismRatio t...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3694#issuecomment-75563929 I am also not clear this is a good thing. As a default, it doesn't change anything. There is probably not a globally correct ratio, even if it's not 1, but this implies

[GitHub] spark pull request: [SPARK-4006] Block Manager - Double Register C...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2854#issuecomment-75564774 Mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-4340] [Core] add java opts argument sub...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3205#issuecomment-75564995 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [MLLIB] SPARK-4362: Added classProbabilities m...

2015-02-23 Thread alanctgardner
Github user alanctgardner commented on a diff in the pull request: https://github.com/apache/spark/pull/3626#discussion_r25172516 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala --- @@ -65,6 +66,25 @@ class NaiveBayesModel private[mllib] (

[GitHub] spark pull request: [SPARK-4340] [Core] add java opts argument sub...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3205#issuecomment-75565369 [Test build #27852 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27852/consoleFull) for PR 3205 at commit

[GitHub] spark pull request: [SPARK-3147][MLLib] A/B testing

2015-02-23 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4716#issuecomment-75580259 Let's remove `numDim`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5946][Streaming] Add Python API for dir...

2015-02-23 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/4723#issuecomment-75581979 Hi @tdas , do we need to add a Python version of `createRDD` for direct Kafka stream? Seems this API requires Python wrapper of Java object like `OffsetRange`. ---

[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...

2015-02-23 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-75582752 @GenTang This PR looks good to me now, thanks @JoshRosen I think it's ready to go. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-23 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25183946 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java --- @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-23 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25184021 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilder.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-5946][Streaming] Add Python API for dir...

2015-02-23 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/4723#discussion_r25178444 --- Diff: examples/src/main/python/streaming/direct_kafka_wordcount.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-5939][MLLib] make FPGrowth example app ...

2015-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4714 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...

2015-02-23 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/3920#discussion_r25180080 --- Diff: examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala --- @@ -18,20 +18,34 @@ package

[GitHub] spark pull request: [SPARK-4845][Core] Adding a parallelismRatio t...

2015-02-23 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3694#issuecomment-75581853 You can implement this by expressing parallelism as a function of the parent RDD right? yeah you have to write the expression but does an alternative multiplier arg do

[GitHub] spark pull request: [SPARK-4845][Core] Adding a parallelismRatio t...

2015-02-23 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3694#issuecomment-75584312 @srowen good point. I think a ratio argument is prettier than an expression, but arguably not enough to warrant clogging up the API. --- If your project is set up for

[GitHub] spark pull request: [SPARK-5950][SQL] Enable inserting array into ...

2015-02-23 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4729 [SPARK-5950][SQL] Enable inserting array into Hive table saved as Parquet using DataSource API Currently `ParquetConversions` in `HiveMetastoreCatalog` does not really work. One reason is that

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-23 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25183693 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -0,0 +1,684 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75593300 [Test build #27854 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27854/consoleFull) for PR 4708 at commit

[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...

2015-02-23 Thread GenTang
Github user GenTang commented on a diff in the pull request: https://github.com/apache/spark/pull/3920#discussion_r25182187 --- Diff: examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala --- @@ -18,20 +18,34 @@ package

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread ilganeli
Github user ilganeli commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25183148 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -830,39 +836,39 @@ class DAGScheduler( try { // For

[GitHub] spark pull request: [SPARK-5939][MLLib] make FPGrowth example app ...

2015-02-23 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4714#issuecomment-75580799 LGTM. Merged into master and branch-1.3. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5950][SQL] Enable inserting array into ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4729#issuecomment-75588235 [Test build #27853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27853/consoleFull) for PR 4729 at commit

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-23 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-75593031 @pwendell I see what you mean about compatibility. Let me play with the code a bit, it might not be hard to do something like that as part of this patch. --- If your

[GitHub] spark pull request: SPARK-5951

2015-02-23 Thread zuxqoj
GitHub user zuxqoj opened a pull request: https://github.com/apache/spark/pull/4730 SPARK-5951 Remove unreachable driver memory properties in yarn client mode You can merge this pull request into a Git repository by running: $ git pull https://github.com/zuxqoj/spark master

[GitHub] spark pull request: [SPARK-4340] [Core] add java opts argument sub...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3205#issuecomment-75581446 [Test build #27852 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27852/consoleFull) for PR 3205 at commit

[GitHub] spark pull request: [SPARK-4845][Core] Adding a parallelismRatio t...

2015-02-23 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3694#issuecomment-75580971 In general, a fixed number of partitions is very difficult to work with when configuring a shuffle. Suppose I have a job where I know a `flatMap` is going to blow up the

[GitHub] spark pull request: [SPARK-4340] [Core] add java opts argument sub...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3205#issuecomment-75581461 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5951][YARN] Remove unreachable driver m...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4730#issuecomment-75594177 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [Examples] fix deprecated method use in HBaseT...

2015-02-23 Thread potix2
GitHub user potix2 opened a pull request: https://github.com/apache/spark/pull/4725 [Examples] fix deprecated method use in HBaseTest HTableDescriptor(String name) is deprecated. https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html

[GitHub] spark pull request: [SPARK-5944] [PySpark] fix version in Python A...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4731#issuecomment-75629173 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5944] [PySpark] fix version in Python A...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4731#issuecomment-75629156 [Test build #27861 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27861/consoleFull) for PR 4731 at commit

[GitHub] spark pull request: [SPARK-5912] [docs] [mllib] Small fixes to Chi...

2015-02-23 Thread jkbradley
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/4732 [SPARK-5912] [docs] [mllib] Small fixes to ChiSqSelector docs Fixes: * typo in Scala example * Removed comment usually applied on sparse data since that is debatable * small edits to

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75637201 Look pretty good to me, but left a few more comments. Also, please take a look at the various logging strings to see whether some of them can be expressed more

[GitHub] spark pull request: [SPARK-5912] [docs] [mllib] Small fixes to Chi...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4732#issuecomment-75637616 [Test build #27862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27862/consoleFull) for PR 4732 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25202931 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -210,40 +210,58 @@ class DAGScheduler( * The jobId value

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75638966 Oops, did not realize that a test was still running (glad it passed) --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-5946][Streaming] Add Python API for dir...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4723#issuecomment-75594343 [Test build #27855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27855/consoleFull) for PR 4723 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186634 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,52 +71,9 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-5253] [ML] LinearRegression with L1/L2 ...

2015-02-23 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4259#issuecomment-75600260 @dbtsai I'd like to make a pass over this, but I realized that it has conflicts because of the developer api PR committed last week:

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75604216 [Test build #27858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27858/consoleFull) for PR 4708 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75609741 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75610748 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75614718 [Test build #27857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27857/consoleFull) for PR 4709 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186416 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -47,26 +47,20 @@ import org.apache.spark.util.CallSite * be updated for

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75599168 [Test build #27857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27857/consoleFull) for PR 4709 at commit

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-75604198 [Test build #27859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27859/consoleFull) for PR 4048 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75594276 [Test build #27856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27856/consoleFull) for PR 4708 at commit

[GitHub] spark pull request: [SPARK-5950][SQL] Enable inserting array into ...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4729#issuecomment-75597599 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186558 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,52 +71,9 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25188257 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -228,22 +227,41 @@ class DAGScheduler( } /** -

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/4709#discussion_r25188678 --- Diff: docs/mllib-feature-extraction.md --- @@ -375,3 +375,55 @@ data2 = labels.zip(normalizer2.transform(features)) {% endhighlight %} /div

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75603592 I think that last issue is the only one--thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75609726 [Test build #27854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27854/consoleFull) for PR 4708 at commit

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75610731 [Test build #27856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27856/consoleFull) for PR 4708 at commit

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread avulanov
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75610561 Sorry for this, still sleeping... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [MLLIB] SPARK-5912 Programming guide for featu...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4709#issuecomment-75611280 [Test build #27860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27860/consoleFull) for PR 4709 at commit

[GitHub] spark pull request: [SPARK-5924] Add the ability to specify withMe...

2015-02-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4704#discussion_r25192351 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala --- @@ -29,7 +29,18 @@ import org.apache.spark.sql.types.{StructField,

[GitHub] spark pull request: [SPARK-5944] fix version in Python API docs

2015-02-23 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/4731 [SPARK-5944] fix version in Python API docs use RELEASE_VERSION when building the Python API docs You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-5950][SQL] Enable inserting array into ...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4729#issuecomment-75597588 [Test build #27853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27853/consoleFull) for PR 4729 at commit

[GitHub] spark pull request: [SPARK-5927][MLlib] Modify FPGrowth's partitio...

2015-02-23 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4706#issuecomment-75599387 @viirya Your proposal definitely works better in some cases, while the current implementation works better in some others. I think we both agree on this. The question is

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75602215 @JoshRosen I'm happy to take a look at this but won't be able to get to it until Friday. Feel free to merge it sooner than that if you're eager to get it in;

[GitHub] spark pull request: [MLLIB] SPARK-4362: Added classProbabilities m...

2015-02-23 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3626#issuecomment-75608187 @alanctgardner That will be great if you change it to predictProbabilities; thanks. I agree with what @jatinpreet was saying about the correctness, and with @srowen

[GitHub] spark pull request: [SPARK-5924] Add the ability to specify withMe...

2015-02-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4704#discussion_r25192356 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala --- @@ -44,12 +55,18 @@ class StandardScaler extends

[GitHub] spark pull request: [SPARK-5946][Streaming] Add Python API for dir...

2015-02-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4723#issuecomment-75610987 [Test build #27855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27855/consoleFull) for PR 4723 at commit

[GitHub] spark pull request: [SPARK-5946][Streaming] Add Python API for dir...

2015-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4723#issuecomment-75610996 Test PASSed. Refer to this link for build results (access rights to CI server needed):

  1   2   3   >