[GitHub] spark pull request: [SPARK-4808] Configurable spillable memory thr...

2015-02-18 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/4420#issuecomment-74827293 If every single object is large though, then in that case after we've spilled the 32nd object, there would still be an OOM before we check for spilling again, right? I

[GitHub] spark pull request: [SPARK-4808] Configurable spillable memory thr...

2015-02-18 Thread mingyukim
Github user mingyukim commented on the pull request: https://github.com/apache/spark/pull/4420#issuecomment-74828351 @pwendell, I mentioned above, but at a high level, wouldn't it be better to control the frequency of spills by how much memory you acquire from the shuffle memory

[GitHub] spark pull request: [SPARK-5878] fix DataFrame.repartition() in Py...

2015-02-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4667#issuecomment-74832254 Thanks. Merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Avoid deprecation warnings in JDBCSuite.

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4668 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...

2015-02-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3781#discussion_r24890677 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -148,19 +152,16 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3781#issuecomment-74846885 [Test build #27678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27678/consoleFull) for PR 3781 at commit

[GitHub] spark pull request: [SPARK-5840][SQL] HiveContext cannot be serial...

2015-02-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4628#issuecomment-74833670 @marmbrus I updated it with test cases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [Minor] Minor doc fix in GBT classification ex...

2015-02-18 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/4672 [Minor] Minor doc fix in GBT classification example numClassesForClassification has been renamed to numClasses. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74834330 ping @jkbradley ? I was not sure if I had to open a JIRA for this, as it is minor. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-5825] [Spark Submit] Remove the double ...

2015-02-18 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4611#issuecomment-74842751 As I say on OS X you get the whole binary path, not just `java`: ``` ps -p ... -o comm= ...

[GitHub] spark pull request: [SPARK-5878] fix DataFrame.repartition() in Py...

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4667 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5840][SQL] HiveContext cannot be serial...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4628#issuecomment-74834156 [Test build #27676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27676/consoleFull) for PR 4628 at commit

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74843700 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5840][SQL] HiveContext cannot be serial...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4628#issuecomment-74843574 [Test build #27676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27676/consoleFull) for PR 4628 at commit

[GitHub] spark pull request: [SPARK-5840][SQL] HiveContext cannot be serial...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4628#issuecomment-74843582 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74843689 [Test build #27677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27677/consoleFull) for PR 4672 at commit

[GitHub] spark pull request: [SPARK 5280] RDF Loader added + documentation

2015-02-18 Thread lukovnikov
Github user lukovnikov commented on the pull request: https://github.com/apache/spark/pull/4650#issuecomment-74851018 style errors fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74834704 [Test build #27677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27677/consoleFull) for PR 4672 at commit

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74834754 I will merge this back to 1.2. It really should just be an addendum to https://issues.apache.org/jira/browse/SPARK-4610 --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...

2015-02-18 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/3781#discussion_r24893296 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -148,19 +152,16 @@ private[spark] class

[GitHub] spark pull request: SPARK-5669 [BUILD] [HOTFIX] Spark assembly inc...

2015-02-18 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/4673 SPARK-5669 [BUILD] [HOTFIX] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS Correct exclusion path for JBLAS native libs. (More explanation coming soon on the

[GitHub] spark pull request: Avoid deprecation warnings in JDBCSuite.

2015-02-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4668#issuecomment-74832153 This is great. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4672 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK 5280] RDF Loader added + documentation

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4650#issuecomment-74851295 [Test build #27680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27680/consoleFull) for PR 4650 at commit

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-18 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74908505 Sorry, but this patch is not correct. As @growse mentions, when `SPARK_LOCAL_DIRS` is not set, this code will try to change the permissions of `/tmp` on Unix machines. It

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-18 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74910370 Ah, wait, there's a second problem (which would result in the cascading directories, I think). `getLocalDir` should cache the local directory it returns, to avoid having

[GitHub] spark pull request: [SPARK-5507] Added documentation for BlockMatr...

2015-02-18 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4664#issuecomment-74915682 LGTM. Merged into master and branch-1.3. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-18 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74916479 Hi, me again, sorry for the spam. Regarding my last comment, it's probably better if `getOrCreateLocalRootDirs()` caches its return value instead of `getLocalDir()`,

[GitHub] spark pull request: [SPARK-5507] Added documentation for BlockMatr...

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4664 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4903][SQL]Backport the bug fix for SPAR...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4671#issuecomment-74913992 [Test build #27682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27682/consoleFull) for PR 4671 at commit

[GitHub] spark pull request: [SPARK-5519][MLLIB] add user guide with exampl...

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4661 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5821] [SQL] JSON CTAS command should th...

2015-02-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/4610#discussion_r24922476 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JSONRelation.scala --- @@ -66,9 +66,17 @@ private[sql] class DefaultSource mode match {

[GitHub] spark pull request: [SPARK-5519][MLLIB] add user guide with exampl...

2015-02-18 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4661#issuecomment-74915467 Merged into master and branch-1.3. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...

2015-02-18 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-74919610 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread jkbradley
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/4675 [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] Doc cleanups for 1.3 release For SPARK-5867: * The spark.ml programming guide needs to be updated to use the new SQL DataFrame API instead of the

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74931339 [Test build #27684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27684/consoleFull) for PR 4675 at commit

[GitHub] spark pull request: [Minor] [MLlib] Minor doc fix in GBT classific...

2015-02-18 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4672#issuecomment-74932384 (belatedly) Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Spark-5889] Remove pid file after stopping se...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4676#issuecomment-74932367 [Test build #27685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27685/consoleFull) for PR 4676 at commit

[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

2015-02-18 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/4583#issuecomment-74931875 @florianverhein - Sorry for the delay. I just tested this out and it seemed to work okay. One thing that I was confused by is that its not very clear where the files

[GitHub] spark pull request: [Spark 5889] Remove pid file after stopping se...

2015-02-18 Thread zhzhan
GitHub user zhzhan opened a pull request: https://github.com/apache/spark/pull/4676 [Spark 5889] Remove pid file after stopping service. Currently the pid file is not deleted, and potentially may cause some problem after service is stopped. The fix remove the pid file after service

[GitHub] spark pull request: [SPARK-4903][SQL]Backport the bug fix for SPAR...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4671#issuecomment-74932737 [Test build #27682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27682/consoleFull) for PR 4671 at commit

[GitHub] spark pull request: [SPARK-4903][SQL]Backport the bug fix for SPAR...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4671#issuecomment-74932750 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-74934591 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-74934575 [Test build #27683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27683/consoleFull) for PR 3850 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74931104 Note: The altered examples in the spark.ml guide were copied from executable examples in the examples/ directory. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74932285 [Test build #27686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27686/consoleFull) for PR 4675 at commit

[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-74920638 [Test build #27683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27683/consoleFull) for PR 3850 at commit

[GitHub] spark pull request: SPARK-5570: No docs stating that `new SparkCon...

2015-02-18 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/4665#issuecomment-74920473 Hey @ilganeli thanks for doing this. Can you also do this for the other `spark.driver.*` options? Like extra java opts, class paths etc. --- If your project is set

[GitHub] spark pull request: SPARK-5548: Fix for AkkaUtilsSuite failure - a...

2015-02-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4653#discussion_r24927231 --- Diff: core/src/test/scala/org/apache/spark/util/AkkaUtilsSuite.scala --- @@ -370,9 +371,13 @@ class AkkaUtilsSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-5559] [Streaming] [Test] Remove oppotun...

2015-02-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4337#discussion_r24929712 --- Diff: external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala --- @@ -113,7 +115,8 @@ class MQTTStreamSuite extends

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-02-18 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4553#issuecomment-74928572 It looks like this was opened by mistake; do you mind closing this issue? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread gurvindersingh
Github user gurvindersingh commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74931493 will be nice to have this patch merged in for 1.3 release. As we plan to use this feature with Mesos and Spark --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-5548: Fix for AkkaUtilsSuite failure - a...

2015-02-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/4653#discussion_r24927173 --- Diff: core/src/test/scala/org/apache/spark/util/AkkaUtilsSuite.scala --- @@ -370,9 +371,13 @@ class AkkaUtilsSuite extends FunSuite with

[GitHub] spark pull request: SPARK-5570: No docs stating that `new SparkCon...

2015-02-18 Thread ilganeli
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/4665#issuecomment-74931272 Sure @andrewor14 , I presume their behavior is identical ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74936971 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74936955 [Test build #27687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27687/consoleFull) for PR 3074 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4675#discussion_r24935321 --- Diff: docs/ml-guide.md --- @@ -171,12 +171,12 @@ import org.apache.spark.sql.{Row, SQLContext} val conf = new

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74936967 [Test build #27687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27687/consoleFull) for PR 3074 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/4675#discussion_r24936197 --- Diff: python/pyspark/ml/pipeline.py --- @@ -18,7 +18,8 @@ from abc import ABCMeta, abstractmethod from pyspark.ml.param import Param,

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread tnachen
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74940888 @mateiz where do you suggest putting this Dockerfile? I have a Dockerfile that builds Spark from source that depends on the Mesos image here:

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mbofb
Github user mbofb commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74948488 description of RowMatrix.computeSVD and mllib-dimensionality-reduction.html: We assume n is smaller than m. Is this just a recommendation or a hard requirement. This

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mbofb
Github user mbofb commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74949165 description of RowMatrix. computePrincipalComponents or RowMatrix in general: I got a Exception. java.lang.IllegalArgumentException: Argument with more than 65535

[GitHub] spark pull request: [SPARK-5016] Distribute Gaussian Initializatio...

2015-02-18 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4654#issuecomment-74951424 @MechCoder Could you share some performance comparison results? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-74953365 [Test build #27689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27689/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3861#issuecomment-74950927 We spoke a bit offline about this, but my feeling was that the best thing here might be to add a way to launch the shuffle service as a standalone application

[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

2015-02-18 Thread tnachen
Github user tnachen closed the pull request at: https://github.com/apache/spark/pull/3861 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

2015-02-18 Thread tnachen
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/3861#issuecomment-74951847 Agree and it's currently being worked on. We can close this PR too. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74951728 The docker folder is for test images, but it could be a good place for this one. I'll let @pwendell comment on it. Does Apache Mesos publish a base Docker image?

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4221#issuecomment-74952993 Yeah our auto-close doesn't work on PR's into release branches like this. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/4677 [SPARK-5436] [MLlib] Validate GradientBoostedTrees during train One can early stop if the decrease in error rate is lesser than a certain tol, or if the error increases if the training data is

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-74953724 @jkbradley I just wanted to know if this is in the right direction. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-74954211 [Test build #27690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27690/consoleFull) for PR 4677 at commit

[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

2015-02-18 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/4583#issuecomment-74954531 Hmm okay - My other concern was also that the directory itself wasn't maintained. i.e. it might be better to put the deploy-root-dir into `/` as a directory

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread tnachen
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74956760 Mesosphere does publish a Mesos image on each release (mesosphere/mesos), with the each version tagged. We don't tag the latest release with the :latest tag, I could

[GitHub] spark pull request: [SPARK-4903][SQL]Backport the bug fix for SPAR...

2015-02-18 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4671#issuecomment-74957281 Thanks, merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: example of python converter for avrò output f...

2015-02-18 Thread daria-sukhareva
GitHub user daria-sukhareva opened a pull request: https://github.com/apache/spark/pull/4678 example of python converter for avrò output format I actually wanted to know if I am doing it right rather than suggest pulling it to spark repo You can merge this pull request into a Git

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-74957382 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: example of python converter for avrò output f...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4678#issuecomment-74957691 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5710][SQL] Combines two adjacent Cast e...

2015-02-18 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4497#issuecomment-74957611 Agreed. If there are concrete proposals for eliminating redundant casts then we should discuss on JIRA. However as is this could change the answer and thus is an

[GitHub] spark pull request: [SPARK-4903][SQL]Backport the bug fix for SPAR...

2015-02-18 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/4671#issuecomment-74957462 Thanks!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5436] [MLlib] Validate GradientBoostedT...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4677#issuecomment-74957377 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4675#discussion_r24935349 --- Diff: docs/mllib-guide.md --- @@ -90,6 +90,21 @@ version 1.4 or newer. # Migration Guide +## From 1.2 to 1.3 + +In the

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4675#discussion_r24935325 --- Diff: docs/ml-guide.md --- @@ -300,19 +302,21 @@ ListLabeledPoint localTest = Lists.newArrayList( new LabeledPoint(1.0, Vectors.dense(-1.0, 1.5,

[GitHub] spark pull request: [SPARK-5673] [MLlib] Implement Streaming wrapp...

2015-02-18 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4456#issuecomment-74936361 @catap Can you please add a description for this PR? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread hellertime
Github user hellertime commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74944182 @tnachen That Dockerfile you have is actually all that is needed for an example image; that its based on the mesosphere image is even better! I had hoped that

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74946826 [Test build #27686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27686/consoleFull) for PR 4675 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74946839 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74939602 [Test build #27688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27688/consoleFull) for PR 3074 at commit

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74939621 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74945101 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

2015-02-18 Thread florianverhein
Github user florianverhein commented on the pull request: https://github.com/apache/spark/pull/4583#issuecomment-74946543 Thanks @shivaram. I'm not sure I follow 100%. With that argument they should have ended up eg /.vimrc (unless root is a subdirectory of dotfiles). The contents if

[GitHub] spark pull request: [SPARK-4808] Configurable spillable memory thr...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4420#issuecomment-74946838 @mccheah @mingyukim yeah, there isn't an OOM proof solution at all because these are all heuristics. Even checking every element is not OOM proof since memory

[GitHub] spark pull request: [Spark-5889] Remove pid file after stopping se...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4676#issuecomment-74946963 [Test build #27685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27685/consoleFull) for PR 4676 at commit

[GitHub] spark pull request: [Spark-5889] Remove pid file after stopping se...

2015-02-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4676#issuecomment-74946981 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread hellertime
Github user hellertime commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74937819 So perhaps putting an example Dockerfile in the `docker` subdirectory is not an appropriate thing to do... any suggestions on a better location for examples such as

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/4675#discussion_r24936278 --- Diff: python/pyspark/mllib/__init__.py --- @@ -33,3 +34,20 @@ random.__name__ = 'random' random.RandomRDDs.__module__ = __name__ + '.random'

[GitHub] spark pull request: [SPARK-2691][Mesos] Support for Mesos DockerIn...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-74939617 [Test build #27688 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27688/consoleFull) for PR 3074 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74945089 [Test build #27684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27684/consoleFull) for PR 4675 at commit

[GitHub] spark pull request: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] D...

2015-02-18 Thread mbofb
Github user mbofb commented on the pull request: https://github.com/apache/spark/pull/4675#issuecomment-74948403 The description of RowMatrix.computeSVD and mllib-dimensionality-reduction.html should be more precise/explicit regarding the m x n matrix. In the current description I

[GitHub] spark pull request: [SPARK-5840][SQL] HiveContext cannot be serial...

2015-02-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4628 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

  1   2   3   >