date:20141222

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67811183 [Test build #24699 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24699/consoleFull) for PR 3758 at commit

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67811187 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4916][SQL][DOCS]Update SQL programming ...

2014-12-22 Thread luogankun

GitHub user luogankun opened a pull request: https://github.com/apache/spark/pull/3759 [SPARK-4916][SQL][DOCS]Update SQL programming guide about cache section `SchemeRDD.cache()` now uses in-memory columnar storage. You can merge this pull request into a Git repository by running:

[GitHub] spark pull request: [SPARK-4916][SQL][DOCS]Update SQL programming ...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3759#issuecomment-67811391 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4917] Add a function to convert into a ...

2014-12-22 Thread maropu

GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3760 [SPARK-4917] Add a function to convert into a graph with canonical edges in GraphOps Convert bi-directional edges into uni-directional ones instead of 'canonicalOrientation' in

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

2014-12-22 Thread loachli

Github user loachli commented on the pull request: https://github.com/apache/spark/pull/1290#issuecomment-67812105 @avulanov: *Could you write a brief description to the ANN test called Gradient of ANN to let the reader understand more clearly what we are testing?* The test

[GitHub] spark pull request: [SPARK-4917] Add a function to convert into a ...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3760#issuecomment-67812097 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread rxin

Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67814128 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread liancheng

Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67814308 Ah, sorry, forgot that the golden answer file name is generated by the MD5 of the query string. Then let's revert the last space change. I think this minor issue

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67814511 [Test build #24700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24700/consoleFull) for PR 3758 at commit

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread scwf

GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/3761 [SQL] spark-sql aborted if passed in a wrong sql If we passed in a wrong sql like ```abdcdfsfs```, the spark-sql script aborted. You can merge this pull request into a Git repository by running:

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD

Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/3022#issuecomment-67816287 Sorry for late reply.predictLabels() and predictMembership() looks fine.But what about moving the computeSoftAssignments() to GaussianMixtureModelEM class(in KMeans,

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67816416 [Test build #24701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24701/consoleFull) for PR 3761 at commit

[GitHub] spark pull request: [SPARK-1953][YARN]yarn client mode Application...

2014-12-22 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/3607#issuecomment-67816606 @andrewor14 Ok, I got what you mean. I think I have a misunderstanding before. To solve this problem, should we just delete `(--driver-memory,

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread YanTangZhai

Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67816709 @liancheng I will revert the last space change. Thanks for your comment. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67817199 [Test build #24702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24702/consoleFull) for PR 3555 at commit

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67818470 [Test build #24703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24703/consoleFull) for PR 3761 at commit

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread liancheng

Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/3752#discussion_r22159720 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -55,8 +56,60 @@ private[hive] class

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread liancheng

Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/3752#discussion_r22160548 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -55,8 +56,60 @@ private[hive] class

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67822551 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67822544 [Test build #24701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24701/consoleFull) for PR 3761 at commit

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67822642 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67822635 [Test build #24700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24700/consoleFull) for PR 3758 at commit

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread liancheng

Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/3752#discussion_r22161200 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -55,8 +56,60 @@ private[hive] class

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67823381 [Test build #24702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24702/consoleFull) for PR 3555 at commit

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67823389 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67824715 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SQL] spark-sql aborted if passed in a wrong s...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3761#issuecomment-67824711 [Test build #24703 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24703/consoleFull) for PR 3761 at commit

[GitHub] spark pull request: [SPARK-4907][MLlib] Inconsistent loss and grad...

2014-12-22 Thread bryanyang0528

Github user bryanyang0528 commented on the pull request: https://github.com/apache/spark/pull/3746#issuecomment-67825123 On my opinion, I don't think the parameter of the cost function is 1/m or 1/2m is the critical deference. Across the cost function L = alpha * 1/2n ||A

[GitHub] spark pull request: [SPARK-4907][MLlib] Inconsistent loss and grad...

2014-12-22 Thread srowen

Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3746#issuecomment-67826137 @bryanyang0528 I don't think anyone's suggesting that the extra factor of 1/2 is more or less correct or desirable per se. The solution doesn't depend on the absolute

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread liancheng

Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67827365 Thanks for the update, this now LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD

Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22163213 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/DenseGmmEM.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD

Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22163250 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: Reuse Text in saveAsTextFile

2014-12-22 Thread zsxwing

GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/3762 Reuse Text in saveAsTextFile Reuse Text in saveAsTextFile to reduce GC. /cc @rxin You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67832465 [Test build #24704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24704/consoleFull) for PR 3762 at commit

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread srowen

Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67832813 I think it's a small but OK optimization. Hadoop won't save the `Text` object itself, so it's safe, here in the 'save' method. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread zsxwing

Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67834037 Looks there is some issue in HiveThriftServer2 in the branch-1.2? @liancheng ``` Exception in thread main java.lang.RuntimeException:

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread liancheng

Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67834393 This is probably caused by SPARK-4914, which is a bug in `dev/run-tests` and doesn't affect production code. PR #3756 was opened to fix this. --- If your project is

[GitHub] spark pull request: [SPARK-4914][Build] Cleans lib_managed before ...

2014-12-22 Thread liancheng

Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3756#issuecomment-67834656 @pwendell @JoshRosen Would you please take a look at this? This issue is causing random PR build failures. Thanks! --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-4907][MLlib] Inconsistent loss and grad...

2014-12-22 Thread bryanyang0528

Github user bryanyang0528 commented on the pull request: https://github.com/apache/spark/pull/3746#issuecomment-67836176 @srowen I agree on that need a absolute value can be compared with others software. Maybe it would add a parameter to control the extra factor? --- If your

[GitHub] spark pull request: #SPARK-2808 update kafka to version 0.8.2

2014-12-22 Thread helena

Github user helena commented on the pull request: https://github.com/apache/spark/pull/3631#issuecomment-67836412 @JoshRosen Ticket name updated :) Sorry for the delay, I was away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67839700 [Test build #24704 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24704/consoleFull) for PR 3762 at commit

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67839708 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4907][MLlib] Inconsistent loss and grad...

2014-12-22 Thread dbtsai

Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/3746#issuecomment-67842962 @bryanyang0528 The learning rate issue here is different story. With modern optimization algorithms like LBFGS and OWLQN, the learning rate is not required. The

[GitHub] spark pull request: [SPARK-4907][MLlib] Inconsistent loss and grad...

2014-12-22 Thread bryanyang0528

Github user bryanyang0528 commented on the pull request: https://github.com/apache/spark/pull/3746#issuecomment-67847818 @dbtsai Thank you for your clear explanation which helps me alot! --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2505][MLlib] Weighted Regularizer for G...

2014-12-22 Thread witgo

Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1518#discussion_r22171070 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/Regularizer.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-22 Thread uncleGen

GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/3763 [SPARK-4920][UI]:current spark version in UI is not striking. It is not convenient to see the Spark version. We can keep the same style with Spark website.

[GitHub] spark pull request: [SPARK-2505][MLlib] Weighted Regularizer for G...

2014-12-22 Thread dbtsai

Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/1518#discussion_r22173571 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/Regularizer.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3763#issuecomment-67852386 [Test build #24705 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24705/consoleFull) for PR 3763 at commit

[GitHub] spark pull request: [SPARK-4860][pyspark][sql] speeding up `sample...

2014-12-22 Thread jbencook

GitHub user jbencook opened a pull request: https://github.com/apache/spark/pull/3764 [SPARK-4860][pyspark][sql] speeding up `sample()` and `takeSample()` This PR modifies the python `SchemaRDD` to use `sample()` and `takeSample()` from Scala instead of the slower python

[GitHub] spark pull request: [SPARK-4860][pyspark][sql] speeding up `sample...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3764#issuecomment-67860230 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread sryza

Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67860552 Great idea, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3763#issuecomment-67863235 [Test build #24705 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24705/consoleFull) for PR 3763 at commit

[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3763#issuecomment-67863240 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67868189 Hmm, guess I missed this in my testing. Anyway, I think this is the wrong place for the fix. The right fix in my view should be in `SparkDeploySchedulerBackend`,

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-67868333 Thanks Josh! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-1507][YARN]specify num of cores for AM

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3686#issuecomment-67869210 It says it's standalone mode only because it's never been implemented anywhere else. You're now implementing it for Yarn, I don't see a reason why you wouldn't just reuse

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67870864 Here's what I think is a better approach, feel free to use / adapt it: https://gist.github.com/vanzin/e1910b11ce00630fe9d4 --- If your project is set up for it, you

[GitHub] spark pull request: [Minor] Fix scala doc

2014-12-22 Thread ash211

Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/3751#issuecomment-67871664 This is a very minor change -- do we need a Jira ticket for it?

[GitHub] spark pull request: [SPARK-2309][MLlib] Generalize the binary logi...

2014-12-22 Thread avulanov

Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1379#issuecomment-67872100 @dbtsai I did local experiment on mnist and your new implementation seems to be more than 2x faster than the previous one! I am going to perform bigger experiments. In

[GitHub] spark pull request: [SPARK-4860][pyspark][sql] speeding up `sample...

2014-12-22 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3764#issuecomment-67875459 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3752#discussion_r22183385 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -55,8 +56,60 @@ private[hive] class HiveMetastoreCatalog(hive:

[GitHub] spark pull request: [SPARK-4860][pyspark][sql] speeding up `sample...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3764#issuecomment-67875679 [Test build #24706 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24706/consoleFull) for PR 3764 at commit

[GitHub] spark pull request: SPARK-4547 [MLLIB] OOM when making bins in Bin...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3702#discussion_r22183573 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala --- @@ -28,9 +28,23 @@ import

[GitHub] spark pull request: SPARK-4547 [MLLIB] OOM when making bins in Bin...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3702#discussion_r22183579 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetricsSuite.scala --- @@ -124,4 +124,36 @@ class

[GitHub] spark pull request: SPARK-4547 [MLLIB] OOM when making bins in Bin...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3702#discussion_r22183575 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala --- @@ -103,7 +117,37 @@ class

[GitHub] spark pull request: SPARK-4547 [MLLIB] OOM when making bins in Bin...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3702#discussion_r22183580 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetricsSuite.scala --- @@ -124,4 +124,36 @@ class

[GitHub] spark pull request: SPARK-4547 [MLLIB] OOM when making bins in Bin...

2014-12-22 Thread jkbradley

Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3702#issuecomment-67876179 @srowen The logic test look fine; I just added a couple of comments. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3752#issuecomment-67876256 [Test build #24707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24707/consoleFull) for PR 3752 at commit

[GitHub] spark pull request: [SPARK-1507][YARN]specify num of cores for AM

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3686#issuecomment-67876245 So, after actually reading the code :-), the current implementation uses `spark.yarn.am.cores` for both client and cluster mode. I think that's bad, because if

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3752#discussion_r22183673 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -55,8 +56,60 @@ private[hive] class HiveMetastoreCatalog(hive:

[GitHub] spark pull request: [SPARK-4912][SQL] Persistent tables for the Sp...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3752#issuecomment-67876806 [Test build #24708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24708/consoleFull) for PR 3752 at commit

[GitHub] spark pull request: [SPARK-4917] Add a function to convert into a ...

2014-12-22 Thread rxin

Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3760#issuecomment-67876851 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread rxin

Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67876912 I'm merging this since I really only wanted to check for compilation. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67877038 retest this please @vanzin I believe we separated the definition of `isEventLogEnabled` from that of `eventLogger` because of the following initialization

[GitHub] spark pull request: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread rxin

Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3758#issuecomment-67877226 Alright I've merged this. Do you mind closing the PR? Github doesn't close it unless the commit is merged into master. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67877360 I see. Hmm. That sucks. :-/ A comment there would help at least, but even better would be to avoid this tight coupling altogether. --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-4917] Add a function to convert into a ...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3760#issuecomment-67877352 [Test build #24709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24709/consoleFull) for PR 3760 at commit

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread rxin

Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3762#issuecomment-67877459 LGTM. Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4918][Core] Reuse Text in saveAsTextFil...

2014-12-22 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3762 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67877780 Hey @viirya I believe the right fix here is to change the `eventLogFile` field back to an `eventLogDir` (because it refers to the base logging directory, not the

[GitHub] spark pull request: [SPARK-3382] GradientDescent convergence toler...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3636#discussion_r22184450 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala --- @@ -77,6 +80,17 @@ class GradientDescent private[mllib]

[GitHub] spark pull request: [SPARK-4749] [mllib]: Allow initializing KMean...

2014-12-22 Thread jkbradley

Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3610#issuecomment-67877997 failure in a streaming test...retesting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-4749] [mllib]: Allow initializing KMean...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3610#issuecomment-67878052 [Test build #551 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/551/consoleFull) for PR 3610 at commit

[GitHub] spark pull request: [SPARK-4749] [mllib]: Allow initializing KMean...

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3610#discussion_r22184615 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -43,13 +43,14 @@ class KMeans private ( private var runs:

[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3763#issuecomment-67878388 +1. I also thought the bottom greyed out text is too obscure. @JoshRosen any thoughts? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-4915][YARN] Fix classname to be specifi...

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3757#issuecomment-67878534 I'm merging this since this is just docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4915][YARN] Fix classname to be specifi...

2014-12-22 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3757#issuecomment-67878503 [Test build #24710 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24710/consoleFull) for PR 3757 at commit

[GitHub] spark pull request: [SPARK-4915][YARN] Fix classname to be specifi...

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3757#issuecomment-67878474 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4915][YARN] Fix classname to be specifi...

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3757#issuecomment-67878448 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4915][YARN] Fix classname to be specifi...

2014-12-22 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3757 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22184915 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala --- @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67878956 Andrew's suggestion sounds good. Long term, I think it would be better to send this log path later (as some sort of application stopping message maybe?), instead of

[GitHub] spark pull request: [SPARK-4881] Use SparkConf#getBoolean instead ...

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3733#issuecomment-67878993 No worries. Once you make those changes I will merge this. By the way for issues as minor as this one I don't think filing a JIRA is necessary. I would just put

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22185023 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModelEM.scala --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4870] Add spark version to driver log

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3717#issuecomment-67879567 This looks fine. I'm going to tweak the log format a little bit when I merge it. Thanks --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-4870] Add spark version to driver log

2014-12-22 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3717 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread jkbradley

Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22185641 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModelEM.scala --- @@ -0,0 +1,242 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread jkbradley

Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3022#issuecomment-67880399 @tgaloppo MLUtils.EPSILON is actually private[util]. I think it would be fine to change it to be private[mllib]. CC: @mengxr @tgaloppo I strongly recommend

[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path

2014-12-22 Thread andrewor14

Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-67880836 For instance... https://github.com/andrewor14/spark/compare/fix-event-log-suggestion --- If your project is set up for it, you can reply to this email and have your

1 2 3 4 >

1 - 100 of 322 matches

Mail list logo