[GitHub] spark pull request: Add a Community Projects page

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2219#issuecomment-53979567 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19534/consoleFull) for PR 2219 at commit [`613b021`](https://github.com/ap

[GitHub] spark pull request: Add a Community Projects page

2014-08-30 Thread velvia
GitHub user velvia opened a pull request: https://github.com/apache/spark/pull/2219 Add a Community Projects page This adds a new page to the docs listing community projects -- those created outside of Apache Spark that are of interest to the community of Spark users. Anybody can

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-08-30 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-53979487 Yes, I will do that next week. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-53979384 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19533/consoleFull) for PR 60 at commit [`27df6ce`](https://github.com/apache

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53978638 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19532/consoleFull) for PR 1992 at commit [`b2a044a`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3280] Made sort-based shuffle the defau...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2178#issuecomment-53978615 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19531/consoleFull) for PR 2178 at commit [`1445ef2`](https://github.com/a

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53978586 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53978314 hi @pwendell, jenkins fetch error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark pull request: [SPARK-3280] Made sort-based shuffle the defau...

2014-08-30 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2178#issuecomment-53978226 @pwendell I think some Spark SQL tests are failing. Spark SQL isn't completely compatible with sort based shuffle yet. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3280] Made sort-based shuffle the defau...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2178#issuecomment-53978007 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19531/consoleFull) for PR 2178 at commit [`1445ef2`](https://github.com/ap

[GitHub] spark pull request: Update building-with-maven.md

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2102#issuecomment-53977986 Can you create a JIRA for this issue? I'm going to re-word this a bit when I merge it, I think it's fine to say that certain users have reported issues building behind p

[GitHub] spark pull request: [SPARK-3280] Made sort-based shuffle the defau...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2178#issuecomment-53977954 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-53977940 Made a few comments inline. On building docs, my favorite idea is just to have the README link to the upstream docs, and then change the upstream docs to be called "Buil

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2014#discussion_r16933038 --- Diff: README.md --- @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given. ## Running Tests -Test

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2014#discussion_r16933034 --- Diff: README.md --- @@ -66,78 +69,24 @@ Many of the example programs print usage help if no params are given. ## Running Tests -Test

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2014#discussion_r16933031 --- Diff: CONTRIBUTING.md --- @@ -0,0 +1,12 @@ +## Contributing to Spark --- End diff -- Yeah, seems fine to have this here. It might make it

[GitHub] spark pull request: [SPARK-3287] When ResourceManager High Availab...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2212#issuecomment-53977840 Can you add `[YARN]` to the title so that it gets sorted properly? Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53977827 I added a comment about the experimental formatting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If yo

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2217#issuecomment-53977801 Hi there, The immutability of broadcast variables might be assumed in other places in the code base. Since this approach requires re-broadcasting the entire cont

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2184 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53977717 Cool - thanks Nick! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977588 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Add normalizeByCol method to mllib.util.MLUtil...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1698 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: Update spark-daemon.sh

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/254 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1356 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3009: Reverted readObject method in Appl...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1922 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3229] spark.shuffle.safetyFraction and ...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-53977557 okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977519 @mateiz retest this again, tests failed in sparkstreaming, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977395 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19529/consoleFull) for PR 1992 at commit [`b2a044a`](https://github.com/a

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-53977396 Okay I think this is no longer necessary now that we fixed the issue causing lag in processing events. So I'd like to close this issue for now. --- If your project is s

[GitHub] spark pull request: [SPARK-2558][DOCS] Add spark.yarn.queue descri...

2014-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2218#issuecomment-53977030 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-2558][DOCS] Add spark.yarn.queue descri...

2014-08-30 Thread kramimus
GitHub user kramimus opened a pull request: https://github.com/apache/spark/pull/2218 [SPARK-2558][DOCS] Add spark.yarn.queue description to YARN doc Put original YARN queue spark-submit arg description in running-on-yarn html table and example command line You can merge this pu

[GitHub] spark pull request: [SPARK-2489] [SQL] Parquet support for fixed_l...

2014-08-30 Thread joesu
Github user joesu commented on the pull request: https://github.com/apache/spark/pull/1737#issuecomment-53976828 Another way is to include max length information in the BinaryType type, just like the FixedLenByteArray type in this pull request. Thus we can maintain only one binary dat

[GitHub] spark pull request: [SPARK-2489] [SQL] Parquet support for fixed_l...

2014-08-30 Thread joesu
Github user joesu commented on the pull request: https://github.com/apache/spark/pull/1737#issuecomment-53976708 It's not that straightforward to reuse BinaryType for handling parquet's binary type and fixed_len_byte_array types because these two types are incompatible in the parquet

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53976440 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19528/consoleFull) for PR 1778 at commit [`75a0b51`](https://github.com/a

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53976161 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19527/consoleFull) for PR 1778 at commit [`0f12ade`](https://github.com/a

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53976069 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19529/consoleFull) for PR 1992 at commit [`b2a044a`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53976026 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975553 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19528/consoleFull) for PR 1778 at commit [`75a0b51`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53975519 @mateiz , retest this please, tests failed due to forked process exit code is not zero. https://github.com/apache/spark/pull/2108 --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-2890][SQL] Allow reading of data when c...

2014-08-30 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/2209#issuecomment-53975391 Sounds good. I was not sure how to correctly query those results with ambiguous schemas when I added that check. Seems an more informative logging entry is better than an e

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975264 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19527/consoleFull) for PR 1778 at commit [`0f12ade`](https://github.com/ap

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread rezazadeh
Github user rezazadeh commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975250 Style changes made. Experimental results below. We run DIMSUM daily on a production-scale ads dataset. After replacing the traditional cosine similarity computa

[GitHub] spark pull request: [SPARK-2890][SQL] Allow reading of data when c...

2014-08-30 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2209#issuecomment-53975089 I actually encountered the error with a jsonRDD, but yeah it could happen with parquet files as well. Your comment about joins though makes me think that we should just

[GitHub] spark pull request: [SPARK-2973][SQL] Lightweight SQL commands wit...

2014-08-30 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2215#discussion_r16932497 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala --- @@ -90,10 +90,9 @@ case class SetCommand( throw new IllegalArg

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53975007 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19526/consoleFull) for PR 2210 at commit [`a25d27a`](https://github.com/a

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53974654 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19525/consoleFull) for PR 1992 at commit [`b2a044a`](https://github.com/a

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-53974143 Apart from the discussion around the correct place for centriod calculations and some minor code style comments, it looks good to me. If it's too much work to change i

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2210 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974083 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19526/consoleFull) for PR 2210 at commit [`a25d27a`](https://github.com/ap

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974057 Alright, thanks. Going to merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974032 @mateiz Thanks, completely forgot to check the javadoc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973967 (And please add [SPARK-3318] at the beginning of your PR title) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16932280 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Softw

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973932 Actually you missed JavaSparkContext; it has the same issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973919 Looks good, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: Check if margin > 0, not if prob > 0.5

2014-08-30 Thread naftaliharris
Github user naftaliharris commented on the pull request: https://github.com/apache/spark/pull/1057#issuecomment-53973576 @mateiz oh yeah, no problem. Thanks again for the fixes! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark pull request: Check if margin > 0, not if prob > 0.5

2014-08-30 Thread naftaliharris
Github user naftaliharris closed the pull request at: https://github.com/apache/spark/pull/1057 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973578 Yeah up to you, you should either update it or close the PR if you think everything is there already. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973561 Alright, thanks for taking a look at this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] spark pull request: [SPARK-3205] add EscapedTextInputFormat

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2118#discussion_r16932221 --- Diff: core/src/main/scala/org/apache/spark/input/EscapedTextInputFormat.scala --- @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundat

[GitHub] spark pull request: [SPARK-3205] add EscapedTextInputFormat

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2118#discussion_r16932218 --- Diff: core/src/main/scala/org/apache/spark/input/EscapedTextInputFormat.scala --- @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundat

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread rcompton
Github user rcompton closed the pull request at: https://github.com/apache/spark/pull/915 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread rcompton
Github user rcompton commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973422 @mateiz no, for the reasons mentioned by Sean as well as the new work by Sigmoid, you don't need this patch. --- If your project is set up for it, you can reply to thi

[GitHub] spark pull request: [SPARK-2237][CORE]Add ZLIBCompressionCodec cod...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1121#issuecomment-53973370 @YanjieGao do you see a major tradeoff in compressed size and speed with this codec over our current ones? Also, I'm not sure your patch will compile as written. T

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973167 @rcompton I believe this has been addressed by Sigmoid's recent work for Pig on Spark: https://issues.apache.org/jira/browse/PIG-4059. Given that, do we still need this pat

[GitHub] spark pull request: SPARK-2461. Add a toString method to Generaliz...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1388#issuecomment-53973123 @sryza just wondering, will you have time to update this for Python? As I said it would be useful to include. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973073 sure, because other people told me some of the parameters are not supposed to be configurableso I pend the work hereI can go through it again to check the missing

[GitHub] spark pull request: MetadataCleaner - fine control cleanup documen...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/89#issuecomment-53973053 I agree, we should not expose these to the user given the recent changes. Would it be okay to close this PR? --- If your project is set up for it, you can reply to this ema

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973039 @CodingCat are you still working on this patch? The doc page changed significantly in 1.0, so maybe a lot of this info is still in, but it would be good to look over it and

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53973019 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19525/consoleFull) for PR 1992 at commit [`b2a044a`](https://github.com/ap

[GitHub] spark pull request: Check if margin > 0, not if prob > 0.5

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1057#issuecomment-53972967 Hey @naftaliharris, might closing this pull request now that this has been fixed in other PRs? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53972940 Looks good to me pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53972938 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-1919] Fix Windows spark-shell --jars

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2211#discussion_r16932123 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -965,11 +966,9 @@ class SparkILoop(in0: Option[BufferedReader], protected val out:

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1843 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1843#issuecomment-53971505 Thanks Marcelo! I've merged this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53971091 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19524/consoleFull) for PR 2184 at commit [`33786ac`](https://github.com/a

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-53971034 The ordered categorical features are not binned and the centriods are re-calculated using the entire bin aggregate every level. I can see the improvement in accuracy h

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-53970492 @davies just curious, do all the unit tests run if you do `run-tests` with `pypy`? We should make sure they do, and add a command in there to test this in Jenkins (ask Pat

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16931670 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Softw

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16931674 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Softw

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53969763 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19524/consoleFull) for PR 2184 at commit [`33786ac`](https://github.com/ap

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53969665 @pwendell I think we're all set now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53969391 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19523/consoleFull) for PR 2214 at commit [`983d030`](https://github.com/a

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53968973 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19522/consoleFull) for PR 2184 at commit [`638c0e4`](https://github.com/a

[GitHub] spark pull request: [SPARK-3300][SQL] No need to call clear() and ...

2014-08-30 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/2195#issuecomment-53967657 ok to test please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2217#issuecomment-53967605 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/2217 [SPARK-3327] Make broadcasted value mutable for caching useful information This PR makes broadcasted value mutable for caching useful information when implementing some algorithms that iteratively ru

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53967407 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19523/consoleFull) for PR 2214 at commit [`983d030`](https://github.com/ap

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53967270 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53966890 Hmm, looks like I need to fix something now that this doesn't merge cleanly anymore. Investigating. --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53966846 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19522/consoleFull) for PR 2184 at commit [`638c0e4`](https://github.com/ap

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/2184#discussion_r16930998 --- Diff: dev/run-tests-jenkins --- @@ -138,7 +141,7 @@ function post_message () { test_result="$?" if [ "$test_result" -eq "124" ]; then

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2184#discussion_r16930734 --- Diff: dev/run-tests-jenkins --- @@ -138,7 +141,7 @@ function post_message () { test_result="$?" if [ "$test_result" -eq "124" ]; then

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53965382 Minor style note, but otherwise LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2176#discussion_r16930621 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -574,4 +574,15 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extend

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53961548 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19521/consoleFull) for PR 2176 at commit [`5536d55`](https://github.com/a

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread lirui-intel
Github user lirui-intel commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53960113 Thanks @rxin , @vanzin for the review. I've added experimental mark in the java doc. I see that mima can automatically exclude DeveloperApi and Experimental classes,

  1   2   >