[GitHub] spark pull request: [yarn]The method has a never used parameter

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1761#issuecomment-51020215 In general, I'm wary of merging this type of very-small-scale code cleanup; I don't think this makes the code any easier to understand and it may cause merge conflicts

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-51020347 Since this same locking pattern occurs at several places in the code, I think it might make sense to abstract it behind a function or macro, which would give us a

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51020440 QA results for PR 1719:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51020596 This looks okay, but I still wonder whether there's a simpler approach. Have you looked at how [dill](https://github.com/uqfoundation/dill) handles namedtuples? ---

[GitHub] spark pull request: [SPARK-2583] ConnectionManager error reporting

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1758#issuecomment-51020658 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51020851 It's easy to extend pickle to support namedtuple, couldpickle and dill have done in this way, but they are slow. We want to use cPickle for dataset, it should be fast by

[GitHub] spark pull request: [SPARK-2583] ConnectionManager error reporting

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1758#issuecomment-51020991 QA tests have started for PR 1758. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17845/consoleFull ---

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51021254 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2817] add show create table support

2014-08-04 Thread tianyi
Github user tianyi commented on the pull request: https://github.com/apache/spark/pull/1760#issuecomment-51021277 @chenghao-intel is these files all right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51021399 Here's another (contrived) example that breaks: ```python from collections import namedtuple as nt from pyspark import SparkContext from

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51021526 QA tests have started for PR 1719. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17846/consoleFull ---

[GitHub] spark pull request: [SPARK-1812] Enable cross build for scala 2.11...

2014-08-04 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/996#discussion_r15741984 --- Diff: assembly/pom.xml --- @@ -26,7 +26,7 @@ /parent groupIdorg.apache.spark/groupId - artifactIdspark-assembly_2.10/artifactId

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-08-04 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-51022369 Thanks for commenting Josh. I will see about putting together something on this including solid testcases. ETA later in the coming week. --- If your project is set up

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51022848 Yes, it's easy to break it. Having an solution working in 99% cases is better than no solutions, or much slower solution working 100% cases. --- If your

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51023016 QA results for PR 1719:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51023029 This feature is not blocker, because we prefer use Row() instead of namedtuple to do inferSchema(). If user really want to use namedtuple or customized class in

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-08-04 Thread avulanov
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-51023586 @mengxr Could you review or comment this? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51023600 LGTM. Merged into both master and branch-1.1. @Ishiihara Thanks a lot for implementing word2vec! Please help improve its performance during the QA period. One task left

[GitHub] spark pull request: JIRA issue: [SPARK-1405] Gibbs sampling based ...

2014-08-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/476#discussion_r15742427 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala --- @@ -233,4 +235,60 @@ object MLUtils { } sqDist } +

[GitHub] spark pull request: Remove support for waiting for executors in st...

2014-08-04 Thread kayousterhout
GitHub user kayousterhout opened a pull request: https://github.com/apache/spark/pull/1762 Remove support for waiting for executors in standalone mode. Current code waits until some minimum fraction of expected executors have registered before beginning scheduling. The current

[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...

2014-08-04 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-51024346 @pwendell I created https://github.com/apache/spark/pull/1762 for your judgment of what the right thing to do here is! --- If your project is set up for it, you

[GitHub] spark pull request: Remove support for waiting for executors in st...

2014-08-04 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/1762#discussion_r15742607 --- Diff: yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala --- @@ -0,0 +1,63 @@ +/* + * Licensed to

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1719 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-08-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-51024568 Sure. We had some transformers implemented under `mllib.feature`, similar to sk-learn's approach. For feature selection, we can follow the same approach if we view

[GitHub] spark pull request: fix GraphX EdgeRDD zipPartitions

2014-08-04 Thread luluorta
GitHub user luluorta opened a pull request: https://github.com/apache/spark/pull/1763 fix GraphX EdgeRDD zipPartitions If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw:

[GitHub] spark pull request: [SPARK-2823]fix GraphX EdgeRDD zipPartitions

2014-08-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1763#issuecomment-51024735 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51024718 @liancheng why not just have the jars listed on the classpath in the order they are given to us? This is also how classpaths work in general, when I run a java command,

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51024798 @andrewor14 I still don't understand how this is different. Basically, the JVM works such that you put a set of jars in order (indicating precedence) and then you can

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-08-04 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/1481#discussion_r15742765 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -29,7 +29,7 @@ import akka.actor.{ActorSystem, Cancellable, Props}

[GitHub] spark pull request: [SPARK-1687] [PySpark] pickable namedtuple

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1623#issuecomment-51024991 I found another technique that may be more robust to `namedtuple` being accessible under different names. We can replace `namedtuple`'s code object at runtime in

[GitHub] spark pull request: [MLlib] [SPARK-2510]Word2Vec: Distributed Repr...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-51025157 QA results for PR 1719:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2817] add show create table support

2014-08-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/1760#issuecomment-51025307 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-08-04 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1481#issuecomment-51025499 Updated patch addresses @pwendell and @kayousterhout 's comments and adds tests. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1481#issuecomment-51025769 QA tests have started for PR 1481. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17848/consoleFull ---

[GitHub] spark pull request: [SPARK-1812] Enable cross build for scala 2.11...

2014-08-04 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/996#discussion_r15743162 --- Diff: assembly/pom.xml --- @@ -26,7 +26,7 @@ /parent groupIdorg.apache.spark/groupId -

[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-51026034 Okay let me run it by some more people tomorrow and figure it out. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1764#discussion_r15743214 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala --- @@ -266,7 +266,8 @@ package object dsl { object plans

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread aarondav
GitHub user aarondav opened a pull request: https://github.com/apache/spark/pull/1764 [SPARK-2824/2825][SQL] Work towards separating data location from format Currently, there is a fundamental assumption in SparkSQL that a Parquet table is stored at a certain Hadoop path and

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1764#discussion_r15743237 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/TableFormat.scala --- @@ -0,0 +1,53 @@ +/* + * Licensed to the

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1481#issuecomment-51026321 Cool - thanks Sandy. Let's see if tests pass. Can likely merge this tomorrow and fix any remaining issues (if they exist). --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1764#discussion_r15743336 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -353,15 +356,14 @@ private[parquet] object ParquetTypesConverter

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1764#issuecomment-51026425 QA tests have started for PR 1764. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17849/consoleFull ---

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1764#issuecomment-51026471 QA results for PR 1764:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1764#issuecomment-51026733 QA tests have started for PR 1764. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17850/consoleFull ---

[GitHub] spark pull request: [SPARK-2824/2825][SQL] Work towards separating...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1764#issuecomment-51026778 QA results for PR 1764:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1754#issuecomment-51027051 I think this is an intermediate YARN version that is different from both the yarn-alpha and yarn-stable API's. @witgo what if you apply the patch here - does it work?

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1754#issuecomment-51028168 This PR makes sbt' s behavior is consistent with [building-with-maven.md] (https://github.com/apache/spark/blob/master/docs/building-with-maven.md) description.

[GitHub] spark pull request: [SPARK-1779] add warning when memoryFraction i...

2014-08-04 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/714#issuecomment-51029448 hey @andrewor14 , i can not see FAILED unit tests info, so i do not know how to resolve it. can you help me --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Optionally parallelize the Spark build.

2014-08-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1752#issuecomment-51030127 @pwendell Ah, so that's what's causing it. Yes, fix forward by all means, but can this be disabled until that time? it looks like about half or more of all test runs are

[GitHub] spark pull request: [SQL] [SPARK-2826] Reduce the memory copy whil...

2014-08-04 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/1765 [SQL] [SPARK-2826] Reduce the memory copy while building the hashmap for HashOuterJoin This is a follow up for #1147 , this PR will improve the performance about 10% - 15% in my local

[GitHub] spark pull request: [SQL] [SPARK-2826] Reduce the memory copy whil...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1765#issuecomment-51030910 QA tests have started for PR 1765. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17851/consoleFull ---

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1754#discussion_r15745620 --- Diff: project/SparkBuild.scala --- @@ -71,7 +71,7 @@ object SparkBuild extends PomBuild { } Properties.envOrNone(SPARK_HADOOP_VERSION)

[GitHub] spark pull request: [SPARK-1986][GraphX]move lib.Analytics to org....

2014-08-04 Thread larryxiao
GitHub user larryxiao opened a pull request: https://github.com/apache/spark/pull/1766 [SPARK-1986][GraphX]move lib.Analytics to org.apache.spark.examples to support ~/spark/bin/run-example GraphXAnalytics triangles /soc-LiveJournal1.txt --numEPart=256 You can merge this pull

[GitHub] spark pull request: [SPARK-1986][GraphX]move lib.Analytics to org....

2014-08-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1766#issuecomment-51032872 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-08-04 Thread avulanov
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-51033011 @mengxr 1. Do I understand correct, that you propose that `fit(dataset: RDD[LabeledPoint])` should compute feature scores according to the feature selection

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2014-08-04 Thread luluorta
GitHub user luluorta opened a pull request: https://github.com/apache/spark/pull/1767 [SPARK-2827][GraphX]Add degree distribution operators in GraphOps for GraphX You can merge this pull request into a Git repository by running: $ git pull https://github.com/luluorta/spark

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2014-08-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1767#issuecomment-51033669 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1754#issuecomment-51033818 @pwendell #151 compilation fails. There seems to be infinite loop: ` SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt clean

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1754#issuecomment-51034387 We need to explicitly pointed out that spark does not support the version `2.0.x` and `2.1.x` of yarn ? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-51037911 QA tests have started for PR 1616. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17852/consoleFull ---

[GitHub] spark pull request: [SQL] [SPARK-2826] Reduce the memory copy whil...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1765#issuecomment-51039744 QA results for PR 1765:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-51044768 QA results for PR 1616:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2792. Fix reading too much or too little...

2014-08-04 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/1722#discussion_r15750212 --- Diff: core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala --- @@ -35,16 +35,15 @@ private[spark] class JavaSerializationStream(out:

[GitHub] spark pull request: SPARK-2792. Fix reading too much or too little...

2014-08-04 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1722#issuecomment-51047651 LGTM ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-2815]: Compilation failed upon the hado...

2014-08-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1754#issuecomment-51048750 @witgo I don't think #151 is to be committed, if I understand correctly. It's not 100% clear which versions of YARN 2.0.x actually work with `yarn-alpha`, and which if

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-08-04 Thread li-zhihui
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-51051752 @JoshRosen added comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51055752 @pwendell OK, the `java -firstCpElement` example really convinced me :) I used to think asking users to care about the order of the jars is a little too much, but

[GitHub] spark pull request: SPARK-2813: [SQL] Implement SQRT() directly in...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1750#issuecomment-51064528 QA tests have started for PR 1750. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17853/consoleFull ---

[GitHub] spark pull request: SPARK-2686 Add Length and OctetLen support to ...

2014-08-04 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-51065193 Hi, For some reason the CORE module testing has ballooned in overall testing time: it took over 7.5 hours to run. There was one timeout error out of 736 tests -

[GitHub] spark pull request: Fix postfixOps warnings in the test suite

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-51068459 QA tests have started for PR 1323. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17854/consoleFull ---

[GitHub] spark pull request: Fix postfixOps warnings in the test suite

2014-08-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-51069490 Related work #1330 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-51070559 QA tests have started for PR 1330. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17855/consoleFull ---

[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

2014-08-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1269#discussion_r15759585 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/topicmodeling/utils/serialization/TObjectIntHashMapSerializer.scala --- @@ -0,0 +1,51 @@

[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

2014-08-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1269#discussion_r15759818 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/clustering/topicmodeling/topicmodels/RobustPLSASuite.scala --- @@ -0,0 +1,40 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-2817] add show create table support

2014-08-04 Thread tianyi
Github user tianyi commented on the pull request: https://github.com/apache/spark/pull/1760#issuecomment-51072215 what's wrong with jenkins? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2583] ConnectionManager cannot distingu...

2014-08-04 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/1490#issuecomment-51072948 Thanks for your back up @JoshRosen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2813: [SQL] Implement SQRT() directly in...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1750#issuecomment-51075115 QA results for PR 1750:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brcase class

[GitHub] spark pull request: [SPARK-2817] add show create table support

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1760#issuecomment-51075309 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Fix postfixOps warnings in the test suite

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-51076281 QA results for PR 1323:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-51078602 QA results for PR 1330:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-51081881 Thanks for commenting. I now realize that my concern about advisory locking was a little misguided, since only cooperating Spark processes will be coordinating

[GitHub] spark pull request: [SPARK-2817] add show create table support

2014-08-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1760#issuecomment-51083724 LGTM :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-51085554 This seems like an alright fix and I'd like to get it into a release, but I'm concerned that this doesn't correctly handle every possible feature of `fetchFile`.

[GitHub] spark pull request: [SPARK-2806] core - upgrade to json4s-jackson ...

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1702#issuecomment-51085913 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2806] core - upgrade to json4s-jackson ...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1702#issuecomment-51086273 QA tests have started for PR 1702. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17856/consoleFull ---

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-08-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-51086836 @avulanov I have the same concern about calling `transform` before `fit`. There are two options: 1) throw an error, 2) fit on the same dataset and then transform

[GitHub] spark pull request: [SPARK-2179][SQL] Public API for DataTypes and...

2014-08-04 Thread chutium
Github user chutium commented on a diff in the pull request: https://github.com/apache/spark/pull/1346#discussion_r15766362 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -89,6 +88,44 @@ class SQLContext(@transient val sparkContext: SparkContext)

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51087364 @rxin @pwendell This PR is ready for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2583] ConnectionManager error reporting

2014-08-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1758#issuecomment-51087933 Jenkins, retest this please @JoshRosen it appears something timed out or failed during the tests --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-2583] ConnectionManager error reporting

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1758#issuecomment-51088288 QA tests have started for PR 1758. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17857/consoleFull ---

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51089001 @pwendell How about for python files? What if I have one.py and two.py that reference each other, and I want spark-submit to run the main method of one.py but not

[GitHub] spark pull request: [SPARK-2179][SQL] Public API for DataTypes and...

2014-08-04 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/1346#discussion_r15767384 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -89,6 +88,44 @@ class SQLContext(@transient val sparkContext: SparkContext)

[GitHub] spark pull request: SPARK-2380: Support displaying accumulator val...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1309#issuecomment-51089309 QA tests have started for PR 1309. This patch DID NOT merge cleanly! brView progress:

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51089761 @andrewor14 I believe Patrick means: 1. For Scala/Java applications, the primary jar should appear as the 1st entry of `--jars` 1. For Python applications,

[GitHub] spark pull request: [SPARK-2583] ConnectionManager error reporting

2014-08-04 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1758#discussion_r15768094 --- Diff: core/src/main/scala/org/apache/spark/network/ConnectionManager.scala --- @@ -41,16 +42,26 @@ import org.apache.spark.util.{SystemClock, Utils}

[GitHub] spark pull request: [SPARK-2678][Core] Added -- to prevent spark...

2014-08-04 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1715#issuecomment-51090902 Hm, I see. Even then we still need some kind of separator right? I thought the whole point of handling primary resources differently here (either under `--primary` or

[GitHub] spark pull request: [SPARK-2806] core - upgrade to json4s-jackson ...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1702#issuecomment-51091836 QA results for PR 1702:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2806] core - upgrade to json4s-jackson ...

2014-08-04 Thread avati
Github user avati commented on the pull request: https://github.com/apache/spark/pull/1702#issuecomment-51092475 It is not clear how the failure is related to this patch..? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-2380: Support displaying accumulator val...

2014-08-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1309#issuecomment-51092478 QA tests have started for PR 1309. This patch DID NOT merge cleanly! brView progress:

[GitHub] spark pull request: SPARK-2380: Support displaying accumulator val...

2014-08-04 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/1309#discussion_r15769145 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -42,6 +44,13 @@ class TaskInfo( var gettingResultTime: Long = 0

  1   2   3   4   5   >