[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50257258 Since all process local tasks are also node, rack and any : we will incur node local delay also. On 27-Jul-2014 11:09 am, Matei Zaharia notificati...@github.com

[GitHub] spark pull request: [SPARK-2490] Change recursive visiting on RDD ...

2014-07-27 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/1418#issuecomment-50257662 @mateiz Thanks for suggestion. I leave the PageRank example as-is. These braces are added to comply with code style. --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50259076 Okay, in that case, how about just calling the method with NO_PREFS after you've called it with all other levels (up to and including ANY)? Then we can have some

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50259298 (BTW this would require calling with NO_PREFS even if some of the other levels returned no task; but it still seems less complicated than the current approach.) --- If

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-27 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1607#discussion_r15438269 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -110,42 +110,56 @@ class ExternalAppendOnlyMap[K, V, C](

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1524#issuecomment-50259995 QA tests have started for PR 1524. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17240/consoleFull ---

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-27 Thread javadba
Github user javadba commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50260024 After a fair bit of struggling with testing inconsistencies and maven and git, I have the updates in place. Please take a look whenever you have a chance - no rush ;)

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438411 --- Diff: external/hbase/pom.xml --- @@ -0,0 +1,217 @@ +project xmlns=http://maven.apache.org/POM/4.0.0;

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438415 --- Diff: external/hbase/pom.xml --- @@ -0,0 +1,217 @@ +project xmlns=http://maven.apache.org/POM/4.0.0;

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438420 --- Diff: external/hbase/pom.xml --- @@ -0,0 +1,217 @@ +project xmlns=http://maven.apache.org/POM/4.0.0;

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438418 --- Diff: external/hbase/pom.xml --- @@ -0,0 +1,217 @@ +project xmlns=http://maven.apache.org/POM/4.0.0;

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438426 --- Diff: external/hbase/pom.xml --- @@ -0,0 +1,217 @@ +project xmlns=http://maven.apache.org/POM/4.0.0;

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438429 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/HBaseContext.scala --- @@ -0,0 +1,544 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438438 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/HBaseContext.scala --- @@ -0,0 +1,544 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438442 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/HBaseContext.scala --- @@ -0,0 +1,544 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438453 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/HConnectionStaticCache.scala --- @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438461 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/HConnectionStaticCache.scala --- @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438469 --- Diff: external/hbase/src/main/scala/org/apache/spark/hbase/example/HBaseStreamingBulkPutExample.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438472 --- Diff: external/hbase/src/test/scala/org/apache/spark/hbase/HBaseContextSuite.scala --- @@ -0,0 +1,296 @@ +package org.apache.spark.hbase --- End

[GitHub] spark pull request: Spark-2447 : Spark on HBase

2014-07-27 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1608#discussion_r15438477 --- Diff: external/hbase/src/test/scala/org/apache/spark/hbase/LocalSparkContext.scala --- @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1524#issuecomment-50260857 QA results for PR 1524:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50260912 @mateiz for skipping the levels in computeValidLocalityLevels, that's straightforward, however, the current TaskSetManager does not update the valid locality after the

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1524#issuecomment-50262166 QA tests have started for PR 1524. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17241/consoleFull ---

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-27 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/1439#discussion_r15438771 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveTableScanSuite.scala --- @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1524#issuecomment-50263068 QA results for PR 1524:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2679] [MLLib] Ser/De for Double

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1581 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2479][MLlib] Comparing floating-point n...

2014-07-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1425#issuecomment-50265751 @dbtsai I think it is very uncommon to combine the scientific notation with percentage, like `1e-10 percent`. Shall we switch to `absTol` and `relTol` instead? I feel

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439509 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -102,6 +100,34 @@ class PartitioningSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439511 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439516 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439518 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50266091 @rxin @mateiz I have one question about using `rdd.id` as random seed shift to avoid sampling the same sequence in each partition. It is a constant within a session. But

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439751 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1607#discussion_r15439805 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -110,42 +110,56 @@ class ExternalAppendOnlyMap[K, V,

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439847 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread mridulm
GitHub user mridulm opened a pull request: https://github.com/apache/spark/pull/1609 [SPARK-2532] WIP Consolidated shuffle fixes Status of the PR - [X] Cherry pick and merge changes from internal branch to spark master - [X] Remove WIP comments and 2G branch references.

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50267547 QA results for PR 1609:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50267540 QA tests have started for PR 1609. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17242/consoleFull ---

[GitHub] spark pull request: [SPARK-1550] [PySpark] Allow SparkContext crea...

2014-07-27 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1606#issuecomment-50269547 +1 lgtm i've been hitting this issue repeatedly, but assumed it was a corner case that wouldn't get much attention. my assumption is that use of spark-submit and

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50274622 QA results for PR 1609:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50274618 QA tests have started for PR 1609. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17243/consoleFull ---

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50277674 QA tests have started for PR 1609. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17244/consoleFull ---

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50278406 QA results for PR 1609:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2648] through shuffling blocksByAddress...

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1549#issuecomment-50279212 Wait I'm confused now - if these are randomized already in the `BlockFetcherIterator`, than why do we need to randomize them in the `BlockStoreShuffleFetcher`? --- If

[GitHub] spark pull request: SPARK-2651: Add maven scalastyle plugin

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1550#issuecomment-50279335 After some thought, I'd prefer not to force stylce checking when the `package` target is run. Many distributors fork the Spark build and modify things, I don't want to

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-07-27 Thread kanzhang
Github user kanzhang commented on the pull request: https://github.com/apache/spark/pull/1082#issuecomment-50279342 @JoshRosen one use case I could imagine is a user creates a series of RDDs along the way (e.g., in a shell) and she later wants to find out which RDDs are persisted and

[GitHub] spark pull request: SPARK-2651: Add maven scalastyle plugin

2014-07-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1550#issuecomment-50279475 Packaging wouldn't generally mean changing code checked by scalastyle, I think. I wouldn't mind it being enabled on package; it feels like something to run with the test

[GitHub] spark pull request: SPARK-2651: Add maven scalastyle plugin

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1550#issuecomment-50279563 @srowen your perspective could be helpful here. My concern was that say Cloudera has some custom patches or back ports in a Spark package. Would they find it annoying

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15440366 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-07-27 Thread kanzhang
Github user kanzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/1082#discussion_r15440360 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala --- @@ -559,6 +559,19 @@ class JavaSparkContext(val sc: SparkContext) extends

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50279680 I think it's fine to make it random. Actually it would be better to do something like `idx | (rdd.id 16)` to have them overlap in fewer bits, since both `idx` and

[GitHub] spark pull request: SPARK-2651: Add maven scalastyle plugin

2014-07-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1550#issuecomment-50280326 Cloudera only back-ports and doesn't add extra code. Back-porting style-compliant code should result in style-compliant code. Of course a back-port sometimes requires a

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-27 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-50280386 I have addressed your latest comments and rebased to master. Anything else? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-50280496 QA tests have started for PR 1165. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17245/consoleFull ---

[GitHub] spark pull request: SPARK-2651: Add maven scalastyle plugin

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1550#issuecomment-50280763 okay then - perhaps it's fine to stick with this. We can always make it less stringent if people complain. My main concern was just frustrating downstream packagers -

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50280775 Actually I meant '^', not '|' --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-27 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/1607#issuecomment-50281189 LGTM, merging into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1607 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2425 Don't kill a still-running Applicat...

2014-07-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1360#issuecomment-50281668 ping Probably too late for a 1.0.2-rc, but this should go into 1.0.3 and 1.1.0. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-50281752 QA results for PR 1165:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brcase class

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-27 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50281863 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-2674] [SQL] [PySpark] support datetime ...

2014-07-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1601#discussion_r15440847 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -357,16 +357,52 @@ class SQLContext(@transient val sparkContext:

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50281975 QA tests have started for PR 1586. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17246/consoleFull ---

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-07-27 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-50282111 I think reflection is definitely the right way to go here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50283185 QA tests have started for PR 1562. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17248/consoleFull ---

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1524#issuecomment-50283221 Looks good, thanks @liancheng! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2705][CORE] Fixed stage description in ...

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1524 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50283761 QA tests have started for PR 1562. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17249/consoleFull ---

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15441176 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15441180 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V](

[GitHub] spark pull request: [SPARK-2410][SQL] Merging Hive Thrift/JDBC ser...

2014-07-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1600#issuecomment-50284058 Okay - lets try this again. I can merge it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2410][SQL] Merging Hive Thrift/JDBC ser...

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50284527 QA results for PR 1562:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50284646 QA results for PR 1586:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brcase class

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1562#issuecomment-50285111 QA results for PR 1562:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2420. Illustration of downgrading to Gua...

2014-07-27 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/1610 SPARK-2420. Illustration of downgrading to Guava 11 from 14 This PR illustrates a change being proposed for https://issues.apache.org/jira/browse/SPARK-2420 See the JIRA. You can merge this pull

[GitHub] spark pull request: SPARK-2420. Illustration of downgrading to Gua...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1610#issuecomment-50285700 QA tests have started for PR 1610. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17250/consoleFull ---

[GitHub] spark pull request: [SPARK-2054][SQL] Code Generation for Expressi...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/993#issuecomment-50287338 QA tests have started for PR 993. This patch DID NOT merge cleanly! brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17251/consoleFull

[GitHub] spark pull request: [Build] SPARK-2614: (2nd patch) Create a spark...

2014-07-27 Thread tzolov
GitHub user tzolov opened a pull request: https://github.com/apache/spark/pull/1611 [Build] SPARK-2614: (2nd patch) Create a spark-XXX-examples.deb package in addition to the spark-XXX-all.deb (-Pdeb profile) New patch that takes into consideration Mark Hamstra's suggestion

[GitHub] spark pull request: [Build] SPARK-2614: (2nd patch) Create a spark...

2014-07-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1611#issuecomment-50287433 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2054][SQL] Code Generation for Expressi...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/993#issuecomment-50287472 QA tests have started for PR 993. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17252/consoleFull ---

[GitHub] spark pull request: SPARK-2420. Illustration of downgrading to Gua...

2014-07-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1610#issuecomment-50287800 QA results for PR 1610:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [Build] SPARK-2614: Add the spark-examples-xxx...

2014-07-27 Thread tzolov
Github user tzolov commented on the pull request: https://github.com/apache/spark/pull/1527#issuecomment-50288291 This pull request should be discarded and replaced by https://github.com/apache/spark/pull/1611 --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [Build] SPARK-2614: Add the spark-examples-xxx...

2014-07-27 Thread tzolov
Github user tzolov closed the pull request at: https://github.com/apache/spark/pull/1527 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50288916 I see, my point was that we shouldn't try to make optimizations or major refactorings here. It makes it much harder to tell whether the code is correct, and this is code

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50288953 BTW one other thing I should add is that it would be good to update the comments of these methods when you make a PR. For example, if you go with the way I suggested

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-50289038 Thanks Andrew! The changes look good to me -- I've merged this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-2490] Change recursive visiting on RDD ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1418#issuecomment-50289064 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1680: use configs for specifying environ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1512#issuecomment-50289111 @tgravescs I think it's fine to expose that. We actually already document it in the comments of SparkConf.setExecutorEnv and such. --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-2572] Delete the local dir on executor ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1480#issuecomment-50289129 Can you explain what happens without this? I thought the DiskStore's shutdown hook is still called when the executor exits, so it will still clean up blocks. --- If

[GitHub] spark pull request: SPARK-2710 Build SchemaRDD from a JdbcRDD with...

2014-07-27 Thread chutium
GitHub user chutium opened a pull request: https://github.com/apache/spark/pull/1612 SPARK-2710 Build SchemaRDD from a JdbcRDD with MetaData SPARK-2710 Build SchemaRDD from a JdbcRDD with MetaData and a small bug fix on JdbcRDD, line 109 it seems conn will never be

[GitHub] spark pull request: [SPARK-2572] Delete the local dir on executor ...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1480#issuecomment-50289133 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2325] Utils.getLocalDir had better chec...

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1281#issuecomment-50289215 I see, that makes sense, but in that case we need to do a couple more things to make this complete: 1) We should have a max limit of broken dirs we tolerate, after

[GitHub] spark pull request: [SPARK-2710] [SQL] Build SchemaRDD from a Jdbc...

2014-07-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1612#issuecomment-50289234 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1520 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1520#issuecomment-50289239 LGTM. Merged into master. Thanks for adding random RDD generators!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1609#discussion_r15442565 --- Diff: core/src/test/scala/org/apache/spark/storage/DiskBlockObjectWriterSuite.scala --- @@ -0,0 +1,296 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2532] WIP Consolidated shuffle fixes

2014-07-27 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1609#issuecomment-50289340 @adav @andrewor14 would be good if you two take a look at this when it's merging correctly. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50289440 @mateiz thanks, I'm working on simplify this, just one thing to confirm, if we call resourceOffer with NO_PREF only after NODE_LOCAL, we will have the

[GitHub] spark pull request: [Build] SPARK-2614: (2nd patch) Create a spark...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1611#discussion_r15442657 --- Diff: assembly/src/deb/control/examples/control --- @@ -0,0 +1,8 @@ +Package: [[deb.pkg.name]]-examples +Version: [[version]]-[[buildNumber]]

  1   2   >