[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-21 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2432#issuecomment-56290329 In current implementation, MesosSchedulerBackend and CoarseGrainedMesosSchedulerBackend always return None when we invoke applicationId method. But if we apply the

[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2432#issuecomment-56290403 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20617/consoleFull) for PR 2432 at commit

[GitHub] spark pull request: SPARK-1597: Add a version of reduceByKey that ...

2014-09-21 Thread techaddict
Github user techaddict closed the pull request at: https://github.com/apache/spark/pull/550 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Add ValueIncrementableHashMapAccumulator

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2314 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2387: remove stage barrier

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1328 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1630: Make PythonRDD handle NULL element...

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/554 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2582. Make Block Manager Master pluggabl...

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1506 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [MLLIB] fix a unresolved reference variable 'n...

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2423 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3584] sbin/slaves doesn't work when we ...

2014-09-21 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2444#discussion_r17823617 --- Diff: sbin/slaves.sh --- @@ -67,20 +69,26 @@ fi if [ $HOSTLIST = ]; then if [ $SPARK_SLAVES = ]; then -export

[GitHub] spark pull request: [SPARK-3584] sbin/slaves doesn't work when we ...

2014-09-21 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2444#discussion_r17823625 --- Diff: sbin/slaves.sh --- @@ -67,20 +69,26 @@ fi if [ $HOSTLIST = ]; then if [ $SPARK_SLAVES = ]; then -export

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56290881 Hmm - O(seconds) is a lot for new tests considering the quantity of tests we have. Is there a constant overhead in launching selenium that gets amortized over future

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2310#issuecomment-56290971 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20616/consoleFull) for PR 2310 at commit

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-21 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56290962 If the only issue here is test speed, maybe we can disable the slower tests by default on Jenkins. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2432#issuecomment-56291394 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20617/consoleFull) for PR 2432 at commit

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56292164 It really depends on the number of zero-sized blocks. One thing we can possibly do is to create a compressed bitmap to track zero sized blocks, as discussed here:

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56292398 @lemire our requirements here are very simple. We just need to have a bitmap to track the position of zero-sized blocks in Spark shuffle. Things we need from the bitmap

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread Ishiihara
Github user Ishiihara commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56292536 @rxin I am definitely interested in working on adding compressed bitmap. What is the first step? Thanks. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread scwf
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/2476 [Minor]ignore .idea_modules You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwf/spark patch-4 Alternatively you can review and apply these

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2476#issuecomment-56292832 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56293506 @mridulm Very good thoughts! I totally agree that replication is not only for streaming, and the implications of this patch in other scenarios is important to understand.

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56293724 @tdas handling (1) deterministically will make (2) in line with what we currently have. And that should be sufficient imo. (3) was not in context of this

[GitHub] spark pull request: [Build] SPARK-3624: Failed to find Spark assem...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2477#issuecomment-56294819 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Build] SPARK-3624: Failed to find Spark assem...

2014-09-21 Thread tzolov
GitHub user tzolov opened a pull request: https://github.com/apache/spark/pull/2477 [Build] SPARK-3624: Failed to find Spark assembly in /usr/share/spark/lib... Define a 'lib' symlink like this: lib - /usr/share/spark/jars This required jdeb maven plugin update

[GitHub] spark pull request: [YARN] SPARK-2668: Add variable of yarn log di...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1573#issuecomment-56297461 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20619/consoleFull) for PR 1573 at commit

[GitHub] spark pull request: [YARN] SPARK-2668: Add variable of yarn log di...

2014-09-21 Thread renozhang
Github user renozhang commented on the pull request: https://github.com/apache/spark/pull/1573#issuecomment-56297532 @tgravescs patch updated, thanks for your review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825042 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -403,16 +402,20 @@ class BlockManagerMasterActor(val isLocal:

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825062 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1228,4 +1240,212 @@ class BlockManagerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825064 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -62,6 +63,7 @@ class BlockManagerSuite extends FunSuite with Matchers

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56299008 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20620/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [YARN] SPARK-2668: Add variable of yarn log di...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1573#issuecomment-56299060 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20619/consoleFull) for PR 1573 at commit

[GitHub] spark pull request: [SPARK-3580] add 'partitions' property to PySp...

2014-09-21 Thread mattf
GitHub user mattf opened a pull request: https://github.com/apache/spark/pull/2478 [SPARK-3580] add 'partitions' property to PySpark RDD 'rdd.partitions' is available in scalajava, primarily used for its size() method to get the number of partitions. pyspark instead has a

[GitHub] spark pull request: [SPARK-3580] add 'partitions' property to PySp...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2478#issuecomment-56299526 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20621/consoleFull) for PR 2478 at commit

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825538 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1228,4 +1240,212 @@ class BlockManagerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825562 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1228,4 +1240,212 @@ class BlockManagerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17825572 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1228,4 +1240,212 @@ class BlockManagerSuite extends FunSuite with

[GitHub] spark pull request: Update docs to use jsonRDD instead of wrong js...

2014-09-21 Thread gregakespret
GitHub user gregakespret opened a pull request: https://github.com/apache/spark/pull/2479 Update docs to use jsonRDD instead of wrong jsonRdd. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gregakespret/spark patch-1

[GitHub] spark pull request: Update docs to use jsonRDD instead of wrong js...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2479#issuecomment-56300581 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56301104 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20622/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56301129 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20622/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56301209 @JoshRosen, here i named the configuration options refer to how hadoop do(actually just use ```spark``` instead ```hadoop```). ```spark.client``` is the browser which

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56301425 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20623/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-3580] add 'partitions' property to PySp...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2478#issuecomment-56301525 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20621/consoleFull) for PR 2478 at commit

[GitHub] spark pull request: SPARK-3625'In some cases, the RDD.checkpoint d...

2014-09-21 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2480 SPARK-3625'In some cases, the RDD.checkpoint does not work You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3625 Alternatively

[GitHub] spark pull request: [SPARK-3625] In some cases, the RDD.checkpoint...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2480#issuecomment-56301739 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20624/consoleFull) for PR 2480 at commit

[GitHub] spark pull request: [SPARK-3625] In some cases, the RDD.checkpoint...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2480#issuecomment-56302378 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20625/consoleFull) for PR 2480 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56302384 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20626/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56302482 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20623/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56302557 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20620/consoleFull)** after a configured wait of `120m`. --- If your project

[GitHub] spark pull request: [SPARK-3625] In some cases, the RDD.checkpoint...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2480#issuecomment-56303360 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20624/consoleFull) for PR 2480 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56303535 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20626/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56303934 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20627/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56304208 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20628/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-3625] In some cases, the RDD.checkpoint...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2480#issuecomment-56304525 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20625/consoleFull) for PR 2480 at commit

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2476#issuecomment-56304894 Jenkins, test this please. LGTM thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56304928 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20627/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2476#issuecomment-56305114 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20629/consoleFull) for PR 2476 at commit

[GitHub] spark pull request: SPARK-2058: Overriding SPARK_HOME/conf with SP...

2014-09-21 Thread EugenCepoi
GitHub user EugenCepoi opened a pull request: https://github.com/apache/spark/pull/2481 SPARK-2058: Overriding SPARK_HOME/conf with SPARK_CONF_DIR Update of PR #997. With this PR, setting SPARK_CONF_DIR overrides SPARK_HOME/conf (not only spark-defaults.conf and

[GitHub] spark pull request: SPARK-2058: Overriding SPARK_HOME/conf with SP...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2481#issuecomment-56305629 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20630/consoleFull) for PR 2481 at commit

[GitHub] spark pull request: SPARK-2058: Overriding config from SPARK_HOME ...

2014-09-21 Thread EugenCepoi
Github user EugenCepoi commented on the pull request: https://github.com/apache/spark/pull/997#issuecomment-56305844 @andrewor14 @pwendell I updated it and opened a new PR #2481 against master. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2450#issuecomment-56306137 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2450#issuecomment-56306228 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/138/consoleFull) for PR 2450 at commit

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-56306346 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20628/consoleFull) for PR 1980 at commit

[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...

2014-09-21 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2464#discussion_r17826891 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -859,18 +864,27 @@ private[spark] object Utils extends Logging { } }

[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...

2014-09-21 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2464#discussion_r17826897 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -859,18 +864,27 @@ private[spark] object Utils extends Logging { } }

[GitHub] spark pull request: SPARK-2621. Update task InputMetrics increment...

2014-09-21 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/2087#issuecomment-56307027 @aarondav @sryza Did you consider using reader.getPos() to get the correct metrics for older versions of Hadoop (as in here:

[GitHub] spark pull request: [SPARK-3609][SQL] Adds sizeInBytes statistics ...

2014-09-21 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2468#discussion_r17827025 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala --- @@ -122,6 +122,16 @@ object NativeType {

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2476#issuecomment-56307280 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20629/consoleFull) for PR 2476 at commit

[GitHub] spark pull request: [SPARK-927] detect numpy at time of use

2014-09-21 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2313#issuecomment-56307508 fyi - re passing warnings to driver: https://issues.apache.org/jira/browse/SPARK-516 and https://issues.apache.org/jira/browse/SPARK-593 --- If your project is set up

[GitHub] spark pull request: SPARK-2058: Overriding SPARK_HOME/conf with SP...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2481#issuecomment-56307689 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20630/consoleFull) for PR 2481 at commit

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2450#issuecomment-56308294 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/138/consoleFull) for PR 2450 at commit

[GitHub] spark pull request: [SPARK-3293] yarn's web show SUCCEEDED when ...

2014-09-21 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2311#issuecomment-56308622 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-3293] yarn's web show SUCCEEDED when ...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2311#issuecomment-56308708 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20631/consoleFull) for PR 2311 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread JoshRosen
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/2482 [SPARK-3626] [WIP] Replace AsyncRDDActions with a more general runAsync() mechanism ### Background The `AsyncRDDActions` methods were introduced in

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-21 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56309633 I've opened #2482 , a pull request (WIP) illustrating my proposal to remove `AsyncRDDActions` and replace it with a more general mechanism for asynchronously launching

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread lemire
Github user lemire commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56309772 @rxin We are currently working with the Druid.io guys to integrate Roaring (http://roaringbitmap.org). We get good results and even support memory mapped bitmaps (with

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56309790 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20632/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56309826 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20632/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56309919 I don't think we can just wipe the old one out. At the very least, we need to deprecate it. Even that is debatable because some applications might prefer this async model.

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56310103 +1 @rxin Just scanned through the code quickly, and I didn't immediately see anything that would preclude retaining and deprecating the old code while

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56310100 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20633/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56310133 Fair enough, although the `AsyncRDDActions` class was marked as `@Experimental` and the documentation for that annotation explicitly warns that experimental APIs might

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread Ishiihara
Github user Ishiihara commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56310242 @rxin @lemire Starting looking at Roaring. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2450 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2476#issuecomment-56310938 Thanks I've merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Minor]ignore .idea_modules

2014-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2476 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2482#discussion_r17827924 --- Diff: core/src/main/scala/org/apache/spark/RunAsyncResult.scala --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-21 Thread debasish83
Github user debasish83 commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-56311340 The colMags right now have sqrt(sum(column1)^2 + sum(column2)^2 + ... + sum(columnN)^2) It will be good to have (sum(column1) + sum(column2) + ... + sum(columnN))

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56311396 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20634/consoleFull) for PR 2366 at commit

[GitHub] spark pull request: [SPARK-3293] yarn's web show SUCCEEDED when ...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2311#issuecomment-56311778 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20631/consoleFull) for PR 2311 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56312285 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20633/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-21 Thread jeffsteinmetz
Github user jeffsteinmetz commented on the pull request: https://github.com/apache/spark/pull/2473#issuecomment-56312949 now consistent, using lower-case --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56313463 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20634/consoleFull) for PR 2366 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56314921 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20635/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56315058 I've taken another pass at this. This time, I kept AsyncRDDActions but re-implemented it using `runAsync`, but I'm actually on the fence about that change. The one

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-21 Thread rezazadeh
Github user rezazadeh commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-56316095 Why do you say normL1 is not implemented? I have implemented normL1 in MultivariateOnlineSummarizer, with tests. Do you want a version without absolute values? If so,

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56316673 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20635/consoleFull) for PR 2482 at commit

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56317365 @mridulm I implemented (1) and also added an unit test for testing that behavior. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56317539 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20636/consoleFull) for PR 2366 at commit

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829791 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager( } /**

[GitHub] spark pull request: [SPARK-1545] [mllib] Add Random Forests

2014-09-21 Thread chouqin
Github user chouqin commented on the pull request: https://github.com/apache/spark/pull/2435#issuecomment-56319639 @jkbradley thanks, it looks good to me except comments in the code. --- If your project is set up for it, you can reply to this email and have your reply appear on

  1   2   >