[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-15 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-55609759 There really should be at least some mention of zinc (https://github.com/typesafehub/zinc) in our maven build instructions, since using zinc greatly improves

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-15 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-55611558 Yes, I know that the scala-maven-plugin will throw warnings if zinc isn't being used. I also know that many users are either confused by those warnings or ignore

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-15 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2014#discussion_r17570971 --- Diff: docs/building-spark.md --- @@ -159,4 +160,21 @@ then ship it over to the cluster. We are investigating the exact cause

[GitHub] spark pull request: SPARK-3580: New public method for RDD's to hav...

2014-09-18 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2447#discussion_r17741897 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -208,6 +208,23 @@ abstract class RDD[T: ClassTag

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56310103 +1 @rxin Just scanned through the code quickly, and I didn't immediately see anything that would preclude retaining and deprecating the old code while

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-09-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1689#issuecomment-57043930 Have either of you thought about how to coordinate this with Josh's work on SPARK-3626? https://github.com/apache/spark/pull/2482 --- If your project is set up

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-28 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2470#discussion_r18136843 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -24,22 +24,123 @@ import org.apache.spark.storage.BlockManagerId

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2524#discussion_r18193968 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -112,6 +112,10 @@ class DAGScheduler( // stray messages

[GitHub] spark pull request: Fix race condition at SchedulerBackend.isReady...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15240513 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -47,19 +47,19 @@ class

[GitHub] spark pull request: Fix race condition at SchedulerBackend.isReady...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15242266 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -108,4 +108,8 @@ private[spark] class

[GitHub] spark pull request: Fix race condition at SchedulerBackend.isReady...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15242315 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -108,4 +108,8 @@ private[spark] class

[GitHub] spark pull request: [SPARK-2490] Change recursive visiting on RDD ...

2014-07-22 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1418#issuecomment-49773532 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1528#discussion_r15246213 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -778,8 +778,10 @@ class DAGScheduler( logInfo(Submitting

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1528#discussion_r15248491 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SchedulingAlgorithm.scala --- @@ -32,11 +32,21 @@ private[spark] class FIFOSchedulingAlgorithm

[GitHub] spark pull request: Add caching information to rdd.toDebugString

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15258957 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1294,7 +1307,11 @@ abstract class RDD[T: ClassTag]( val partitionStr

[GitHub] spark pull request: Fix race condition at SchedulerBackend.isReady...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15268935 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -47,19 +47,19 @@ class

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1528#discussion_r15270158 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SchedulingAlgorithm.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1528#discussion_r15270239 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala --- @@ -27,9 +27,18 @@ private[spark] class TaskSet( val tasks: Array[Task

[GitHub] spark pull request: [SPARK-2635] Fix race condition at SchedulerBa...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15270679 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -47,19 +47,19 @@ class

[GitHub] spark pull request: [SPARK-2635] Fix race condition at SchedulerBa...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15270909 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -47,19 +47,19 @@ class

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-22 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1528#discussion_r15272274 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala --- @@ -27,9 +27,18 @@ private[spark] class TaskSet( val tasks: Array[Task

[GitHub] spark pull request: [SPARK-2635] Fix race condition at SchedulerBa...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1525#discussion_r15272634 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -268,14 +264,18 @@ class

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1528#issuecomment-49848091 This looks like a clean implementation, but you still need to open a JIRA issue to explain why you want this; then edit the description of this PR to reference

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1528#issuecomment-49848309 Sorry, looks like you already have SPARK-2618, so change change the title of this PR to include that. --- If your project is set up for it, you can reply

[GitHub] spark pull request: Add caching information to rdd.toDebugString

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49874874 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1548#discussion_r15289573 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1202,8 +1202,12 @@ private[scheduler] class

[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-49911032 This appears to be a reversion of d58502a1562bbfb1bb4e517ebcc8239efd639297 while ignoring and misapplying the comment regarding ordering (which I'm not completely

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1528#issuecomment-49916553 Yeah, I'm wondering whether the actual problem is that creation and use of scheduler pools with different weights is unclear or too difficult; and that if we could

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15327897 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -56,6 +58,16 @@ private[spark] class Stage( val numPartitions

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15328234 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -315,13 +309,14 @@ class DAGScheduler( */ private def

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15328329 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -315,13 +309,14 @@ class DAGScheduler( */ private def

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15328468 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -341,8 +336,9 @@ class DAGScheduler

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15328544 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -341,8 +336,9 @@ class DAGScheduler

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15328773 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -992,13 +974,14 @@ class DAGScheduler( } private

[GitHub] spark pull request: Removed some HashMaps from DAGScheduler by sto...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1561#issuecomment-49967238 JIRA? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-1726] [SPARK-2567] Eliminate zombie sta...

2014-07-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1566#issuecomment-49968268 Makes sense. LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15331775 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -355,14 +351,13 @@ class DAGScheduler( logDebug

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15331819 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -22,6 +22,8 @@ import org.apache.spark.rdd.RDD import

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15332026 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -56,6 +58,16 @@ private[spark] class Stage( val numPartitions

[GitHub] spark pull request: [SPARK-2054][SQL] Code Generation for Expressi...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r15334407 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,458

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15356652 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -341,8 +336,9 @@ class DAGScheduler

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-24 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1561#discussion_r15361883 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -341,8 +336,9 @@ class DAGScheduler

[GitHub] spark pull request: SPARK-1715: Ensure actor is self-contained in ...

2014-07-24 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/637#issuecomment-50057749 This rebases cleanly on top of https://github.com/apache/spark/pull/1561, so let's get that one in first. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-2529] Clean closures in foreach and for...

2014-07-24 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1583#issuecomment-50105915 Does anyone recall why we lost the closure cleaning in https://github.com/apache/spark/commit/6b288b75d4c05f42ad3612813dc77ff824bb6203 ? --- If your project is set

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1499#issuecomment-50160338 After installing `hub` you can also do a bunch of new stuff on the command line, including `hub checkout https://github.com/apache/spark/pull/1499` https

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1548#issuecomment-50174114 @YanTangZhai If you are searching for another solution and abandoning this PR, could you please close this PR and open a new one when you have something different

[GitHub] spark pull request: Part of [SPARK-2456] Removed some HashMaps fro...

2014-07-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1561#issuecomment-50201015 @JoshRosen Why a trait instead of an abstract class? We're not expecting to need to mixin Stage outside of the Stage class hierarchy, right? --- If your project

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1607#discussion_r15437121 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -110,42 +110,56 @@ class ExternalAppendOnlyMap[K, V, C

[GitHub] spark pull request: SPARK-2684: Update ExternalAppendOnlyMap to ta...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1607#discussion_r15439805 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -110,42 +110,56 @@ class ExternalAppendOnlyMap[K, V, C

[GitHub] spark pull request: [SPARK-2568] RangePartitioner should run only ...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1562#discussion_r15439847 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V

[GitHub] spark pull request: SPARK-2425 Don't kill a still-running Applicat...

2014-07-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1360#issuecomment-50281668 ping Probably too late for a 1.0.2-rc, but this should go into 1.0.3 and 1.1.0. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [Build] SPARK-2614: (2nd patch) Create a spark...

2014-07-27 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1611#discussion_r15442657 --- Diff: assembly/src/deb/control/examples/control --- @@ -0,0 +1,8 @@ +Package: [[deb.pkg.name]]-examples +Version: [[version]]-[[buildNumber

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-28 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1498#discussion_r15498156 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -17,7 +17,7 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-28 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1498#discussion_r15499105 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -691,25 +689,41 @@ class DAGScheduler

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1056#discussion_r15538656 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-29 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1542#issuecomment-50511022 I dunno, merging a PR with no changed files doesn't sound too scary to me. Something is definitely messed up in this PR, with both `Commits` and `Files

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1056#discussion_r15542781 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -348,4 +353,48 @@ private[spark] class Executor

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1056#discussion_r15543587 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -38,8 +37,10 @@ import org.apache.spark._ import

[GitHub] spark pull request: SPARK-2380: Support displaying accumulator val...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1309#discussion_r15557746 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -809,12 +810,25 @@ class DAGScheduler( listenerBus.post

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-07-29 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1617#issuecomment-50552553 What is the need to expose the jobId after the job is finished? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1056#discussion_r15559948 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -348,4 +353,48 @@ private[spark] class Executor

[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-07-29 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1056#discussion_r15560393 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -348,4 +353,48 @@ private[spark] class Executor

[GitHub] spark pull request: [SPARK-1812][wip]

2014-08-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/996#discussion_r15702768 --- Diff: assembly/pom.xml --- @@ -26,7 +26,7 @@ /parent groupIdorg.apache.spark/groupId - artifactIdspark-assembly_2.10

[GitHub] spark pull request: [SPARK-1812] Enable cross build for scala 2.11...

2014-08-04 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/996#discussion_r15743162 --- Diff: assembly/pom.xml --- @@ -26,7 +26,7 @@ /parent groupIdorg.apache.spark/groupId - artifactIdspark-assembly_2.10

[GitHub] spark pull request: [SPARK-1812] Enable cross build for scala 2.11...

2014-08-04 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/996#discussion_r15770288 --- Diff: assembly/pom.xml --- @@ -26,7 +26,7 @@ /parent groupIdorg.apache.spark/groupId - artifactIdspark-assembly_2.10

[GitHub] spark pull request: [SPARK-1022][Streaming] Add Kafka real unit te...

2014-08-05 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1751#issuecomment-51257040 @tdas @pwendell This broke the Maven build: ``` ~/Apache/spark(branch-1.1|✔) ➤ mvn -U -DskipTests clean install . . . [error] Apache/spark

[GitHub] spark pull request: [SPARK-2897][SPARK-2920]TorrentBroadcast does ...

2014-08-08 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1836#discussion_r16006978 --- Diff: core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala --- @@ -17,14 +17,14 @@ package org.apache.spark.broadcast

[GitHub] spark pull request: [SPARK-2886] Use more specific actor system na...

2014-08-09 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1810#discussion_r16028829 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -146,9 +146,9 @@ object SparkEnv extends Logging { } val

[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-12 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1031#issuecomment-51945400 It's definitely not cross-platform. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-2425 Don't kill a still-running Applicat...

2014-08-12 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1360#issuecomment-51959659 @pwendell Still should go into 1.1.0... The change is fairly small, and the unpatched behavior is pretty nasty for long-running applications. --- If your project

[GitHub] spark pull request: SPARK-2830

2014-08-12 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1908#issuecomment-51962565 nit: That's not really an adequate title for this PR, Ameet. It should include enough description so that we can tell what it is about in the corresponding subject

[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

2014-08-12 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1909#issuecomment-51965072 Erik, you've been doing some great work on making non-lazy transforms lazy! I haven't had time to thoroughly review your recent PRs, but can you do some checks

[GitHub] spark pull request: [Build] SPARK-3624: Failed to find Spark assem...

2014-10-07 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2477#issuecomment-58293963 This certainly works, but I'm not sure that we need to maintain the complexity of having both a `jars` directory and a `lib` symlink to it. What we want

[GitHub] spark pull request: [SPARK-3944][Core] Using Option[String] where ...

2014-10-14 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2795#discussion_r18861574 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -340,8 +340,8 @@ private[spark] object Utils extends Logging { val

[GitHub] spark pull request: [SPARK-3944][Core] Code re-factored as suggest...

2014-10-15 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2810#issuecomment-59223588 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-3944][Core] Code re-factored as suggest...

2014-10-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2810#issuecomment-59400293 Yup, LGTM. And as a general rule for the future, avoid pattern matching on Some and None. In most cases you should instead use a map, flatMap or foreach over

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18985880 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -243,6 +249,10 @@ private[spark] class Worker( System.exit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59810037 @CodingCat, Worker is private[spark], so what is the nature of your concern? In fact, I'm wondering whether we really want the changes in this PR that make some

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59811803 A legitimate concern, and certainly something that could be worked up into a JIRA issue and separate pull request. But it's not a very pressing issue since nothing

[GitHub] spark pull request: [SQL] Update SparkSQL and ScalaTest in branch-...

2014-06-13 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1078#issuecomment-46071589 FYI Bumping all the way to the current scalatest 2.2.0 also works. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-1715: Ensure actor is self-contained in ...

2014-06-14 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/637#issuecomment-46104906 It means that the check for binary compatibility after your patch is applied has failed because the checker thinks that there previously was a default/automatic

[GitHub] spark pull request: SPARK-2158 Clean up core/stdout file from File...

2014-06-16 Thread markhamstra
GitHub user markhamstra opened a pull request: https://github.com/apache/spark/pull/1100 SPARK-2158 Clean up core/stdout file from FileAppenderSuite @tdas You can merge this pull request into a Git repository by running: $ git pull https://github.com/markhamstra/spark SPARK

[GitHub] spark pull request: SPARK-2158 Clean up core/stdout file from File...

2014-06-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1100#issuecomment-46265806 jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/999#issuecomment-46389597 Is that the basic strategy we are going to use with AlphaComponents -- merging new APIs at both the minor and maintenance levels? I don't know that I have any

[GitHub] spark pull request: [SPARK-2060][SQL] Querying JSON Datasets with ...

2014-06-18 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/999#issuecomment-46436035 Hmmm, that doesn't precisely match my recollection or understanding. Certainly we discussed that alpha components aren't required to maintain a stable API, but I

[GitHub] spark pull request: Branch 1.0 Add ZLIBCompressionCodec code

2014-06-18 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1115#issuecomment-46479466 Yes, this PR is not in a useful state right now. It's hard to even find the proposed changes because of all the clutter of unnecessary commits, but it looks to me

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-20 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/686#issuecomment-46684950 ping: This should go into 1.0.1 @pwendell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/686#discussion_r14032954 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -313,6 +314,47 @@ class DAGSchedulerSuite extends TestKit

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/686#discussion_r14040631 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1062,10 +1062,15 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/686#discussion_r14041296 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1062,10 +1062,15 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/686#discussion_r14041700 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1062,10 +1062,15 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-06-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/686#discussion_r14211667 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1062,10 +1062,15 @@ class DAGScheduler

[GitHub] spark pull request: [MLLIB] SPARK-2329 Add multi-label evaluation ...

2014-07-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1270#discussion_r14428640 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/MultilabelMetrics.scala --- @@ -0,0 +1,172 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1749] Job cancellation when SchedulerBa...

2014-07-04 Thread markhamstra
Github user markhamstra closed the pull request at: https://github.com/apache/spark/pull/686 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: SPARK-2425 Don't kill a still-running Applicat...

2014-07-10 Thread markhamstra
GitHub user markhamstra opened a pull request: https://github.com/apache/spark/pull/1360 SPARK-2425 Don't kill a still-running Application because of some misbehaving Executors Introduces a LOADING - RUNNING ApplicationState transition and prevents Master from removing

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r14845258 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,421

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r14845309 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,421

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r14846043 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,421

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r14846216 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,421

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r14847035 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -0,0 +1,421

  1   2   3   4   5   6   7   >