[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59807896 @JoshRosen , this is awesome to test Spark integration with Docker @mccheah , this PR is LGTM now, except that we exposed too many should-be-private members in

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59810532 @markhamstra , yeah, my concern is just this, though Worker is marked as private[spark], is it a good practice to expose every detail in the implementation to the

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59814026 sure, I created the JIRA: https://issues.apache.org/jira/browse/SPARK-4011 --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-4012: call tryOrExit instead of logUncau...

2014-10-20 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/2864 SPARK-4012: call tryOrExit instead of logUncaughtExceptions in ContextCleaner When running an "might-be-memory-intensive" application locally, I received the following

[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

2014-10-21 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r19185382 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -30,7 +30,8 @@ import org.apache.spark.util.ActorLogReceive private[spark

[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

2014-10-21 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60039787 ![image](https://cloud.githubusercontent.com/assets/678008/4731666/589ce496-59af-11e4-99fd-01e4b37d7fef.png) ![image](https://cloud.githubusercontent.com

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60075293 @rxin I'm not sure, maybe we don't need that, because currently all RDD blocks are not reported via network, but by calling post(...) from the driver -

[GitHub] spark pull request: [SPARK-4012] call tryOrExit instead of logUnca...

2014-10-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2864#issuecomment-60179940 Hi, @andrewor14, the issue here is JVM default UncaughtExceptionHandler seems not handle the exception correctly, as I said in the PR description, it will request user

[GitHub] spark pull request: [SPARK-4012] call tryOrExit instead of logUnca...

2014-10-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2864#issuecomment-60181397 this is also very similar to https://github.com/apache/spark/pull/622/, where the main thread cannot handle the exception thrown by the akka's scheduler t

[GitHub] spark pull request: [SPARK-4012] call tryOrExit instead of logUnca...

2014-10-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2864#issuecomment-60181601 but I don't mind to grab that ExecutorUncaughtExceptionHandler to somewhere else to make it more general --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-60237457 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60308342 @andrewor14 do you want to take a look at this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-4067: refactor ExecutorUncaughtException...

2014-10-23 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/2913 SPARK-4067: refactor ExecutorUncaughtExceptionHandler https://issues.apache.org/jira/browse/SPARK-4067 currently , we call Utils.tryOrExit everywhere AppClient Executor

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60315631 @rxin, I see then I will try to refactor the reporting mechanism (currently piggyback in heartbeat) to make it more general --- If your project is set up

[GitHub] spark pull request: [SPARK-4012] call tryOrExit instead of logUnca...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2864#issuecomment-60316623 @andrewor14, (just met a LiveListernBus uncaught exception this afternoon) personally, I feel that we shall stop the driver when such things happen

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60319128 Hi, @shivaram, do you mean we send the report with tell instead of askDriverWithReplyhmmm... what's the original motivation to send Bloc

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-60319425 I mean Akka's tell, not ``` private def tell(message: Any) { if (!askDriverWithReply[Boolean](message)) { throw new SparkExce

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] elimin...

2014-10-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-60382812 Hi, @mateiz @markhamstra , you want to take further review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-4067] refactor ExecutorUncaughtExceptio...

2014-10-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2913#issuecomment-60447582 @andrewor14 thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-10-30 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-61180867 haven't forgot this, I will make it done tomorrow --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-61484584 so, this one is ready for the further review @shivaram @rxin @andrewor14 --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-4012] call tryOrExit instead of logUnca...

2014-11-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2864#issuecomment-61489536 Hey, @andrewor14 , Any further thoughts about this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] elimin...

2014-11-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-61494925 still flaky test...@mateiz, shall we get this merged before 1.2 release? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r19747011 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd( extends

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r19752830 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockId.scala --- @@ -30,12 +30,13 @@ import org.apache.spark.annotation.DeveloperApi * If your

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r19755610 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -19,7 +19,7 @@ package org.apache.spark import akka.actor.Actor

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r19769048 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd( extends

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-03 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-61557580 just addressed the other comments, thanks @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2014-11-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-61877783 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] elimin...

2014-11-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-61878006 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-1771] stop all executors when driverAct...

2014-11-05 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/3129 [SPARK-1771] stop all executors when driverActor restarts In current implementation, when driverActor restarts due to some error, it will clean the executorDataMap but without explicitly sending

[GitHub] spark pull request: [SPARK-1771] stop all executors when driverAct...

2014-11-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/3129#issuecomment-61922823 the unsure thing is, when sending StopExecutor, shall we use ask pattern to ensure the delivery reliability? --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973073 sure, because other people told me some of the parameters are not supposed to be configurableso I pend the work hereI can go through it again to check the

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-09-07 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-54745191 will update the PR today, partially finished --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1191][RESUBMIT]Missing document of some...

2014-09-07 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/2312 [SPARK-1191][RESUBMIT]Missing document of some parameters in Spark Core It's a resubmission of https://github.com/apache/spark/pull/85 according to @mateiz 's suggestion

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-09-07 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-54747068 Hi, @mateiz I made a new version: https://github.com/apache/spark/pull/2312 close this one, thanks --- If your project is set up for it

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-09-07 Thread CodingCat
Github user CodingCat closed the pull request at: https://github.com/apache/spark/pull/85 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1191][RESUBMIT]Missing document of some...

2014-09-07 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2312#issuecomment-54747100 ignore the first 6 parameters...they are duplicate with the ones in spark-standalone.md... --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-1143: refactor TaskSchedulerImplSuite

2014-09-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/339#issuecomment-56505176 I will rebase the PR...I think I was waiting for the review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-3642. Document the nuances of shared var...

2014-09-23 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2490#discussion_r17949828 --- Diff: docs/programming-guide.md --- @@ -1183,6 +1188,10 @@ running on the cluster can then add to it using the `add` method or the `+=` ope

[GitHub] spark pull request: [SPARK-732][RESUBMIT] make if allowing duplica...

2014-09-24 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/2524 [SPARK-732][RESUBMIT] make if allowing duplicate update as an option of accumulator In current implementation, the accumulator will be updated for every successfully finished task, even the task

[GitHub] spark pull request: [SPARK-732][SPARK-3628][RESUBMIT] make if allo...

2014-09-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-56719035 @mateiz @mridulm @kayousterhout @markhamstra @pwendell I proposed this as an resubmission of https://github.com/apache/spark/pull/228 Expecting your review

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-09-24 Thread CodingCat
Github user CodingCat closed the pull request at: https://github.com/apache/spark/pull/228 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-732][SPARK-3628][RESUBMIT] make if allo...

2014-09-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2524#issuecomment-56748597 OK...I will make MIMA happy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-16 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37778677 Hi, @mateiz , I think the current implementation allows fraction == 0 case, or I misunderstood something? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-18 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37933248 Hi, @mateiz, thank you for reviewing this, just added the test cases for both old&new API-based saveAsHadoopDataset --- If your project is set up for it,

[GitHub] spark pull request: SPARK-1235: fail all jobs when DAGScheduler cr...

2014-03-19 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/186 SPARK-1235: fail all jobs when DAGScheduler crashes for some reason https://spark-project.atlassian.net/browse/SPARK-1235 In the current implementation, the running job will hang if the

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-38162769 Hi, @aarondav , I think it's ready for the further review, thank you! --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: SPARK-1235: fail all jobs when DAGScheduler cr...

2014-03-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38164160 hey, @markhamstra, that's my bad, I just added the logging lines BTW, I also thought to restart the running stage which causes the restart of the dagSche

[GitHub] spark pull request: SPARK-1235: fail all jobs when DAGScheduler cr...

2014-03-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38205376 Hey, @markhamstra, how about the current one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1235: fail the problematic job(s) when D...

2014-03-21 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38340804 Hi, all, I read all your comments carefully, I agree with that killing the SparkContext may be a better solution for now... Thank you very much for your suggestions

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38352546 Jenkins dead last time, just rebase the code (to trigger Jenkins) and all test cases passed. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38380394 @mateiz I fixed that, I checked if DAGScheduler is null in runJob of SparkContext --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: SPARK-1298: remove duplicate partitionID check...

2014-03-23 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/210 SPARK-1298: remove duplicate partitionID checking In the current implementation, we check whether partitionIDs is >=0 & < maxPartitionNum in SparkContext.runJob() https://

[GitHub] spark pull request: SPARK-1299: making comments of RDD.doCheckpoin...

2014-03-23 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/211 SPARK-1299: making comments of RDD.doCheckpoint consistent with its usage another trivial thing I found occasionally, the comments of consistent is saying that /** Performs the

[GitHub] spark pull request: SPARK-1298: remove duplicate partitionID check...

2014-03-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/210#issuecomment-38395638 it's weird, how can my modification affect the streaming part, especially this case..."recovery with file input stream.recovery with file input stream "

[GitHub] spark pull request: SPARK-1299: making comments of RDD.doCheckpoin...

2014-03-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/211#issuecomment-38395789 Jenkins doesn't like mecannot pull remote repo. ``` GitHub pull request #211 of commit 79559076ea6facdc903c5a6d45bef4ae78adaf95 automatically m

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-23 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38412099 Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884042 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -93,6 +96,10 @@ private[spark] class TaskSchedulerImpl( val

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884024 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -93,6 +96,10 @@ private[spark] class TaskSchedulerImpl( val

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884203 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -243,12 +275,18 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884189 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -243,12 +275,18 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884300 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884378 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884645 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -30,6 +30,9 @@ import scala.util.Random import org.apache.spark

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10884928 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38476549 I adjusted the code to capture the exception inside processEvent function, so that we can easily test the function in DAGSchedulerSuite --- If your project is

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10896811 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-38483021 @aarondav Done, One additional change is that I change the code SparkHiveHadoopWriter in sql project since we changed the package of SparkHadoopWriter --- If

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10899913 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -31,6 +32,7 @@ import org.apache.spark._ import

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-38499820 thank you @aarondav , seems that Jenkins is not happy in these weeks, I met some weird error in other PRs --- If your project is set up for it, you can reply to

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-38530991 Thank you, Aaron --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10918466 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/214#issuecomment-38533226 Hi, @kayousterhout, you mean CoarseClusterSchedulerBackend block, instead of DAG? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10918503 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/214#issuecomment-38533621 Oh, sorry, it's DAG --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/214#issuecomment-38533621 Oh, sorry, it's DAG --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-24 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/214#discussion_r10919090 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-1141] [WIP] Parallelize Task Serializat...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/214#issuecomment-38556226 Hey, @qqsun8819 , after the second thought on whether task serialization function should call the function directly or send a message to the ClusterSchedulerBackend, I

[GitHub] spark pull request: SPARK-1299: making comments of RDD.doCheckpoin...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/211#issuecomment-38578886 anyone wants to look at this small fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/228 SPARK-732: eliminate duplicate update of the accumulator https://spark-project.atlassian.net/browse/SPARK-732?filter=-1 In current implementation, the accumulator will be updated for every

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-38605061 Hi, @JoshRosen mind reviewing this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1298: remove duplicate partitionID check...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/210#issuecomment-38605165 Hi, anyone wants to look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/228#discussion_r10948890 --- Diff: core/src/main/scala/org/apache/spark/Accumulators.scala --- @@ -43,8 +44,10 @@ class Accumulable[R, T] ( param: AccumulableParam[R, T

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/228#discussion_r10949250 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -925,6 +939,10 @@ class DAGScheduler

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38623391 @mateiz @kayousterhout @markhamstra I think the running jobs have been canceled in this patch? SparkContext.stop() will call DAGScheduler.stop(), which sends a

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-38624402 hehe, Jenkins didn't workI will address all of your comments , thank you @kayousterhout @mridulm --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38624179 sure, I can cancel all jobs immediately after the exception is capturedbut wait for Mark's response on whether we want to use akka native fault-tolerance meth

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-38636442 @kayousterhout @mridulm addressed the concurrency issue in Accumulator, handle speculative case in DAGScheduler (identify the speculative task by checking if there is

[GitHub] spark pull request: SPARK-1324: SparkUI Should Not Bind to SPARK_P...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/231#issuecomment-38637237 Will this cause some problem in EC2, where the bindHost is usually the private IP? if the JettyServer is started with the private IP address, the user cannot

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-25 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-38642649 Jenkins die again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-26 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38674968 working on akka-based solution --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-26 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38769355 Hi, @kayousterhout @markhamstra @mateiz @pwendell , I just committed the supervisor-based solution, it should work as expected (the supervised crashes due to an

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-27 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38883437 seems they are migrating CI infrastructure? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38917608 Finally, Jenkins comes back to normal, thank you @pwendell for your work to fix this @kayousterhout @mateiz @markhamstra how about the current one? I think

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38934195 eh""recovery with file input stream " case in streaming part failed, which is a bit weird, I met it for 3 times now, I think Mark also m

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38955584 anyone can help to re-test it? @kayousterhout @markhamstra --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [WIP]SPARK-732: eliminate duplicate update of ...

2014-03-28 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-38987421 1. refactor the code, 2. clear stageIdtoAccumulator when the stage is aborted 3. Hi, @markhamstra , I checked the code, locally running job seems not have the

[GitHub] spark pull request: SPARK-1235: kill the application when DAGSched...

2014-03-29 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/186#issuecomment-38997523 It seems that when Jenkins is very busy, some weird thing can happen, in the last test, DAGScheduler even failed to create eventProcessingActor... I'm rete

[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...

2014-03-29 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-39007448 Hi, Kay, I will think about it, and see if we can move accumulator related functionalities to tm entirely. --- If your project is set up for it, you can reply to

  1   2   3   4   5   6   7   8   9   10   >