[GitHub] spark pull request #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-03 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16790 [SPARK-19450] Replace askWithRetry with askSync. ## What changes were proposed in this pull request? `askSync` is already added in `RpcEndpointRef` (see SPARK-19347 and https

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 Thanks a lot for reviewing this PR~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16779: [SPARK-19437] Rectify spark executor id in HeartbeatRece...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16779 Thanks a lot for reviewing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-31 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98819685 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1212,8 +1223,9 @@ class DAGScheduler

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for keep reviewing this~ Your comments are very helpful ~ Thank you so much for your help ~~ -when we encounter the condition where there are no pending

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin Thanks a lot for helping this PR~ I've already refined~ Please take another look~ --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @srowen @jasonmoore2k Thanks a lot for reviewing this PR~ >Should successful and tasksSuccessful renamed to be completed and tasksCompleted? How do you think about ab

[GitHub] spark pull request #16807: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16807 [SPARK-19398] Change one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark pull request #16738: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
GitHub user jinxing64 reopened a pull request: https://github.com/apache/spark/pull/16738 [SPARK-19398] Change one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark pull request #16807: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16807 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-04 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 I just changed the log message, but not sure if it clear enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16738: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16738 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks a lot for for the comments. I've already refined accordingly. I still have one concern: > If this is a correct description, I’d ar

[GitHub] spark pull request #16831: [SPARK-19263] Fix race in SchedulerIntegrationSui...

2017-02-07 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16831 [SPARK-19263] Fix race in SchedulerIntegrationSuite. ## What changes were proposed in this pull request? All the process of offering resource and generating `TaskDescription` should

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout @squito This is originally raised by @squito when review https://github.com/apache/spark/pull/16620. Sorry for my eager to make this small pr. --- If your project is set up

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @squito Thanks a lot for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16780: [SPARK-19438] Both reading and updating executorD...

2017-02-02 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16780 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16780: [SPARK-19438] Both reading and updating executorDataMap ...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16780 Thanks a lot for looking into this~ @zsxwing You are right. My understanding about this is incorrect. `CoarseGrainedSchedulerBackend: DriverEndpoint` is a `ThreadSafeRpcEndpoint`, thus

[GitHub] spark issue #16779: [SPARK-19437] Rectify spark executor id in HeartbeatRece...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16779 @zsxwing Thanks a lot for reviewing this. Not sure why the test doesn't start automatically. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #16779: [SPARK-19437] Rectify spark executor id in Heartb...

2017-02-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16779 [SPARK-19437] Rectify spark executor id in HeartbeatReceiverSuite. ## What changes were proposed in this pull request? The current code in `HeartbeatReceiverSuite`, executorId is set

[GitHub] spark pull request #16780: [SPARK-19438] Both reading and updating executorD...

2017-02-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16780 [SPARK-19438] Both reading and updating executorDataMap should be guarded by CoarseGrainedSchedulerBackend.this.synchronized when handle RegisterExecutor. ## What changes were proposed

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @srowen Thanks a lot. I'll refine : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16808: [SPARK-19461] Remove some unused imports.

2017-02-05 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16808 [SPARK-19461] Remove some unused imports. ## What changes were proposed in this pull request? Remove some unused imports in `CoarseGrainedSchedulerBackend` and `YarnSchedulerBackend

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 Thanks a lot for the comments. Actually Im still not sure how to change this log or even just remove it. I just think the log is confusing. It is printed out every FetchFailed. Please give some

[GitHub] spark pull request #16808: [SPARK-19461] Remove some unused imports.

2017-02-05 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16808 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 As @squito mentioned: >Before this, the DAGScheduler didn't really know anything about taskSetManagers. (In its current form, this pr uses a "leaked" handle via rootPool.getSorte

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks a lot for reviewing this pr thus far. I do think the approach, which throws away task results from earlier attempts that were running on executors

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @markhamstra @squito @kayousterhout It would be great if you can give more comments about above and I can continue working on this : ) --- If your project is set up for it, you can reply

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout Thanks a lot for review. I've already refined. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout thanks a lot. I'm not sure how to start the unit test automatically, do I have the right to do that? BTW, may I ask a question, what is the proper way to run the unit test

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @squito Many thanks for your help. You are so kind person : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Thanks a lot again : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for helping this PR thus far. I've added unit test in `DAGSchedulerSuite`, but not sure if it is exactly what you suggest. I created a `mockTaskSchedulerImpl

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Thanks a lot for helping this pr thus far. I think the proposal is quite clear. I've already refined. Please take another look. --- If your project is set up for it, you can

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Would you please take another look at this? Please give some advice if possible and I can continue working on this : ) --- If your project is set up for it, you can reply

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Would you please give a look at this ? It's great if you could help review this : ) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #16690: [SPARK-19347] ReceiverSupervisorImpl can add bloc...

2017-01-24 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16690 [SPARK-19347] ReceiverSupervisorImpl can add block to ReceiverTracker multiple times because of askWithRetry. ## What changes were proposed in this pull request? `ReceiverSupervisorImpl

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for your comments, they are very helpful. I've already refined the code, please take another look : ) When handle `Success` of `ShuffleMapTask`, what I want

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 >hmm, this is a nuisance. I don't see any good way to get rid of this sleep ... but now that I think about it, why can't you do this in DAGSchedulerSuite? it seems like this can be entir

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98043010 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -718,6 +703,21 @@ private[spark] class TaskSetManager

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin @zsxwing ping for review~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin ping for review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito ping for review~~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98488916 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -718,6 +703,21 @@ private[spark] class TaskSetManager

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 I feel very sorry if this is disturbing : ) @vanzin Thanks a lot for continuing reviewing this and I'll be more patient : ) Sorry again~~ --- If your project is set up for it, you can

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-01-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 Should `successful` and `tasksSuccessful` renamed to be `completed` and `tasksCompleted`?which I think make more sense. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #16738: [SPARK-19398] remove one misleading log in TaskSe...

2017-01-29 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16738 [SPARK-19398] remove one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 Fail to pass unit test. I will keep working on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @markhamstra Thanks a lot for your comment, I've already refined, please take another look ~ --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Could you please take another look at this ? : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks for all of your work for this patch. Really appreciate your help : ) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-02-21 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @squito Thanks a lot for your comments : ) Yes, There must be a design doc for discussing. I will prepare and post a pdf to jira. --- If your project is set up for it, you can reply

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @kayousterhout @squito Would you mind to take a look at this when have time ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout I'll close since this functionality is already tested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-20 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16901 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-02-19 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16989 [SPARK-19659] Fetch big blocks to disk when shuffle-read. ## What changes were proposed in this pull request? Currently the whole block is fetched into memory(off heap by default) when

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-02-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @vanzin @squito Would you mind to take a look at this when have time ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 @srowen @vanzin Thanks a lot for the work on this ~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @srowen How do you think about https://github.com/apache/spark/pull/16790? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 https://github.com/apache/spark/pull/16690#discussion_r101616883 causes the build to produce lots of deprecation warnings. @srowen @vanzin How do you think about this ? --- If your project

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout Thanks a lot for the clear explanation. It makes great sense to me and help me understand the logic a lot. Also I think the way of testing is very good and make the code very

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-12 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16901 [SPARK-19565] Improve DAGScheduler tests. ## What changes were proposed in this pull request? This is related to #16620. When fetch failed, stage will be resubmitted. There can

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout @squito @markhamstra As mentioned in #16620 , I think it might make sense to make this pr. Please take a look. If you think it is too trivial, I will close. --- If your

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-02-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r100953546 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2161,6 +2161,58 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout I've refined accordingly, please take another look : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout I've refined accordingly. Sorry for the stupid mistake I made. Please take another look at this : ) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16901#discussion_r100968529 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2161,6 +2161,48 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-02-08 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16867 [SPARK-16929] Improve performance when check speculatable tasks. ## What changes were proposed in this pull request? When check speculatable tasks in `TaskSetManager`, current code scan

[GitHub] spark issue #16876: [SPARK-19537] Move pendingPartitions to ShuffleMapStage.

2017-02-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16876 It's great to have pendingPartitions in ShuffleMapStage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout Thanks a lot. Sorry for this and I'll be careful in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot. I've refined the comment, please take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @squito Thanks a lot for your comments. I've refined the comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16876: [SPARK-19537] Move pendingPartitions to ShuffleMapStage.

2017-02-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16876 @kayousterhout It's great to give a definition of `pendingPartitions` in `ShuffleMapStage`. May I ask a question and make my understanding about `pendingPartitions` clear ? It means

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 Yes, refined : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 Both `askSync` and `askWithRetry` are blocking, the only difference is the "retry"(default is 3 times) when the rpc is failed. Callers of this method do not necessarily rely on t

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should handle stage's pending...

2017-01-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito `SchedulerIntegrationSuite` is very helpful. I like it very much, I can reproduce this issue in `SchedulerIntegrationSuite` now. To fix this issue, it is more complicated than I

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should handle stage's ...

2017-01-17 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16620 [SPARK-19263] DAGScheduler should handle stage's pendingPartitions properly in handleTaskCompletion. ## What changes were proposed in this pull request? In current `DAGScheduler

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @vanzin Thanks for your comments.I have changed the unit test. Could you take another look? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @vanzin Sorry for the stupid mistake I made. I've changed. Please take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @markhamstra @squito Thanks a lot for your helpful comments. I made a unit test for this fix and changed the patch. Now it can pass all unit tests for me locally. In this fix: add

[GitHub] spark issue #16867: [WIP][SPARK-16929] Improve performance when check specul...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @squito Thanks a lot for your comments : ) >When check speculatable tasks in TaskSetManager, current code scan all task infos and sort durations of successful tasks in O(NlogN) t

[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @squito I've uploaded a design doc to jira, please take a look when you have time :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r103391138 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -911,14 +916,14 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #17133: [SPARK-19793] Use clock.getTimeMillis when mark t...

2017-03-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17133 [SPARK-19793] Use clock.getTimeMillis when mark task as finished in TaskSetManager. ## What changes were proposed in this pull request? TaskSetManager is now using

[GitHub] spark issue #17133: [SPARK-19793] Use clock.getTimeMillis when mark task as ...

2017-03-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17133 I found this when do https://github.com/apache/spark/pull/17112, which is for measuring the approach I proposed in https://github.com/apache/spark/pull/16867. --- If your project is set up

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-03-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17111 @kayousterhout Thanks for merging. (btw, I made some measurements for https://github.com/apache/spark/pull/16867 SPARK-16929, please take a look when you have time :) ) --- If your

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @kayousterhout @squito It's great to open a new jira for this change. Please take a look at https://github.com/apache/spark/pull/17111. --- If your project is set up for it, you can reply

[GitHub] spark issue #17112: Measurement for SPARK-16929.

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17112 The unit test "Measurement for SPARK-16929." added is the measurement. In TaskSetManagerSuite.scala line 1049, if `newAlgorithm=true`, `successfulTaskIdsSet `will be used to get

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 I added a measurement for this pr in #17112 . Results are as below, newAlgorithm indicates whether we use `TreeSet` to get the median duration or not. And `time cost` is the time used when get

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17111 cc @kayousterhout @squito --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17112: Measurement for SPARK-16929.

2017-02-28 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17112 Measurement for SPARK-16929. ## What changes were proposed in this pull request? This pr doesn't target for merging. It's a measurement for https://github.com/apache/spark/pull/16867

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 ping @zsxwing @vanzin Could you give another look at this please ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @vanzin @zsxwing Thanks a lot for your comment. I will file another jira to add a blocking version of ask. What else can I do for this pr : ) ? --- If your project is set up for it, you

[GitHub] spark issue #16503: [SPARK-18113] canCommit should return same when called b...

2017-01-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @zsxing, @vanzin Maybe using `ask` in method `canCommit` is not suitable(i think). Because `ask` returns a Future, but it should be a blocking process to get result

[GitHub] spark issue #16503: [SPARK-18113] canCommit should return same when called b...

2017-01-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 >If we can remove uses of askWithRetry as we find these issues, we can, at some point, finally get rid of the API altogether. How do you think about providing a *"blockin

[GitHub] spark issue #16503: [SPARK-18113] canCommit should return same when called b...

2017-01-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @vanzin Thanks a lot for your comment. It's very helpful. I'll change it to `ask`. I think it make sense to keep receiver idempotent when handling `AskPermissionToCommitOutput`, even

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @ash211 Thanks a lot for your comment. I've already fixed the failing Scala style tests. Running `./dev/scalastyle` passed. Could you give another look? --- If your project is set up

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @ash211 Thank you so much for your comment. I've changed accordingly. Could you please give another look? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #16503: [SPARK-18113] Use ask to replace askWithRetry in ...

2017-01-14 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16503#discussion_r96120047 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -221,6 +232,17 @@ private case class

  1   2   3   4   5   6   7   8   >