[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-14 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-859233951 sgtm. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-10 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-858718082 I see. And to ensure we're on the same page - for the `RequiredResourceLocation`, how would you provide the PVC info there? IIUC, you want to put the PVC info there,

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-10 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-858375487 > The mapping could be between specific resources (e.g. PVC) and task (i.e. state store). Your rephrase looks good except for one point here. "task (i.e. state

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-09 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-857475592 @tgravescs what's your opinion on the `StateStoreTaskLocation` proposal? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-09 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-857475125 > Isn't it the mapping still executor id <-> statestore? Executor id could be changed due to executor lost. More robust mapping, e.g. for our use-case, might be PVC id <->

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-04 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-854768806 > if user can't specify it themselves with stage level api, you are saying Spark would internally do it for the user? Yes. And we can add a new conf for users to control

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-04 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-853697668 (Sorry about the late reply.) > There is assertion that dynamic allocation should be enabled under stage-level scheduling. I mean if we remove such assertion, will it

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-03 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-853697668 (Sorry about the late reply.) > There is assertion that dynamic allocation should be enabled under stage-level scheduling. I mean if we remove such assertion, will it

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-24 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-847498828 > So, does it mean I can remove the dynamic allocation check for our case without affecting classic stage-level scheduling? Which check are you referring to? >>

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-24 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-846905399 > BTW, @Ngone51 Is dynamic allocation is required for stage-level scheduling? It's required for the classic use case (when you really need to change executor resources

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-23 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-846713976 > If I understand correctly about stage level scheduling, you still need to specify "all" resources needed for "all" tasks in StateRDD; while that may block Spark to schedule

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-846336246 > If the persistent volume is a resource, then it will have to be there on executor startup, so I guess a new executor checks for it on startup and advertises it. At that

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-845771355 > Let me know if I misunderstand it. So I think the idea is, for example, we go to set a flag at the task set of first micro batch. The flag tells Spark scheduler that we want

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-845752591 > Oh, I mean we don’t use HDFS for state store reconstruction. Ah, for HDFS based state store, I was trying to explain how stage level scheduling should work for it

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-845745942 > Oh, I mean we don't use HDFS for state store reconstruction.https://github.com/apache/spark/issue_comments/845745347; accept-charset="UTF-8" method="post"

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-845740021 > No. In our use-case, we want to get rid of HDFS for state store checkpoint. So the task will wait until the PVC re-mounts to another new executor. Our state store is

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-21 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-845694478 > For the stage level scheduling option, is the state store essentially the same across all executors? No. Tasks with the different partition ids must use the different

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-20 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-844858158 > However I'm not sure if stage level scheduling can deal with executor lost case. Based on above comment, seems it cannot. That will be a major concern for the use-case here.

[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-20 Thread GitBox
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-844775582 Thanks for the ping @xuanyuanking > does it fit into stage level scheduling This sounds feasible to me. We can treat the state store as a resource for the