Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-859233951
sgtm.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-858718082
I see.
And to ensure we're on the same page - for the `RequiredResourceLocation`,
how would you provide the PVC info there? IIUC, you want to put the PVC info
there,
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-858375487
> The mapping could be between specific resources (e.g. PVC) and task (i.e.
state store).
Your rephrase looks good except for one point here. "task (i.e. state
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-857475592
@tgravescs what's your opinion on the `StateStoreTaskLocation` proposal?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-857475125
> Isn't it the mapping still executor id <-> statestore? Executor id could
be changed due to executor lost. More robust mapping, e.g. for our use-case,
might be PVC id <->
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-854768806
> if user can't specify it themselves with stage level api, you are saying
Spark would internally do it for the user?
Yes. And we can add a new conf for users to control
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-853697668
(Sorry about the late reply.)
> There is assertion that dynamic allocation should be enabled under
stage-level scheduling. I mean if we remove such assertion, will it
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-853697668
(Sorry about the late reply.)
> There is assertion that dynamic allocation should be enabled under
stage-level scheduling. I mean if we remove such assertion, will it
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-847498828
> So, does it mean I can remove the dynamic allocation check for our case
without affecting classic stage-level scheduling?
Which check are you referring to?
>>
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-846905399
> BTW, @Ngone51 Is dynamic allocation is required for stage-level scheduling?
It's required for the classic use case (when you really need to change
executor resources
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-846713976
> If I understand correctly about stage level scheduling, you still need to
specify "all" resources needed for "all" tasks in StateRDD; while that may
block Spark to schedule
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-846336246
> If the persistent volume is a resource, then it will have to be there on
executor startup, so I guess a new executor checks for it on startup and
advertises it. At that
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845771355
> Let me know if I misunderstand it. So I think the idea is, for example, we
go to set a flag at the task set of first micro batch. The flag tells Spark
scheduler that we want
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845752591
> Oh, I mean we don’t use HDFS for state store reconstruction.
Ah, for HDFS based state store, I was trying to explain how stage level
scheduling should work for it
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845745942
>
Oh, I mean we don't use HDFS for
state store reconstruction.https://github.com/apache/spark/issue_comments/845745347;
accept-charset="UTF-8" method="post"
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845740021
> No. In our use-case, we want to get rid of HDFS for state store
checkpoint. So the task will wait until the PVC re-mounts to another new
executor. Our state store is
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845694478
> For the stage level scheduling option, is the state store essentially the
same across all executors?
No. Tasks with the different partition ids must use the different
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-844858158
> However I'm not sure if stage level scheduling can deal with executor lost
case. Based on above comment, seems it cannot. That will be a major concern for
the use-case here.
Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-844775582
Thanks for the ping @xuanyuanking
> does it fit into stage level scheduling
This sounds feasible to me. We can treat the state store as a resource for
the
19 matches
Mail list logo