HeartSaVioR commented on pull request #30812:
URL: https://github.com/apache/spark/pull/30812#issuecomment-750544363
I have to agree that this approach isn't ideal one, as the logic is blind on
executors' status and just try to distribute stateful tasks evenly on
executors. (And it's not guaranteed task scheduler follows the requests.)
Probably the logic needs to update to reflect the actual state distribution per
every batch, as the logic wouldn't know how task scheduler finally makes the
decision.
But I also have to agree that there doesn't look to be other feasible
approaches without making major change.
> Ideally, we should let the Spark task scheduler to do its work rather than
doing the task scheduling work in SS because we don't have the full context of
the executors. For example, this PR has to assume each executor has the same
capability, while the task scheduler knows more about slow and fast executors.
Same applies to the task scheduler. Task scheduler doesn't have the full
context of the characteristics on SS (preferred locations are not an
enforcement), and given the cost of reloading state
(retrieving the file"s" from remote file system, and extracting the
compression, and loading to the memory) is not trivial compared to the ideal
micro-batch execution time, locality is no longer just a guidance.
Probably the point here is the view of the cost - whether it's ignorable or
not, compared to the actual execution. Kafka data source has the same
characteristic (Kafka client and unread fetched data is cached in executor) but
probably less costly. Assuming the large state, it's going to be no longer
ignorable.
If we want to draw the ideal picture here, IMO my ideal picture is to pin
executors and force these executors to serve these stateful tasks on the
lifetime of the query. It's ideal to guarantee these stateful tasks never have
to reload the state unless crash. This would not be ideal if the application
runs multiple queries where batch and streaming are mixed and streaming queries
have longer trigger interval hence the chance to be idle. Either the query
should wait to be assigned to the executor, or executor should be allowed to be
idle for the query.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]