Github user sitalkedia commented on the issue:
https://github.com/apache/spark/pull/19048
>> Why? Because of the idle timeout? If that's your point, then the change
I referenced above should avoid that.
Yes because of idle timeout. Note that the `numExecutorsTarget` is 5 and
EAM has 10 executors available, so it is fine to kill 2 of them. That is not
the issue.
>> How? The scheduler (a.k.a. CGSB) does not kill executors on its own. It
has to be told to do so in some way
Because the EAM asks it to kill 2 of them. But please note that while
killing 2 executors the EAM did not reduce its target to 3, it is still 5. But
since scheduler keeps its internal target, it reduces its target from 5 to 3.
And the EAM and scheduler gets out of sync.
>> If you can actually provide logs that show what you're trying to say
that would probably be easier.
Actually, I added a lot of debug log to find this issue so probably the log
is not going to be of any help to you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]