httfighter edited a comment on issue #23437: [SPARK-26524] If the application 
directory fails to be created on the SPARK_WORKER_…
URL: https://github.com/apache/spark/pull/23437#issuecomment-452553044
 
 
   @srowen Not yet. I added a property worker.isblack to workerInfo to identify 
whether the worker can be used to allocate executors. The default value is 
false. 
   When the worker fails to assign an executor to an application, I will record 
the number of failures. When the number reaches 
“spark.deploy.executorFailedPerWorkerThreshold”, the worker.isblack is set to 
true. When the master allocates the executor, it will judge whether the worker 
is available according to the resource and worker.isblack.
   I added a timeout parameter “spark.worker.black.timeout” to periodically 
reset worker.isblack to false. The user can repair the worker dir within the 
time limit to make the worker available again. If this solution is available, I 
should also need to add a log print to remind the user to repair the damaged 
worker dir.
   Is this solution feasible? Is there a better suggestion?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to