[
https://issues.apache.org/jira/browse/SPARK-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhoukang updated SPARK-19934:
-----------------------------
Description:
{code}
def handleRemovedExecutor(executorId: String): Unit = {
// We intentionally do not clean up executors that are already blacklisted
in
// nodeToBlacklistedExecs, so that if another executor on the same node
gets blacklisted, we can
// blacklist the entire node. We also can't clean up
executorIdToBlacklistStatus, so we can
// eventually remove the executor after the timeout. Despite not clearing
those structures
// here, we don't expect they will grow too big since you won't get too
many executors on one
// node, and the timeout will clear it up periodically in any case.
executorIdToFailureList -= executorId
}
{code}
I think the comments should be:
{code}
// We intentionally do not clean up executors that are already blacklisted in
// nodeToBlacklistedExecs, so that if
{spark.blacklist.application.maxFailedExecutorsPerNode} - 1 executor on the
same node gets blacklisted, we can
// blacklist the entire node.
{code}
Reference from the design doc
https://docs.google.com/document/d/1R2CVKctUZG9xwD67jkRdhBR4sCgccPR2dhTYSRXFEmg/edit.
when consider update a node to application-level blacklist,should follow rule:
Nodes are placed into a blacklist for the entire application when the number of
blacklisted executors goes over
spark.blacklist.application.maxFailedExecutorsPerNode (default 2)
and the comment just explain as default value
was:
{code}
def handleRemovedExecutor(executorId: String): Unit = {
// We intentionally do not clean up executors that are already blacklisted
in
// nodeToBlacklistedExecs, so that if another executor on the same node
gets blacklisted, we can
// blacklist the entire node. We also can't clean up
executorIdToBlacklistStatus, so we can
// eventually remove the executor after the timeout. Despite not clearing
those structures
// here, we don't expect they will grow too big since you won't get too
many executors on one
// node, and the timeout will clear it up periodically in any case.
executorIdToFailureList -= executorId
}
{code}
I think the comments should be:
{code}
// We intentionally do not clean up executors that are already blacklisted in
// nodeToBlacklistedExecs, so that if
{spark.blacklist.application.maxFailedExecutorsPerNode} - 1 executor on the
same node gets blacklisted, we can
// blacklist the entire node.
{code}
Reference from the design doc
https://docs.google.com/document/d/1R2CVKctUZG9xwD67jkRdhBR4sCgccPR2dhTYSRXFEmg/edit.
when consider update a node to application-level blacklist,should abey follow
rule:
Nodes are placed into a blacklist for the entire application when the number of
blacklisted executors goes over
spark.blacklist.application.maxFailedExecutorsPerNode (default 2)
and the comment just explain as default value
> code comments are not very clearly in BlackListTracker.scala
> ------------------------------------------------------------
>
> Key: SPARK-19934
> URL: https://issues.apache.org/jira/browse/SPARK-19934
> Project: Spark
> Issue Type: Improvement
> Components: Documentation
> Affects Versions: 2.1.0
> Reporter: zhoukang
> Priority: Trivial
>
> {code}
> def handleRemovedExecutor(executorId: String): Unit = {
> // We intentionally do not clean up executors that are already
> blacklisted in
> // nodeToBlacklistedExecs, so that if another executor on the same node
> gets blacklisted, we can
> // blacklist the entire node. We also can't clean up
> executorIdToBlacklistStatus, so we can
> // eventually remove the executor after the timeout. Despite not
> clearing those structures
> // here, we don't expect they will grow too big since you won't get too
> many executors on one
> // node, and the timeout will clear it up periodically in any case.
> executorIdToFailureList -= executorId
> }
> {code}
> I think the comments should be:
> {code}
> // We intentionally do not clean up executors that are already blacklisted in
> // nodeToBlacklistedExecs, so that if
> {spark.blacklist.application.maxFailedExecutorsPerNode} - 1 executor on the
> same node gets blacklisted, we can
> // blacklist the entire node.
> {code}
> Reference from the design doc
> https://docs.google.com/document/d/1R2CVKctUZG9xwD67jkRdhBR4sCgccPR2dhTYSRXFEmg/edit.
> when consider update a node to application-level blacklist,should follow rule:
> Nodes are placed into a blacklist for the entire application when the number
> of blacklisted executors goes over
> spark.blacklist.application.maxFailedExecutorsPerNode (default 2)
> and the comment just explain as default value
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]