[jira] [Commented] (SPARK-33031) scheduler with blacklisting doesn't appear to pick up new executor added
[ https://issues.apache.org/jira/browse/SPARK-33031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371592#comment-17371592 ] Thomas Graves commented on SPARK-33031: --- I do not think its resolved but haven't tried it lately. it sounds like its just a UI issue so if you want to try it out and still see the problem, feel free to work on it. > scheduler with blacklisting doesn't appear to pick up new executor added > > > Key: SPARK-33031 > URL: https://issues.apache.org/jira/browse/SPARK-33031 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 3.0.0, 3.1.0 >Reporter: Thomas Graves >Priority: Critical > > I was running a test with blacklisting standalone mode and all the executors > were initially blacklisted. Then one of the executors died and we got > allocated another one. The scheduler did not appear to pick up the new one > and try to schedule on it though. > You can reproduce this by starting a master and slave on a single node, then > launch a shell like where you will get multiple executors (in this case I got > 3) > $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 > --conf spark.blacklist.enabled=true > From shell run: > {code:java} > import org.apache.spark.TaskContext > val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it => > val context = TaskContext.get() > if (context.attemptNumber() < 2) { > throw new Exception("test attempt num") > } > it > } > rdd.collect(){code} > > Note that I tried both with and without dynamic allocation enabled. > > You can see screen shot related on > https://issues.apache.org/jira/browse/SPARK-33029 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33031) scheduler with blacklisting doesn't appear to pick up new executor added
[ https://issues.apache.org/jira/browse/SPARK-33031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371398#comment-17371398 ] shubhangi priya commented on SPARK-33031: - [~tgraves] Hi I want to work on this issue. If it is unresolved can I work upon it? > scheduler with blacklisting doesn't appear to pick up new executor added > > > Key: SPARK-33031 > URL: https://issues.apache.org/jira/browse/SPARK-33031 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 3.0.0, 3.1.0 >Reporter: Thomas Graves >Priority: Critical > > I was running a test with blacklisting standalone mode and all the executors > were initially blacklisted. Then one of the executors died and we got > allocated another one. The scheduler did not appear to pick up the new one > and try to schedule on it though. > You can reproduce this by starting a master and slave on a single node, then > launch a shell like where you will get multiple executors (in this case I got > 3) > $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 > --conf spark.blacklist.enabled=true > From shell run: > {code:java} > import org.apache.spark.TaskContext > val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it => > val context = TaskContext.get() > if (context.attemptNumber() < 2) { > throw new Exception("test attempt num") > } > it > } > rdd.collect(){code} > > Note that I tried both with and without dynamic allocation enabled. > > You can see screen shot related on > https://issues.apache.org/jira/browse/SPARK-33029 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33031) scheduler with blacklisting doesn't appear to pick up new executor added
[ https://issues.apache.org/jira/browse/SPARK-33031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258498#comment-17258498 ] Thomas Graves commented on SPARK-33031: --- ah that could be the case, but if that is true we probably need to fix something in the UI to indicate that > scheduler with blacklisting doesn't appear to pick up new executor added > > > Key: SPARK-33031 > URL: https://issues.apache.org/jira/browse/SPARK-33031 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 3.0.0, 3.1.0 >Reporter: Thomas Graves >Priority: Critical > > I was running a test with blacklisting standalone mode and all the executors > were initially blacklisted. Then one of the executors died and we got > allocated another one. The scheduler did not appear to pick up the new one > and try to schedule on it though. > You can reproduce this by starting a master and slave on a single node, then > launch a shell like where you will get multiple executors (in this case I got > 3) > $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 > --conf spark.blacklist.enabled=true > From shell run: > {code:java} > import org.apache.spark.TaskContext > val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it => > val context = TaskContext.get() > if (context.attemptNumber() < 2) { > throw new Exception("test attempt num") > } > it > } > rdd.collect(){code} > > Note that I tried both with and without dynamic allocation enabled. > > You can see screen shot related on > https://issues.apache.org/jira/browse/SPARK-33029 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33031) scheduler with blacklisting doesn't appear to pick up new executor added
[ https://issues.apache.org/jira/browse/SPARK-33031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255685#comment-17255685 ] Baohe Zhang commented on SPARK-33031: - New tasks won't be scheduled because the node is marked as blacklisted after 2 executors on that node are blacklisted. The behavior seems correct if the experiment is done on a single node. > scheduler with blacklisting doesn't appear to pick up new executor added > > > Key: SPARK-33031 > URL: https://issues.apache.org/jira/browse/SPARK-33031 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 3.0.0, 3.1.0 >Reporter: Thomas Graves >Priority: Critical > > I was running a test with blacklisting standalone mode and all the executors > were initially blacklisted. Then one of the executors died and we got > allocated another one. The scheduler did not appear to pick up the new one > and try to schedule on it though. > You can reproduce this by starting a master and slave on a single node, then > launch a shell like where you will get multiple executors (in this case I got > 3) > $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 > --conf spark.blacklist.enabled=true > From shell run: > {code:java} > import org.apache.spark.TaskContext > val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it => > val context = TaskContext.get() > if (context.attemptNumber() < 2) { > throw new Exception("test attempt num") > } > it > } > rdd.collect(){code} > > Note that I tried both with and without dynamic allocation enabled. > > You can see screen shot related on > https://issues.apache.org/jira/browse/SPARK-33029 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33031) scheduler with blacklisting doesn't appear to pick up new executor added
[ https://issues.apache.org/jira/browse/SPARK-33031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17204265#comment-17204265 ] Thomas Graves commented on SPARK-33031: --- I tried this again on yarn and now it seems to be working there so it might only be standalone mode. > scheduler with blacklisting doesn't appear to pick up new executor added > > > Key: SPARK-33031 > URL: https://issues.apache.org/jira/browse/SPARK-33031 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 3.0.0, 3.1.0 >Reporter: Thomas Graves >Priority: Critical > > I was running a test with blacklisting standalone mode and all the executors > were initially blacklisted. Then one of the executors died and we got > allocated another one. The scheduler did not appear to pick up the new one > and try to schedule on it though. > You can reproduce this by starting a master and slave on a single node, then > launch a shell like where you will get multiple executors (in this case I got > 3) > $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 > --conf spark.blacklist.enabled=true > From shell run: > {code:java} > import org.apache.spark.TaskContext > val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it => > val context = TaskContext.get() > if (context.attemptNumber() < 2) { > throw new Exception("test attempt num") > } > it > } > rdd.collect(){code} > > Note that I tried both with and without dynamic allocation enabled. > > You can see screen shot related on > https://issues.apache.org/jira/browse/SPARK-33029 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org