[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300926#comment-16300926 ] Xuefu Zhang commented on SPARK-22765: - Did some benchmarking with a set of 20 queries on upfront allocation against exponential ramp-up. No clear trend is seen: upfront allocation offers better efficiency for some of the queries, similar efficiency for others, and worse for the rest. These variations might just be noise, which is abundant in our production cluster. Thus, I tend to agree that upfront allocation offers limited benefit for efficiency, if any. (On the other hand, it seems benefiting performance somewhat.) I also noticed that when the scheduler schedules a task, it doesn't necessarily pick a core that's available in an executor that's running other tasks. I speculate that efficiency improves if busy executors are favored for a new task so that other idle executors can idle out. (To be tested out.) Making idleTime=0 valid is a good thing to have. I will create a separate ticket for that. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298634#comment-16298634 ] Xuefu Zhang commented on SPARK-22765: - bq: at least based on this one experiment up front allocation didn't help. I'm not sure how the conclusion is drawn here, but with #5, which diffs from #3 only with additional upfront allocation, we see a resource usage decrease from 2X to 1.4X. This doesn't seem supporting this claim. I will do more testing on this to make my observation more evident. For idle=0, I think it's a valid case and should be helpful as well. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298589#comment-16298589 ] Thomas Graves commented on SPARK-22765: --- ok so at least based on this one experiment up front allocation didn't help. The thing that did are already available (SPARK-21656) and idleTime=1s. The other thing that helped is SPARK-22683 which is separate jira. So with this there isn't anything to do unless you want to supposed idleTime=0. If that is the case I would be interested to hear if that helps at all. Personally I'd be ok with supporting 0 as idle time. If you want to do that with this jira I suggest changing the description and feel free to put up PR. Note that its the holidays so might not get reviewed immediately. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297914#comment-16297914 ] Xuefu Zhang commented on SPARK-22765: - Alright, I tested upfront allocation and its combinations with other improvement ideas and here is what I found: 1. the query: just one query, fairly complicated, represented by one of the main spark jobs: {code} Status: Running (Hive on Spark job[4]) -- STAGES ATTEMPTSTATUS TOTAL COMPLETED RUNNING PENDING FAILED -- Stage-10 ... 0 FINISHED33933900 0 Stage-11 ... 0 FINISHED20120100 0 Stage-12 ... 0 FINISHED19119100 0 Stage-13 ... 0 FINISHED17817800 0 Stage-14 ... 0 FINISHED11511500 0 Stage-15 ... 0 FINISHED10510500 0 Stage-16 ... 0 FINISHED59259200 0 Stage-17 ... 0 FINISHED19119100 0 Stage-4 0 FINISHED17817800 0 Stage-5 0 FINISHED11511500 0 Stage-6 0 FINISHED10510500 0 Stage-7 0 FINISHED33933900 0 Stage-8 0 FINISHED20120100 0 Stage-9 0 FINISHED19119100 0 -- {code} 2. Without any improvement, default 60s idleTime, Spark uses more than 3X resources compared to MR. 3. With idleTime=5s and the improvement in SPARK-21656, Spark uses about 2X resources. 4. Same as #3, but with idleTime=1s, Spark uses 1.4X resoures 5. Same as #3, but with additional upfront allocation Spark also uses 1.4X resource 6. Same as #4, but with additional improvement in SPARK-22683 (factor = 2), Spark uses 1.2X resource 7. Same as #6 but with factor=3, Spark uses 0.8X resources. While this is just for one query, far from being conclusive, we can really see that how much those considerations might impact efficiency. I'm sure the mileage varies, but this at least shows there is a lot of room for Spark to improve resource utilization efficiency wrt scheduling. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297645#comment-16297645 ] Xuefu Zhang commented on SPARK-22765: - Haven't got a chance to try upfront allocation. Tried one query (runs for a couple of mins) with 1s idle time. The resource usage is further cut down as much as half, very close to that of MR in this case. I think we should allow 0s idle time, even for completeness. I will try upfront allocation and update. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297549#comment-16297549 ] Thomas Graves commented on SPARK-22765: --- with SPARK-21656 does upfront allocation vs exponential ramp up make a difference? Does 1 second vs 0 seconds in idle timeout make a difference? I guess it could add up eventually, but currently I think we have a timer scheduled for every 100 millis so unless you bypass that in onExecutorIdle you will have some delay. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297529#comment-16297529 ] Xuefu Zhang commented on SPARK-22765: - {quote} SPARK-21656 and the dynamic allocation should handle that, the target number in the dynamic allocation manager is supposed to be based on all running stages for pending and running tasks. Are you saying that is not true? {quote} I verified and it does seem that SPARK-21656 covers concurrent stages. My job ran too fast, so I had to limit maxExecutors to a smaller number to observe more closely. It's no issue, after all. This is great! > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297453#comment-16297453 ] Nan Zhu commented on SPARK-22765: - I took a look at the code, one of the possibilities is as following: we add the new executor id https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L598 and immediately filter idle executors https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L419 Right after this step, since executorIdToTaskIds hasn't contained executor id, the newly added executor is applied with onExecutorIdle in the method of onExecutorIdle: https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L479 the newly added executor is also vulnerable to be assigned with a removeTimes value, since removeTimes and executorsPendingToRemove would not contain the newly added executor ID. Based on the calculation method, the removeTime would be 60s after. Then we will remove the executor after 60s (https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L266). The potential fix is to update executorIdToTaskIds before we filter idle executors (but SPARK-21656 should prevent this happeningstill trying to figure out why SPARK-21656 not working) > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297346#comment-16297346 ] Xuefu Zhang commented on SPARK-22765: - I'm not 100% positive, but that seems to be what I saw. I will verify further. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297335#comment-16297335 ] Thomas Graves commented on SPARK-22765: --- SPARK-21656 and the dynamic allocation should handle that, the target number in the dynamic allocation manager is supposed to be based on all running stages for pending and running tasks. Are you saying that is not true? > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297316#comment-16297316 ] Xuefu Zhang commented on SPARK-22765: - Actually I meant across parallel stages (those connected to union transformation), not serial stages that I care much less. The following is what I meant (in Hive on Spark context): {code} Status: Running (Hive on Spark job[4]) -- STAGES ATTEMPTSTATUS TOTAL COMPLETED RUNNING PENDING FAILED -- Stage-10 .. 0 RUNNING340126 2140 0 Stage-11 0 PENDING201 00 201 0 Stage-12 0 PENDING191 00 191 0 Stage-13 . 0 RUNNING178 33 1450 0 Stage-14 0 PENDING115 00 115 0 Stage-15 0 PENDING105 00 105 0 Stage-16 0 PENDING592 00 592 0 Stage-17 0 PENDING191 00 191 0 Stage-4 ... 0 RUNNING178157 210 4 Stage-5 0 PENDING115 00 115 0 Stage-6 0 PENDING105 00 105 0 Stage-7 .0 RUNNING340232 1080 0 Stage-8 0 PENDING201 00 201 0 Stage-9 0 PENDING191 00 191 0 {code} > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297307#comment-16297307 ] Thomas Graves commented on SPARK-22765: --- yes between stages becomes a problem with lower timeout, we can certainly look at extending across stages but that isn't really what you are proposing here, mr style is to release immediately and not reuse. That also kind of implies a slightly higher timeout would be good unless you have very large time between stages which would again probably favor small timeout to release and then reacquire. how much time are you having between stages? Or perhaps we need 2 timeouts one for running stage and one for between stages, but that is somewhat contradictory unless a lot of tasks finish all at the same time, but even then the timeout for running stages as 0 would have released the container immediately anyway. So I'm a bit confused by your statement. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297276#comment-16297276 ] Xuefu Zhang commented on SPARK-22765: - Hi [~CodingCat], yes, we adapted both #1 and #2, but we still needed 60s as minimum w/o SPARK-21656. Hi @Tomas Graves, idle=1 seems working, but idle=0 seems illegal. {code} org.apache.spark.SparkException: spark.dynamicAllocation.executorIdleTimeout must be > 0! {code} Now I think it, idle=0 should be permitted because a user might not want any idle executor at all. {quote} with SPARK-21656 executors shouldn't timeout unless you are between stages {quote} This seems functioning correctly. However, I think this needs to be extended across concurrent stages. Based on what I'm seeing, executors are not reused across such stages, which means if an executor working for a completed stage should be recycled if there are other running stages that have pending tasks. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296811#comment-16296811 ] Thomas Graves commented on SPARK-22765: --- [~CodingCat] with SPARK-21656 executors shouldn't timeout unless you are between stages. [~xuefuz] I would be interested if you could experiment to see if allocating all up front helps. If you can simply temporarily change, build, and try running. Did you try a timeout of 0 or 1? Wondering if we handle that properly or we don't allow. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295958#comment-16295958 ] Nan Zhu commented on SPARK-22765: - [~xuefuz] Regarding this, "The symptom is that newly allocated executors are idled out before completing a single tasks! I suspected that this is caused by a busy scheduler. As a result, we have to keep 60s as a minimum." did you try to tune the parameters like 1. spark.driver.cores to assign more threads to the driver, 2. "spark.locality.wait.process", "spark.locality.wait.node", "spark.locality.wait.rack" to let executors get filled in a faster pace? > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295814#comment-16295814 ] Xuefu Zhang commented on SPARK-22765: - As an update, I managed to backport SPARK-21656, among a few others, into our codebase and measured the efficiency improvement with a smaller idle time (from 60s to 5s). Our test shows that the efficiency gain is significant (consistent 2X) for small jobs, especially those with many stages. For large, long-running jobs. the gain is less significant, about 10% higher than 60s). I'd like to point out that even with 2X improvement on efficiency, for small jobs, Sparks is till behind. With 60s idle time, MR uses only about 35% of resource used by Spark. With 5s, now MR uses about 70% of that used by Spark. I suspect that the additional overhead comes from: 1. exponential ramp-up allocation; 2. bigger container (I have 4 core per container). It seems clearer to me now about the desired allocation scheme that is optimized for efficiency: 1. Upfront allocation instead of exponential ramp-up 2. Zero idle time (reuse containers if there are pending tasks or kill them right way if there are none) 3. Optimizations for smaller containers (like 1 core per container). In combination with executor conservation factor from SPARK-22683, the new scheme, which diverts from dynamic allocation widely, should offer better resource efficiency. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292594#comment-16292594 ] Thomas Graves commented on SPARK-22765: --- I'm not sure how mr style and 4-core executor go together. either way if you are allocated executors with 4 cores and only assigning 4 tasks to it (like mr style) , 3 of those tasks might finish quickly while one last much longer and you waste resources. How does MR style solve scheduler inefficiencies? one way or another you have to schedule tasks to containers you get. Seems like a scheduler issue not a container allocation issue and this would apply to both schemes unless you happen to get lucky in that perhaps its not as busy to schedule up front, but that is going to depend on when you get containers. If you want to investigate this to improve that would be great. I'm not really sure what this proposal does that you can't do now (with SPARK-21656) other then kill an executor after running X tasks rather then wait for idle. This might be useful but again might hurt you especially in busy clusters where you might only get a percentage of your executors up front. Let me know if I'm missing something in your proposal though. Note that we saw a significant increase in cluster utilization (meaning we were better usage resources not wasting them) when we moved to pig on tez with container re-use vs single container MR style. Some of this was due to not having to do multiple jobs, some was the container reuse. This was across a cluster though with mixed workloads. I would expect container reuse on spark to do the same, but there are obviously specialized workloads. Let me know what you find with SPARK-21656 and if its not sufficient please add more specifics (design) on what you propose to change. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292048#comment-16292048 ] Xuefu Zhang commented on SPARK-22765: - [~tgraves], I think it would help if SPARK-21656 can make a close-to-zero idle time work. This is one source of inefficiency. Our version is too old to backport the fix, but will try out this when we upgrade. The second source of inefficiency comes in the fact that Spark favors bigger containers. A 4-core container might be running one task while wasting the other cores/mem. The executor cannot die as long as there is one task running. One might argue that a user configures 1-core containers under dynamic allocation. but this is probably not optimal on other aspects. The third reason that one might favor MR-styled scheduling is its simplicity and efficiency. Frequently we found that for heavy workload the scheduler cannot really keep up with the task ups and downs, especially when the tasks finish fast. For cost-conscious users, cluster-level resource efficiency is probably what's looked at. My suspicion is that an enhanced MR-styled scheduling, simple and performing, will be significantly improve resource efficiency than a typical use of dynamic allocation, without sacrificing much performance. As a start point, we will first benchmark with SPARK-21656 when possible. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289793#comment-16289793 ] Thomas Graves commented on SPARK-22765: --- ok so before you do anything else I would suggest trying spark version with SPARK-21656 or backporting it and then having a small idle timeout to see if that meets your needs. I assume even if you put a new feature in your would have to configure it for different types of jobs so I don't see how that would be any different then setting idle timeout different per job? > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289726#comment-16289726 ] Xuefu Zhang commented on SPARK-22765: - Yes, we are using Hive on Spark. Our Spark version is 1.6.1, which is old. Obviously it doesn't have the fix in SPARK-21656. As commented in SPARK-22683, our comparison was made between all jobs for MR VS all jobs (usually just 1) for Spark for individual queries. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289700#comment-16289700 ] Thomas Graves commented on SPARK-22765: --- ok so its basically they idle timeout during DAG computing or scheduler not fast enough to deploy task. What version of Spark are you using, we did actually recently make a change to dynamic allocation where it won't idle timeout executors when it has tasks to run on them. https://issues.apache.org/jira/browse/SPARK-21656 Are you using hive with spark? did you compare resource utilization for the spark job compared to the multiple MR jobs that get run for single query? > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289676#comment-16289676 ] Xuefu Zhang commented on SPARK-22765: - Hi [~tgraves], Thanks for your input. In our busy, heavily loaded cluster environment, we have found that any idle time less than 60s is a problem. 30s works for small jobs, but starts having problem for bigger jobs. The symptom is that newly allocated executors are idled out before completing a single tasks! I suspected that this is caused by a busy scheduler. As a result, we have to keep 60s as a minimum. Having said that, however, I'm not against container reuse. Also, I used the word "enhanced" to improve on MR scheduling. Reusing is good, but in my opinion the speculation factor in dynamic allocation goes against efficiency. That is, you set an idle time just in case a new task comes within that period of time. When that doesn't happen, you waste your executor for 1 minute. (This is good for performance.) Please note that this happens a lot at the end of each stage because no tasks from the next stage will be scheduled until the current stage finishes. If we can remove the speculation aspect of the scheduling, the efficiency should improve significantly with some compromise on performance. This would be a good start point, which is the main purpose of my proposal of an enhanced MR-style scheduling, which is open to many other possible improvements. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289613#comment-16289613 ] Thomas Graves commented on SPARK-22765: --- why doesn't idle timeout very small < 5 seconds work? If there is no work to be done it should exit soon after task finishes similar to MR. Note that tez added container reuse as well which is similar to spark scheme. Basically I think you are proposing a change that essentially add a config that does not reuse containers. I'm not sure I agree with this and that it will help resource utilization. Especially when looking at the whole ecosystem. Without reuse you have to go back to yarn to ask for more, which depending on cluster usage could cause significant overhead to wait for more containers. You are bringing up and killing processes, you are re-downloading things into distributed cache, etc. So from vcore/memory per second on yarn it might be better but it affects other things as well. Just something to keep in mind. You are seeing this on very short running tasks? > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288414#comment-16288414 ] Xuefu Zhang commented on SPARK-22765: - I wouldn't say that MR is static, at lease not static in Spark's sense. MR allocates an executor for each map or reduce task and the executor exits when the running task completes. This would avoid the inefficiency in dynamic allocation where executor has to slowly die out (0 idle time doesn't really work). Secondly, this proposal doesn't really intend to go back to the MR paradigm. Instead, I propose a scheduling scheme similar to MR but enhanced to fit into Spark's DAG execution model. To be clear, the proposal here is not to replace dynamic allocation. Rather, it provides an alternative that's more efficiency-centric than dynamic allocation. I understand there are a lot of details lacking, but I'd like to start a discussion now and hopefully something concrete will come out soon. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
[ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288353#comment-16288353 ] Sean Owen commented on SPARK-22765: --- I'm not clear why this needs another allocation scheme. You say dynamic allocation has overhead at runtime -- yes -- and M/R doesn't because it's static. So why not disable dynamic allocation? Things you're identifying as "problems" are just because Spark is a generalization; you can write a bunch of independent 2-stage map-reduce jobs if you want. Killing idle executors is the point of dynamic allocation, not a problem. I don't see any detail on how this differs from anything else in Spark. > Create a new executor allocation scheme based on that of MR > --- > > Key: SPARK-22765 > URL: https://issues.apache.org/jira/browse/SPARK-22765 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 1.6.0 >Reporter: Xuefu Zhang > > Many users migrating their workload from MR to Spark find a significant > resource consumption hike (i.e, SPARK-22683). While this might not be a > concern for users that are more performance centric, for others conscious > about cost, such hike creates a migration obstacle. This situation can get > worse as more users are moving to cloud. > Dynamic allocation make it possible for Spark to be deployed in multi-tenant > environment. With its performance-centric design, its inefficiency has also > unfortunately shown up, especially when compared with MR. Thus, it's believed > that MR-styled scheduler still has its merit. Based on our research, the > inefficiency associated with dynamic allocation comes in many aspects such as > executor idling out, bigger executors, many stages (rather than 2 stages only > in MR) in a spark job, etc. > Rather than fine tuning dynamic allocation for efficiency, the proposal here > is to add a new, efficiency-centric scheduling scheme based on that of MR. > Such a MR-based scheme can be further enhanced and be more adapted to Spark > execution model. This alternative is expected to offer good performance > improvement (compared to MR) still with similar to or even better efficiency > than MR. > Inputs are greatly welcome! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org