[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300926#comment-16300926
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Did some benchmarking with a set of 20 queries on upfront allocation against 
exponential ramp-up. No clear trend is seen: upfront allocation offers better 
efficiency for some of the queries, similar efficiency for others, and worse 
for the rest. These variations might just be noise, which is abundant in our 
production cluster. Thus, I tend to agree that upfront allocation offers 
limited benefit for efficiency, if any. (On the other hand, it seems benefiting 
performance somewhat.)

I also noticed that when the scheduler schedules a task, it doesn't necessarily 
pick a core that's available in an executor that's running other tasks. I 
speculate that efficiency improves if busy executors are favored for a new task 
so that other idle executors can idle out. (To be tested out.)

Making idleTime=0 valid is a good thing to have. I will create a separate 
ticket for that.


> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-20 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298634#comment-16298634
 ] 

Xuefu Zhang commented on SPARK-22765:
-

bq:  at least based on this one experiment up front allocation didn't help.

I'm not sure how the conclusion is drawn here, but with #5, which diffs from #3 
only with additional upfront allocation, we see a resource usage decrease from 
2X to 1.4X. This doesn't seem supporting this claim. I will do more testing on 
this to make my observation more evident.

For idle=0, I think it's a valid case and should be helpful as well.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-20 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298589#comment-16298589
 ] 

Thomas Graves commented on SPARK-22765:
---

ok so at least based on this one experiment up front allocation didn't help. 
The thing that did are already available (SPARK-21656) and idleTime=1s. The 
other thing that helped is SPARK-22683 which is separate jira.  So with this 
there isn't anything to do unless you want to supposed idleTime=0.  If that is 
the case I would be interested to hear if that helps at all.  Personally I'd be 
ok with supporting 0 as idle time. If you want to do that with this jira I 
suggest changing the description and feel free to put up PR.  Note that its the 
holidays so might not get reviewed immediately.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297914#comment-16297914
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Alright, I tested upfront allocation and its combinations with other 
improvement ideas and here is what I found:

1. the query: just one query, fairly complicated, represented by one of the 
main spark jobs:
{code}
Status: Running (Hive on Spark job[4])
--
  STAGES   ATTEMPTSTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  
--
Stage-10 ... 0  FINISHED33933900
   0  
Stage-11 ... 0  FINISHED20120100
   0  
Stage-12 ... 0  FINISHED19119100
   0  
Stage-13 ... 0  FINISHED17817800
   0  
Stage-14 ... 0  FINISHED11511500
   0  
Stage-15 ... 0  FINISHED10510500
   0  
Stage-16 ... 0  FINISHED59259200
   0  
Stage-17 ... 0  FINISHED19119100
   0  
Stage-4  0  FINISHED17817800
   0  
Stage-5  0  FINISHED11511500
   0  
Stage-6  0  FINISHED10510500
   0  
Stage-7  0  FINISHED33933900
   0  
Stage-8  0  FINISHED20120100
   0  
Stage-9  0  FINISHED19119100
   0  
--
{code}
2. Without any improvement, default 60s idleTime, Spark uses more than 3X 
resources compared to MR.
3. With idleTime=5s and the improvement in SPARK-21656, Spark uses about 2X 
resources.
4. Same as #3, but with idleTime=1s, Spark uses 1.4X resoures
5. Same as #3, but with additional upfront allocation Spark also uses 1.4X 
resource
6. Same as #4, but with additional improvement in SPARK-22683 (factor = 2), 
Spark uses 1.2X resource
7. Same as #6 but with factor=3, Spark uses 0.8X resources.

While this is just for one query, far from being conclusive, we can really see 
that how much those considerations might impact efficiency. I'm sure the 
mileage varies, but this at least shows there is a lot of room for Spark to 
improve resource utilization efficiency wrt scheduling.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297645#comment-16297645
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Haven't got a chance to try upfront allocation. Tried one query (runs for a 
couple of mins) with 1s idle time. The resource usage is further cut down as 
much as half, very close to that of MR in this case.

I think we should allow 0s idle time, even for completeness.

I will try upfront allocation and update.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297549#comment-16297549
 ] 

Thomas Graves commented on SPARK-22765:
---

with SPARK-21656 does upfront allocation vs exponential ramp up make a 
difference?

Does 1 second vs 0 seconds in idle timeout make a difference?  I guess it could 
add up eventually, but currently I think we have a timer scheduled for every 
100 millis so unless you bypass that in onExecutorIdle you will have some 
delay. 

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297529#comment-16297529
 ] 

Xuefu Zhang commented on SPARK-22765:
-

{quote}
SPARK-21656 and the dynamic allocation should handle that, the target number in 
the dynamic allocation manager is supposed to be based on all running stages 
for pending and running tasks. Are you saying that is not true?
{quote}
I verified and it does seem that SPARK-21656 covers concurrent stages. My job 
ran too fast, so I had to limit maxExecutors to a smaller number to observe 
more closely. It's no issue, after all. This is great!

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Nan Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297453#comment-16297453
 ] 

Nan Zhu commented on SPARK-22765:
-

I took a look at the code, one of the possibilities is as following:

we add the new executor id 
https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L598

and immediately filter idle executors

https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L419

Right after this step, since executorIdToTaskIds hasn't contained executor id, 
the newly added executor is applied with onExecutorIdle

in the method of onExecutorIdle: 
https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L479

the newly added executor is also vulnerable to be assigned with a removeTimes 
value, since removeTimes and executorsPendingToRemove would not contain the 
newly added executor ID. Based on the calculation method, the removeTime would 
be 60s after. Then we will remove the executor after 60s 
(https://github.com/apache/spark/blob/a233fac0b8bf8229d938a24f2ede2d9d8861c284/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L266).


The potential fix is to update executorIdToTaskIds before we filter idle 
executors (but SPARK-21656 should prevent this happeningstill trying to 
figure out why SPARK-21656 not working)


> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297346#comment-16297346
 ] 

Xuefu Zhang commented on SPARK-22765:
-

I'm not 100% positive, but that seems to be what I saw. I will verify further.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297335#comment-16297335
 ] 

Thomas Graves commented on SPARK-22765:
---

SPARK-21656 and the dynamic allocation should handle that, the target number in 
the dynamic allocation manager is supposed to be based on all running stages 
for pending and running tasks.  Are you saying that is not true?

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297316#comment-16297316
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Actually I meant across parallel stages (those connected to union 
transformation), not serial stages that I care much less. The following is what 
I meant (in Hive on Spark context):
{code}
Status: Running (Hive on Spark job[4])
--
  STAGES   ATTEMPTSTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  
--
Stage-10 ..  0   RUNNING340126  2140
   0  
Stage-11 0   PENDING201  00  201
   0  
Stage-12 0   PENDING191  00  191
   0  
Stage-13 .   0   RUNNING178 33  1450
   0  
Stage-14 0   PENDING115  00  115
   0  
Stage-15 0   PENDING105  00  105
   0  
Stage-16 0   PENDING592  00  592
   0  
Stage-17 0   PENDING191  00  191
   0  
Stage-4 ...  0   RUNNING178157   210
   4  
Stage-5  0   PENDING115  00  115
   0  
Stage-6  0   PENDING105  00  105
   0  
Stage-7 .0   RUNNING340232  1080
   0  
Stage-8  0   PENDING201  00  201
   0  
Stage-9  0   PENDING191  00  191
   0  
{code}

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297307#comment-16297307
 ] 

Thomas Graves commented on SPARK-22765:
---

yes between stages becomes a problem with lower timeout, we can certainly look 
at extending across stages but that isn't really what you are proposing here, 
mr style is to release immediately and not reuse. That also kind of implies a 
slightly higher timeout would be good unless you have very large time between 
stages which would again probably favor small timeout to release and then 
reacquire.  

how much time are you having between stages?  

Or perhaps we need 2 timeouts one for running stage and one for between stages, 
but that is somewhat contradictory unless a lot of tasks finish all at the same 
time, but even then the timeout for running stages as 0 would have released the 
container immediately anyway.  So I'm a bit confused by your statement.  



> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297276#comment-16297276
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Hi [~CodingCat], yes, we adapted both #1 and #2, but we still needed 60s as 
minimum w/o SPARK-21656.

Hi @Tomas Graves, idle=1 seems working, but idle=0 seems illegal.
{code}
org.apache.spark.SparkException: spark.dynamicAllocation.executorIdleTimeout 
must be > 0!
{code}

Now I think it, idle=0 should be permitted because a user might not want any 
idle executor at all.

{quote}
with SPARK-21656 executors shouldn't timeout unless you are between stages
{quote}
This seems functioning correctly. However, I think this needs to be extended 
across concurrent stages. Based on what I'm seeing, executors are not reused 
across such stages, which means if an executor working for a completed stage 
should be recycled if there are other running stages that have pending tasks.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296811#comment-16296811
 ] 

Thomas Graves commented on SPARK-22765:
---

[~CodingCat] with SPARK-21656 executors shouldn't timeout unless you are 
between stages.  

[~xuefuz] I would be interested if you could experiment to see if allocating 
all up front helps.  If you can simply temporarily change, build, and try 
running.
Did you try a timeout of 0 or 1? Wondering if we handle that properly or we 
don't allow.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-18 Thread Nan Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295958#comment-16295958
 ] 

Nan Zhu commented on SPARK-22765:
-

[~xuefuz] Regarding this, "The symptom is that newly allocated executors are 
idled out before completing a single tasks! I suspected that this is caused by 
a busy scheduler. As a result, we have to keep 60s as a minimum."

did you try to tune the parameters like 

1. spark.driver.cores to assign more threads to the driver, 

2. "spark.locality.wait.process", "spark.locality.wait.node", 
"spark.locality.wait.rack" to let executors get filled in a faster pace?

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-18 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295814#comment-16295814
 ] 

Xuefu Zhang commented on SPARK-22765:
-

As an update, I managed to backport SPARK-21656, among a few others, into our 
codebase and measured the efficiency improvement with a smaller idle time (from 
60s to 5s). Our test shows that the efficiency gain is significant (consistent 
2X) for small jobs, especially those with many stages. For large, long-running 
jobs. the gain is less significant, about 10% higher than 60s).

I'd like to point out that even with 2X improvement on efficiency, for small 
jobs, Sparks is till behind. With 60s idle time, MR uses only about 35% of 
resource used by Spark. With 5s, now MR uses about 70% of that used by Spark. I 
suspect that the additional overhead comes from: 1. exponential ramp-up 
allocation; 2. bigger container (I have 4 core per container).

It seems clearer to me now about the desired allocation scheme that is 
optimized for efficiency:

1. Upfront allocation instead of exponential ramp-up
2. Zero idle time (reuse containers if there are pending tasks or kill them 
right way if there are none)
3. Optimizations for smaller containers (like 1 core per container).

In combination with executor conservation factor from SPARK-22683, the new 
scheme, which diverts from dynamic allocation widely, should offer better 
resource efficiency.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-15 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292594#comment-16292594
 ] 

Thomas Graves commented on SPARK-22765:
---

I'm not sure how mr style and 4-core executor go together.  either way if you 
are allocated executors with 4 cores and only assigning 4 tasks to it (like mr 
style) , 3 of those tasks might finish quickly while one last much longer and 
you waste resources. 

How does MR style solve scheduler inefficiencies?  one way or another you have 
to schedule tasks to containers you get.  Seems like a scheduler issue not a 
container allocation issue and this would apply to both schemes unless you 
happen to get lucky in that perhaps its not as busy to schedule up front, but 
that is going to depend on when you get containers.  If you want to investigate 
this to improve that would be great.

I'm not really sure what this proposal does that you can't do now (with 
SPARK-21656) other then kill an executor after running X tasks rather then wait 
for idle.  This might be useful but again might hurt you especially in busy 
clusters where you might only get a percentage of your executors up front.  Let 
me know if I'm missing something in your proposal though. 
Note that we saw a significant increase in cluster utilization (meaning we were 
better usage resources not wasting them) when we moved to pig on tez with 
container re-use vs single container MR style.  Some of this was due to not 
having to do multiple jobs, some was the container reuse.   This was across a 
cluster though with mixed workloads.  I would expect container reuse on spark 
to do the same, but there are obviously specialized workloads.

Let me know what you find with SPARK-21656 and if its not sufficient please add 
more specifics (design) on what you propose to change.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292048#comment-16292048
 ] 

Xuefu Zhang commented on SPARK-22765:
-

[~tgraves], I think it would help if SPARK-21656 can make a close-to-zero idle 
time work. This is one source of inefficiency. Our version is too old to 
backport the fix, but will try out this when we upgrade.

The second source of inefficiency comes in the fact that Spark favors bigger 
containers. A 4-core container might be running one task while wasting the 
other cores/mem. The executor cannot die as long as there is one task running. 
One might argue that a user configures 1-core containers under dynamic 
allocation. but this is probably not optimal on other aspects.

The third reason that one might favor MR-styled scheduling is its simplicity 
and efficiency. Frequently we found that for heavy workload the scheduler 
cannot really keep up with the task ups and downs, especially when the tasks 
finish fast. 

For cost-conscious users, cluster-level resource efficiency is probably what's 
looked at. My suspicion is that an enhanced MR-styled scheduling, simple and 
performing, will be significantly improve resource efficiency than a typical 
use of dynamic allocation, without sacrificing much performance.

As a start point, we will first benchmark with SPARK-21656 when possible.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-13 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289793#comment-16289793
 ] 

Thomas Graves commented on SPARK-22765:
---

ok so before you do anything else I would suggest trying spark version with  
SPARK-21656 or backporting it and then having a small idle timeout to see if 
that meets your needs.

I assume even if you put a new feature in your would have to configure it for 
different types of jobs so I don't see how that would be any different then 
setting idle timeout different per job?

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289726#comment-16289726
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Yes, we are using Hive on Spark. Our Spark version is 1.6.1, which is old. 
Obviously it doesn't have the fix in SPARK-21656.

As commented in SPARK-22683, our comparison was made between all jobs for MR VS 
all jobs (usually just 1) for Spark for individual queries.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-13 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289700#comment-16289700
 ] 

Thomas Graves commented on SPARK-22765:
---

ok so its basically they idle timeout during DAG computing or scheduler not 
fast enough to deploy task.  What version of Spark are you using, we did 
actually recently make a change to dynamic allocation where it won't idle 
timeout executors when it has tasks to run on them.  
https://issues.apache.org/jira/browse/SPARK-21656

Are you using hive with spark? did you compare resource utilization for the 
spark job compared to the multiple MR jobs that get run for single query?

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289676#comment-16289676
 ] 

Xuefu Zhang commented on SPARK-22765:
-

Hi [~tgraves], Thanks for your input.

In our busy, heavily loaded cluster environment, we have found that any idle 
time less than 60s is a problem. 30s works for small jobs, but starts having 
problem for bigger jobs. The symptom is that newly allocated executors are 
idled out before completing a single tasks! I suspected that this is caused by 
a busy scheduler. As a result, we have to keep 60s as a minimum.

Having said that, however, I'm not against container reuse. Also, I used the 
word "enhanced" to improve on MR scheduling. Reusing is good, but in my opinion 
the speculation factor in dynamic allocation goes against efficiency. That is, 
you set an idle time just in case a new task comes within that period of time. 
When that doesn't happen, you waste your executor for 1 minute. (This is good 
for performance.) Please note that this happens a lot at the end of each stage 
because no tasks from the next stage will be scheduled until the current stage 
finishes.

If we can remove the speculation aspect of the scheduling, the efficiency 
should improve significantly with some compromise on performance. This would be 
a good start point, which is the main purpose of my proposal of an enhanced 
MR-style scheduling, which is open to many other possible improvements.


> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-13 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289613#comment-16289613
 ] 

Thomas Graves commented on SPARK-22765:
---

why doesn't idle timeout very small < 5 seconds work?  If there is no work to 
be done it should exit soon after task finishes similar to MR.  Note that tez 
added container reuse as well which is similar to spark scheme. 

Basically I think you are proposing a change that essentially add a config that 
does not reuse containers. 
I'm not sure I agree with this and that it will help resource utilization. 
Especially when looking at the whole ecosystem. Without reuse you have to go 
back to yarn to ask for more, which depending on cluster usage could cause 
significant overhead to wait for more containers. You are bringing up and 
killing processes, you are re-downloading things into distributed cache, etc.  
So from vcore/memory per second on yarn it might be better but it affects other 
things as well.  Just something to keep in mind.  

You are seeing this on very short running tasks?

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-12 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288414#comment-16288414
 ] 

Xuefu Zhang commented on SPARK-22765:
-

I wouldn't say that MR is static, at lease not static in Spark's sense. MR 
allocates an executor for each map or reduce task and the executor exits when 
the running task completes. This would avoid the inefficiency in dynamic 
allocation where executor has to slowly die out (0 idle time doesn't really 
work). Secondly, this proposal doesn't really intend to go back to the MR 
paradigm. Instead, I propose a scheduling scheme similar to MR but enhanced to 
fit into Spark's DAG execution model.

To be clear, the proposal here is not to replace dynamic allocation. Rather, it 
provides an alternative that's more efficiency-centric than dynamic allocation. 
I understand there are a lot of details lacking, but I'd like to start a 
discussion now and hopefully something concrete will come out soon.

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR

2017-12-12 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288353#comment-16288353
 ] 

Sean Owen commented on SPARK-22765:
---

I'm not clear why this needs another allocation scheme. You say dynamic 
allocation has overhead at runtime -- yes -- and M/R doesn't because it's 
static. So why not disable dynamic allocation? Things you're identifying as 
"problems" are just because Spark is a generalization; you can write a bunch of 
independent 2-stage map-reduce jobs if you want. Killing idle executors is the 
point of dynamic allocation, not a problem.

I don't see any detail on how this differs from anything else in Spark. 

> Create a new executor allocation scheme based on that of MR
> ---
>
> Key: SPARK-22765
> URL: https://issues.apache.org/jira/browse/SPARK-22765
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant 
> resource consumption hike (i.e, SPARK-22683). While this might not be a 
> concern for users that are more performance centric, for others conscious 
> about cost, such hike creates a migration obstacle. This situation can get 
> worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
> environment. With its performance-centric design, its inefficiency has also 
> unfortunately shown up, especially when compared with MR. Thus, it's believed 
> that MR-styled scheduler still has its merit. Based on our research, the 
> inefficiency associated with dynamic allocation comes in many aspects such as 
> executor idling out, bigger executors, many stages (rather than 2 stages only 
> in MR) in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here 
> is to add a new, efficiency-centric  scheduling scheme based on that of MR. 
> Such a MR-based scheme can be further enhanced and be more adapted to Spark 
> execution model. This alternative is expected to offer good performance 
> improvement (compared to MR) still with similar to or even better efficiency 
> than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org