[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-25 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490979#comment-16490979
 ] 

Eric Yang commented on YARN-8255:
-

[~suma.shivaprasad] Restart_policy = ON_FAILURE covers Spark use case to start 
more executor base on workload demand, no?

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-24 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490231#comment-16490231
 ] 

Suma Shivaprasad commented on YARN-8255:


One usecase I can think of is in the case of Spark applications which have 
Dynamic Allocation enabled on the Spark driver and executors need to flex up 
and down based on driver's discretion. There could two be components here - one 
for Spark Driver and one for Spark Executor and the driver needs to flex the 
executor component instances up and down based on workload at that point in 
time /idle timeout. 

Thoughts [~eyang] [~leftnoteasy] [~billie.rinaldi]

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-08 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467635#comment-16467635
 ] 

Billie Rinaldi commented on YARN-8255:
--

I agree with Eric's suggestion as well.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466857#comment-16466857
 ] 

Wangda Tan commented on YARN-8255:
--

[~eyang], 

Thanks for commenting, your suggestion makes sense, and has less dev/testing 
overhead. I think we can do as you suggested: allow flexing when restart-policy 
 = always / on-failure; and disallow flexing when restart-policy = never.

We can add a separate allow_flexing flag to spec if once we see solid 
requirements from users.

[~suma.shivaprasad], does this make sense to you, please feel free to share 
your opinions.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466800#comment-16466800
 ] 

Eric Yang commented on YARN-8255:
-

[~leftnoteasy] Recompute and expandable are intertwined.  They are the same 
thing.  At conceptual level, teragen has no dependency of input format.  You 
can add more partitions to get more data generated.  Hadoop's own 
implementation limited this from happening, but this does not mean docker 
containers should be imposed by the same initialization time limitation.  On 
the other hand, we must optimize the framework for general purpose usage and 
prevent ourselves from giving too many untested and unsupported options.  I 
think it make sense to reduce the flex options to 2 main types instead of 
giving all 6 options.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466462#comment-16466462
 ] 

Wangda Tan commented on YARN-8255:
--

Thanks [~suma.shivaprasad] for filing the JIRA and suggestions from [~eyang] / 
[~billie.rinaldi], 

I think the service flexing is different from restart policy: As mentioned by 
[~eyang], restart policy = on_failure / always means some part of the job can 
be *recomputed*. *Recomputable* is different from *Expandable*, an example is 
map-reduce, # of mappers and reducers are determined by InputFormat, which is 
determined before job get launched. Allocating more mappers or reducers than 
pre-calculated while job is running doesn't helpful. Many computation 
frameworks are in this pattern, such as Tensorflow/OpenMPI, etc. adding tasks 
while job is running isn't helpful.

Considering this, I would prefer what Suma suggested, allow user to specify 
allow_flexing, sometimes adding a new instance to a component could lead task 
or even master failure because it is unexpected. I tend to agree making 
allow_flexing=false by default, but I'm also fine with the opposite.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466448#comment-16466448
 ] 

Eric Yang commented on YARN-8255:
-

Instead of introduce another field to enable or disable flex.  We can identify 
if the workload can perform flex operation base on restart_policy.

When restart_policy=ON_FAILURE or ALWAYS, this means the data can be 
recomputed, or the process can resume from failure.  Flex operation can be 
enabled.

When restart_policy=NEVER, this means the data is stateful, and can not 
reprocess.  (i.e. mapreduce writes to HBase without transaction property.) . 
This type of containers are not allowed to have flexing operation.

By reasoning deduction, it is possible to reduce combinations that will be 
supported.  This also implies that restart_policy=NEVER doesn't have to support 
upgrade.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466423#comment-16466423
 ] 

Billie Rinaldi commented on YARN-8255:
--

If people disagree with me and we do allow this to be configured, it should be 
through a configuration property read with YarnServiceConf.getBoolean rather 
than a new field. I'd prefer it to default to true (allowing flexing) for all 
types, but am flexible on its default for the NEVER type.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466268#comment-16466268
 ] 

Billie Rinaldi commented on YARN-8255:
--

I'm not sure this configuration parameter is necessary. Only the launching user 
can flex the service, so this user should know whether flexing makes sense for 
components of their service. I am also not sure about having different defaults 
for different policies; it seems like this will be confusing and require 
complex documentation.

> Allow option to disable flex for a service component 
> -
>
> Key: YARN-8255
> URL: https://issues.apache.org/jira/browse/YARN-8255
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
>
> YARN-8080 implements restart capabilities for service component instances. 
> YARN service components should add an option to disallow flexing to support 
> workloads which are essentially batch/iterative jobs which terminate with 
> restart_policy=NEVER/ON_FAILURE. This could be disabled by default for 
> components where restart_policy=NEVER/ON_FAILURE and enabled by default when 
> restart_policy=ALWAYS(which is the default restart_policy) unless explicitly 
> set at the service spec.
> The option could be exposed as part of the component spec as "allow_flexing". 
> cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org