[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490979#comment-16490979 ] Eric Yang commented on YARN-8255: - [~suma.shivaprasad] Restart_policy = ON_FAILURE covers Spark use case to start more executor base on workload demand, no? > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490231#comment-16490231 ] Suma Shivaprasad commented on YARN-8255: One usecase I can think of is in the case of Spark applications which have Dynamic Allocation enabled on the Spark driver and executors need to flex up and down based on driver's discretion. There could two be components here - one for Spark Driver and one for Spark Executor and the driver needs to flex the executor component instances up and down based on workload at that point in time /idle timeout. Thoughts [~eyang] [~leftnoteasy] [~billie.rinaldi] > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467635#comment-16467635 ] Billie Rinaldi commented on YARN-8255: -- I agree with Eric's suggestion as well. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466857#comment-16466857 ] Wangda Tan commented on YARN-8255: -- [~eyang], Thanks for commenting, your suggestion makes sense, and has less dev/testing overhead. I think we can do as you suggested: allow flexing when restart-policy = always / on-failure; and disallow flexing when restart-policy = never. We can add a separate allow_flexing flag to spec if once we see solid requirements from users. [~suma.shivaprasad], does this make sense to you, please feel free to share your opinions. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466800#comment-16466800 ] Eric Yang commented on YARN-8255: - [~leftnoteasy] Recompute and expandable are intertwined. They are the same thing. At conceptual level, teragen has no dependency of input format. You can add more partitions to get more data generated. Hadoop's own implementation limited this from happening, but this does not mean docker containers should be imposed by the same initialization time limitation. On the other hand, we must optimize the framework for general purpose usage and prevent ourselves from giving too many untested and unsupported options. I think it make sense to reduce the flex options to 2 main types instead of giving all 6 options. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466462#comment-16466462 ] Wangda Tan commented on YARN-8255: -- Thanks [~suma.shivaprasad] for filing the JIRA and suggestions from [~eyang] / [~billie.rinaldi], I think the service flexing is different from restart policy: As mentioned by [~eyang], restart policy = on_failure / always means some part of the job can be *recomputed*. *Recomputable* is different from *Expandable*, an example is map-reduce, # of mappers and reducers are determined by InputFormat, which is determined before job get launched. Allocating more mappers or reducers than pre-calculated while job is running doesn't helpful. Many computation frameworks are in this pattern, such as Tensorflow/OpenMPI, etc. adding tasks while job is running isn't helpful. Considering this, I would prefer what Suma suggested, allow user to specify allow_flexing, sometimes adding a new instance to a component could lead task or even master failure because it is unexpected. I tend to agree making allow_flexing=false by default, but I'm also fine with the opposite. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466448#comment-16466448 ] Eric Yang commented on YARN-8255: - Instead of introduce another field to enable or disable flex. We can identify if the workload can perform flex operation base on restart_policy. When restart_policy=ON_FAILURE or ALWAYS, this means the data can be recomputed, or the process can resume from failure. Flex operation can be enabled. When restart_policy=NEVER, this means the data is stateful, and can not reprocess. (i.e. mapreduce writes to HBase without transaction property.) . This type of containers are not allowed to have flexing operation. By reasoning deduction, it is possible to reduce combinations that will be supported. This also implies that restart_policy=NEVER doesn't have to support upgrade. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466423#comment-16466423 ] Billie Rinaldi commented on YARN-8255: -- If people disagree with me and we do allow this to be configured, it should be through a configuration property read with YarnServiceConf.getBoolean rather than a new field. I'd prefer it to default to true (allowing flexing) for all types, but am flexible on its default for the NEVER type. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466268#comment-16466268 ] Billie Rinaldi commented on YARN-8255: -- I'm not sure this configuration parameter is necessary. Only the launching user can flex the service, so this user should know whether flexing makes sense for components of their service. I am also not sure about having different defaults for different policies; it seems like this will be confusing and require complex documentation. > Allow option to disable flex for a service component > - > > Key: YARN-8255 > URL: https://issues.apache.org/jira/browse/YARN-8255 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > > YARN-8080 implements restart capabilities for service component instances. > YARN service components should add an option to disallow flexing to support > workloads which are essentially batch/iterative jobs which terminate with > restart_policy=NEVER/ON_FAILURE. This could be disabled by default for > components where restart_policy=NEVER/ON_FAILURE and enabled by default when > restart_policy=ALWAYS(which is the default restart_policy) unless explicitly > set at the service spec. > The option could be exposed as part of the component spec as "allow_flexing". > cc [~billie.rinaldi] [~gsaha] [~eyang] [~csingh] [~wangda] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org