[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690682#comment-16690682 ] ASF GitHub Bot commented on FLINK-10880: tillrohrmann closed pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/_includes/generated/job_manager_configuration.html b/docs/_includes/generated/job_manager_configuration.html index 0458af24c06..99eec1dcec7 100644 --- a/docs/_includes/generated/job_manager_configuration.html +++ b/docs/_includes/generated/job_manager_configuration.html @@ -17,11 +17,6 @@ 16 The maximum number of prior execution attempts kept in history. - -jobmanager.execution.failover-strategy -"full" -This option specifies how the job computation recovers from task failures. Accepted values are:'full': Restarts all tasks.'individual': Restarts only the failed task. Should only be used if all tasks are independent components.'region': Restarts all tasks that could be affected by the task failure. - jobmanager.heap.size "1024m" diff --git a/docs/release-notes/flink-1.5.md b/docs/release-notes/flink-1.5.md index 4cee5774cdf..ed5f2c2ad26 100644 --- a/docs/release-notes/flink-1.5.md +++ b/docs/release-notes/flink-1.5.md @@ -80,5 +80,13 @@ The Kinesis dependencies of Flinkās Kinesis connector have been updated to the 0.12.9 ``` + +### Limitations of failover strategies +Flink's non-default failover strategies are still a very experimental feature which come with a set of limitations. +You should only use this feature if you are executing a stateless streaming job. +In any other cases, it is highly recommended to remove the config option `jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it to `"full"`. + +In order to avoid future problems, this feature has been removed from the documentation until it will be fixed. +See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more details. {% top %} diff --git a/docs/release-notes/flink-1.6.md b/docs/release-notes/flink-1.6.md index 34cd6135511..7c22b3f4de6 100644 --- a/docs/release-notes/flink-1.6.md +++ b/docs/release-notes/flink-1.6.md @@ -29,6 +29,15 @@ The default value of the slot idle timeout `slot.idle.timeout` is set to the def ### Changed ElasticSearch 5.x Sink API Previous APIs in the Flink ElasticSearch 5.x Sink's `RequestIndexer` interface have been deprecated in favor of new signatures. -When adding requests to the `RequestIndexer`, the requests now must be of type `IndexRequest`, `DeleteRequest`, or `UpdateRequest`, instead of the base `ActionRequest`. +When adding requests to the `RequestIndexer`, the requests now must be of type `IndexRequest`, `DeleteRequest`, or `UpdateRequest`, instead of the base `ActionRequest`. + + +### Limitations of failover strategies +Flink's non-default failover strategies are still a very experimental feature which come with a set of limitations. +You should only use this feature if you are executing a stateless streaming job. +In any other cases, it is highly recommended to remove the config option `jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it to `"full"`. + +In order to avoid future problems, this feature has been removed from the documentation until it will be fixed. +See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more details. {% top %} diff --git a/docs/release-notes/flink-1.7.md b/docs/release-notes/flink-1.7.md index f9e7425601a..8cdfe9d5400 100644 --- a/docs/release-notes/flink-1.7.md +++ b/docs/release-notes/flink-1.7.md @@ -30,4 +30,13 @@ Therefore, the module `flink-scala-shell` is not being released for Scala 2.12. See [FLINK-10911](https://issues.apache.org/jira/browse/FLINK-10911) for more details. + +### Limitations of failover strategies +Flink's non-default failover strategies are still a very experimental feature which come with a set of limitations. +You should only use this feature if you are executing a stateless streaming job. +In any other cases, it is highly recommended to remove the config option `jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it to `"full"`. + +In order to avoid future problems, this feature has been removed from the documentation until it will be fixed. +See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more details. + {% top %} diff --git
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689673#comment-16689673 ] ASF GitHub Bot commented on FLINK-10880: tillrohrmann commented on a change in pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129#discussion_r234281426 ## File path: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java ## @@ -106,6 +106,7 @@ /** * This option specifies the failover strategy, i.e. how the job computation recovers from task failures. */ + @Documentation.ExcludeFromDocumentation Review comment: Good idea. Will change it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689671#comment-16689671 ] ASF GitHub Bot commented on FLINK-10880: tillrohrmann commented on a change in pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129#discussion_r234281191 ## File path: flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java ## @@ -213,6 +213,12 @@ static void processConfigOptions(String rootDir, String module, String packageNa } } + private static boolean isValidConfigOption(Field field) { Review comment: Will split it up. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689669#comment-16689669 ] ASF GitHub Bot commented on FLINK-10880: zentol commented on a change in pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129#discussion_r234278176 ## File path: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java ## @@ -106,6 +106,7 @@ /** * This option specifies the failover strategy, i.e. how the job computation recovers from task failures. */ + @Documentation.ExcludeFromDocumentation Review comment: Do we maybe want to include a reason for why this is hidden similar to deprecation notices? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689670#comment-16689670 ] ASF GitHub Bot commented on FLINK-10880: zentol commented on a change in pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129#discussion_r234280027 ## File path: flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java ## @@ -213,6 +213,12 @@ static void processConfigOptions(String rootDir, String module, String packageNa } } + private static boolean isValidConfigOption(Field field) { Review comment: I'm not fond of the naming as a deprecated options is still valid. Maybe split this into `isConfigOption` that does the `ConfigOption` check and `shouldBeDocumented` that does the deprecation/exclusion checks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689661#comment-16689661 ] ASF GitHub Bot commented on FLINK-10880: tillrohrmann opened a new pull request #7129: [FLINK-10880] Exclude JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation URL: https://github.com/apache/flink/pull/7129 ## What is the purpose of the change This commit excludes the `JobManagerOptions#EXECUTION_FAILOVER_STRATEGY` from Flink's configuration documentation. ## Brief change log - Introduce `Documentation#ExcludeFromDocumentation` - Exclude `JobManagerOptions#EXECUTION_FAILOVER_STRATEGY` from documentation ## Verifying this change - Added `ConfigOptionsDocGeneratorTest#testConfigOptionExclusion` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689606#comment-16689606 ] TisonKun commented on FLINK-10880: -- [~till.rohrmann] alright I agree your quick fix and hope we can meet the release asap :-) > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: Till Rohrmann >Priority: Blocker > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689594#comment-16689594 ] Till Rohrmann commented on FLINK-10880: --- I think one step is to exclude {{JobManagerOptions#EXECUTION_FAILOVER_STRATEGY}} from our documentation. This makes this feature effectively a hidden feature again. Moreover, I would like to include a release note that this feature is experimental to make people aware of the limitations. Concerning preventing the combination of {{DataSet}} and other failover strategies than "full", I think this is not that easy to do. The failover strategy is configured for the cluster and we only instantiate it when we create the {{ExecutionGraph}}. At this stage, it is not that easy to distinguish whether the job is a streaming or batch job. Since this is the last blocker issue for Flink 1.7.0 I'd like to take this issue over [~Tison]. I hope this is ok for you. > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: TisonKun >Priority: Blocker > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution
[ https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686519#comment-16686519 ] TisonKun commented on FLINK-10880: -- Hi [~StephanEwen], I'd like to meet the consensus of where to document this message. My opinion is the option description at {{JobManagerOptions#EXECUTION_FAILOVER_STRATEGY}} and Java Doc at {{RestartPipelinedRegionStrategy}}. What do you think? Are these enough? > Failover strategies should not be applied to Batch Execution > > > Key: FLINK-10880 > URL: https://issues.apache.org/jira/browse/FLINK-10880 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.2 >Reporter: Stephan Ewen >Assignee: TisonKun >Priority: Blocker > Fix For: 1.6.3, 1.7.0 > > > When configuring a failover strategy other than "full", DataSet/Batch > execution is currently not correct. > This is expected, the failover region strategy is an experimental WIP feature > for streaming that has not been extended to the DataSet API. > We need to document this and prevent execution of DataSet features with other > failover strategies than "full". -- This message was sent by Atlassian JIRA (v7.6.3#76005)