[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690682#comment-16690682
 ] 

ASF GitHub Bot commented on FLINK-10880:


tillrohrmann closed pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/_includes/generated/job_manager_configuration.html 
b/docs/_includes/generated/job_manager_configuration.html
index 0458af24c06..99eec1dcec7 100644
--- a/docs/_includes/generated/job_manager_configuration.html
+++ b/docs/_includes/generated/job_manager_configuration.html
@@ -17,11 +17,6 @@
 16
 The maximum number of prior execution attempts kept in 
history.
 
-
-jobmanager.execution.failover-strategy
-"full"
-This option specifies how the job computation recovers from 
task failures. Accepted values are:'full': Restarts all 
tasks.'individual': Restarts only the failed task. Should only be used 
if all tasks are independent components.'region': Restarts all tasks 
that could be affected by the task failure.
-
 
 jobmanager.heap.size
 "1024m"
diff --git a/docs/release-notes/flink-1.5.md b/docs/release-notes/flink-1.5.md
index 4cee5774cdf..ed5f2c2ad26 100644
--- a/docs/release-notes/flink-1.5.md
+++ b/docs/release-notes/flink-1.5.md
@@ -80,5 +80,13 @@ The Kinesis dependencies of Flinkā€™s Kinesis connector have 
been updated to the
 0.12.9
 ```
 
+
+### Limitations of failover strategies
+Flink's non-default failover strategies are still a very experimental feature 
which come with a set of limitations.
+You should only use this feature if you are executing a stateless streaming 
job.
+In any other cases, it is highly recommended to remove the config option 
`jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it 
to `"full"`.
+
+In order to avoid future problems, this feature has been removed from the 
documentation until it will be fixed.
+See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more 
details.
 
 {% top %}
diff --git a/docs/release-notes/flink-1.6.md b/docs/release-notes/flink-1.6.md
index 34cd6135511..7c22b3f4de6 100644
--- a/docs/release-notes/flink-1.6.md
+++ b/docs/release-notes/flink-1.6.md
@@ -29,6 +29,15 @@ The default value of the slot idle timeout 
`slot.idle.timeout` is set to the def
 ### Changed ElasticSearch 5.x Sink API
 
 Previous APIs in the Flink ElasticSearch 5.x Sink's `RequestIndexer` interface 
have been deprecated in favor of new signatures. 
-When adding requests to the `RequestIndexer`, the requests now must be of type 
`IndexRequest`, `DeleteRequest`, or `UpdateRequest`, instead of the base 
`ActionRequest`. 
+When adding requests to the `RequestIndexer`, the requests now must be of type 
`IndexRequest`, `DeleteRequest`, or `UpdateRequest`, instead of the base 
`ActionRequest`.
+
+
+### Limitations of failover strategies
+Flink's non-default failover strategies are still a very experimental feature 
which come with a set of limitations.
+You should only use this feature if you are executing a stateless streaming 
job.
+In any other cases, it is highly recommended to remove the config option 
`jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it 
to `"full"`.
+
+In order to avoid future problems, this feature has been removed from the 
documentation until it will be fixed.
+See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more 
details. 
 
 {% top %}
diff --git a/docs/release-notes/flink-1.7.md b/docs/release-notes/flink-1.7.md
index f9e7425601a..8cdfe9d5400 100644
--- a/docs/release-notes/flink-1.7.md
+++ b/docs/release-notes/flink-1.7.md
@@ -30,4 +30,13 @@ Therefore, the module `flink-scala-shell` is not being 
released for Scala 2.12.
 
 See [FLINK-10911](https://issues.apache.org/jira/browse/FLINK-10911) for more 
details.  
 
+
+### Limitations of failover strategies
+Flink's non-default failover strategies are still a very experimental feature 
which come with a set of limitations.
+You should only use this feature if you are executing a stateless streaming 
job.
+In any other cases, it is highly recommended to remove the config option 
`jobmanager.execution.failover-strategy` from your `flink-conf.yaml` or set it 
to `"full"`.
+
+In order to avoid future problems, this feature has been removed from the 
documentation until it will be fixed.
+See [FLINK-10880](https://issues.apache.org/jira/browse/FLINK-10880) for more 
details. 
+
 {% top %}
diff --git 

[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689673#comment-16689673
 ] 

ASF GitHub Bot commented on FLINK-10880:


tillrohrmann commented on a change in pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129#discussion_r234281426
 
 

 ##
 File path: 
flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
 ##
 @@ -106,6 +106,7 @@
/**
 * This option specifies the failover strategy, i.e. how the job 
computation recovers from task failures.
 */
+   @Documentation.ExcludeFromDocumentation
 
 Review comment:
   Good idea. Will change it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689671#comment-16689671
 ] 

ASF GitHub Bot commented on FLINK-10880:


tillrohrmann commented on a change in pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129#discussion_r234281191
 
 

 ##
 File path: 
flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
 ##
 @@ -213,6 +213,12 @@ static void processConfigOptions(String rootDir, String 
module, String packageNa
}
}
 
+   private static boolean isValidConfigOption(Field field) {
 
 Review comment:
   Will split it up.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689669#comment-16689669
 ] 

ASF GitHub Bot commented on FLINK-10880:


zentol commented on a change in pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129#discussion_r234278176
 
 

 ##
 File path: 
flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
 ##
 @@ -106,6 +106,7 @@
/**
 * This option specifies the failover strategy, i.e. how the job 
computation recovers from task failures.
 */
+   @Documentation.ExcludeFromDocumentation
 
 Review comment:
   Do we maybe want to include a reason for why this is hidden similar to 
deprecation notices?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689670#comment-16689670
 ] 

ASF GitHub Bot commented on FLINK-10880:


zentol commented on a change in pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129#discussion_r234280027
 
 

 ##
 File path: 
flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
 ##
 @@ -213,6 +213,12 @@ static void processConfigOptions(String rootDir, String 
module, String packageNa
}
}
 
+   private static boolean isValidConfigOption(Field field) {
 
 Review comment:
   I'm not fond of the naming as a deprecated options is still valid.
   
   Maybe split this into `isConfigOption` that does the `ConfigOption` check 
and `shouldBeDocumented` that does the deprecation/exclusion checks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689661#comment-16689661
 ] 

ASF GitHub Bot commented on FLINK-10880:


tillrohrmann opened a new pull request #7129: [FLINK-10880] Exclude 
JobManagerOptions#EXECUTION_FAILOVER_STRATEGY from documentation
URL: https://github.com/apache/flink/pull/7129
 
 
   ## What is the purpose of the change
   
   This commit excludes the `JobManagerOptions#EXECUTION_FAILOVER_STRATEGY` 
from Flink's
   configuration documentation.
   
   ## Brief change log
   
   - Introduce `Documentation#ExcludeFromDocumentation`
   - Exclude `JobManagerOptions#EXECUTION_FAILOVER_STRATEGY` from documentation
   
   ## Verifying this change
   
   - Added `ConfigOptionsDocGeneratorTest#testConfigOptionExclusion`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread TisonKun (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689606#comment-16689606
 ] 

TisonKun commented on FLINK-10880:
--

[~till.rohrmann] alright I agree your quick fix and hope we can meet the 
release asap :-)

> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>Priority: Blocker
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-16 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689594#comment-16689594
 ] 

Till Rohrmann commented on FLINK-10880:
---

I think one step is to exclude 
{{JobManagerOptions#EXECUTION_FAILOVER_STRATEGY}} from our documentation. This 
makes this feature effectively a hidden feature again.

Moreover, I would like to include a release note that this feature is 
experimental to make people aware of the limitations.

Concerning preventing the combination of {{DataSet}} and other failover 
strategies than "full", I think this is not that easy to do. The failover 
strategy is configured for the cluster and we only instantiate it when we 
create the {{ExecutionGraph}}. At this stage, it is not that easy to 
distinguish whether the job is a streaming or batch job.

Since this is the last blocker issue for Flink 1.7.0 I'd like to take this 
issue over [~Tison]. I hope this is ok for you.

> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: TisonKun
>Priority: Blocker
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10880) Failover strategies should not be applied to Batch Execution

2018-11-14 Thread TisonKun (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686519#comment-16686519
 ] 

TisonKun commented on FLINK-10880:
--

Hi [~StephanEwen], I'd like to meet the consensus of where to document this 
message.

My opinion is the option description at 
{{JobManagerOptions#EXECUTION_FAILOVER_STRATEGY}} and Java Doc at 
{{RestartPipelinedRegionStrategy}}. What do you think? Are these enough?

> Failover strategies should not be applied to Batch Execution
> 
>
> Key: FLINK-10880
> URL: https://issues.apache.org/jira/browse/FLINK-10880
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2
>Reporter: Stephan Ewen
>Assignee: TisonKun
>Priority: Blocker
> Fix For: 1.6.3, 1.7.0
>
>
> When configuring a failover strategy other than "full", DataSet/Batch 
> execution is currently not correct.
> This is expected, the failover region strategy is an experimental WIP feature 
> for streaming that has not been extended to the DataSet API.
> We need to document this and prevent execution of DataSet features with other 
> failover strategies than "full".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)