[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4774


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-30 Thread NicoK
Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/4774#discussion_r147743853
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/akka/AkkaUtils.scala ---
@@ -257,6 +257,20 @@ object AkkaUtils {
 ConfigFactory.parseString(config)
   }
 
+  private def validateHeartbeat(pauseParamName: String,
+pauseValue: String,
+intervalParamName: String,
+intervalValue: String) = {
+if (Duration.apply(pauseValue).lteq(Duration.apply(intervalValue))) {
+  throw new IllegalConfigurationException(
+"%s [%s] must greater then %s [%s]",
--- End diff --

this should actually be "than", not "then"


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-09 Thread pnowojski
Github user pnowojski commented on a diff in the pull request:

https://github.com/apache/flink/pull/4774#discussion_r143416770
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/restart/FixedDelayRestartStrategy.java
 ---
@@ -82,7 +82,7 @@ public static FixedDelayRestartStrategyFactory 
createFactory(Configuration confi
int maxAttempts = 
configuration.getInteger(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_ATTEMPTS, 
1);
 
String timeoutString = configuration.getString(
-   AkkaOptions.WATCH_HEARTBEAT_INTERVAL);
+   AkkaOptions.WATCH_HEARTBEAT_PAUSE);
--- End diff --

I will fix the inconsistency in other way: fixing exception message instead 
of this getter.


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-09 Thread pnowojski
Github user pnowojski commented on a diff in the pull request:

https://github.com/apache/flink/pull/4774#discussion_r143416489
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/IllegalConfigurationException.java
 ---
@@ -42,6 +42,17 @@ public IllegalConfigurationException(String message) {
 
/**
 * Constructs an new IllegalConfigurationException with the given error 
message
+* format and arguments.
+*
+* @param format The error message format for the exception.
+* @param arguments The arguments for the format.
+*/
+   public IllegalConfigurationException(String format, Object... 
arguments) {
--- End diff --

Convenience - reduces boiler plate. I prefer this:
```
throw new IllegalConfigurationException("%s %s", foo, bar);
```
to that:
```
throw new IllegalConfigurationException(String.format("%s %s", foo, bar));
```
same as it is done in loggers.


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-06 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4774#discussion_r143248194
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/restart/FixedDelayRestartStrategy.java
 ---
@@ -82,7 +82,7 @@ public static FixedDelayRestartStrategyFactory 
createFactory(Configuration confi
int maxAttempts = 
configuration.getInteger(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_ATTEMPTS, 
1);
 
String timeoutString = configuration.getString(
-   AkkaOptions.WATCH_HEARTBEAT_INTERVAL);
+   AkkaOptions.WATCH_HEARTBEAT_PAUSE);
--- End diff --

We cannot make this change, this introduces crazy delay on each recovery.


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-06 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4774#discussion_r143246933
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/IllegalConfigurationException.java
 ---
@@ -42,6 +42,17 @@ public IllegalConfigurationException(String message) {
 
/**
 * Constructs an new IllegalConfigurationException with the given error 
message
+* format and arguments.
+*
+* @param format The error message format for the exception.
+* @param arguments The arguments for the format.
+*/
+   public IllegalConfigurationException(String format, Object... 
arguments) {
--- End diff --

Curious: Why introduce the extra constructor and not call 
`String.format(...)` where the exception is created?


---


[GitHub] flink pull request #4774: [FLINK-6495] Fix Akka's default value for heartbea...

2017-10-04 Thread pnowojski
GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4774

[FLINK-6495] Fix Akka's default value for heartbeat pause

## Brief change log

This PR consists of two hotfixes regarding akka's heartbeat pause. The 
critical one is reverting it's default value from 10s back to 60s (bug 
introduced by #3935)

## Verifying this change

This change is already covered by existing tests

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes (default config values)** / no)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know)

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink akka

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4774.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4774


commit 0f17a2998d343bc78bb01533a6fcf847d657c38c
Author: Piotr Nowojski 
Date:   2017-10-02T17:33:46Z

[hotfix][config] Revert heartbeat pause back to 60s

This fixes an important bug introduced by FLINK-6495. Heartbeat pause MUST 
be
significantly larger then heartbeat interval.

commit 07fd359be00c8de6c1473c6c5631b4bdeee0c586
Author: Piotr Nowojski 
Date:   2017-10-04T11:26:16Z

[hotfix][runtime] Fix default value for restart delay

1. Previously default value didn't match with an exception message being 
thrown.
2. HEARTBEAT_PAUSE is more sane default value for the delay, becauce we 
want to
wait long enough before restart, for actors to realize about previous crash.




---