GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/319
[FLINK-1415] Akka cleanups
This PR removes the akka.jobmanager.url config parameter to override the
jobmanager address and port set in the configuration.
Adds timeout heuristics to deduce the different timeout values from a
single value if no other values are specified. This makes it easier to
configure the system.
Adds test cases to test if the JobManager detects a failing TaskManager
using Akka's death watch. Moreover, the test cases test alos if the TaskManager
detects a failing JobManager and then tries to reconnect to it.
Removes the notifyExecutionStateChange method which gave access to the
internal actor state of the TaskManager. Replaced by sending a message to the
TaskManager. This removes the NullPointerExceptions which occurred when
shutting the system down.
Harmonizes the scala code and adds comments to the important classes.
This PR is based on the PR #317, which is required by the refactoring of
the notifyExecutionStateChange method.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink akka_polishing
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/319.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #319
----
commit f5aac7f7753a396be35c879f55c850ea59b8be8d
Author: Till Rohrmann <[email protected]>
Date: 2015-01-12T09:58:45Z
[FLINK-1376] [runtime] Add proper shared slot release in case of a fatal
TaskManager failure.
Fixes concurrent modification exception of SharedSlot's subSlots field by
synchronizing all state changing operations through the associated assignment
group. Fixes deadlock where Instance.markDead first acquires InstanceLock and
then by releasing the associated slots the assignment group lockcan block with
a direct releaseSlot call on a SharedSlot which first acquires the assignment
group lock and then the instance lock in order to return the slot to the
instance.
Fixes colocation shared slot releasing. A colocation constraint is now
realized as a SharedSlot in a SharedSlot where the colocated tasks allocate sub
slots.
commit e27764579a644a0263f9d6c1841234a1c6e76c1c
Author: Till Rohrmann <[email protected]>
Date: 2015-01-06T10:15:30Z
Replace akka.jobmanager.url by non exposed mechanism. Add heuristics to
calculate different timeouts based on a single value.
commit 0784d1f272e99119824482f0c2333e821ec3d113
Author: Till Rohrmann <[email protected]>
Date: 2015-01-06T15:10:01Z
Harmonize scala coding style: Remove redundant braces and parentheses,
remove meaningless code statements, standardize access patterns, name boolean
parameters, unnecessary semicolons, unnecessary braces in import section
commit 5b19443264e5d130e0ae747ef1d3d6febb6a0348
Author: Till Rohrmann <[email protected]>
Date: 2015-01-08T14:41:11Z
Adds death watch test cases: Test if JobManager detects failing
TaskManager. Test if the TaskManager detects failing JobManager and tries to
reconnect to the JobManager.
commit 0acacb7abe3aa9ac530d387546844b3166a89c0f
Author: Till Rohrmann <[email protected]>
Date: 2015-01-08T15:55:11Z
Refactors notifyExecutionStateChange method to avoid access of the
TaskManagers internal state from outside
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---