GitHub user yew1eb opened a pull request:
https://github.com/apache/flink/pull/4614
[FLINK-7547] AsyncFunction.scala extends Function, serialized
fix [#issue FLINK-7547](https://issues.apache.org/jira/browse/FLINK-7547)
details:
org.apache.flink.streaming.api.scala.async.AsyncFunction is not declared
Serializable, whereas
org.apache.flink.streaming.api.functions.async.AsyncFunction is. This leads to
the job not starting as the as async function can't be serialized during
initialization.
## What is the purpose of the change
*(For example: This pull request makes task deployment go through the blob
server, rather than through RPC. That way we avoid re-transferring them on each
deployment (during recovery).)*
## Brief change log
*(for example:)*
- *The TaskInfo is stored in the blob store on job creation time as a
persistent artifact*
- *Deployments RPC transmits only the blob storage reference*
- *TaskManagers retrieve the TaskInfo from the blob cache*
## Verifying this change
*(Please pick either of the following options)*
This change is a trivial rework / code cleanup without any test coverage.
*(or)*
This change is already covered by existing tests, such as *(please describe
tests)*.
*(or)*
This change added tests and can be verified as follows:
*(example:)*
- *Added integration tests for end-to-end deployment with large payloads
(100MB)*
- *Extended integration test for recovery after master (JobManager)
failure*
- *Added test that validates that TaskInfo is transferred only once
across recoveries*
- *Manually verified the change by running a 4 node cluser with 2
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one
JobManager and two TaskManagers during the execution, verifying that recovery
happens correctly.*
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (yes / no)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (yes / no)
- The serializers: (yes / no / don't know)
- The runtime per-record code paths (performance sensitive): (yes / no /
don't know)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
## Documentation
- Does this pull request introduce a new feature? (yes / no)
- If yes, how is the feature documented? (not applicable / docs /
JavaDocs / not documented)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yew1eb/flink FLINK-7547
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4614.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4614
----
commit 93edc636d5804e4a50a818cd60199d25be3f073e
Author: yew1eb <[email protected]>
Date: 2017-08-29T12:25:49Z
AsyncFunction.scala extends Function, serialized
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---