GitHub user czhxmz opened a pull request: https://github.com/apache/flink/pull/5191
Release 1.4 *Thank you very much for contributing to Apache Flink - we are happy that you want to help us improve Flink. To help the community review your contribution in the best possible way, please go through the checklist below, which will get the contribution into a shape in which it can be best reviewed.* *Please understand that we do not do this to make contributions to Flink a hassle. In order to uphold a high standard of quality for code contributions, while at the same time managing a large number of contributions, we need contributors to prepare the contributions well, and give reviewers enough contextual information for the review. Please also understand that contributions that do not follow this guide will take longer to review and thus typically be picked up with lower priority by the community.* ## Contribution Checklist - Make sure that the pull request corresponds to a [JIRA issue](https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue. - Name the pull request in the form "[FLINK-XXXX] [component] Title of the pull request", where *FLINK-XXXX* should be replaced by the actual issue number. Skip *component* if you are unsure about which is the best component. Typo fixes that have no associated JIRA issue should be named following this pattern: `[hotfix] [docs] Fix typo in event time introduction` or `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`. - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review. - Make sure that the change passes the automated tests, i.e., `mvn clean verify` passes. You can set up Travis CI to do that following [this guide](http://flink.apache.org/contribute-code.html#best-practices). - Each pull request should address only one issue, not mix up code from multiple issues. - Each commit in the pull request has a meaningful commit message (including the JIRA id) - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below. **(The sections below can be removed for hotfixes of typos)** ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/flink release-1.4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5191.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5191 ---- commit 760d1a6bb75eb9519a4b93eb3cf34ad1605621da Author: yew1eb <yew1eb@...> Date: 2017-11-07T01:06:45Z [hotfix][docs] Add type for numLateRecordsDropped metric in docs commit 07830e7897a42b5d12f0b33c42933c6ca78e70d3 Author: zentol <chesnay@...> Date: 2017-11-07T11:16:04Z [hotfix][rat] Add missing rat exclusions Another set of RAT exclusions to prevent errors on Windows. commit aab36f934548a5697c5c461b2a79c7cf3fd0d756 Author: kkloudas <kkloudas@...> Date: 2017-11-06T11:43:18Z [FLINK-7823][QS] Update Queryable State configuration parameters. commit 819995454611be6a85e2933318d053b2c25a18f7 Author: kkloudas <kkloudas@...> Date: 2017-11-06T16:21:45Z [FLINK-7822][QS][doc] Update Queryable State docs. commit 564c9934fd3aaba462a7415788b3d55486146f9b Author: Aljoscha Krettek <aljoscha.krettek@...> Date: 2017-11-07T17:27:16Z [hotfix] Use correct commit id in GenericWriteAheadSink.notifyCheckpoint commit 3cbf467ebdf639df4d7d4da78b7bc2929aa4b5d9 Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-11-06T13:03:16Z [hotfix][kafka] Extract TransactionalIdsGenerator class from FlinkKafkaProducer011 This is pure refactor without any functional changes. commit 460e27aeb5e246aff0f8137448441c315123608c Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-11-06T13:14:01Z [FLINK-7978][kafka] Ensure that transactional ids will never clash Previously transactional ids to use and to abort could clash between subtasks. This could lead to a race condition between initialization and writting the data, where one subtask is still initializing/aborting some transactional id while different subtask is already trying to write the data using the same transactional id. commit b677c8d69b81fb3594798ba2761fdb7e2edea5db Author: Fabian Hueske <fhueske@...> Date: 2017-11-07T22:43:45Z [hotfix] [docs] Improve Supported Types section of Table API & SQL docs. commit dc1ca78a4e4cb339e9fbf0c90700f3204e091c53 Author: Fabian Hueske <fhueske@...> Date: 2017-11-07T23:12:49Z [hotfix] [docs] Fix UDTF join description in SQL docs. commit 5af710080eb72d23d8d2f6a77d1825f3d8a009ae Author: zentol <chesnay@...> Date: 2017-11-07T10:40:15Z [FLINK-8004][metrics][docs] Fix usage examples commit 49dc380697627189f6ac2e8bf5a084ac85c21ed5 Author: zentol <chesnay@...> Date: 2017-11-07T14:36:49Z [FLINK-8010][build] Bump remaining flink-shaded versions commit 17aae5af4a7973348067d5786cd4f16fc9da2639 Author: Tzu-Li (Gordon) Tai <tzulitai@...> Date: 2017-11-07T11:35:33Z [FLINK-8001] [kafka] Prevent PeriodicWatermarkEmitter from violating IDLE status Prior to this commit, a bug exists such that if a Kafka consumer subtask initially marks itself as idle because it didn't have any partitions to subscribe to, that idleness status will be violated when the PeriodicWatermarkEmitter is fired. The problem is that the PeriodicWatermarkEmitter incorrecty yields a Long.MAX_VALUE watermark even when there are no partitions to subscribe to. This commit fixes this by additionally ensuring that the aggregated watermark in the PeriodicWatermarkEmitterr is an effective one (i.e., is really aggregated from some partition). commit f5a0b4bdfb623852cd5790223fd38732ff985de9 Author: zentol <chesnay@...> Date: 2017-11-07T15:58:53Z [FLINK-8009][build][runtime] Remove transitive dependency promotion This closes #4972. commit a126bd3e7d9614749f61692fbb53c5b284f17091 Author: Xpray <leonxpray@...> Date: 2017-11-03T07:19:42Z [FLINK-7971] [table] Fix potential NPE in non-windowed aggregation. This closes #4941. commit 8c60f97a43defc57bb1bfaabdd6081b329db53b8 Author: Rong Rong <rongr@...> Date: 2017-10-31T18:05:38Z [FLINK-7922] [table] Fix FlinkTypeFactory.leastRestrictive for composite types. This closes #4929. commit 1b20f70dea3fddfaeaf00ceae44e4dc0fcb4f47b Author: Xingcan Cui <xingcanc@...> Date: 2017-11-07T17:17:57Z [FLINK-7996] [table] Add support for (left.time = right.time) predicates to window join. This closes #4977. commit 51657fc6deaf28115020db86d031d536b09bf384 Author: Fabian Hueske <fhueske@...> Date: 2017-11-07T16:57:39Z [FLINK-8012] [table] Fix TableSink config for tables with time attributes. This closes #4974. commit c7943291599260003304f003e89725352ae7d836 Author: Fabian Hueske <fhueske@...> Date: 2017-11-06T20:22:35Z [FLINK-8002] [table] Fix join window boundary for LESS_THAN and GREATER_THAN predicates. This closes #4962. commit 02a19a14fad1ef928038f4971bdcacf4d0642d88 Author: Dan Kelley <dan.kelley@...> Date: 2017-11-08T01:27:44Z [FLINK-8017] Fix High availability cluster-id key in documentation commit d302c652f1b52aac29ff6d09c817bd7f9e5e00e7 Author: Aljoscha Krettek <aljoscha.krettek@...> Date: 2017-11-09T14:34:44Z [hotfix] Fix formatting in windowing documentation commit 7df7fc457618d371b4c1f9623ac7fc2cab37cb1f Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-10-05T13:17:13Z [hotfix][build] Deduplicate maven-enforcer version commit 005a871771ce73bef9c78ee04a61817fa9a31e99 Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-11-07T11:13:59Z [FLINK-7765][build] Enable dependency convergence by default Disable it in most modules. commit 2117eb77bb9d34da4288b5dd4455ef06c583ce7c Author: gyao <gary@...> Date: 2017-11-08T10:46:45Z [FLINK-8005] Set user-code class loader as context loader before snapshot During checkpointing, user code may dynamically load classes from the user code jar. This is a problem if the thread invoking the snapshot callbacks does not have the user code class loader set as its context class loader. This commit makes sure that the correct class loader is set. commit 896f13da1d35fcae46600eb54a055bbfd6f6e8fc Author: Michael Fong <mcfong.open@...> Date: 2017-08-14T12:57:06Z [FLINK-4500] CassandraSinkBase implements CheckpointedFunction This closes #4605. commit 5f992e8dec2b9349627385d4188a6975c619de9d Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-11-10T12:57:51Z [hotfix][build] Disable dependency convergence in flink-dist Previously mvn javadoc:aggregate goal was failing commit da435f121821fd1107c41352a54ee804f10cf7e3 Author: Aljoscha Krettek <aljoscha.krettek@...> Date: 2017-11-10T09:54:16Z [FLINK-6163] Document per-window state in ProcessWindowFunction commit b3df579f0fd36f8b4a235a994caaaffe6f2b2a0d Author: Piotr Nowojski <piotr.nowojski@...> Date: 2017-11-10T14:15:11Z [hotfix][docs] Change mailing list link in quickstart to flink-user Previously it was pointing to flink-dev commit 6aeac3fb77c1053e344d08c0cc68e84a88623a43 Author: Aljoscha Krettek <aljoscha.krettek@...> Date: 2017-11-10T15:28:46Z [FLINK-7702] Remove Javadoc aggregation for Scala code genjavadoc generated some Java code that was making Javadoc fail. commit 431ae36f787adcaac2e1071753d2dc2af299f528 Author: Aljoscha Krettek <aljoscha.krettek@...> Date: 2017-11-10T17:13:26Z [FLINK-7702] Add maven-bundle-plugin to root pom Before, we had it in places that require it. This doesn't work when running mvn javadoc:aggregate because this will only run for the root pom and can then not find the "bundle" dependencies. commit e2b92f22c2686f8d842d371a17c36c5d28f9b247 Author: Stefan Richter <s.richter@...> Date: 2017-11-13T10:50:07Z [FLINK-8040] [tests] Fix test instability in ResourceGuardTest (cherry picked from commit ad8ef6d) ---- ---