[jira] [Commented] (FLINK-8573) Print JobID for failed jobs
[ https://issues.apache.org/jira/browse/FLINK-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418520#comment-16418520 ] ASF GitHub Bot commented on FLINK-8573: --- Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/5421 Sup ? > Print JobID for failed jobs > --- > > Key: FLINK-8573 > URL: https://issues.apache.org/jira/browse/FLINK-8573 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang >Priority: Major > > When a job is successfully run the client prints a something along the lines > of "Job with successfully run". If the job fails however we only > print the exception but not the JobID. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5421: [FLINK-8573] [client] Add more information for printing J...
Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/5421 Sup ? ---
[jira] [Closed] (FLINK-8908) MapSerializer creates new serializer even if key and value serializers are stateless
[ https://issues.apache.org/jira/browse/FLINK-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-8908. - Resolution: Fixed Fixed on master with eea887b7d9a3cd43416feca568c9815d8362e8d4 and on release-1.5 with d4a2bc545a1e993130479813d4194727a56d86f9 > MapSerializer creates new serializer even if key and value serializers are > stateless > > > Key: FLINK-8908 > URL: https://issues.apache.org/jira/browse/FLINK-8908 > Project: Flink > Issue Type: Bug > Components: Type Serialization System >Affects Versions: 1.5.0 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas >Priority: Major > Fix For: 1.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9097) Jobs can be dropped in HA when job submission fails
[ https://issues.apache.org/jira/browse/FLINK-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419265#comment-16419265 ] ASF GitHub Bot commented on FLINK-9097: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5774 > Jobs can be dropped in HA when job submission fails > --- > > Key: FLINK-9097 > URL: https://issues.apache.org/jira/browse/FLINK-9097 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > Jobs can be dropped in HA mode if the job submission step fails. In such a > case, we should fail fatally to let the {{Dispatcher}} restart and retry to > recover all jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-9097) Jobs can be dropped in HA when job submission fails
[ https://issues.apache.org/jira/browse/FLINK-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-9097. Resolution: Fixed Fixed via master: cc190a6457cdf6186ea8e449f7b38be04761af8b 1.5.0: 74f4d55be1edbb2fcbd25bff89de3d6ee790fea5 > Jobs can be dropped in HA when job submission fails > --- > > Key: FLINK-9097 > URL: https://issues.apache.org/jira/browse/FLINK-9097 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > Jobs can be dropped in HA mode if the job submission step fails. In such a > case, we should fail fatally to let the {{Dispatcher}} restart and retry to > recover all jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-8928) Improve error message on server binding error.
[ https://issues.apache.org/jira/browse/FLINK-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-8928. - Resolution: Fixed Fixed on master with 6a8172a9c654589faa01f1fccb2dec5e008fe532 and on release-1.5 with c342487466abc70447550e05e2abe30e9f01e368 > Improve error message on server binding error. > -- > > Key: FLINK-8928 > URL: https://issues.apache.org/jira/browse/FLINK-8928 > Project: Flink > Issue Type: Bug > Components: Queryable State >Affects Versions: 1.5.0 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas >Priority: Major > Fix For: 1.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5790: [FLINK-9107][docs] document timer coalescing for ProcessF...
Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5790 Actually, more advanced schemes using `current watermark + 1` (which fires with the next watermark) for the event time timer should also go into the documentation. I'll extend the PR ... ---
[GitHub] flink pull request #5784: [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5784 ---
[jira] [Commented] (FLINK-9111) SSL Passwords written to log file in plain text
[ https://issues.apache.org/jira/browse/FLINK-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419474#comment-16419474 ] Vinay commented on FLINK-9111: -- [~aljoscha] that's right . sorry I thought this issue was not reported earlier. > SSL Passwords written to log file in plain text > --- > > Key: FLINK-9111 > URL: https://issues.apache.org/jira/browse/FLINK-9111 > Project: Flink > Issue Type: Bug > Components: Security >Affects Versions: 1.4.2 >Reporter: Vinay >Priority: Major > > Hi, > The SSL passwords are written to log file in plain text. > This should be either be masked or should not be included in logs. > GlobalConfiguration file prints all the key and value in loadYAMLResource > method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9113) Data loss in BucketingSink when writing to local filesystem
Timo Walther created FLINK-9113: --- Summary: Data loss in BucketingSink when writing to local filesystem Key: FLINK-9113 URL: https://issues.apache.org/jira/browse/FLINK-9113 Project: Flink Issue Type: Bug Components: Streaming Connectors Reporter: Timo Walther Assignee: Timo Walther This issue is closely related to FLINK-7737. By default the bucketing sink uses HDFS's {{org.apache.hadoop.fs.FSDataOutputStream#hflush}} for performance reasons. However, this leads to data loss in case of TaskManager failures when writing to a local filesystem {{org.apache.hadoop.fs.LocalFileSystem}}. We should use {{hsync}} by default in local filesystem cases and make it possible to disable this behavior if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-9012) Shaded Hadoop S3A end-to-end test failing with S3 bucket in Ireland
[ https://issues.apache.org/jira/browse/FLINK-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber closed FLINK-9012. -- Resolution: Cannot Reproduce Fix Version/s: (was: 1.5.0) Yes, I now tested locally and it was running fine with the same S3 settings. Maybe a temporary glitch with S3. > Shaded Hadoop S3A end-to-end test failing with S3 bucket in Ireland > --- > > Key: FLINK-9012 > URL: https://issues.apache.org/jira/browse/FLINK-9012 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.5.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Critical > Labels: test-stability > > https://api.travis-ci.org/v3/job/354259892/log.txt > {code} > Found AWS bucket [secure], running Shaded Hadoop S3A e2e tests. > Flink dist directory: /home/travis/build/NicoK/flink/build-target > TEST_DATA_DIR: > /home/travis/build/NicoK/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-05775180416 > % Total% Received % Xferd Average Speed TimeTime Time > Current > Dload Upload Total SpentLeft Speed > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 91 4930 4490 0 2476 0 --:--:-- --:--:-- --:--:-- 2467 > > TemporaryRedirectPlease re-send this request to > the specified temporary endpoint. Continue to use the original request > endpoint for future > requests.[secure][secure].s3-eu-west-1.amazonaws.com1FCEC82C3EBF7C7ENG5dxnXQ0Y5mK2X/m3bU+Z7Fqt0mNVL2JsjyVjGZUmpDmNuBDfKJACh7VI6tCTYEZsw65W057lA=Starting > cluster. > Starting standalonesession daemon on host > travis-job-087822e3-2f4c-46b7-b9bd-b6d4aba6136c. > Starting taskexecutor daemon on host > travis-job-087822e3-2f4c-46b7-b9bd-b6d4aba6136c. > Dispatcher/TaskManagers are not yet up > Waiting for dispatcher REST endpoint to come up... > Dispatcher/TaskManagers are not yet up > Waiting for dispatcher REST endpoint to come up... > Dispatcher/TaskManagers are not yet up > Waiting for dispatcher REST endpoint to come up... > Dispatcher/TaskManagers are not yet up > Waiting for dispatcher REST endpoint to come up... > Dispatcher/TaskManagers are not yet up > Waiting for dispatcher REST endpoint to come up... > Waiting for dispatcher REST endpoint to come up... > Dispatcher REST endpoint is up. > Starting execution of program > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: Could not > retrieve the execution result. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:246) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:458) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:446) > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62) > at > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:86) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:398) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:780) > at > org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:274) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:209) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1019) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1095) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1095) > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. >
[jira] [Commented] (FLINK-9107) Document timer coalescing for ProcessFunctions
[ https://issues.apache.org/jira/browse/FLINK-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419302#comment-16419302 ] ASF GitHub Bot commented on FLINK-9107: --- Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5790 Actually, more advanced schemes using `current watermark + 1` (which fires with the next watermark) for the event time timer should also go into the documentation. I'll extend the PR ... > Document timer coalescing for ProcessFunctions > -- > > Key: FLINK-9107 > URL: https://issues.apache.org/jira/browse/FLINK-9107 > Project: Flink > Issue Type: Improvement > Components: Documentation, Streaming >Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Fix For: 1.5.0, 1.4.3, 1.3.4 > > > In a {{ProcessFunction}}, registering timers for each event via > {{ctx.timerService().registerEventTimeTimer()}} using times like > {{ctx.timestamp() + timeout}} will get a millisecond accuracy and may thus > create one timer per millisecond which may lead to some overhead in the > {{TimerService}}. > This problem can be mitigated by using timer coalescing if the desired > accuracy of the timer can be larger than 1ms. A timer firing at full seconds > only, for example, can be realised like this: > {code} > coalescedTime = ((ctx.timestamp() + timeout) / 1000) * 1000; > ctx.timerService().registerEventTimeTimer(coalescedTime); > {code} > As a result, only a single timer may exist for every second since we do not > add timers for timestamps that are already there. > This should be documented in the {{ProcessFunction}} docs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9080) Flink Scheduler goes OOM, suspecting a memory leak
[ https://issues.apache.org/jira/browse/FLINK-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419305#comment-16419305 ] Rohit Singh commented on FLINK-9080: Hi Till, Tried 1.4.2 flink, but not 1.5.0 Posted this on stackoverflow https://stackoverflow.com/questions/49530333/getting-following-class-cast-exception-while-adding-job-jar-to-flink-home-lib Based on that found out that, Our job contains compile group: 'org.apache.commons', name: 'commons-collections4', version: '4.1' as dependency and flink uses compile group: 'commons-collections', name: 'commons-collections', version: '3.2.2' i. e 3.2.2 version removed the dependency and use the same dependecy whcih flink uses, but still was getting the same error. I can try out 1.5.0 release branch fix andd share the results, is there any fix targeted around this issue. Also in long term, is there any plan to avoid dynamic class loading in mesos or any other workaround to overcome the issue apart from adding jar in flink lib. Please let me know your thoughts on this. > Flink Scheduler goes OOM, suspecting a memory leak > -- > > Key: FLINK-9080 > URL: https://issues.apache.org/jira/browse/FLINK-9080 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 1.4.0 >Reporter: Rohit Singh >Priority: Blocker > Fix For: 1.5.0 > > Attachments: Top Level packages.JPG, Top level classes.JPG, > classesloaded vs unloaded.png > > > Running FLink version 1.4.0. on mesos,scheduler running along with job > manager in single container, whereas task managers running in seperate > containers. > Couple of jobs were running continously, Flink scheduler was working > properlyalong with task managers. Due to some change in data, one of the jobs > started failing continuously. In the meantime,there was a surge in flink > scheduler memory usually eventually died out off OOM > > Memory dump analysis was done, > Following were findings !Top Level packages.JPG!!Top level classes.JPG! > * Majority of top loaded packages retaining heap indicated towards > Flinkuserclassloader, glassfish(jersey library), Finalizer classes. (Top > level package image) > * Top level classes were of Flinkuserclassloader, (Top Level class image) > * The number of classes loaded vs unloaded was quite less PFA,inspite of > adding jvm options of -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled , > PFAclassloaded vs unloaded graph, scheduler was restarted 3 times > * There were custom classes as well which were duplicated during subsequent > class uploads > PFA all the images of heap dump. Can you suggest some pointers on as to how > to overcome this issue. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5791: [FLINK-8507][Table API & SQL] upgrade calcite depe...
GitHub user suez1224 reopened a pull request: https://github.com/apache/flink/pull/5791 [FLINK-8507][Table API & SQL] upgrade calcite dependency to 1.16 ## What is the purpose of the change Upgrade Flink table's Calcite dependency to 1.16. ## Brief change log - Update pom. - Fix HepPlanner issue. ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5791 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 ---
[GitHub] flink pull request #5791: [FLINK-8507][Table API & SQL] upgrade calcite depe...
Github user suez1224 closed the pull request at: https://github.com/apache/flink/pull/5791 ---
[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16
[ https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419796#comment-16419796 ] ASF GitHub Bot commented on FLINK-8507: --- Github user suez1224 closed the pull request at: https://github.com/apache/flink/pull/5791 > Upgrade Calcite dependency to 1.16 > -- > > Key: FLINK-8507 > URL: https://issues.apache.org/jira/browse/FLINK-8507 > Project: Flink > Issue Type: Task > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5791: [FLINK-8507][Table API & SQL] upgrade calcite dependency ...
Github user suez1224 commented on the issue: https://github.com/apache/flink/pull/5791 Close and reopen to trigger travis. SavepointITCase.testSavepointForJobWithIteration is unstable. ---
[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16
[ https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419797#comment-16419797 ] ASF GitHub Bot commented on FLINK-8507: --- GitHub user suez1224 reopened a pull request: https://github.com/apache/flink/pull/5791 [FLINK-8507][Table API & SQL] upgrade calcite dependency to 1.16 ## What is the purpose of the change Upgrade Flink table's Calcite dependency to 1.16. ## Brief change log - Update pom. - Fix HepPlanner issue. ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5791 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 > Upgrade Calcite dependency to 1.16 > -- > > Key: FLINK-8507 > URL: https://issues.apache.org/jira/browse/FLINK-8507 > Project: Flink > Issue Type: Task > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16
[ https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419795#comment-16419795 ] ASF GitHub Bot commented on FLINK-8507: --- Github user suez1224 commented on the issue: https://github.com/apache/flink/pull/5791 Close and reopen to trigger travis. SavepointITCase.testSavepointForJobWithIteration is unstable. > Upgrade Calcite dependency to 1.16 > -- > > Key: FLINK-8507 > URL: https://issues.apache.org/jira/browse/FLINK-8507 > Project: Flink > Issue Type: Task > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5794: [Flink-8509][Table API & SQL] Remove SqlGroupedWin...
GitHub user suez1224 opened a pull request: https://github.com/apache/flink/pull/5794 [Flink-8509][Table API & SQL] Remove SqlGroupedWindowFunction copy from flink ## What is the purpose of the change Remove SqlGroupedWindowFunction copy from flink. This depends on https://github.com/apache/flink/pull/5791. ## Brief change log - remove code. ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8509 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5794.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5794 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 commit 379111d0f2a8e2e54b5370884742fa5615d39b68 Author: Shuyi Chen Date: 2018-03-29T18:48:28Z Remove SqlGroupedWindowFunction from Flink repo ---
[jira] [Closed] (FLINK-8802) Concurrent serialization without duplicating serializers in state server.
[ https://issues.apache.org/jira/browse/FLINK-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-8802. - Resolution: Fixed Fixed on master with db8e1f09bd7dcd9f392bf987e96cddcb34665b6c and on release-1.5 with c17c3b60381b454e101d10b5b285b489775bfa71 > Concurrent serialization without duplicating serializers in state server. > - > > Key: FLINK-8802 > URL: https://issues.apache.org/jira/browse/FLINK-8802 > Project: Flink > Issue Type: Bug > Components: Queryable State >Affects Versions: 1.5.0 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas >Priority: Blocker > Fix For: 1.5.0 > > > The `getSerializedValue()` may be called by multiple threads but serializers > are not duplicated, which may lead to exceptions thrown when a serializer is > stateful. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9106) Add UnfencedMainThreadExecutor to FencedRpcEndpoint
[ https://issues.apache.org/jira/browse/FLINK-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419264#comment-16419264 ] ASF GitHub Bot commented on FLINK-9106: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5784 > Add UnfencedMainThreadExecutor to FencedRpcEndpoint > --- > > Key: FLINK-9106 > URL: https://issues.apache.org/jira/browse/FLINK-9106 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Labels: flip-6 > Fix For: 1.5.0 > > > In order to run unfenced operation it would be convenient to also have an > {{UnfencedMainThreadExecutor}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5774: [FLINK-9097] Fail fatally if job submission fails ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5774 ---
[GitHub] flink pull request #5792: [Flink-8563][Table API & SQL] add unittest for con...
GitHub user suez1224 opened a pull request: https://github.com/apache/flink/pull/5792 [Flink-8563][Table API & SQL] add unittest for consecutive dot access of composite array element in SQL ## What is the purpose of the change add unittest for consecutive dot access of composite array element in SQL. This depends on https://github.com/apache/flink/pull/5791. ## Brief change log - add unittest ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8563 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5792.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5792 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 commit aea021cb9efc869872595f64467f8d2ec8071ea4 Author: Shuyi Chen Date: 2018-03-29T19:15:15Z add unittest for consecutive dot access of composite array element in SQL ---
[GitHub] flink pull request #5793: [Flink-8508][Table API & SQL] Remove RexSimplify c...
GitHub user suez1224 opened a pull request: https://github.com/apache/flink/pull/5793 [Flink-8508][Table API & SQL] Remove RexSimplify copy from flink ## What is the purpose of the change Remove RexSimplify copy from flink. This depends on https://github.com/apache/flink/pull/5791. ## Brief change log - remove code. ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8508 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5793.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5793 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 commit 0b63d25f604ed8c5320156b0846a1fdfa7553639 Author: Shuyi Chen Date: 2018-03-29T18:46:38Z remove RexSimplify copy from calcite ---
[jira] [Commented] (FLINK-8507) Upgrade Calcite dependency to 1.16
[ https://issues.apache.org/jira/browse/FLINK-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419553#comment-16419553 ] ASF GitHub Bot commented on FLINK-8507: --- GitHub user suez1224 opened a pull request: https://github.com/apache/flink/pull/5791 [FLINK-8507][Table API & SQL] upgrade calcite dependency to 1.16 ## What is the purpose of the change Upgrade Flink table's Calcite dependency to 1.16. ## Brief change log - Update pom. - Fix HepPlanner issue. ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5791 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 > Upgrade Calcite dependency to 1.16 > -- > > Key: FLINK-8507 > URL: https://issues.apache.org/jira/browse/FLINK-8507 > Project: Flink > Issue Type: Task > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5791: [FLINK-8507][Table API & SQL] upgrade calcite depe...
GitHub user suez1224 opened a pull request: https://github.com/apache/flink/pull/5791 [FLINK-8507][Table API & SQL] upgrade calcite dependency to 1.16 ## What is the purpose of the change Upgrade Flink table's Calcite dependency to 1.16. ## Brief change log - Update pom. - Fix HepPlanner issue. ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5791 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 ---
[GitHub] flink pull request #5784: [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor ...
GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/5784 [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor to FencedRpcEndpoint ## What is the purpose of the change The UnfencedMainThreadExecutor executed Runnables in the main thread context without checking the fencing token. This is important to set a new fencing token, for example. ## Verifying this change - Added `AsyncCallsTest#testUnfencedMainThreadExecutor` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink addUnfencedMainThreadExecutor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5784.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5784 commit 71bb6d898055f9b166148544b63a895b62125085 Author: Till RohrmannDate: 2018-03-29T09:40:19Z [hotfix] Add FutureUtils.supplyAsync with SupplierWithException commit 2e7d6fcf5b648932026371956f781d6131a94a3b Author: Till Rohrmann Date: 2018-03-29T09:43:03Z [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor to FencedRpcEndpoint The UnfencedMainThreadExecutor executed Runnables in the main thread context without checking the fencing token. This is important to set a new fencing token, for example. ---
[jira] [Commented] (FLINK-9097) Jobs can be dropped in HA when job submission fails
[ https://issues.apache.org/jira/browse/FLINK-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418711#comment-16418711 ] ASF GitHub Bot commented on FLINK-9097: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5774 I had to put another commit on top of it to fix a problem with the failing `DispatcherTest#testWaitingForJobMasterLeadership` @GJL. The new commit makes sure that we first recover all jobs before we set the fencing token of the `Dispatcher`. That way, no other action can interfere with the job recover, e.g. another job submission. > Jobs can be dropped in HA when job submission fails > --- > > Key: FLINK-9097 > URL: https://issues.apache.org/jira/browse/FLINK-9097 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > Jobs can be dropped in HA mode if the job submission step fails. In such a > case, we should fail fatally to let the {{Dispatcher}} restart and retry to > recover all jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9107) Document timer coalescing for ProcessFunctions
Nico Kruber created FLINK-9107: -- Summary: Document timer coalescing for ProcessFunctions Key: FLINK-9107 URL: https://issues.apache.org/jira/browse/FLINK-9107 Project: Flink Issue Type: Improvement Components: Documentation, Streaming Affects Versions: 1.4.0, 1.3.0, 1.5.0, 1.6.0 Reporter: Nico Kruber Assignee: Nico Kruber Fix For: 1.5.0, 1.4.3, 1.3.4 In a {{ProcessFunction}}, registering timers for each event via {{ctx.timerService().registerEventTimeTimer()}} using times like {{ctx.timestamp() + timeout}} will get a millisecond accuracy and may thus create one timer per millisecond which may lead to some overhead in the {{TimerService}}. This problem can be mitigated by using timer coalescing if the desired accuracy of the timer can be larger than 1ms. A timer firing at full seconds only, for example, can be realised like this: {code} coalescedTime = ((ctx.timestamp() + timeout) / 1000) * 1000; ctx.timerService().registerEventTimeTimer(coalescedTime); {code} As a result, only a single timer may exist for every second since we do not add timers for timestamps that are already there. This should be documented in the {{ProcessFunction}} docs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5774: [FLINK-9097] Fail fatally if job submission fails when re...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5774 I had to put another commit on top of it to fix a problem with the failing `DispatcherTest#testWaitingForJobMasterLeadership` @GJL. The new commit makes sure that we first recover all jobs before we set the fencing token of the `Dispatcher`. That way, no other action can interfere with the job recover, e.g. another job submission. ---
[jira] [Commented] (FLINK-9106) Add UnfencedMainThreadExecutor to FencedRpcEndpoint
[ https://issues.apache.org/jira/browse/FLINK-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418706#comment-16418706 ] ASF GitHub Bot commented on FLINK-9106: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/5784 [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor to FencedRpcEndpoint ## What is the purpose of the change The UnfencedMainThreadExecutor executed Runnables in the main thread context without checking the fencing token. This is important to set a new fencing token, for example. ## Verifying this change - Added `AsyncCallsTest#testUnfencedMainThreadExecutor` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink addUnfencedMainThreadExecutor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5784.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5784 commit 71bb6d898055f9b166148544b63a895b62125085 Author: Till RohrmannDate: 2018-03-29T09:40:19Z [hotfix] Add FutureUtils.supplyAsync with SupplierWithException commit 2e7d6fcf5b648932026371956f781d6131a94a3b Author: Till Rohrmann Date: 2018-03-29T09:43:03Z [FLINK-9106] [rpc] Add UnfencedMainThreadExecutor to FencedRpcEndpoint The UnfencedMainThreadExecutor executed Runnables in the main thread context without checking the fencing token. This is important to set a new fencing token, for example. > Add UnfencedMainThreadExecutor to FencedRpcEndpoint > --- > > Key: FLINK-9106 > URL: https://issues.apache.org/jira/browse/FLINK-9106 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Labels: flip-6 > Fix For: 1.5.0 > > > In order to run unfenced operation it would be convenient to also have an > {{UnfencedMainThreadExecutor}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9104) Re-generate REST API documentation for FLIP-6
Gary Yao created FLINK-9104: --- Summary: Re-generate REST API documentation for FLIP-6 Key: FLINK-9104 URL: https://issues.apache.org/jira/browse/FLINK-9104 Project: Flink Issue Type: Bug Components: REST Affects Versions: 1.5.0 Reporter: Gary Yao The API documentation is missing for several handlers, e.g., {{SavepointHandlers}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9104) Re-generate REST API documentation for FLIP-6
[ https://issues.apache.org/jira/browse/FLINK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Yao updated FLINK-9104: Labels: flip-6 (was: ) > Re-generate REST API documentation for FLIP-6 > -- > > Key: FLINK-9104 > URL: https://issues.apache.org/jira/browse/FLINK-9104 > Project: Flink > Issue Type: Bug > Components: REST >Affects Versions: 1.5.0 >Reporter: Gary Yao >Priority: Blocker > Labels: flip-6 > > The API documentation is missing for several handlers, e.g., > {{SavepointHandlers}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9105) Table program compiles failed
Bob Lau created FLINK-9105: -- Summary: Table program compiles failed Key: FLINK-9105 URL: https://issues.apache.org/jira/browse/FLINK-9105 Project: Flink Issue Type: Bug Components: DataStream API Affects Versions: 1.4.2, 1.4.1, 1.4.0, 1.5.0 Reporter: Bob Lau ExceptionStack: org.apache.flink.client.program.ProgramInvocationException: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled. This is a bug. Please file an issue. at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:253) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:463) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456) at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219) at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178) at com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:150) at com.tydic.tysc.rest.SubmitJobController.submitJob(SubmitJobController.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:112) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.tydic.tysc.filter.ShiroSessionFilter.doFilter(ShiroSessionFilter.java:51) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61) at com.tydic.tysc.filter.OauthFilter.doFilter(OauthFilter.java:48) at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66) at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449) at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365) at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90) at
[jira] [Created] (FLINK-9106) Add UnfencedMainThreadExecutor to FencedRpcEndpoint
Till Rohrmann created FLINK-9106: Summary: Add UnfencedMainThreadExecutor to FencedRpcEndpoint Key: FLINK-9106 URL: https://issues.apache.org/jira/browse/FLINK-9106 Project: Flink Issue Type: Sub-task Components: Distributed Coordination Affects Versions: 1.5.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.5.0 In order to run unfenced operation it would be convenient to also have an {{UnfencedMainThreadExecutor}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6567) ExecutionGraphMetricsTest fails on Windows CI
[ https://issues.apache.org/jira/browse/FLINK-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418724#comment-16418724 ] ASF GitHub Bot commented on FLINK-6567: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5782#discussion_r178017403 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphMetricsTest.java --- @@ -140,6 +140,9 @@ public void testExecutionGraphRestartTimeMetric() throws JobException, IOExcepti assertTrue(currentRestartingTime >= previousRestartingTime); previousRestartingTime = currentRestartingTime; + + // add some pause to let the currentRestartingTime increase + Thread.sleep(1L); --- End diff -- Yes can do. > ExecutionGraphMetricsTest fails on Windows CI > - > > Key: FLINK-6567 > URL: https://issues.apache.org/jira/browse/FLINK-6567 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: Till Rohrmann >Priority: Blocker > Labels: test-stability > Fix For: 1.6.0 > > > The {{testExecutionGraphRestartTimeMetric}} fails every time i run it on > AppVeyor. It also very rarely failed for me locally. > The test fails at Line 235 if the RUNNING timestamp is equal to the > RESTARTING timestamp, which may happen when combining a fast test with a low > resolution clock. > A simple fix would be to increase the timestamp between RUNNING and > RESTARTING by adding a 50ms sleep timeout into the > {{TestingRestartStrategy#canRestart()}} method, as this one is called before > transitioning to the RESTARTING state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8981) End-to-end test: Kerberos security
[ https://issues.apache.org/jira/browse/FLINK-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418784#comment-16418784 ] Till Rohrmann commented on FLINK-8981: -- Hi [~suez1224], if the {{MiniKDC}} offers the same functionality as the real Hadoop cluster, then this might work. But also in this case we should use the {{CliFrontend}} to submit a job to YARN. > End-to-end test: Kerberos security > -- > > Key: FLINK-8981 > URL: https://issues.apache.org/jira/browse/FLINK-8981 > Project: Flink > Issue Type: Sub-task > Components: Security, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Shuyi Chen >Priority: Blocker > Fix For: 1.5.0 > > > We should add an end-to-end test which verifies Flink's integration with > Kerberos security. In order to do this, we should start a Kerberos secured > Hadoop, ZooKeeper and Kafka cluster. Then we should start a Flink cluster > with HA enabled and run a job which reads from and writes to Kafka. We could > use a simple pipe job for that purpose which has some state for checkpointing > to HDFS. > See [security docs| > https://ci.apache.org/projects/flink/flink-docs-master/ops/security-kerberos.html] > for how more information about Flink's Kerberos integration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9109) Add flink modify command to documentation
Till Rohrmann created FLINK-9109: Summary: Add flink modify command to documentation Key: FLINK-9109 URL: https://issues.apache.org/jira/browse/FLINK-9109 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.5.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.5.0 We should add documentation for the {{flink modify}} command. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5786: [FLINK-9109] [doc] Update documentation for CLI
GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/5786 [FLINK-9109] [doc] Update documentation for CLI ## What is the purpose of the change Update documentation for CLI. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink addDocumentationForModifyCommand Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5786 commit 2822c0eadfd5469fb293d73b2c9466def1f38607 Author: Till RohrmannDate: 2018-03-29T11:36:34Z [FLINK-9109] [doc] Update documentation for CLI ---
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418811#comment-16418811 ] ASF GitHub Bot commented on FLINK-8813: --- Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/5756 Rebased and solved conflict. > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX for fli...
Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/5756 Rebased and solved conflict. ---
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418810#comment-16418810 ] ASF GitHub Bot commented on FLINK-8813: --- Github user pnowojski commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178029070 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- I would prefer to keep the enum. Name `isLegacyDeployment` would deprecate faster then the enum type. Also enums are more flexible (adding/removing more values in the future). Regardless there is no big advantage of one over the other, so if you want, I can change it either way. > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy,
[jira] [Commented] (FLINK-9031) DataSet Job result changes when adding rebalance after union
[ https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418836#comment-16418836 ] ASF GitHub Bot commented on FLINK-9031: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5742 > DataSet Job result changes when adding rebalance after union > > > Key: FLINK-9031 > URL: https://issues.apache.org/jira/browse/FLINK-9031 > Project: Flink > Issue Type: Bug > Components: DataSet API, Local Runtime, Optimizer >Affects Versions: 1.3.1 >Reporter: Fabian Hueske >Priority: Critical > Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt > > > A user [reported this issue on the user mailing > list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E]. > {quote}I am using Flink 1.3.1 and I have found a strange behavior on running > the following logic: > # Read data from file and store into DataSet > # Split dataset in two, by checking if "field1" of POJOs is empty or not, so > that the first dataset contains only elements with non empty "field1", and > the second dataset will contain the other elements. > # Each dataset is then grouped by, one by "field1" and other by another > field, and subsequently reduced. > # The 2 datasets are merged together by union. > # The final dataset is written as json. > What I was expected, from output, was to find only one element with a > specific value of "field1" because: > # Reducing the first dataset grouped by "field1" should generate only one > element with a specific value of "field1". > # The second dataset should contain only elements with empty "field1". > # Making an union of them should not duplicate any record. > This does not happen. When i read the generated jsons i see some duplicate > (non empty) values of "field1". > Strangely this does not happen when the union between the two datasets is > not computed. In this case the first dataset produces elements only with > distinct values of "field1", while second dataset produces only records with > empty field "value1". > {quote} > The user has not enable object reuse. > Later he reports that the problem disappears when he injects a rebalance() > after a union resolves the problem. I had a look at the execution plans for > both cases (attached to this issue) but could not identify a problem. > Hence I assume, this might be an issue with the runtime code but we need to > look deeper into this. The user also provided an example program consisting > of two classes which are attached to the issue as well. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418853#comment-16418853 ] ASF GitHub Bot commented on FLINK-8813: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5756 > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5756 ---
[GitHub] flink pull request #5751: [FLINK-9060][state] Deleting state using KeyedStat...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5751 ---
[GitHub] flink issue #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX for fli...
Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/5756 Thanks :) ---
[jira] [Closed] (FLINK-9060) Deleting state using KeyedStateBackend.getKeys() throws Exception
[ https://issues.apache.org/jira/browse/FLINK-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-9060. - Resolution: Fixed Merged on master with 62bbada10847609884f14ceee74588ab2c8f3d8c and on release-1.5 with caac61833ddc5110070346590f14dcf9db2b > Deleting state using KeyedStateBackend.getKeys() throws Exception > - > > Key: FLINK-9060 > URL: https://issues.apache.org/jira/browse/FLINK-9060 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Reporter: Aljoscha Krettek >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Adding this test to {{StateBackendTestBase}} showcases the problem: > {code} > @Test > public void testConcurrentModificationWithGetKeys() throws Exception { > AbstractKeyedStateBackend backend = > createKeyedBackend(IntSerializer.INSTANCE); > try { > ListStateDescriptor listStateDescriptor = > new ListStateDescriptor<>("foo", > StringSerializer.INSTANCE); > backend.setCurrentKey(1); > backend > .getPartitionedState(VoidNamespace.INSTANCE, > VoidNamespaceSerializer.INSTANCE, listStateDescriptor) > .add("Hello"); > backend.setCurrentKey(2); > backend > .getPartitionedState(VoidNamespace.INSTANCE, > VoidNamespaceSerializer.INSTANCE, listStateDescriptor) > .add("Ciao"); > Stream keys = backend > .getKeys(listStateDescriptor.getName(), > VoidNamespace.INSTANCE); > keys.forEach((key) -> { > backend.setCurrentKey(key); > try { > backend > .getPartitionedState( > VoidNamespace.INSTANCE, > > VoidNamespaceSerializer.INSTANCE, > listStateDescriptor) > .clear(); > } catch (Exception e) { > e.printStackTrace(); > } > }); > } > finally { > IOUtils.closeQuietly(backend); > backend.dispose(); > } > } > {code} > This should work because one of the use cases of {{getKeys()}} and > {{applyToAllKeys()}} is to do stuff for every key, which includes deleting > them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5782: [FLINK-6567] [tests] Harden ExecutionGraphMetricsT...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5782#discussion_r178017114 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphMetricsTest.java --- @@ -140,6 +140,9 @@ public void testExecutionGraphRestartTimeMetric() throws JobException, IOExcepti assertTrue(currentRestartingTime >= previousRestartingTime); previousRestartingTime = currentRestartingTime; + + // add some pause to let the currentRestartingTime increase + Thread.sleep(1L); --- End diff -- I think we have to keep it here, because otherwise the loop might finish so fast that we don't see an increase in `previousRestartingTime` because this value is effectively `System.currentTimeMillis - timestamps[RESTARTING]`. ---
[jira] [Commented] (FLINK-8986) End-to-end test: REST
[ https://issues.apache.org/jira/browse/FLINK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418786#comment-16418786 ] Till Rohrmann commented on FLINK-8986: -- Hi [~walterddr], sorry for the late reply. The REST API has actually been extend a bit and we still have to update the docs for it (see FLINK-9104). But the majority of REST calls stayed the same. Actually all the subclasses of {{MessageHeaders}} contain all the valid rest calls we currently support. > End-to-end test: REST > - > > Key: FLINK-8986 > URL: https://issues.apache.org/jira/browse/FLINK-8986 > Project: Flink > Issue Type: Sub-task > Components: REST, Tests >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should add an end-to-end test which verifies that we can use the REST > interface to obtain information about a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX ...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178030866 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- This is definitely not a big issue. I will merge. ---
[jira] [Closed] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-8813. - Resolution: Fixed Merged on master with d3c489d36b02c851ba4f8add1ea3ac69277281a0 and to release-1.5 with cd4a1e92b52af115862e3ce8a3c2be5a5b11aa7b > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-9031) DataSet Job result changes when adding rebalance after union
[ https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-9031. - Resolution: Fixed Assignee: Fabian Hueske Fix Version/s: 1.4.3 1.5.0 Fixed in - 1.4.3 via 492dcb1765e86f1e7d66ee08db714992e7a31f4e - 1.5.0 via ef2038755f3af8cc356727fee3f0f28c3cfffc64 - 1.6.0 via 87ff6eb86e7130309aab8e4ff322f0fdd2cdc020 > DataSet Job result changes when adding rebalance after union > > > Key: FLINK-9031 > URL: https://issues.apache.org/jira/browse/FLINK-9031 > Project: Flink > Issue Type: Bug > Components: DataSet API, Local Runtime, Optimizer >Affects Versions: 1.3.1 >Reporter: Fabian Hueske >Assignee: Fabian Hueske >Priority: Critical > Fix For: 1.5.0, 1.4.3 > > Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt > > > A user [reported this issue on the user mailing > list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E]. > {quote}I am using Flink 1.3.1 and I have found a strange behavior on running > the following logic: > # Read data from file and store into DataSet > # Split dataset in two, by checking if "field1" of POJOs is empty or not, so > that the first dataset contains only elements with non empty "field1", and > the second dataset will contain the other elements. > # Each dataset is then grouped by, one by "field1" and other by another > field, and subsequently reduced. > # The 2 datasets are merged together by union. > # The final dataset is written as json. > What I was expected, from output, was to find only one element with a > specific value of "field1" because: > # Reducing the first dataset grouped by "field1" should generate only one > element with a specific value of "field1". > # The second dataset should contain only elements with empty "field1". > # Making an union of them should not duplicate any record. > This does not happen. When i read the generated jsons i see some duplicate > (non empty) values of "field1". > Strangely this does not happen when the union between the two datasets is > not computed. In this case the first dataset produces elements only with > distinct values of "field1", while second dataset produces only records with > empty field "value1". > {quote} > The user has not enable object reuse. > Later he reports that the problem disappears when he injects a rebalance() > after a union resolves the problem. I had a look at the execution plans for > both cases (attached to this issue) but could not identify a problem. > Hence I assume, this might be an issue with the runtime code but we need to > look deeper into this. The user also provided an example program consisting > of two classes which are attached to the issue as well. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8708) Unintended integer division in StandaloneThreadedGenerator
[ https://issues.apache.org/jira/browse/FLINK-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418873#comment-16418873 ] ASF GitHub Bot commented on FLINK-8708: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5776 Thanks, merging this... > Unintended integer division in StandaloneThreadedGenerator > -- > > Key: FLINK-8708 > URL: https://issues.apache.org/jira/browse/FLINK-8708 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > > In > flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/statemachine/generator/StandaloneThreadedGenerator.java > : > {code} > double factor = (ts - lastTimeStamp) / 1000; > {code} > Proper casting should be done before the integer division -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5624: [FLINK-8402][s3][tests] fix hadoop/presto S3 IT cases for...
Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5624 Indeed, Presto-S3 does better in `com.facebook.presto.hive.PrestoS3FileSystem#create()`: ``` if ((!overwrite) && exists(path)) { throw new IOException("File already exists:" + path); } // file creation ``` But if `overwrite = false`, it will also check for existence first. Also, contrary to my initial analysis, the retries when retrieving the file status during the existence check do not cover non-existence. I can adapt the tests to only use `overwrite = true`, but actual code outside the tests makes use of both variants. It's therefore a good idea to make the distinction between `flink-s3-fs-hadoop` and `flink-s3-fs-presto` but only for the existence check, not for checking that a file/directory was deleted since > Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions. I adapted the code accordingly which effectively boiled down to removing some of the new eventual consistent existence checks in `PrestoS3FileSystemITCase`. Regarding the two implementations you provided: for doing the existence check, there should not be a difference between a single `fs.exists()` call vs. `fs.open()` in terms of consistency. ---
[jira] [Commented] (FLINK-9060) Deleting state using KeyedStateBackend.getKeys() throws Exception
[ https://issues.apache.org/jira/browse/FLINK-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418877#comment-16418877 ] ASF GitHub Bot commented on FLINK-9060: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5751 > Deleting state using KeyedStateBackend.getKeys() throws Exception > - > > Key: FLINK-9060 > URL: https://issues.apache.org/jira/browse/FLINK-9060 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Reporter: Aljoscha Krettek >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Adding this test to {{StateBackendTestBase}} showcases the problem: > {code} > @Test > public void testConcurrentModificationWithGetKeys() throws Exception { > AbstractKeyedStateBackend backend = > createKeyedBackend(IntSerializer.INSTANCE); > try { > ListStateDescriptor listStateDescriptor = > new ListStateDescriptor<>("foo", > StringSerializer.INSTANCE); > backend.setCurrentKey(1); > backend > .getPartitionedState(VoidNamespace.INSTANCE, > VoidNamespaceSerializer.INSTANCE, listStateDescriptor) > .add("Hello"); > backend.setCurrentKey(2); > backend > .getPartitionedState(VoidNamespace.INSTANCE, > VoidNamespaceSerializer.INSTANCE, listStateDescriptor) > .add("Ciao"); > Stream keys = backend > .getKeys(listStateDescriptor.getName(), > VoidNamespace.INSTANCE); > keys.forEach((key) -> { > backend.setCurrentKey(key); > try { > backend > .getPartitionedState( > VoidNamespace.INSTANCE, > > VoidNamespaceSerializer.INSTANCE, > listStateDescriptor) > .clear(); > } catch (Exception e) { > e.printStackTrace(); > } > }); > } > finally { > IOUtils.closeQuietly(backend); > backend.dispose(); > } > } > {code} > This should work because one of the use cases of {{getKeys()}} and > {{applyToAllKeys()}} is to do stuff for every key, which includes deleting > them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418887#comment-16418887 ] ASF GitHub Bot commented on FLINK-8813: --- Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/5756 Thanks :) > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-8708) Unintended integer division in StandaloneThreadedGenerator
[ https://issues.apache.org/jira/browse/FLINK-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-8708. - Resolution: Fixed Fix Version/s: 1.5.0 Fixed in - 1.5.0 via 063895c24cb37cabb71a22bbe3eb09bbd22b09a4 - 1.6.0 via 98924f0ab5eac1fea361277715560bb7d519a369 > Unintended integer division in StandaloneThreadedGenerator > -- > > Key: FLINK-8708 > URL: https://issues.apache.org/jira/browse/FLINK-8708 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > Fix For: 1.5.0 > > > In > flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/statemachine/generator/StandaloneThreadedGenerator.java > : > {code} > double factor = (ts - lastTimeStamp) / 1000; > {code} > Proper casting should be done before the integer division -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-8708) Unintended integer division in StandaloneThreadedGenerator
[ https://issues.apache.org/jira/browse/FLINK-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-8708. --- > Unintended integer division in StandaloneThreadedGenerator > -- > > Key: FLINK-8708 > URL: https://issues.apache.org/jira/browse/FLINK-8708 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > Fix For: 1.5.0 > > > In > flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/statemachine/generator/StandaloneThreadedGenerator.java > : > {code} > double factor = (ts - lastTimeStamp) / 1000; > {code} > Proper casting should be done before the integer division -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8708) Unintended integer division in StandaloneThreadedGenerator
[ https://issues.apache.org/jira/browse/FLINK-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418895#comment-16418895 ] ASF GitHub Bot commented on FLINK-8708: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5776 > Unintended integer division in StandaloneThreadedGenerator > -- > > Key: FLINK-8708 > URL: https://issues.apache.org/jira/browse/FLINK-8708 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > Fix For: 1.5.0 > > > In > flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/statemachine/generator/StandaloneThreadedGenerator.java > : > {code} > double factor = (ts - lastTimeStamp) / 1000; > {code} > Proper casting should be done before the integer division -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9108) invalid ProcessWindowFunction link in Document
Matrix42 created FLINK-9108: --- Summary: invalid ProcessWindowFunction link in Document Key: FLINK-9108 URL: https://issues.apache.org/jira/browse/FLINK-9108 Project: Flink Issue Type: Bug Components: Documentation Reporter: Matrix42 Assignee: Matrix42 Attachments: QQ截图20180329184203.png !QQ截图20180329184203.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9108) invalid ProcessWindowFunction link in Document
[ https://issues.apache.org/jira/browse/FLINK-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418752#comment-16418752 ] ASF GitHub Bot commented on FLINK-9108: --- GitHub user Matrix42 opened a pull request: https://github.com/apache/flink/pull/5785 [FLINK-9108][docs] Fix invalid link ## What is the purpose of the change Fix invalid link You can merge this pull request into a Git repository by running: $ git pull https://github.com/Matrix42/flink FLINK-9108 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5785.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5785 commit a1497d313ca79794a44c134a3a034064ac441977 Author: Matrix42 <934336389@...> Date: 2018-03-29T10:47:55Z [FLINK-9108][docs] Fix invalid link > invalid ProcessWindowFunction link in Document > --- > > Key: FLINK-9108 > URL: https://issues.apache.org/jira/browse/FLINK-9108 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Matrix42 >Assignee: Matrix42 >Priority: Trivial > Attachments: QQ截图20180329184203.png > > > !QQ截图20180329184203.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5785: [FLINK-9108][docs] Fix invalid link
GitHub user Matrix42 opened a pull request: https://github.com/apache/flink/pull/5785 [FLINK-9108][docs] Fix invalid link ## What is the purpose of the change Fix invalid link You can merge this pull request into a Git repository by running: $ git pull https://github.com/Matrix42/flink FLINK-9108 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5785.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5785 commit a1497d313ca79794a44c134a3a034064ac441977 Author: Matrix42 <934336389@...> Date: 2018-03-29T10:47:55Z [FLINK-9108][docs] Fix invalid link ---
[GitHub] flink pull request #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX ...
Github user pnowojski commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178029070 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- I would prefer to keep the enum. Name `isLegacyDeployment` would deprecate faster then the enum type. Also enums are more flexible (adding/removing more values in the future). Regardless there is no big advantage of one over the other, so if you want, I can change it either way. ---
[jira] [Closed] (FLINK-9031) DataSet Job result changes when adding rebalance after union
[ https://issues.apache.org/jira/browse/FLINK-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-9031. --- > DataSet Job result changes when adding rebalance after union > > > Key: FLINK-9031 > URL: https://issues.apache.org/jira/browse/FLINK-9031 > Project: Flink > Issue Type: Bug > Components: DataSet API, Local Runtime, Optimizer >Affects Versions: 1.3.1 >Reporter: Fabian Hueske >Assignee: Fabian Hueske >Priority: Critical > Fix For: 1.5.0, 1.4.3 > > Attachments: Person.java, RunAll.java, newplan.txt, oldplan.txt > > > A user [reported this issue on the user mailing > list|https://lists.apache.org/thread.html/075f1a487b044079b5d61f199439cb77dd4174bd425bcb3327ed7dfc@%3Cuser.flink.apache.org%3E]. > {quote}I am using Flink 1.3.1 and I have found a strange behavior on running > the following logic: > # Read data from file and store into DataSet > # Split dataset in two, by checking if "field1" of POJOs is empty or not, so > that the first dataset contains only elements with non empty "field1", and > the second dataset will contain the other elements. > # Each dataset is then grouped by, one by "field1" and other by another > field, and subsequently reduced. > # The 2 datasets are merged together by union. > # The final dataset is written as json. > What I was expected, from output, was to find only one element with a > specific value of "field1" because: > # Reducing the first dataset grouped by "field1" should generate only one > element with a specific value of "field1". > # The second dataset should contain only elements with empty "field1". > # Making an union of them should not duplicate any record. > This does not happen. When i read the generated jsons i see some duplicate > (non empty) values of "field1". > Strangely this does not happen when the union between the two datasets is > not computed. In this case the first dataset produces elements only with > distinct values of "field1", while second dataset produces only records with > empty field "value1". > {quote} > The user has not enable object reuse. > Later he reports that the problem disappears when he injects a rebalance() > after a union resolves the problem. I had a look at the execution plans for > both cases (attached to this issue) but could not identify a problem. > Hence I assume, this might be an issue with the runtime code but we need to > look deeper into this. The user also provided an example program consisting > of two classes which are attached to the issue as well. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8402) HadoopS3FileSystemITCase.testDirectoryListing fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418879#comment-16418879 ] ASF GitHub Bot commented on FLINK-8402: --- Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5624 Indeed, Presto-S3 does better in `com.facebook.presto.hive.PrestoS3FileSystem#create()`: ``` if ((!overwrite) && exists(path)) { throw new IOException("File already exists:" + path); } // file creation ``` But if `overwrite = false`, it will also check for existence first. Also, contrary to my initial analysis, the retries when retrieving the file status during the existence check do not cover non-existence. I can adapt the tests to only use `overwrite = true`, but actual code outside the tests makes use of both variants. It's therefore a good idea to make the distinction between `flink-s3-fs-hadoop` and `flink-s3-fs-presto` but only for the existence check, not for checking that a file/directory was deleted since > Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions. I adapted the code accordingly which effectively boiled down to removing some of the new eventual consistent existence checks in `PrestoS3FileSystemITCase`. Regarding the two implementations you provided: for doing the existence check, there should not be a difference between a single `fs.exists()` call vs. `fs.open()` in terms of consistency. > HadoopS3FileSystemITCase.testDirectoryListing fails on Travis > - > > Key: FLINK-8402 > URL: https://issues.apache.org/jira/browse/FLINK-8402 > Project: Flink > Issue Type: Bug > Components: FileSystem, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Nico Kruber >Priority: Blocker > Labels: test-stability > Fix For: 1.5.0, 1.4.3 > > > The test {{HadoopS3FileSystemITCase.testDirectoryListing}} fails on Travis. > https://travis-ci.org/tillrohrmann/flink/jobs/327021175 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6571) InfiniteSource in SourceStreamOperatorTest should deal with InterruptedExceptions
[ https://issues.apache.org/jira/browse/FLINK-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418896#comment-16418896 ] ASF GitHub Bot commented on FLINK-6571: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5394 What's the state @zentol? Would Stephan's proposal work? > InfiniteSource in SourceStreamOperatorTest should deal with > InterruptedExceptions > - > > Key: FLINK-6571 > URL: https://issues.apache.org/jira/browse/FLINK-6571 > Project: Flink > Issue Type: Improvement > Components: Tests >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > Labels: test-stability > Fix For: 1.5.0 > > > So this is a new one: i got a failing test > ({{testNoMaxWatermarkOnAsyncStop}}) due to an uncatched InterruptedException. > {code} > [00:28:15] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 0.828 sec <<< FAILURE! - in > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest > [00:28:15] > testNoMaxWatermarkOnAsyncStop(org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest) > Time elapsed: 0 sec <<< ERROR! > [00:28:15] java.lang.InterruptedException: sleep interrupted > [00:28:15]at java.lang.Thread.sleep(Native Method) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest$InfiniteSource.run(StreamSourceOperatorTest.java:343) > [00:28:15]at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest.testNoMaxWatermarkOnAsyncStop(StreamSourceOperatorTest.java:176) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-6571) InfiniteSource in SourceStreamOperatorTest should deal with InterruptedExceptions
[ https://issues.apache.org/jira/browse/FLINK-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-6571: - Priority: Critical (was: Blocker) > InfiniteSource in SourceStreamOperatorTest should deal with > InterruptedExceptions > - > > Key: FLINK-6571 > URL: https://issues.apache.org/jira/browse/FLINK-6571 > Project: Flink > Issue Type: Improvement > Components: Tests >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Labels: test-stability > Fix For: 1.5.0 > > > So this is a new one: i got a failing test > ({{testNoMaxWatermarkOnAsyncStop}}) due to an uncatched InterruptedException. > {code} > [00:28:15] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 0.828 sec <<< FAILURE! - in > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest > [00:28:15] > testNoMaxWatermarkOnAsyncStop(org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest) > Time elapsed: 0 sec <<< ERROR! > [00:28:15] java.lang.InterruptedException: sleep interrupted > [00:28:15]at java.lang.Thread.sleep(Native Method) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest$InfiniteSource.run(StreamSourceOperatorTest.java:343) > [00:28:15]at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest.testNoMaxWatermarkOnAsyncStop(StreamSourceOperatorTest.java:176) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6571) InfiniteSource in SourceStreamOperatorTest should deal with InterruptedExceptions
[ https://issues.apache.org/jira/browse/FLINK-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418900#comment-16418900 ] Till Rohrmann commented on FLINK-6571: -- unblocked 1.5.0 because the problem seems to exist for longer. > InfiniteSource in SourceStreamOperatorTest should deal with > InterruptedExceptions > - > > Key: FLINK-6571 > URL: https://issues.apache.org/jira/browse/FLINK-6571 > Project: Flink > Issue Type: Improvement > Components: Tests >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Labels: test-stability > Fix For: 1.5.0 > > > So this is a new one: i got a failing test > ({{testNoMaxWatermarkOnAsyncStop}}) due to an uncatched InterruptedException. > {code} > [00:28:15] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 0.828 sec <<< FAILURE! - in > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest > [00:28:15] > testNoMaxWatermarkOnAsyncStop(org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest) > Time elapsed: 0 sec <<< ERROR! > [00:28:15] java.lang.InterruptedException: sleep interrupted > [00:28:15]at java.lang.Thread.sleep(Native Method) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest$InfiniteSource.run(StreamSourceOperatorTest.java:343) > [00:28:15]at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87) > [00:28:15]at > org.apache.flink.streaming.runtime.operators.StreamSourceOperatorTest.testNoMaxWatermarkOnAsyncStop(StreamSourceOperatorTest.java:176) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5394: [FLINK-6571][tests] Catch InterruptedException in StreamS...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5394 What's the state @zentol? Would Stephan's proposal work? ---
[GitHub] flink pull request #5776: [FLINK-8708] Unintended integer division in Standa...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5776 ---
[jira] [Commented] (FLINK-8402) HadoopS3FileSystemITCase.testDirectoryListing fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418903#comment-16418903 ] Till Rohrmann commented on FLINK-8402: -- Unblocked 1.5.0, because this is a test instability caused by the underlying filesystem consistencies. > HadoopS3FileSystemITCase.testDirectoryListing fails on Travis > - > > Key: FLINK-8402 > URL: https://issues.apache.org/jira/browse/FLINK-8402 > Project: Flink > Issue Type: Bug > Components: FileSystem, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Nico Kruber >Priority: Critical > Labels: test-stability > Fix For: 1.5.0, 1.4.3 > > > The test {{HadoopS3FileSystemITCase.testDirectoryListing}} fails on Travis. > https://travis-ci.org/tillrohrmann/flink/jobs/327021175 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8402) HadoopS3FileSystemITCase.testDirectoryListing fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-8402: - Priority: Critical (was: Blocker) > HadoopS3FileSystemITCase.testDirectoryListing fails on Travis > - > > Key: FLINK-8402 > URL: https://issues.apache.org/jira/browse/FLINK-8402 > Project: Flink > Issue Type: Bug > Components: FileSystem, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Nico Kruber >Priority: Critical > Labels: test-stability > Fix For: 1.5.0, 1.4.3 > > > The test {{HadoopS3FileSystemITCase.testDirectoryListing}} fails on Travis. > https://travis-ci.org/tillrohrmann/flink/jobs/327021175 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8707) Excessive amount of files opened by flink task manager
[ https://issues.apache.org/jira/browse/FLINK-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-8707: - Priority: Critical (was: Blocker) > Excessive amount of files opened by flink task manager > -- > > Key: FLINK-8707 > URL: https://issues.apache.org/jira/browse/FLINK-8707 > Project: Flink > Issue Type: Bug > Components: TaskManager >Affects Versions: 1.3.2 > Environment: NAME="Red Hat Enterprise Linux Server" > VERSION="7.3 (Maipo)" > Two boxes, each with a Job Manager & Task Manager, using Zookeeper for HA. > flink.yaml below with some settings (removed exact box names) etc: > env.log.dir: ...some dir...residing on the same box > env.pid.dir: some dir...residing on the same box > metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter > metrics.reporters: jmx > state.backend: filesystem > state.backend.fs.checkpointdir: file:///some_nfs_mount > state.checkpoints.dir: file:///some_nfs_mount > state.checkpoints.num-retained: 3 > high-availability.cluster-id: /tst > high-availability.storageDir: file:///some_nfs_mount/ha > high-availability: zookeeper > high-availability.zookeeper.path.root: /flink > high-availability.zookeeper.quorum: ...list of zookeeper boxes > env.java.opts.jobmanager: ...some extra jar args > jobmanager.archive.fs.dir: some dir...residing on the same box > jobmanager.web.submit.enable: true > jobmanager.web.tmpdir: some dir...residing on the same box > env.java.opts.taskmanager: some extra jar args > taskmanager.tmp.dirs: some dir...residing on the same box/var/tmp > taskmanager.network.memory.min: 1024MB > taskmanager.network.memory.max: 2048MB > blob.storage.directory: some dir...residing on the same box >Reporter: Alexander Gardner >Assignee: Piotr Nowojski >Priority: Critical > Fix For: 1.5.0 > > Attachments: AfterRunning-3-jobs-Box2-TM-JCONSOLE.png, > AfterRunning-3-jobs-TM-FDs-BOX2.jpg, AfterRunning-3-jobs-lsof-p.box2-TM, > AfterRunning-3-jobs-lsof.box2-TM, AterRunning-3-jobs-Box1-TM-JCONSOLE.png, > box1-jobmgr-lsof, box1-taskmgr-lsof, box2-jobmgr-lsof, box2-taskmgr-lsof > > > The job manager has less FDs than the task manager. > > Hi > A support alert indicated that there were a lot of open files for the boxes > running Flink. > There were 4 flink jobs that were dormant but had consumed a number of msgs > from Kafka using the FlinkKafkaConsumer010. > A simple general lsof: > $ lsof | wc -l -> returned 153114 open file descriptors. > Focusing on the TaskManager process (process ID = 12154): > $ lsof | grep 12154 | wc -l- > returned 129322 open FDs > $ lsof -p 12154 | wc -l -> returned 531 FDs > There were 228 threads running for the task manager. > > Drilling down a bit further, looking at a_inode and FIFO entries: > $ lsof -p 12154 | grep a_inode | wc -l = 100 FDs > $ lsof -p 12154 | grep FIFO | wc -l = 200 FDs > $ /proc/12154/maps = 920 entries. > Apart from lsof identifying lots of JARs and SOs being referenced there were > also 244 child processes for the task manager process. > Noticed that in each environment, a creep of file descriptors...are the above > figures deemed excessive for the no of FDs in use? I know Flink uses Netty - > is it using a separate Selector for reads & writes? > Additionally Flink uses memory mapped files? or direct bytebuffers are these > skewing the numbers of FDs shown? > Example of one child process ID 6633: > java 12154 6633 dfdev 387u a_inode 0,9 0 5869 [eventpoll] > java 12154 6633 dfdev 388r FIFO 0,8 0t0 459758080 pipe > java 12154 6633 dfdev 389w FIFO 0,8 0t0 459758080 pipe > Lasty, cannot identify yet the reason for the creep in FDs even if Flink is > pretty dormant or has dormant jobs. Production nodes are not experiencing > excessive amounts of throughput yet either. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8707) Excessive amount of files opened by flink task manager
[ https://issues.apache.org/jira/browse/FLINK-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418909#comment-16418909 ] Till Rohrmann commented on FLINK-8707: -- Unblocked for 1.5.0 release since so far we could not reproduce the problem while testing. We should definitely keep an eye on it, though. > Excessive amount of files opened by flink task manager > -- > > Key: FLINK-8707 > URL: https://issues.apache.org/jira/browse/FLINK-8707 > Project: Flink > Issue Type: Bug > Components: TaskManager >Affects Versions: 1.3.2 > Environment: NAME="Red Hat Enterprise Linux Server" > VERSION="7.3 (Maipo)" > Two boxes, each with a Job Manager & Task Manager, using Zookeeper for HA. > flink.yaml below with some settings (removed exact box names) etc: > env.log.dir: ...some dir...residing on the same box > env.pid.dir: some dir...residing on the same box > metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter > metrics.reporters: jmx > state.backend: filesystem > state.backend.fs.checkpointdir: file:///some_nfs_mount > state.checkpoints.dir: file:///some_nfs_mount > state.checkpoints.num-retained: 3 > high-availability.cluster-id: /tst > high-availability.storageDir: file:///some_nfs_mount/ha > high-availability: zookeeper > high-availability.zookeeper.path.root: /flink > high-availability.zookeeper.quorum: ...list of zookeeper boxes > env.java.opts.jobmanager: ...some extra jar args > jobmanager.archive.fs.dir: some dir...residing on the same box > jobmanager.web.submit.enable: true > jobmanager.web.tmpdir: some dir...residing on the same box > env.java.opts.taskmanager: some extra jar args > taskmanager.tmp.dirs: some dir...residing on the same box/var/tmp > taskmanager.network.memory.min: 1024MB > taskmanager.network.memory.max: 2048MB > blob.storage.directory: some dir...residing on the same box >Reporter: Alexander Gardner >Assignee: Piotr Nowojski >Priority: Blocker > Fix For: 1.5.0 > > Attachments: AfterRunning-3-jobs-Box2-TM-JCONSOLE.png, > AfterRunning-3-jobs-TM-FDs-BOX2.jpg, AfterRunning-3-jobs-lsof-p.box2-TM, > AfterRunning-3-jobs-lsof.box2-TM, AterRunning-3-jobs-Box1-TM-JCONSOLE.png, > box1-jobmgr-lsof, box1-taskmgr-lsof, box2-jobmgr-lsof, box2-taskmgr-lsof > > > The job manager has less FDs than the task manager. > > Hi > A support alert indicated that there were a lot of open files for the boxes > running Flink. > There were 4 flink jobs that were dormant but had consumed a number of msgs > from Kafka using the FlinkKafkaConsumer010. > A simple general lsof: > $ lsof | wc -l -> returned 153114 open file descriptors. > Focusing on the TaskManager process (process ID = 12154): > $ lsof | grep 12154 | wc -l- > returned 129322 open FDs > $ lsof -p 12154 | wc -l -> returned 531 FDs > There were 228 threads running for the task manager. > > Drilling down a bit further, looking at a_inode and FIFO entries: > $ lsof -p 12154 | grep a_inode | wc -l = 100 FDs > $ lsof -p 12154 | grep FIFO | wc -l = 200 FDs > $ /proc/12154/maps = 920 entries. > Apart from lsof identifying lots of JARs and SOs being referenced there were > also 244 child processes for the task manager process. > Noticed that in each environment, a creep of file descriptors...are the above > figures deemed excessive for the no of FDs in use? I know Flink uses Netty - > is it using a separate Selector for reads & writes? > Additionally Flink uses memory mapped files? or direct bytebuffers are these > skewing the numbers of FDs shown? > Example of one child process ID 6633: > java 12154 6633 dfdev 387u a_inode 0,9 0 5869 [eventpoll] > java 12154 6633 dfdev 388r FIFO 0,8 0t0 459758080 pipe > java 12154 6633 dfdev 389w FIFO 0,8 0t0 459758080 pipe > Lasty, cannot identify yet the reason for the creep in FDs even if Flink is > pretty dormant or has dormant jobs. Production nodes are not experiencing > excessive amounts of throughput yet either. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5782: [FLINK-6567] [tests] Harden ExecutionGraphMetricsT...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5782#discussion_r178017403 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphMetricsTest.java --- @@ -140,6 +140,9 @@ public void testExecutionGraphRestartTimeMetric() throws JobException, IOExcepti assertTrue(currentRestartingTime >= previousRestartingTime); previousRestartingTime = currentRestartingTime; + + // add some pause to let the currentRestartingTime increase + Thread.sleep(1L); --- End diff -- Yes can do. ---
[jira] [Commented] (FLINK-6567) ExecutionGraphMetricsTest fails on Windows CI
[ https://issues.apache.org/jira/browse/FLINK-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418720#comment-16418720 ] ASF GitHub Bot commented on FLINK-6567: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5782 Thanks for the review @zentol. Let me know once you've run the test on your machine so that I can merge this PR. > ExecutionGraphMetricsTest fails on Windows CI > - > > Key: FLINK-6567 > URL: https://issues.apache.org/jira/browse/FLINK-6567 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: Till Rohrmann >Priority: Blocker > Labels: test-stability > Fix For: 1.6.0 > > > The {{testExecutionGraphRestartTimeMetric}} fails every time i run it on > AppVeyor. It also very rarely failed for me locally. > The test fails at Line 235 if the RUNNING timestamp is equal to the > RESTARTING timestamp, which may happen when combining a fast test with a low > resolution clock. > A simple fix would be to increase the timestamp between RUNNING and > RESTARTING by adding a 50ms sleep timeout into the > {{TestingRestartStrategy#canRestart()}} method, as this one is called before > transitioning to the RESTARTING state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6567) ExecutionGraphMetricsTest fails on Windows CI
[ https://issues.apache.org/jira/browse/FLINK-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418717#comment-16418717 ] ASF GitHub Bot commented on FLINK-6567: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5782#discussion_r178017114 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphMetricsTest.java --- @@ -140,6 +140,9 @@ public void testExecutionGraphRestartTimeMetric() throws JobException, IOExcepti assertTrue(currentRestartingTime >= previousRestartingTime); previousRestartingTime = currentRestartingTime; + + // add some pause to let the currentRestartingTime increase + Thread.sleep(1L); --- End diff -- I think we have to keep it here, because otherwise the loop might finish so fast that we don't see an increase in `previousRestartingTime` because this value is effectively `System.currentTimeMillis - timestamps[RESTARTING]`. > ExecutionGraphMetricsTest fails on Windows CI > - > > Key: FLINK-6567 > URL: https://issues.apache.org/jira/browse/FLINK-6567 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: Till Rohrmann >Priority: Blocker > Labels: test-stability > Fix For: 1.6.0 > > > The {{testExecutionGraphRestartTimeMetric}} fails every time i run it on > AppVeyor. It also very rarely failed for me locally. > The test fails at Line 235 if the RUNNING timestamp is equal to the > RESTARTING timestamp, which may happen when combining a fast test with a low > resolution clock. > A simple fix would be to increase the timestamp between RUNNING and > RESTARTING by adding a 50ms sleep timeout into the > {{TestingRestartStrategy#canRestart()}} method, as this one is called before > transitioning to the RESTARTING state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5782: [FLINK-6567] [tests] Harden ExecutionGraphMetricsTest
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/5782 Thanks for the review @zentol. Let me know once you've run the test on your machine so that I can merge this PR. ---
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418747#comment-16418747 ] ASF GitHub Bot commented on FLINK-8813: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178019320 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- I think enums are preferably over booleans. > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5756: [FLINK-8813][flip6] Disallow PARALLELISM_AUTO_MAX ...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178019320 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- I think enums are preferably over booleans. ---
[jira] [Commented] (FLINK-8985) End-to-end test: CLI
[ https://issues.apache.org/jira/browse/FLINK-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418795#comment-16418795 ] Till Rohrmann commented on FLINK-8985: -- Hi [~walterddr], yes starting with these commands is a good idea. The only other command we have introduced is {{flink modify -p }}. > End-to-end test: CLI > > > Key: FLINK-8985 > URL: https://issues.apache.org/jira/browse/FLINK-8985 > Project: Flink > Issue Type: Sub-task > Components: Client, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should an end-to-end test which verifies that all client commands are > working correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9109) Add flink modify command to documentation
[ https://issues.apache.org/jira/browse/FLINK-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418813#comment-16418813 ] ASF GitHub Bot commented on FLINK-9109: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/5786 [FLINK-9109] [doc] Update documentation for CLI ## What is the purpose of the change Update documentation for CLI. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink addDocumentationForModifyCommand Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5786 commit 2822c0eadfd5469fb293d73b2c9466def1f38607 Author: Till RohrmannDate: 2018-03-29T11:36:34Z [FLINK-9109] [doc] Update documentation for CLI > Add flink modify command to documentation > - > > Key: FLINK-9109 > URL: https://issues.apache.org/jira/browse/FLINK-9109 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > We should add documentation for the {{flink modify}} command. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8813) AutoParallellismITCase fails with Flip6
[ https://issues.apache.org/jira/browse/FLINK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418831#comment-16418831 ] ASF GitHub Bot commented on FLINK-8813: --- Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/5756#discussion_r178030866 --- Diff: flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java --- @@ -93,6 +93,10 @@ private MiniClusterResource( this.enableClusterClient = enableClusterClient; } + public MiniClusterType getMiniClusterType() { --- End diff -- This is definitely not a big issue. I will merge. > AutoParallellismITCase fails with Flip6 > --- > > Key: FLINK-8813 > URL: https://issues.apache.org/jira/browse/FLINK-8813 > Project: Flink > Issue Type: Bug > Components: JobManager, Tests >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Piotr Nowojski >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > The {{AutoParallelismITCase}} fails when running against flip6. > ([https://travis-ci.org/zentol/flink/jobs/347373854)] > It appears that the {{JobMaster}} does not properly handle > {{ExecutionConfig#PARALLELISM_AUTO_MAX}}. > > Exception: > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not > start JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:210) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:66) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132) > at > akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > set up JobManager > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:181) > at > org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:747) > at > org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:243) > ... 20 more > Caused by: java.lang.IllegalArgumentException: The parallelism must be at > least one. > at > org.apache.flink.runtime.jobgraph.JobVertex.setParallelism(JobVertex.java:290) > at > org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:162) > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:295) > at > org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:170) > ... 22 more{code} > > The likely culprit is this call to {{ExecutionGraphBuilder#buildGraph}} in > the {{JobMaster}} constructor: > {code:java} > this.executionGraph = ExecutionGraphBuilder.buildGraph( >null, >jobGraph, >jobMasterConfiguration.getConfiguration(), >scheduledExecutorService, >scheduledExecutorService, >slotPool.getSlotProvider(), >userCodeLoader, >highAvailabilityServices.getCheckpointRecoveryFactory(), >rpcTimeout, >restartStrategy, >jobMetricGroup, >-1, // parallelismForAutoMax >blobServer, >jobMasterConfiguration.getSlotRequestTimeout(), >log);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5742: [FLINK-9031] Fix DataSet Union operator translatio...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5742 ---
[GitHub] flink issue #5776: [FLINK-8708] Unintended integer division in StandaloneThr...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5776 Thanks, merging this... ---
[GitHub] flink pull request #5787: Release 1.5
GitHub user 386587793 opened a pull request: https://github.com/apache/flink/pull/5787 Release 1.5 *Thank you very much for contributing to Apache Flink - we are happy that you want to help us improve Flink. To help the community review your contribution in the best possible way, please go through the checklist below, which will get the contribution into a shape in which it can be best reviewed.* *Please understand that we do not do this to make contributions to Flink a hassle. In order to uphold a high standard of quality for code contributions, while at the same time managing a large number of contributions, we need contributors to prepare the contributions well, and give reviewers enough contextual information for the review. Please also understand that contributions that do not follow this guide will take longer to review and thus typically be picked up with lower priority by the community.* ## Contribution Checklist - Make sure that the pull request corresponds to a [JIRA issue](https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue. - Name the pull request in the form "[FLINK-] [component] Title of the pull request", where *FLINK-* should be replaced by the actual issue number. Skip *component* if you are unsure about which is the best component. Typo fixes that have no associated JIRA issue should be named following this pattern: `[hotfix] [docs] Fix typo in event time introduction` or `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`. - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review. - Make sure that the change passes the automated tests, i.e., `mvn clean verify` passes. You can set up Travis CI to do that following [this guide](http://flink.apache.org/contribute-code.html#best-practices). - Each pull request should address only one issue, not mix up code from multiple issues. - Each commit in the pull request has a meaningful commit message (including the JIRA id) - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below. **(The sections below can be removed for hotfixes of typos)** ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/flink release-1.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5787.patch To close
[jira] [Commented] (FLINK-8804) Bump flink-shaded dependency to 3.0
[ https://issues.apache.org/jira/browse/FLINK-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418912#comment-16418912 ] Till Rohrmann commented on FLINK-8804: -- Do we strictly need flink-shaded 3.0 for the Flink 1.5.0 release [~Zentol]? > Bump flink-shaded dependency to 3.0 > --- > > Key: FLINK-8804 > URL: https://issues.apache.org/jira/browse/FLINK-8804 > Project: Flink > Issue Type: Improvement > Components: Build System >Affects Versions: 1.5.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > Fix For: 1.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8533) Support MasterTriggerRestoreHook state reinitialization
[ https://issues.apache.org/jira/browse/FLINK-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419889#comment-16419889 ] ASF GitHub Bot commented on FLINK-8533: --- Github user EronWright commented on the issue: https://github.com/apache/flink/pull/5427 @tillrohrmann @StephanEwen sorry about the long delay here, would you please take another look? I followed Stephan's suggestion of not introducing a new method. However, the semantics that I was shooting for with `initializeState` is that it would be called on both _start_ and _restart_. I adjusted `JobManager` to call `restoreLatestCheckpointedState` on first execution (as does `JobMaster`). Are you OK with that? > Support MasterTriggerRestoreHook state reinitialization > --- > > Key: FLINK-8533 > URL: https://issues.apache.org/jira/browse/FLINK-8533 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.0 >Reporter: Eron Wright >Assignee: Eron Wright >Priority: Major > > {{MasterTriggerRestoreHook}} enables coordination with an external system for > taking or restoring checkpoints. When execution is restarted from a > checkpoint, {{restoreCheckpoint}} is called to restore or reinitialize the > external system state. There's an edge case where the external state is not > adequately reinitialized, that is when execution fails _before the first > checkpoint_. In that case, the hook is not invoked and has no opportunity to > restore the external state to initial conditions. > The impact is a loss of exactly-once semantics in this case. For example, in > the Pravega source function, the reader group state (e.g. stream position > data) is stored externally. In the normal restore case, the reader group > state is forcibly rewound to the checkpointed position. In the edge case > where no checkpoint has yet been successful, the reader group state is not > rewound and consequently some amount of stream data is not reprocessed. > A possible fix would be to introduce an {{initializeState}} method on the > hook interface. Similar to {{CheckpointedFunction::initializeState}}, this > method would be invoked unconditionally upon hook initialization. The Pravega > hook would, for example, initialize or forcibly reinitialize the reader group > state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-9105) Table program compiles failed
[ https://issues.apache.org/jira/browse/FLINK-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420044#comment-16420044 ] Bob Lau edited comment on FLINK-9105 at 3/30/18 1:55 AM: - [~twalthr]Thank you for your response! The SQL statement I'm going to compile is'''insert into zdry_wbyj select a.certificate_code,a.user_name from wb_swry a inner join ry_bc_duibi_all b on a.certificate_code = b.zjhm where b.bc_end_time > a.offline_time and b.yxx=1'''.. I'm using a stream environment API,as follows: StreamExecutionEnvironment environment = StreamExecutionEnvironment.createRemoteEnvironment(host, port, depend); The variable 'depend' is an array of jar package absolute paths that contain all of the current project. My current operating system is Mac OS X, the standalone flink cluster is CentOS 6.8. The jar packages of the current project is compiled in mac os x, the remote flink cluster is built in centos 6.8 Do I need to copy the Linux's flink jar package to the current project, and depend them on maven of current project? It's surprising that I can debug success in the local mode. was (Author: bob365): @[~twalthr]Thank you for your response! The SQL statement I'm going to compile is'''insert into zdry_wbyj select a.certificate_code,a.user_name from wb_swry a inner join ry_bc_duibi_all b on a.certificate_code = b.zjhm where b.bc_end_time > a.offline_time and b.yxx=1'''.. I'm using a stream environment API,as follows: StreamExecutionEnvironment environment = StreamExecutionEnvironment.createRemoteEnvironment(host, port, depend); The variable 'depend' is an array of jar package absolute paths that contain all of the current project. My current operating system is Mac OS X, the standalone flink cluster is CentOS 6.8. The jar packages of the current project is compiled in mac os x, the remote flink cluster is built in centos 6.8 Do I need to copy the Linux's flink jar package to the current project, and depend them on maven of current project? > Table program compiles failed > - > > Key: FLINK-9105 > URL: https://issues.apache.org/jira/browse/FLINK-9105 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.4.0, 1.5.0, 1.4.1, 1.4.2 >Reporter: Bob Lau >Priority: Major > > ExceptionStack: > org.apache.flink.client.program.ProgramInvocationException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:253) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:463) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178) > at > com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:150) > at > com.tydic.tysc.rest.SubmitJobController.submitJob(SubmitJobController.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) > at >
[GitHub] flink issue #5427: [FLINK-8533] [checkpointing] Support MasterTriggerRestore...
Github user EronWright commented on the issue: https://github.com/apache/flink/pull/5427 @tillrohrmann @StephanEwen sorry about the long delay here, would you please take another look? I followed Stephan's suggestion of not introducing a new method. However, the semantics that I was shooting for with `initializeState` is that it would be called on both _start_ and _restart_. I adjusted `JobManager` to call `restoreLatestCheckpointedState` on first execution (as does `JobMaster`). Are you OK with that? ---
[jira] [Created] (FLINK-9115) Support addition of part suffix in BucketingSink
Lakshmi Rao created FLINK-9115: -- Summary: Support addition of part suffix in BucketingSink Key: FLINK-9115 URL: https://issues.apache.org/jira/browse/FLINK-9115 Project: Flink Issue Type: Improvement Components: filesystem-connector Reporter: Lakshmi Rao Currently the BucketingSink allows addition of part prefix, pending prefix/suffix and in-progress prefix/suffix via setter methods. Can we also support setting part suffixes? An instance where this maybe useful: I am currently writing GZIP compressed output to S3 using the BucketingSink and I would want the uploaded files to have a ".gz" or ".zip" extensions . An easy way to do this would be by setting a part file suffix with the required file extension. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9105) Table program compiles failed
[ https://issues.apache.org/jira/browse/FLINK-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420044#comment-16420044 ] Bob Lau commented on FLINK-9105: Thank you for your response! The SQL statement I'm going to compile is'''insert into zdry_wbyj select a.certificate_code,a.user_name from wb_swry a inner join ry_bc_duibi_all b on a.certificate_code = b.zjhm where b.bc_end_time > a.offline_time and b.yxx=1'''.. I'm using a stream environment API,as follows: StreamExecutionEnvironment environment = StreamExecutionEnvironment.createRemoteEnvironment(host, port, depend); The variable 'depend' is an array of jar package absolute paths that contain all of the current project. My current operating system is Mac OS X, the standalone flink cluster is CentOS 6.8. The jar packages of the current project is compiled in mac os x, the remote flink cluster is built in centos 6.8 Do I need to copy the Linux's flink jar package to the current project, and depend them on maven of current project? > Table program compiles failed > - > > Key: FLINK-9105 > URL: https://issues.apache.org/jira/browse/FLINK-9105 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.4.0, 1.5.0, 1.4.1, 1.4.2 >Reporter: Bob Lau >Priority: Major > > ExceptionStack: > org.apache.flink.client.program.ProgramInvocationException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:253) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:463) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178) > at > com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:150) > at > com.tydic.tysc.rest.SubmitJobController.submitJob(SubmitJobController.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) > at > org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at >
[jira] [Assigned] (FLINK-8825) Disallow new String() without charset in checkstyle
[ https://issues.apache.org/jira/browse/FLINK-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned FLINK-8825: --- Assignee: vinoyang > Disallow new String() without charset in checkstyle > --- > > Key: FLINK-8825 > URL: https://issues.apache.org/jira/browse/FLINK-8825 > Project: Flink > Issue Type: Bug > Components: Tests >Reporter: Aljoscha Krettek >Assignee: vinoyang >Priority: Major > Fix For: 1.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9105) Table program compiles failed
[ https://issues.apache.org/jira/browse/FLINK-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Lau updated FLINK-9105: --- Component/s: Table API & SQL > Table program compiles failed > - > > Key: FLINK-9105 > URL: https://issues.apache.org/jira/browse/FLINK-9105 > Project: Flink > Issue Type: Bug > Components: DataStream API, Table API SQL >Affects Versions: 1.5.0 >Reporter: Bob Lau >Priority: Major > > ExceptionStack: > org.apache.flink.client.program.ProgramInvocationException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:253) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:463) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178) > at > com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:150) > at > com.tydic.tysc.rest.SubmitJobController.submitJob(SubmitJobController.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) > at > org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:112) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > com.tydic.tysc.filter.ShiroSessionFilter.doFilter(ShiroSessionFilter.java:51) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61) > at com.tydic.tysc.filter.OauthFilter.doFilter(OauthFilter.java:48) > at > org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66) > at >
[jira] [Updated] (FLINK-9105) Table program compiles failed
[ https://issues.apache.org/jira/browse/FLINK-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Lau updated FLINK-9105: --- Affects Version/s: (was: 1.4.2) (was: 1.4.1) (was: 1.4.0) > Table program compiles failed > - > > Key: FLINK-9105 > URL: https://issues.apache.org/jira/browse/FLINK-9105 > Project: Flink > Issue Type: Bug > Components: DataStream API, Table API SQL >Affects Versions: 1.5.0 >Reporter: Bob Lau >Priority: Major > > ExceptionStack: > org.apache.flink.client.program.ProgramInvocationException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:253) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:463) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:219) > at > org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:178) > at > com.tydic.tysc.job.service.SubmitJobService.submitJobToStandaloneCluster(SubmitJobService.java:150) > at > com.tydic.tysc.rest.SubmitJobController.submitJob(SubmitJobController.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) > at > org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:112) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > com.tydic.tysc.filter.ShiroSessionFilter.doFilter(ShiroSessionFilter.java:51) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61) > at com.tydic.tysc.filter.OauthFilter.doFilter(OauthFilter.java:48) > at >
[GitHub] flink issue #5792: [Flink-8563][Table API & SQL] add unittest for consecutiv...
Github user suez1224 commented on the issue: https://github.com/apache/flink/pull/5792 retry build ---
[GitHub] flink pull request #5792: [Flink-8563][Table API & SQL] add unittest for con...
GitHub user suez1224 reopened a pull request: https://github.com/apache/flink/pull/5792 [Flink-8563][Table API & SQL] add unittest for consecutive dot access of composite array element in SQL ## What is the purpose of the change add unittest for consecutive dot access of composite array element in SQL. This depends on https://github.com/apache/flink/pull/5791. ## Brief change log - add unittest ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (y no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/suez1224/flink FLINK-8563 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5792.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5792 commit 637b7462ac049d347fb5bbb8e68ed4fca7b0ec93 Author: Shuyi ChenDate: 2018-03-29T18:39:09Z upgrade calcite dependency to 1.16 commit aea021cb9efc869872595f64467f8d2ec8071ea4 Author: Shuyi Chen Date: 2018-03-29T19:15:15Z add unittest for consecutive dot access of composite array element in SQL ---
[GitHub] flink pull request #5792: [Flink-8563][Table API & SQL] add unittest for con...
Github user suez1224 closed the pull request at: https://github.com/apache/flink/pull/5792 ---