[jira] [Updated] (FLINK-29453) Add uidHash support to State Processor API
[ https://issues.apache.org/jira/browse/FLINK-29453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29453: - Issue Type: New Feature (was: Improvement) > Add uidHash support to State Processor API > --- > > Key: FLINK-29453 > URL: https://issues.apache.org/jira/browse/FLINK-29453 > Project: Flink > Issue Type: New Feature > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > The state process API is currently limited to working with uids. > We should change this since this is a good application for the API. > The API should be extended to support uidHashes wherever a uid is support, > and we should add a method to map uid[hashes] to a different uid[hash]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29511) Sort properties/schemas in OpenAPI spec
[ https://issues.apache.org/jira/browse/FLINK-29511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29511: - Description: The properties/schema order is currently based on whatever order they were looked up, which varies as the spec is being extended. Sort them by name to prevent this. was: The properties order is currently based on whatever order properties were looked up, which varies as the spec is being extended. Sort the properties by name to prevent this. > Sort properties/schemas in OpenAPI spec > --- > > Key: FLINK-29511 > URL: https://issues.apache.org/jira/browse/FLINK-29511 > Project: Flink > Issue Type: Technical Debt > Components: Documentation, Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > The properties/schema order is currently based on whatever order they were > looked up, which varies as the spec is being extended. > Sort them by name to prevent this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29511) Sort properties/schemas in OpenAPI spec
[ https://issues.apache.org/jira/browse/FLINK-29511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29511: - Summary: Sort properties/schemas in OpenAPI spec (was: Sort properties in OpenAPI spec) > Sort properties/schemas in OpenAPI spec > --- > > Key: FLINK-29511 > URL: https://issues.apache.org/jira/browse/FLINK-29511 > Project: Flink > Issue Type: Technical Debt > Components: Documentation, Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > The properties order is currently based on whatever order properties were > looked up, which varies as the spec is being extended. > Sort the properties by name to prevent this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29511) Sort properties in OpenAPI spec
[ https://issues.apache.org/jira/browse/FLINK-29511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29511: - Description: The properties order is currently based on whatever order properties were looked up, which varies as the spec is being extended. Sort the properties by name to prevent this. was: The properties order is currently based on whatever order a HashMap provides, which varies as the spec is being extended. Sort the properties by name to prevent this. > Sort properties in OpenAPI spec > --- > > Key: FLINK-29511 > URL: https://issues.apache.org/jira/browse/FLINK-29511 > Project: Flink > Issue Type: Technical Debt > Components: Documentation, Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > The properties order is currently based on whatever order properties were > looked up, which varies as the spec is being extended. > Sort the properties by name to prevent this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29511) Sort properties in OpenAPI spec
Chesnay Schepler created FLINK-29511: Summary: Sort properties in OpenAPI spec Key: FLINK-29511 URL: https://issues.apache.org/jira/browse/FLINK-29511 Project: Flink Issue Type: Technical Debt Components: Documentation, Runtime / REST Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 The properties order is currently based on whatever order a HashMap provides, which varies as the spec is being extended. Sort the properties by name to prevent this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29456) Add methods that accept OperatorIdentifier
[ https://issues.apache.org/jira/browse/FLINK-29456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29456. Resolution: Fixed master: 8d72490377551a35851a3319c0f49b408d31a566 > Add methods that accept OperatorIdentifier > -- > > Key: FLINK-29456 > URL: https://issues.apache.org/jira/browse/FLINK-29456 > Project: Flink > Issue Type: Sub-task > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > Add new variants of all methods in the SavepointReader/-Writer that accept an > OperatorIdentifier. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29499) DispatcherOperationCaches should implement AutoCloseableAsync
[ https://issues.apache.org/jira/browse/FLINK-29499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29499. Resolution: Fixed master: 3730e24cc4283f877ac35b189dff355579d1de68 > DispatcherOperationCaches should implement AutoCloseableAsync > - > > Key: FLINK-29499 > URL: https://issues.apache.org/jira/browse/FLINK-29499 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29510) Add NoticeFileChecker tests
Chesnay Schepler created FLINK-29510: Summary: Add NoticeFileChecker tests Key: FLINK-29510 URL: https://issues.apache.org/jira/browse/FLINK-29510 Project: Flink Issue Type: Technical Debt Components: Build System, Tests Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 The NoticeFileChecker is too important to not be covered by tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29508) Some NOTICE files are not checked for correctness
Chesnay Schepler created FLINK-29508: Summary: Some NOTICE files are not checked for correctness Key: FLINK-29508 URL: https://issues.apache.org/jira/browse/FLINK-29508 Project: Flink Issue Type: Technical Debt Components: Build System Affects Versions: 1.16.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.16.0 We have 3 modules that are not being deployed (and thus auto-excluded since FLINK-29301) which are still relevant for production though. We should amend the checker to take into account whether the non-deployed module is bundled by another deployed module. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29487) RpcService should implement AutoCloseableAsync
[ https://issues.apache.org/jira/browse/FLINK-29487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29487. Resolution: Fixed master: e9f3ec93aad7cec795c765c937ee71807f5478cf > RpcService should implement AutoCloseableAsync > -- > > Key: FLINK-29487 > URL: https://issues.apache.org/jira/browse/FLINK-29487 > Project: Flink > Issue Type: Sub-task > Components: Runtime / RPC >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] Chesnay Schepler edited comment on FLINK-29501 at 10/5/22 10:25 AM: ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_, presumably by a standardized .parallelism argument), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) ??redeploy the Flink JobGraph?? I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. was (Author: zentol): ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) ??redeploy the Flink JobGraph?? I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] Chesnay Schepler edited comment on FLINK-29501 at 10/5/22 10:25 AM: ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_, presumably by a standardized .parallelism argument), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this approach has problems :) ??redeploy the Flink JobGraph?? I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. was (Author: zentol): ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_, presumably by a standardized .parallelism argument), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) ??redeploy the Flink JobGraph?? I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] Chesnay Schepler edited comment on FLINK-29501 at 10/5/22 10:21 AM: ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) ??redeploy the Flink JobGraph?? I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. was (Author: zentol): ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) > redeploy the Flink JobGraph I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] Chesnay Schepler edited comment on FLINK-29501 at 10/5/22 10:21 AM: ??The user application doesn't have access to the JobGraph during the normal execution flow?? It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) > redeploy the Flink JobGraph I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. was (Author: zentol): > The user application doesn't have access to the JobGraph during the normal > execution flow It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) > redeploy the Flink JobGraph I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] Chesnay Schepler commented on FLINK-29501: -- > The user application doesn't have access to the JobGraph during the normal > execution flow It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism) Yes, this isn't a good approach :) > redeploy the Flink JobGraph I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration? Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)? On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though. > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29503) Add backpressureLevel field wiithout hyphens
[ https://issues.apache.org/jira/browse/FLINK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29503: - Summary: Add backpressureLevel field wiithout hyphens (was: [OpenAPI] enum `backpressure-level` should be backpressurelevel so code generator recognizes it) > Add backpressureLevel field wiithout hyphens > > > Key: FLINK-29503 > URL: https://issues.apache.org/jira/browse/FLINK-29503 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > Fix For: 1.16.0, 1.15.3 > > > Install nodejs and run > {{$ npx --yes --package openapi-typescript-codegen openapi --input > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > --output .}} > {{$ npx --package typescript tsc }} > The only thing it complains about is: > {{{}src/models/JobVertexBackPressureInfo.ts:21:17 - error TS1003: Identifier > expected.{}}}{{{}21 export enum 'backpressure-level' {{}}} > This is because for TypeScript, enum name should not have a hyphen in it. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29503) [OpenAPI] enum `backpressure-level` should be backpressurelevel so code generator recognizes it
[ https://issues.apache.org/jira/browse/FLINK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29503: - Fix Version/s: 1.16.0 1.15.3 > [OpenAPI] enum `backpressure-level` should be backpressurelevel so code > generator recognizes it > --- > > Key: FLINK-29503 > URL: https://issues.apache.org/jira/browse/FLINK-29503 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > Fix For: 1.16.0, 1.15.3 > > > Install nodejs and run > {{$ npx --yes --package openapi-typescript-codegen openapi --input > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > --output .}} > {{$ npx --package typescript tsc }} > The only thing it complains about is: > {{{}src/models/JobVertexBackPressureInfo.ts:21:17 - error TS1003: Identifier > expected.{}}}{{{}21 export enum 'backpressure-level' {{}}} > This is because for TypeScript, enum name should not have a hyphen in it. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29503) [OpenAPI] enum `backpressure-level` should be backpressurelevel so code generator recognizes it
[ https://issues.apache.org/jira/browse/FLINK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612919#comment-17612919 ] Chesnay Schepler commented on FLINK-29503: -- This is less of an issue with a spec but rather the REST API itself. We'll need to duplicate the field. > [OpenAPI] enum `backpressure-level` should be backpressurelevel so code > generator recognizes it > --- > > Key: FLINK-29503 > URL: https://issues.apache.org/jira/browse/FLINK-29503 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes --package openapi-typescript-codegen openapi --input > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > --output .}} > {{$ npx --package typescript tsc }} > The only thing it complains about is: > {{{}src/models/JobVertexBackPressureInfo.ts:21:17 - error TS1003: Identifier > expected.{}}}{{{}21 export enum 'backpressure-level' {{}}} > This is because for TypeScript, enum name should not have a hyphen in it. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-29503) [OpenAPI] enum `backpressure-level` should be backpressurelevel so code generator recognizes it
[ https://issues.apache.org/jira/browse/FLINK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-29503: Assignee: Chesnay Schepler > [OpenAPI] enum `backpressure-level` should be backpressurelevel so code > generator recognizes it > --- > > Key: FLINK-29503 > URL: https://issues.apache.org/jira/browse/FLINK-29503 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes --package openapi-typescript-codegen openapi --input > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > --output .}} > {{$ npx --package typescript tsc }} > The only thing it complains about is: > {{{}src/models/JobVertexBackPressureInfo.ts:21:17 - error TS1003: Identifier > expected.{}}}{{{}21 export enum 'backpressure-level' {{}}} > This is because for TypeScript, enum name should not have a hyphen in it. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29504) Jar upload spec should define a schema
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29504: - Fix Version/s: 1.16.0 1.15.3 > Jar upload spec should define a schema > -- > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > Fix For: 1.16.0, 1.15.3 > > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29504) Jar upload spec should define a schema
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29504: - Component/s: Runtime / REST (was: API / Core) > Jar upload spec should define a schema > -- > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > Fix For: 1.16.0, 1.15.3 > > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29504) Jar upload spec should define a schema
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29504: - Summary: Jar upload spec should define a schema (was: Jar upload spec should define a shema) > Jar upload spec should define a schema > -- > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29504) Jar upload spec should define a shema
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29504: - Summary: Jar upload spec should define a shema (was: [OpenAPI] `application/x-java-archive` should be followed by a structure instead of `{}`) > Jar upload spec should define a shema > - > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-29504) [OpenAPI] `application/x-java-archive` should be followed by a structure instead of `{}`
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-29504: Assignee: Chesnay Schepler > [OpenAPI] `application/x-java-archive` should be followed by a structure > instead of `{}` > > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Assignee: Chesnay Schepler >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29504) [OpenAPI] `application/x-java-archive` should be followed by a structure instead of `{}`
[ https://issues.apache.org/jira/browse/FLINK-29504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612908#comment-17612908 ] Chesnay Schepler commented on FLINK-29504: -- Should be straight forward to fix by adding this in {{OpenApiSpecGenerator#injectFileUploadRequest}}: {code:java} schema: type: string format: binary {code} > [OpenAPI] `application/x-java-archive` should be followed by a structure > instead of `{}` > > > Key: FLINK-29504 > URL: https://issues.apache.org/jira/browse/FLINK-29504 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes @openapitools/openapi-generator-cli generate -i > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > -g typescript-axios -o .}} > > Then it outputs error > > {{Caused by: java.lang.RuntimeException: Request body cannot be null. > Possible cause: missing schema in body parameter (OAS v2): class RequestBody > {}} > {{ description: null}} > {{ content: class Content {}} > {{ {application/x-java-archive=class MediaType {}} > {{ schema: null}} > {{ examples: null}} > {{ example: null}} > {{ encoding: null}} > {\{ > {\{ }}} > {{ required: true}} > {{}}} > > This is because in the YAML: > {{}}{{ requestBody:}} > {{ content:}} > {{{} application/x-java-archive: {{ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29505) [OpenAPI] Need `operationId` on certain operations
[ https://issues.apache.org/jira/browse/FLINK-29505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29505: - Component/s: Runtime / REST > [OpenAPI] Need `operationId` on certain operations > -- > > Key: FLINK-29505 > URL: https://issues.apache.org/jira/browse/FLINK-29505 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Priority: Major > Labels: openapi > Attachments: image-2022-10-04-21-32-28-903.png > > > Install nodejs and run > $ npx --yes autorest > --input-file=https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml > --typescript --output-folder=. > > It returns with error > !image-2022-10-04-21-32-28-903.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29505) [OpenAPI] Need `operationId` on certain operations
[ https://issues.apache.org/jira/browse/FLINK-29505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29505: - Component/s: (was: API / Core) > [OpenAPI] Need `operationId` on certain operations > -- > > Key: FLINK-29505 > URL: https://issues.apache.org/jira/browse/FLINK-29505 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Priority: Major > Labels: openapi > Attachments: image-2022-10-04-21-32-28-903.png > > > Install nodejs and run > $ npx --yes autorest > --input-file=https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml > --typescript --output-folder=. > > It returns with error > !image-2022-10-04-21-32-28-903.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29505) [OpenAPI] Need `operationId` on certain operations
[ https://issues.apache.org/jira/browse/FLINK-29505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29505. Resolution: Duplicate Operation IDs will be available in 1.16.0. > [OpenAPI] Need `operationId` on certain operations > -- > > Key: FLINK-29505 > URL: https://issues.apache.org/jira/browse/FLINK-29505 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Priority: Major > Labels: openapi > Attachments: image-2022-10-04-21-32-28-903.png > > > Install nodejs and run > $ npx --yes autorest > --input-file=https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml > --typescript --output-folder=. > > It returns with error > !image-2022-10-04-21-32-28-903.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29503) [OpenAPI] enum `backpressure-level` should be backpressurelevel so code generator recognizes it
[ https://issues.apache.org/jira/browse/FLINK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29503: - Component/s: Runtime / REST (was: API / Core) > [OpenAPI] enum `backpressure-level` should be backpressurelevel so code > generator recognizes it > --- > > Key: FLINK-29503 > URL: https://issues.apache.org/jira/browse/FLINK-29503 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.15.2 >Reporter: Tiger (Apache) Wang >Priority: Major > Labels: openapi > > Install nodejs and run > {{$ npx --yes --package openapi-typescript-codegen openapi --input > [https://nightlies.apache.org/flink/flink-docs-release-1.15/generated/rest_v1_dispatcher.yml] > --output .}} > {{$ npx --package typescript tsc }} > The only thing it complains about is: > {{{}src/models/JobVertexBackPressureInfo.ts:21:17 - error TS1003: Identifier > expected.{}}}{{{}21 export enum 'backpressure-level' {{}}} > This is because for TypeScript, enum name should not have a hyphen in it. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission
[ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612640#comment-17612640 ] Chesnay Schepler commented on FLINK-29501: -- Why can't this be handled within the application with a main method argument? > Allow overriding JobVertex parallelisms during job submission > - > > Key: FLINK-29501 > URL: https://issues.apache.org/jira/browse/FLINK-29501 > Project: Flink > Issue Type: New Feature > Components: Runtime / Configuration, Runtime / REST >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > It is a common scenario that users want to make changes to the parallelisms > in the JobGraph. For example, because they discover that the job needs more > or less resources. There is the option to do this globally via the job > parallelism. However, for fine-tuned jobs jobs with potentially many > branches, tuning on the job vertex level is required. > This is to propose a way such that users can apply a mapping \{{jobVertexId > => parallelism}} before the job is submitted without having to modify the > JobGraph manually. > One way to achieving this would be to add an optional map field to the Rest > API jobs endpoint. However, in deployment modes like the application mode, > this might not make sense because users do not have control the rest endpoint. > Similarly to how other job parameters are passed in the application mode, we > propose to add the overrides as a configuration parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-29399) TableITCase is unstable
[ https://issues.apache.org/jira/browse/FLINK-29399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-29399: Assignee: Chesnay Schepler > TableITCase is unstable > --- > > Key: FLINK-29399 > URL: https://issues.apache.org/jira/browse/FLINK-29399 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner, Tests >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > > {code:java} > val it = tableResult.collect() > it.close() > val jobStatus = > try { > Some(tableResult.getJobClient.get().getJobStatus.get()) > } catch { > // ignore the exception, > // because the MiniCluster maybe already been shut down when getting > job status > case _: Throwable => None > } > if (jobStatus.isDefined) { > assertNotEquals(jobStatus.get, JobStatus.RUNNING) > } > {code} > There's no guarantee that the cancellation already went through. The test > should periodically poll the job status until another state is reached. > Or even better, use the new collect API, call execute in a separate thread, > close the iterator and wait for the thread to terminate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29499) DispatcherOperationCaches should implement AutoCloseableAsync
Chesnay Schepler created FLINK-29499: Summary: DispatcherOperationCaches should implement AutoCloseableAsync Key: FLINK-29499 URL: https://issues.apache.org/jira/browse/FLINK-29499 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination, Runtime / REST Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29456) Add methods that accept OperatorIdentifier
[ https://issues.apache.org/jira/browse/FLINK-29456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29456: - Summary: Add methods that accept OperatorIdentifier (was: Add methods that acept OperatorIdentifier) > Add methods that accept OperatorIdentifier > -- > > Key: FLINK-29456 > URL: https://issues.apache.org/jira/browse/FLINK-29456 > Project: Flink > Issue Type: Sub-task > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > Add new variants of all methods in the SavepointReader/-Writer that accept an > OperatorIdentifier. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29403) Streamline SimpleCondition usage
[ https://issues.apache.org/jira/browse/FLINK-29403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29403. Resolution: Fixed master: 990a7dac90af5ea7e8450521a63802c2563f4548 > Streamline SimpleCondition usage > > > Key: FLINK-29403 > URL: https://issues.apache.org/jira/browse/FLINK-29403 > Project: Flink > Issue Type: Improvement > Components: Library / CEP >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > CEP SimpleCondition are essentially filter functions, but since it's an > abstract class it ends up being incredibly verbose. We can add a simple > factory method to streamline this. > Additionally the class should not be annotated with {{@Internal}} given how > much it is advertised in the docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29455) Add OperatorIdentifier
[ https://issues.apache.org/jira/browse/FLINK-29455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29455. Resolution: Fixed master: 306c3c4ce3025bb29b84e5f6045cb3c745feb893 > Add OperatorIdentifier > -- > > Key: FLINK-29455 > URL: https://issues.apache.org/jira/browse/FLINK-29455 > Project: Flink > Issue Type: Sub-task > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > Add a class for identifying operators, that supports both uids and uidhashes, > and integrate into the low-level APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29488) MetricRegistryImpl should implement AutoCloseableAsync
[ https://issues.apache.org/jira/browse/FLINK-29488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29488. Resolution: Fixed master: 149785a5725a4727287d2f133b3b3abbb23f99a9 > MetricRegistryImpl should implement AutoCloseableAsync > -- > > Key: FLINK-29488 > URL: https://issues.apache.org/jira/browse/FLINK-29488 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29387) IntervalJoinITCase.testIntervalJoinSideOutputRightLateData failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-29387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611543#comment-17611543 ] Chesnay Schepler commented on FLINK-29387: -- Running the test multiple times in parallel locally makes this pretty easy to reproduce. > IntervalJoinITCase.testIntervalJoinSideOutputRightLateData failed with > AssertionError > - > > Key: FLINK-29387 > URL: https://issues.apache.org/jira/browse/FLINK-29387 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.17.0 >Reporter: Huang Xingbo >Priority: Critical > Labels: test-stability > > {code:java} > 2022-09-22T04:40:21.9296331Z Sep 22 04:40:21 [ERROR] > org.apache.flink.test.streaming.runtime.IntervalJoinITCase.testIntervalJoinSideOutputRightLateData > Time elapsed: 2.46 s <<< FAILURE! > 2022-09-22T04:40:21.9297487Z Sep 22 04:40:21 java.lang.AssertionError: > expected:<[(key,2)]> but was:<[]> > 2022-09-22T04:40:21.9298208Z Sep 22 04:40:21 at > org.junit.Assert.fail(Assert.java:89) > 2022-09-22T04:40:21.9298927Z Sep 22 04:40:21 at > org.junit.Assert.failNotEquals(Assert.java:835) > 2022-09-22T04:40:21.9299655Z Sep 22 04:40:21 at > org.junit.Assert.assertEquals(Assert.java:120) > 2022-09-22T04:40:21.9300403Z Sep 22 04:40:21 at > org.junit.Assert.assertEquals(Assert.java:146) > 2022-09-22T04:40:21.9301538Z Sep 22 04:40:21 at > org.apache.flink.test.streaming.runtime.IntervalJoinITCase.expectInAnyOrder(IntervalJoinITCase.java:521) > 2022-09-22T04:40:21.9302578Z Sep 22 04:40:21 at > org.apache.flink.test.streaming.runtime.IntervalJoinITCase.testIntervalJoinSideOutputRightLateData(IntervalJoinITCase.java:280) > 2022-09-22T04:40:21.9303641Z Sep 22 04:40:21 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2022-09-22T04:40:21.9304472Z Sep 22 04:40:21 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2022-09-22T04:40:21.9305371Z Sep 22 04:40:21 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2022-09-22T04:40:21.9306195Z Sep 22 04:40:21 at > java.lang.reflect.Method.invoke(Method.java:498) > 2022-09-22T04:40:21.9307011Z Sep 22 04:40:21 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 2022-09-22T04:40:21.9308077Z Sep 22 04:40:21 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2022-09-22T04:40:21.9308968Z Sep 22 04:40:21 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 2022-09-22T04:40:21.9309849Z Sep 22 04:40:21 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2022-09-22T04:40:21.9310704Z Sep 22 04:40:21 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 2022-09-22T04:40:21.9311533Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-09-22T04:40:21.9312386Z Sep 22 04:40:21 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 2022-09-22T04:40:21.9313231Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 2022-09-22T04:40:21.9314985Z Sep 22 04:40:21 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 2022-09-22T04:40:21.9315857Z Sep 22 04:40:21 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 2022-09-22T04:40:21.9316633Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 2022-09-22T04:40:21.9317450Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 2022-09-22T04:40:21.9318209Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 2022-09-22T04:40:21.9318949Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 2022-09-22T04:40:21.9319680Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 2022-09-22T04:40:21.9320401Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-09-22T04:40:21.9321130Z Sep 22 04:40:21 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > 2022-09-22T04:40:21.9321822Z Sep 22 04:40:21 at > org.junit.runner.JUnitCore.run(JUnitCore.java:137) > 2022-09-22T04:40:21.9322498Z Sep 22 04:40:21 at > org.junit.runner.JUnitCore.run(JUnitCore.java:115) > 2022-09-22T04:40:21.9323248Z Sep 22 04:40:21 at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > 2022-09-22T04:40:21.9324080Z Sep 22 04:40:21 at > org.junit.vintage.engine.V
[jira] [Updated] (FLINK-28291) Add kerberos delegation token renewer feature instead of logged from keytab individually
[ https://issues.apache.org/jira/browse/FLINK-28291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-28291: - Fix Version/s: (was: 1.13.5) > Add kerberos delegation token renewer feature instead of logged from keytab > individually > > > Key: FLINK-28291 > URL: https://issues.apache.org/jira/browse/FLINK-28291 > Project: Flink > Issue Type: New Feature > Components: Deployment / YARN >Affects Versions: 1.13.5 >Reporter: jiulong.zhu >Priority: Minor > Labels: PatchAvailable, patch-available > Attachments: FLINK-28291.0001.patch > > > h2. 1. Design > LifeCycle of delegation token in RM: > # Container starts with DT given by client. > # Enable delegation token renewer by: > ## set {{security.kerberos.token.renew.enabled}} true, default false. And > ## specify {{security.kerberos.login.keytab}} and > {{security.kerberos.login.principal}} > # When enabled delegation token renewer, the renewer thread will re-obtain > tokens from DelegationTokenProvider(only HadoopFSDelegationTokenProvider > now). Then the renewer thread will broadcast new tokens to RM locally, all > JMs and all TMs by RPCGateway. > # RM process adds new tokens in context by UserGroupInformation. > LifeCycle of delegation token in JM / TM: > # TaskManager starts with keytab stored in remote hdfs. > # When registered successfully, JM / TM get the current tokens of RM boxed > by {{JobMasterRegistrationSuccess}} / {{{}TaskExecutorRegistrationSuccess{}}}. > # JM / TM process add new tokens in context by UserGroupInformation. > It’s too heavy and unnecessary to retrieval leader of ResourceManager by > HAService, so DelegationTokenManager is instanced by ResourceManager. So > DelegationToken can hold the reference of ResourceManager, instead of RM > RPCGateway or self gateway. > h2. 2. Test > # No local junit test. It’s too heavy to build junit environments including > KDC and local hadoop. > # Cluster test > step 1: Specify krb5.conf with short token lifetime(ticket_lifetime, > renew_lifetime) when submitting flink application. > ``` > {{flink run -yD security.kerberos.token.renew.enabled=true -yD > security.kerberos.krb5-conf.path= /home/work/krb5.conf -yD > security.kerberos.login.use-ticket-cache=false ...}} > ``` > step 2: Watch token identifier changelog and synchronizer between rm and > worker. > >> > In RM / JM log, > 2022-06-28 15:13:03,509 INFO org.apache.flink.runtime.util.HadoopUtils [] - > New token (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) > created in KerberosDelegationToken, and next schedule delay is 64799880 ms. > 2022-06-28 15:13:03,529 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. 2022-06-28 15:13:04,729 INFO > org.apache.flink.runtime.util.HadoopUtils [] - JobMaster receives new token > (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) from RM. > … > 2022-06-29 09:13:03,732 INFO org.apache.flink.runtime.util.HadoopUtils [] - > New token (HDFS_DELEGATION_TOKEN token 52310 for work on ha-hdfs:newfyyy) > created in KerberosDelegationToken, and next schedule delay is 64800045 ms. > 2022-06-29 09:13:03,805 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. > 2022-06-29 09:13:03,806 INFO org.apache.flink.runtime.util.HadoopUtils [] - > JobMaster receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on > ha-hdfs:newfyyy) from RM. > >> > In TM log, > 2022-06-28 15:13:17,983 INFO org.apache.flink.runtime.util.HadoopUtils [] - > TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52101 for work on > ha-hdfs:newfyyy) from RM. > 2022-06-28 15:13:18,016 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. > … > 2022-06-29 09:13:03,809 INFO org.apache.flink.runtime.util.HadoopUtils [] - > TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on > ha-hdfs:newfyyy) from RM. > 2022-06-29 09:13:03,836 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29330) Provide better logs of MiniCluster shutdown procedure
[ https://issues.apache.org/jira/browse/FLINK-29330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611520#comment-17611520 ] Chesnay Schepler commented on FLINK-29330: -- Should LeaderRetrievalService really implement it though? It doesn't seem to be doing asynchronously; or are you proposing to keep that as an implementation detail? > Provide better logs of MiniCluster shutdown procedure > - > > Key: FLINK-29330 > URL: https://issues.apache.org/jira/browse/FLINK-29330 > Project: Flink > Issue Type: Technical Debt > Components: Tests >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > I recently ran into an issue where the shutdown of a MiniCluster timed out. > The logs weren't helpful at all and I had to go in and check every > asynchronously component for whether _that_ component was the cause. > The main issues were that various components don't log anything at all, or > that when they did it wasn't clear who owned that component. > I'd like to add a util that makes it easier for us log the start/stop of a > shutdown procedure, > {code:java} > public class ShutdownLog { > /** > * Logs the beginning and end of the shutdown procedure for the given > component. > * > * This method accepts a {@link Supplier} instead of a {@link > CompletableFuture} because the > * latter usually required implies the shutdown to already have begun. > * > * @param log Logger of owning component > * @param component component that will be shut down > * @param shutdownTrigger component shutdown trigger > * @return termination future of the component > */ > public static CompletableFuture logShutdown( > Logger log, String component, Supplier> > shutdownTrigger) { > log.debug("Starting shutdown of {}.", component); > return FutureUtils.logCompletion(log, "shutdown of " + component, > shutdownTrigger.get()); > } > } > public class FutureUtils { > public static CompletableFuture logCompletion( > Logger log, String action, CompletableFuture future) { > future.handle( > (t, throwable) -> { > if (throwable == null) { > log.debug("Completed {}.", action); > } else { > log.debug("Failed {}.", action, throwable); > } > return null; > }); > return future; > } > ... > {code} > and extend the AutoCloseableAsync interface for an easy opt-in and customized > logging: > {code:java} > default CompletableFuture closeAsync(Logger log) { > return ShutdownLog.logShutdown(log, getClass().getSimpleName(), > this::closeAsync); > } > {code} > MiniCluster example usages: > {code:java} > -terminationFutures.add(dispatcherResourceManagerComponent.closeAsync()) > +terminationFutures.add(dispatcherResourceManagerComponent.closeAsync(LOG)) > {code} > {code:java} > -return ExecutorUtils.nonBlockingShutdown( > -executorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, ioExecutor); > +return ShutdownLog.logShutdown( > +LOG, > +"ioExecutor", > +() -> > +ExecutorUtils.nonBlockingShutdown( > +executorShutdownTimeoutMillis, > +TimeUnit.MILLISECONDS, > +ioExecutor)); > {code} > [~mapohl] I'm interested what you think about this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29330) Provide better logs of MiniCluster shutdown procedure
[ https://issues.apache.org/jira/browse/FLINK-29330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611520#comment-17611520 ] Chesnay Schepler edited comment on FLINK-29330 at 9/30/22 11:20 AM: Should LeaderRetrievalService really implement it though? It doesn't seem to be doing asynchronously; or are you proposing to change that or keep that as an implementation detail? was (Author: zentol): Should LeaderRetrievalService really implement it though? It doesn't seem to be doing asynchronously; or are you proposing to keep that as an implementation detail? > Provide better logs of MiniCluster shutdown procedure > - > > Key: FLINK-29330 > URL: https://issues.apache.org/jira/browse/FLINK-29330 > Project: Flink > Issue Type: Technical Debt > Components: Tests >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > I recently ran into an issue where the shutdown of a MiniCluster timed out. > The logs weren't helpful at all and I had to go in and check every > asynchronously component for whether _that_ component was the cause. > The main issues were that various components don't log anything at all, or > that when they did it wasn't clear who owned that component. > I'd like to add a util that makes it easier for us log the start/stop of a > shutdown procedure, > {code:java} > public class ShutdownLog { > /** > * Logs the beginning and end of the shutdown procedure for the given > component. > * > * This method accepts a {@link Supplier} instead of a {@link > CompletableFuture} because the > * latter usually required implies the shutdown to already have begun. > * > * @param log Logger of owning component > * @param component component that will be shut down > * @param shutdownTrigger component shutdown trigger > * @return termination future of the component > */ > public static CompletableFuture logShutdown( > Logger log, String component, Supplier> > shutdownTrigger) { > log.debug("Starting shutdown of {}.", component); > return FutureUtils.logCompletion(log, "shutdown of " + component, > shutdownTrigger.get()); > } > } > public class FutureUtils { > public static CompletableFuture logCompletion( > Logger log, String action, CompletableFuture future) { > future.handle( > (t, throwable) -> { > if (throwable == null) { > log.debug("Completed {}.", action); > } else { > log.debug("Failed {}.", action, throwable); > } > return null; > }); > return future; > } > ... > {code} > and extend the AutoCloseableAsync interface for an easy opt-in and customized > logging: > {code:java} > default CompletableFuture closeAsync(Logger log) { > return ShutdownLog.logShutdown(log, getClass().getSimpleName(), > this::closeAsync); > } > {code} > MiniCluster example usages: > {code:java} > -terminationFutures.add(dispatcherResourceManagerComponent.closeAsync()) > +terminationFutures.add(dispatcherResourceManagerComponent.closeAsync(LOG)) > {code} > {code:java} > -return ExecutorUtils.nonBlockingShutdown( > -executorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, ioExecutor); > +return ShutdownLog.logShutdown( > +LOG, > +"ioExecutor", > +() -> > +ExecutorUtils.nonBlockingShutdown( > +executorShutdownTimeoutMillis, > +TimeUnit.MILLISECONDS, > +ioExecutor)); > {code} > [~mapohl] I'm interested what you think about this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29488) MetricRegistryImpl should implement AutoCloseableAsync
Chesnay Schepler created FLINK-29488: Summary: MetricRegistryImpl should implement AutoCloseableAsync Key: FLINK-29488 URL: https://issues.apache.org/jira/browse/FLINK-29488 Project: Flink Issue Type: Sub-task Components: Runtime / Metrics Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29487) RpcService should implement AutoCloseableAsync
Chesnay Schepler created FLINK-29487: Summary: RpcService should implement AutoCloseableAsync Key: FLINK-29487 URL: https://issues.apache.org/jira/browse/FLINK-29487 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29487) RpcService should implement AutoCloseableAsync
[ https://issues.apache.org/jira/browse/FLINK-29487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29487: - Component/s: Runtime / RPC (was: Runtime / Coordination) > RpcService should implement AutoCloseableAsync > -- > > Key: FLINK-29487 > URL: https://issues.apache.org/jira/browse/FLINK-29487 > Project: Flink > Issue Type: Sub-task > Components: Runtime / RPC >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29471) Create a flink-connector-parent pom
[ https://issues.apache.org/jira/browse/FLINK-29471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611467#comment-17611467 ] Chesnay Schepler commented on FLINK-29471: -- elasticsearch-main: 6e30d5d63d395b2f731418c34f5838231dcab6b8 > Create a flink-connector-parent pom > --- > > Key: FLINK-29471 > URL: https://issues.apache.org/jira/browse/FLINK-29471 > Project: Flink > Issue Type: Sub-task > Components: Build System >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: elasticsearch-3.0.0 > > > Create a shared parent pom for connectors, reducing the overhead of creating > new repos and easing plugin maintenance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29372) Add a suffix to keys that violate YAML spec
[ https://issues.apache.org/jira/browse/FLINK-29372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29372. Resolution: Fixed > Add a suffix to keys that violate YAML spec > --- > > Key: FLINK-29372 > URL: https://issues.apache.org/jira/browse/FLINK-29372 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > We have a few options where the key is a prefix of other options (e.g., > {{high-availability}} and {{high-availability.cluster-id}}. > Add a suffix to these options and keep the old key as deprecated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29471) Create a flink-connector-parent pom
[ https://issues.apache.org/jira/browse/FLINK-29471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29471: - Fix Version/s: elasticsearch-3.0.0 > Create a flink-connector-parent pom > --- > > Key: FLINK-29471 > URL: https://issues.apache.org/jira/browse/FLINK-29471 > Project: Flink > Issue Type: Sub-task > Components: Build System >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: elasticsearch-3.0.0 > > > Create a shared parent pom for connectors, reducing the overhead of creating > new repos and easing plugin maintenance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29454) Deduplicate code in SavepointReader
[ https://issues.apache.org/jira/browse/FLINK-29454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29454. Resolution: Fixed master: 178cc4e56fb94fe0aa68aeaee780b6227625768d > Deduplicate code in SavepointReader > --- > > Key: FLINK-29454 > URL: https://issues.apache.org/jira/browse/FLINK-29454 > Project: Flink > Issue Type: Sub-task > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28326) ResultPartitionTest.testIdleAndBackPressuredTime failed with AssertError
[ https://issues.apache.org/jira/browse/FLINK-28326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-28326. Fix Version/s: 1.17.0 (was: 1.16.0) Resolution: Fixed master: f543b8ac690b1dee58bc3cb345a1c8ad0db0941e > ResultPartitionTest.testIdleAndBackPressuredTime failed with AssertError > > > Key: FLINK-28326 > URL: https://issues.apache.org/jira/browse/FLINK-28326 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Assignee: Weijie Guo >Priority: Major > Labels: pull-request-available, stale-assigned, test-stability > Fix For: 1.17.0 > > > {code:java} > 2022-06-30T09:23:24.0469768Z Jun 30 09:23:24 [INFO] > 2022-06-30T09:23:24.0470382Z Jun 30 09:23:24 [ERROR] Failures: > 2022-06-30T09:23:24.0471581Z Jun 30 09:23:24 [ERROR] > ResultPartitionTest.testIdleAndBackPressuredTime:414 > 2022-06-30T09:23:24.0472898Z Jun 30 09:23:24 Expected: a value greater than > <0L> > 2022-06-30T09:23:24.0474090Z Jun 30 09:23:24 but: <0L> was equal to <0L> > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37406&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-19358) HA should not always use jobid 0000000000
[ https://issues.apache.org/jira/browse/FLINK-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-19358: - Release Note: The JobID in application mode is no longer 0... but instead based on the cluster ID. > HA should not always use jobid 00 > - > > Key: FLINK-19358 > URL: https://issues.apache.org/jira/browse/FLINK-19358 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Jun Zhang >Assignee: Yangze Guo >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available, > usability > Fix For: 1.16.0 > > > when submit a flink job on application mode with HA ,the flink job id will be > , when I have many jobs ,they have the same > job id , it will be lead to a checkpoint error -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29051) Disable dependency-reduced-pom creation in quickstarts
[ https://issues.apache.org/jira/browse/FLINK-29051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29051: - Summary: Disable dependency-reduced-pom creation in quickstarts (was: Disable dependency-reduced-pom creationg quickstarts) > Disable dependency-reduced-pom creation in quickstarts > -- > > Key: FLINK-29051 > URL: https://issues.apache.org/jira/browse/FLINK-29051 > Project: Flink > Issue Type: Improvement > Components: Quickstarts >Affects Versions: 1.15.1 >Reporter: Sergey Nuyanzin >Assignee: Sergey Nuyanzin >Priority: Minor > Labels: pull-request-available > Fix For: 1.16.0 > > > now maven-shade-plugin puts it in project folder and version control detects > as a change. > The proposal is to put it to target folder as it's done for table-planner -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27667) YARNHighAvailabilityITCase fails with "Failed to delete temp directory /tmp/junit1681"
[ https://issues.apache.org/jira/browse/FLINK-27667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-27667: - Issue Type: Technical Debt (was: Bug) > YARNHighAvailabilityITCase fails with "Failed to delete temp directory > /tmp/junit1681" > -- > > Key: FLINK-27667 > URL: https://issues.apache.org/jira/browse/FLINK-27667 > Project: Flink > Issue Type: Technical Debt > Components: Deployment / YARN >Affects Versions: 1.16.0 >Reporter: Martijn Visser >Assignee: Ferenc Csaky >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=35733&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=29208 > > {code:bash} > May 17 08:36:22 [INFO] Results: > May 17 08:36:22 [INFO] > May 17 08:36:22 [ERROR] Errors: > May 17 08:36:22 [ERROR] YARNHighAvailabilityITCase » IO Failed to delete temp > directory /tmp/junit1681... > May 17 08:36:22 [INFO] > May 17 08:36:22 [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0 > May 17 08:36:22 [INFO] > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28735) Deprecate host/web-ui-port parameter of jobmanager.sh
[ https://issues.apache.org/jira/browse/FLINK-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-28735: - Release Note: The host/web-ui-port parameters of the jobmanager.sh script have been deprecated. These can (and should) be specified with the corresponding options as dynamic properties. > Deprecate host/web-ui-port parameter of jobmanager.sh > - > > Key: FLINK-28735 > URL: https://issues.apache.org/jira/browse/FLINK-28735 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Minor > Labels: pull-request-available > Fix For: 1.16.0 > > > If we fix FLINK-28733 we could while we're at it deprecate these 2 > parameters, since you can then also control them via dynamic properties. > This would also subsume FLINK-21038. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27599) Setup flink-connector-rabbitmq repository
[ https://issues.apache.org/jira/browse/FLINK-27599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-27599: - Fix Version/s: (was: 1.16.0) > Setup flink-connector-rabbitmq repository > - > > Key: FLINK-27599 > URL: https://issues.apache.org/jira/browse/FLINK-27599 > Project: Flink > Issue Type: New Feature > Components: Connectors/ RabbitMQ >Reporter: Martijn Visser >Assignee: Martijn Visser >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27472) Setup flink-connector-redis repository
[ https://issues.apache.org/jira/browse/FLINK-27472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-27472: - Fix Version/s: (was: 1.16.0) > Setup flink-connector-redis repository > -- > > Key: FLINK-27472 > URL: https://issues.apache.org/jira/browse/FLINK-27472 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Martijn Visser >Assignee: Martijn Visser >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29396) Race condition in JobMaster shutdown can leak resource requirements
[ https://issues.apache.org/jira/browse/FLINK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611045#comment-17611045 ] Chesnay Schepler commented on FLINK-29396: -- Yes it shouldn't be a blocker. > Race condition in JobMaster shutdown can leak resource requirements > --- > > Key: FLINK-29396 > URL: https://issues.apache.org/jira/browse/FLINK-29396 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Chesnay Schepler >Priority: Critical > > When a JobMaster is stopped it > a) sends a message to the RM informing it of the final job status > b) removes itself as the leader. > Once the JM loses leadership the RM is also informed about that. > With that we have 2 messages being sent to the RM at about the same time. > If the shutdown notifications arrives first (and job is in a terminal state) > we wipe the resource requirements, and the leader loss notification is > effectively ignored. > If the leader loss notification arrives first we keep the resource > requirements, assuming that another JM will pick the job up later on, and the > shutdown notification will be ignored. > This can cause a session cluster to essentially do nothing until the job > timeout is triggered due to no leader being present (default 5 minutes). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29396) Race condition in JobMaster shutdown can leak resource requirements
[ https://issues.apache.org/jira/browse/FLINK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29396: - Priority: Critical (was: Blocker) > Race condition in JobMaster shutdown can leak resource requirements > --- > > Key: FLINK-29396 > URL: https://issues.apache.org/jira/browse/FLINK-29396 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Chesnay Schepler >Priority: Critical > > When a JobMaster is stopped it > a) sends a message to the RM informing it of the final job status > b) removes itself as the leader. > Once the JM loses leadership the RM is also informed about that. > With that we have 2 messages being sent to the RM at about the same time. > If the shutdown notifications arrives first (and job is in a terminal state) > we wipe the resource requirements, and the leader loss notification is > effectively ignored. > If the leader loss notification arrives first we keep the resource > requirements, assuming that another JM will pick the job up later on, and the > shutdown notification will be ignored. > This can cause a session cluster to essentially do nothing until the job > timeout is triggered due to no leader being present (default 5 minutes). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28733) jobmanager.sh should support dynamic properties
[ https://issues.apache.org/jira/browse/FLINK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-28733: - Fix Version/s: 1.15.3 > jobmanager.sh should support dynamic properties > --- > > Key: FLINK-28733 > URL: https://issues.apache.org/jira/browse/FLINK-28733 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.15.3 > > > {{jobmanager.sh}} throws away all arguments after the host/webui-port > settings, in contrast to other scripts like {{taskmanager.sh}},{{ > historyserver.sh}} or {{standalone-job.sh}}. > This prevents users from using dynamic properties. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-28733) jobmanager.sh should support dynamic properties
[ https://issues.apache.org/jira/browse/FLINK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576059#comment-17576059 ] Chesnay Schepler edited comment on FLINK-28733 at 9/29/22 1:01 PM: --- master: ff2f4cb624ab84c43f2fd0daea2daa8ba74b4169 1.15: dbfdebaa1c0eac5346871e023a02662b8301c295 was (Author: zentol): master: ff2f4cb624ab84c43f2fd0daea2daa8ba74b4169 > jobmanager.sh should support dynamic properties > --- > > Key: FLINK-28733 > URL: https://issues.apache.org/jira/browse/FLINK-28733 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.15.3 > > > {{jobmanager.sh}} throws away all arguments after the host/webui-port > settings, in contrast to other scripts like {{taskmanager.sh}},{{ > historyserver.sh}} or {{standalone-job.sh}}. > This prevents users from using dynamic properties. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29473) Move Flink CI utils into separate project
Chesnay Schepler created FLINK-29473: Summary: Move Flink CI utils into separate project Key: FLINK-29473 URL: https://issues.apache.org/jira/browse/FLINK-29473 Project: Flink Issue Type: Sub-task Components: Build System, Build System / CI Reporter: Chesnay Schepler Assignee: Chesnay Schepler We need the ability to improve the CI tools independent of Flink releases. Move flink-ci-tools into it's own repo. This will require some code changes, for example to allow exclusions to passed via the command-line. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29472) Create shared release scripts
Chesnay Schepler created FLINK-29472: Summary: Create shared release scripts Key: FLINK-29472 URL: https://issues.apache.org/jira/browse/FLINK-29472 Project: Flink Issue Type: Sub-task Components: Release System Reporter: Chesnay Schepler Assignee: Chesnay Schepler With the versioning & branching model being identical we should be able to share all release scripts. Put them into a central location that projects can rely on (e.g., via a git submodule). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29471) Create a flink-connector-parent pom
[ https://issues.apache.org/jira/browse/FLINK-29471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611020#comment-17611020 ] Chesnay Schepler commented on FLINK-29471: -- I've published a prototype pom as {{io.github.zentol.flink:flink-connector-parent}} that we can use while we still iron things out to. In the end this should be moved under the Flink umbrella. > Create a flink-connector-parent pom > --- > > Key: FLINK-29471 > URL: https://issues.apache.org/jira/browse/FLINK-29471 > Project: Flink > Issue Type: Sub-task > Components: Build System >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > Create a shared parent pom for connectors, reducing the overhead of creating > new repos and easing plugin maintenance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29471) Create a flink-connector-parent pom
Chesnay Schepler created FLINK-29471: Summary: Create a flink-connector-parent pom Key: FLINK-29471 URL: https://issues.apache.org/jira/browse/FLINK-29471 Project: Flink Issue Type: Sub-task Components: Build System Reporter: Chesnay Schepler Assignee: Chesnay Schepler Create a shared parent pom for connectors, reducing the overhead of creating new repos and easing plugin maintenance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29470) Setup shared utils for connector externalization
Chesnay Schepler created FLINK-29470: Summary: Setup shared utils for connector externalization Key: FLINK-29470 URL: https://issues.apache.org/jira/browse/FLINK-29470 Project: Flink Issue Type: Technical Debt Components: Build System Reporter: Chesnay Schepler Assignee: Chesnay Schepler -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29436) Upgrade Spotless Maven Plugin to 2.27.1
[ https://issues.apache.org/jira/browse/FLINK-29436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29436: - Summary: Upgrade Spotless Maven Plugin to 2.27.1 (was: Upgrade Spotless Maven Plugin to 2.27.0 for running on Java 17) > Upgrade Spotless Maven Plugin to 2.27.1 > --- > > Key: FLINK-29436 > URL: https://issues.apache.org/jira/browse/FLINK-29436 > Project: Flink > Issue Type: Sub-task >Reporter: Zili Chen >Assignee: Zili Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > This blocker is fixed by: https://github.com/diffplug/spotless/pull/1224 and > https://github.com/diffplug/spotless/pull/1228. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29436) Upgrade Spotless Maven Plugin to 2.27.0 for running on Java 17
[ https://issues.apache.org/jira/browse/FLINK-29436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29436. Fix Version/s: 1.17.0 Resolution: Fixed master: e1e9ee81d5a0cca12156aaf9640c79de5e48f118 > Upgrade Spotless Maven Plugin to 2.27.0 for running on Java 17 > -- > > Key: FLINK-29436 > URL: https://issues.apache.org/jira/browse/FLINK-29436 > Project: Flink > Issue Type: Sub-task >Reporter: Zili Chen >Assignee: Zili Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > This blocker is fixed by: https://github.com/diffplug/spotless/pull/1224 and > https://github.com/diffplug/spotless/pull/1228. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29339) JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager blocks main thread
[ https://issues.apache.org/jira/browse/FLINK-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29339: - Priority: Critical (was: Blocker) > JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager > blocks main thread > - > > Key: FLINK-29339 > URL: https://issues.apache.org/jira/browse/FLINK-29339 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Xuannan Su >Priority: Critical > Labels: pull-request-available > Fix For: 1.16.0 > > > {code:java} > private List requestShuffleDescriptorsFromResourceManager( > IntermediateDataSetID intermediateDataSetID) { > Preconditions.checkNotNull( > resourceManagerGateway, "JobMaster is not connected to > ResourceManager"); > try { > return this.resourceManagerGateway > .getClusterPartitionsShuffleDescriptors(intermediateDataSetID) > .get(); // <-- there's your problem > } catch (Throwable e) { > throw new RuntimeException( > String.format( > "Failed to get shuffle descriptors of intermediate > dataset %s from ResourceManager", > intermediateDataSetID), > e); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29457) Add a uid(hash) remapping function
Chesnay Schepler created FLINK-29457: Summary: Add a uid(hash) remapping function Key: FLINK-29457 URL: https://issues.apache.org/jira/browse/FLINK-29457 Project: Flink Issue Type: Sub-task Components: API / State Processor Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 Expose functionality for modifying the uid[hash] of a state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29456) Add methods that acept OperatorIdentifier
Chesnay Schepler created FLINK-29456: Summary: Add methods that acept OperatorIdentifier Key: FLINK-29456 URL: https://issues.apache.org/jira/browse/FLINK-29456 Project: Flink Issue Type: Sub-task Components: API / State Processor Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 Add new variants of all methods in the SavepointReader/-Writer that accept an OperatorIdentifier. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29455) Add OperatorIdentifier
Chesnay Schepler created FLINK-29455: Summary: Add OperatorIdentifier Key: FLINK-29455 URL: https://issues.apache.org/jira/browse/FLINK-29455 Project: Flink Issue Type: Sub-task Components: API / State Processor Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 Add a class for identifying operators, that supports both uids and uidhashes, and integrate into the low-level APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29453) Add uidHash support to State Processor API
[ https://issues.apache.org/jira/browse/FLINK-29453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29453: - Description: The state process API is currently limited to working with uids. We should change this since this is a good application for the API. The API should be extended to support uidHashes wherever a uid is support, and we should add a method to map uid[hashes] to a different uid[hash]. was: The state process API is currently limited to working with uids. We should change this since this is a good application for the API. > Add uidHash support to State Processor API > --- > > Key: FLINK-29453 > URL: https://issues.apache.org/jira/browse/FLINK-29453 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > The state process API is currently limited to working with uids. > We should change this since this is a good application for the API. > The API should be extended to support uidHashes wherever a uid is support, > and we should add a method to map uid[hashes] to a different uid[hash]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29408) HiveCatalogITCase failed with NPE
[ https://issues.apache.org/jira/browse/FLINK-29408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610521#comment-17610521 ] Chesnay Schepler edited comment on FLINK-29408 at 9/28/22 11:57 AM: If you change the {{test_pool_definition}} you run the tests on azure machines, and not our CI machines. Naturally you shouldn't do that. Note that it may just be a timing thing. I saw plenty of those NPEs in my personal azure which also uses azure machines. was (Author: zentol): If you change the {{test_pool_definition}} you run the tests on azure machines, and not our CI machines. Naturally you shouldn't do that. > HiveCatalogITCase failed with NPE > - > > Key: FLINK-29408 > URL: https://issues.apache.org/jira/browse/FLINK-29408 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.16.0, 1.17.0 >Reporter: Huang Xingbo >Assignee: luoyuxia >Priority: Critical > Labels: test-stability > > {code:java} > 2022-09-25T03:41:07.4212129Z Sep 25 03:41:07 [ERROR] > org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf Time > elapsed: 0.098 s <<< ERROR! > 2022-09-25T03:41:07.4212662Z Sep 25 03:41:07 java.lang.NullPointerException > 2022-09-25T03:41:07.4213189Z Sep 25 03:41:07 at > org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf(HiveCatalogUdfITCase.java:109) > 2022-09-25T03:41:07.4213753Z Sep 25 03:41:07 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2022-09-25T03:41:07.4224643Z Sep 25 03:41:07 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2022-09-25T03:41:07.4225311Z Sep 25 03:41:07 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2022-09-25T03:41:07.4225879Z Sep 25 03:41:07 at > java.lang.reflect.Method.invoke(Method.java:498) > 2022-09-25T03:41:07.4226405Z Sep 25 03:41:07 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 2022-09-25T03:41:07.4227201Z Sep 25 03:41:07 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2022-09-25T03:41:07.4227807Z Sep 25 03:41:07 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 2022-09-25T03:41:07.4228394Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2022-09-25T03:41:07.4228966Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-09-25T03:41:07.4229514Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4230066Z Sep 25 03:41:07 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > 2022-09-25T03:41:07.4230587Z Sep 25 03:41:07 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > 2022-09-25T03:41:07.4231258Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-09-25T03:41:07.4231823Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 2022-09-25T03:41:07.4232384Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 2022-09-25T03:41:07.4232930Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 2022-09-25T03:41:07.4233511Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 2022-09-25T03:41:07.4234039Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 2022-09-25T03:41:07.4234546Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 2022-09-25T03:41:07.4235057Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 2022-09-25T03:41:07.4235573Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 2022-09-25T03:41:07.4236087Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 2022-09-25T03:41:07.4236635Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 2022-09-25T03:41:07.4237314Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-09-25T03:41:07.4238211Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4238775Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4239277Z Sep 25 03:41:07 at > org.junit.rules.RunRules.evaluate(RunRules.java:20)
[jira] [Commented] (FLINK-29408) HiveCatalogITCase failed with NPE
[ https://issues.apache.org/jira/browse/FLINK-29408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610521#comment-17610521 ] Chesnay Schepler commented on FLINK-29408: -- If you change the {{test_pool_definition}} you run the tests on azure machines, and not our CI machines. Naturally you shouldn't do that. > HiveCatalogITCase failed with NPE > - > > Key: FLINK-29408 > URL: https://issues.apache.org/jira/browse/FLINK-29408 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.16.0, 1.17.0 >Reporter: Huang Xingbo >Assignee: luoyuxia >Priority: Critical > Labels: test-stability > > {code:java} > 2022-09-25T03:41:07.4212129Z Sep 25 03:41:07 [ERROR] > org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf Time > elapsed: 0.098 s <<< ERROR! > 2022-09-25T03:41:07.4212662Z Sep 25 03:41:07 java.lang.NullPointerException > 2022-09-25T03:41:07.4213189Z Sep 25 03:41:07 at > org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf(HiveCatalogUdfITCase.java:109) > 2022-09-25T03:41:07.4213753Z Sep 25 03:41:07 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2022-09-25T03:41:07.4224643Z Sep 25 03:41:07 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2022-09-25T03:41:07.4225311Z Sep 25 03:41:07 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2022-09-25T03:41:07.4225879Z Sep 25 03:41:07 at > java.lang.reflect.Method.invoke(Method.java:498) > 2022-09-25T03:41:07.4226405Z Sep 25 03:41:07 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 2022-09-25T03:41:07.4227201Z Sep 25 03:41:07 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2022-09-25T03:41:07.4227807Z Sep 25 03:41:07 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 2022-09-25T03:41:07.4228394Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2022-09-25T03:41:07.4228966Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-09-25T03:41:07.4229514Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4230066Z Sep 25 03:41:07 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > 2022-09-25T03:41:07.4230587Z Sep 25 03:41:07 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > 2022-09-25T03:41:07.4231258Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-09-25T03:41:07.4231823Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 2022-09-25T03:41:07.4232384Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 2022-09-25T03:41:07.4232930Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 2022-09-25T03:41:07.4233511Z Sep 25 03:41:07 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 2022-09-25T03:41:07.4234039Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 2022-09-25T03:41:07.4234546Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 2022-09-25T03:41:07.4235057Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 2022-09-25T03:41:07.4235573Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 2022-09-25T03:41:07.4236087Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 2022-09-25T03:41:07.4236635Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 2022-09-25T03:41:07.4237314Z Sep 25 03:41:07 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-09-25T03:41:07.4238211Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4238775Z Sep 25 03:41:07 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-09-25T03:41:07.4239277Z Sep 25 03:41:07 at > org.junit.rules.RunRules.evaluate(RunRules.java:20) > 2022-09-25T03:41:07.4239769Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-09-25T03:41:07.4240265Z Sep 25 03:41:07 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > 2022-09-25T03:41:07.4240731Z Sep 25 03:41:07 at > org.junit.runner.JUnitCore.run(JUnitCore.java:137)
[jira] [Created] (FLINK-29454) Deduplicate code in SavepointReader
Chesnay Schepler created FLINK-29454: Summary: Deduplicate code in SavepointReader Key: FLINK-29454 URL: https://issues.apache.org/jira/browse/FLINK-29454 Project: Flink Issue Type: Sub-task Components: API / State Processor Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29453) Add uidHash support to State Processor API
Chesnay Schepler created FLINK-29453: Summary: Add uidHash support to State Processor API Key: FLINK-29453 URL: https://issues.apache.org/jira/browse/FLINK-29453 Project: Flink Issue Type: Improvement Components: API / State Processor Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 The state process API is currently limited to working with uids. We should change this since this is a good application for the API. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25002) Setup required --add-opens/--add-exports
[ https://issues.apache.org/jira/browse/FLINK-25002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610471#comment-17610471 ] Chesnay Schepler commented on FLINK-25002: -- Several projects have started to open/export packages at runtime via reflection. https://github.com/JetBrains/kotlin/commit/52e45062cff913306872183ca38afe29fb7bfc79 https://github.com/diffplug/spotless/pull/1224 It's quite a hack, but we should consider it for the future nonetheless. > Setup required --add-opens/--add-exports > > > Key: FLINK-25002 > URL: https://issues.apache.org/jira/browse/FLINK-25002 > Project: Flink > Issue Type: Sub-task > Components: Build System, Build System / CI, Documentation, Tests >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > Java 17 actually enforces the encapsulation of the JDK (opposed to Java 11 > which just printed warnings), requiring us to explicitly open/export any > package that we access illegally. > The following is a list of opens/exports that I needed to get most tests to > pass, also with some comments which component needed them. Overall the > ClosureCleaner and FieldSerializer result in the most offenses, as they try > to access private fields. > These properties need to be set _for all JVMs in which we run Flink_, > including surefire forks, other tests processes > (TestJvmProcess/TestProcessBuilder/Yarn) and the distribution. > This needs some thought on how we can share this list across poms (surefire), > code (test processes / yarn) and the configuration (distribution). > {code:xml} > --add-exports java.base/sun.net.util=ALL-UNNAMED --add-exports java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED > --add-exports jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.lang.invoke=ALL-UNNAMED --add-opens java.base/java.math=ALL-UNNAMED --add-opens java.base/java.net=ALL-UNNAMED --add-opens java.base/java.io=ALL-UNNAMED --add-opens java.base/java.lang.reflect=ALL-UNNAMED --add-opens java.sql/java.sql=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/java.text=ALL-UNNAMED --add-opens java.base/java.time=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.util.concurrent=ALL-UNNAMED --add-opens java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens java.base/java.util.concurrent.locks=ALL-UNNAMED --add-opens java.base/java.util.stream=ALL-UNNAMED --add-opens java.base/sun.util.calendar=ALL-UNNAMED > > {code} > Additionally, the following JVM arguments must be supplied when running Maven: > {code} > export MAVEN_OPTS="\ > --add-exports jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED \ > --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED \ > --add-exports jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED \ > --add-exports jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED \ > --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \ > --add-exports java.security.jgss/sun.security.krb5=ALL-UNNAMED" > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-21564) CommonTestUtils.waitUntilCondition could fail with condition meets before
[ https://issues.apache.org/jira/browse/FLINK-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-21564. Resolution: Duplicate > CommonTestUtils.waitUntilCondition could fail with condition meets before > - > > Key: FLINK-21564 > URL: https://issues.apache.org/jira/browse/FLINK-21564 > Project: Flink > Issue Type: Bug > Components: Tests >Reporter: Kezhu Wang >Assignee: Pedro Silva >Priority: Minor > Labels: auto-unassigned, pull-request-available, stale-assigned > > {code} > public static void waitUntilCondition( > SupplierWithException condition, > Deadline timeout, > long retryIntervalMillis, > String errorMsg) > throws Exception { > while (timeout.hasTimeLeft() && !condition.get()) { > final long timeLeft = Math.max(0, timeout.timeLeft().toMillis()); > Thread.sleep(Math.min(retryIntervalMillis, timeLeft)); > } > if (!timeout.hasTimeLeft()) { > throw new TimeoutException(errorMsg); > } > } > {code} > The timeout could run off between truth condition and last checking. > Besides this, I also see time-out blocking condition in some tests, the > combination could be worse. > Not a big issue, but worth to be aware of and solved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29372) Add a suffix to keys that violate YAML spec
[ https://issues.apache.org/jira/browse/FLINK-29372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610453#comment-17610453 ] Chesnay Schepler commented on FLINK-29372: -- master: 6cce68dcdc1baf4be2a9e1549983d010644b5ee3 > Add a suffix to keys that violate YAML spec > --- > > Key: FLINK-29372 > URL: https://issues.apache.org/jira/browse/FLINK-29372 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > We have a few options where the key is a prefix of other options (e.g., > {{high-availability}} and {{high-availability.cluster-id}}. > Add a suffix to these options and keep the old key as deprecated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29339) JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager blocks main thread
[ https://issues.apache.org/jira/browse/FLINK-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610447#comment-17610447 ] Chesnay Schepler commented on FLINK-29339: -- > the actual numbers of the rpc request is limited. The numbers are irrelevant though, as is the size of the response. You just need a single request to time out (e.g., due to the RM actor crashing) to potentially crash the entire cluster because of heartbeat timeouts. You make a good point that the fix is quite involved though, and it is indeed only called if cluster partitions are actually used. So I'm fine with not treating it as a blocker. > JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager > blocks main thread > - > > Key: FLINK-29339 > URL: https://issues.apache.org/jira/browse/FLINK-29339 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Xuannan Su >Priority: Blocker > Labels: pull-request-available > Fix For: 1.16.0 > > > {code:java} > private List requestShuffleDescriptorsFromResourceManager( > IntermediateDataSetID intermediateDataSetID) { > Preconditions.checkNotNull( > resourceManagerGateway, "JobMaster is not connected to > ResourceManager"); > try { > return this.resourceManagerGateway > .getClusterPartitionsShuffleDescriptors(intermediateDataSetID) > .get(); // <-- there's your problem > } catch (Throwable e) { > throw new RuntimeException( > String.format( > "Failed to get shuffle descriptors of intermediate > dataset %s from ResourceManager", > intermediateDataSetID), > e); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29339) JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager blocks main thread
[ https://issues.apache.org/jira/browse/FLINK-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29339: - Fix Version/s: 1.16.0 > JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager > blocks main thread > - > > Key: FLINK-29339 > URL: https://issues.apache.org/jira/browse/FLINK-29339 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Xuannan Su >Priority: Blocker > Labels: pull-request-available > Fix For: 1.16.0 > > > {code:java} > private List requestShuffleDescriptorsFromResourceManager( > IntermediateDataSetID intermediateDataSetID) { > Preconditions.checkNotNull( > resourceManagerGateway, "JobMaster is not connected to > ResourceManager"); > try { > return this.resourceManagerGateway > .getClusterPartitionsShuffleDescriptors(intermediateDataSetID) > .get(); // <-- there's your problem > } catch (Throwable e) { > throw new RuntimeException( > String.format( > "Failed to get shuffle descriptors of intermediate > dataset %s from ResourceManager", > intermediateDataSetID), > e); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29339) JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager blocks main thread
[ https://issues.apache.org/jira/browse/FLINK-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29339: - Priority: Blocker (was: Critical) > JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager > blocks main thread > - > > Key: FLINK-29339 > URL: https://issues.apache.org/jira/browse/FLINK-29339 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Xuannan Su >Priority: Blocker > Labels: pull-request-available > > {code:java} > private List requestShuffleDescriptorsFromResourceManager( > IntermediateDataSetID intermediateDataSetID) { > Preconditions.checkNotNull( > resourceManagerGateway, "JobMaster is not connected to > ResourceManager"); > try { > return this.resourceManagerGateway > .getClusterPartitionsShuffleDescriptors(intermediateDataSetID) > .get(); // <-- there's your problem > } catch (Throwable e) { > throw new RuntimeException( > String.format( > "Failed to get shuffle descriptors of intermediate > dataset %s from ResourceManager", > intermediateDataSetID), > e); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29396) Race condition in JobMaster shutdown can leak resource requirements
[ https://issues.apache.org/jira/browse/FLINK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610038#comment-17610038 ] Chesnay Schepler commented on FLINK-29396: -- The question is when the RM can send that notification. The disconnect calls are weirdly cyclic and have 0 notion of who initiated it initially. The JM calls {{ResourceManagerGateway#disconnectJobManager}} which ends with the RM calling {{JobMasterGateway#disconnectResourceManager}}. But this sequence can also happen in reverse; so who waits for who? Maybe we should rework these methods to actually be an {{ask}}. The shutdown sequence is is actually super annoying in general because even an orderly shutdown by the mini cluster invariably leads to someone getting an exception because the other party has already shut down. > Race condition in JobMaster shutdown can leak resource requirements > --- > > Key: FLINK-29396 > URL: https://issues.apache.org/jira/browse/FLINK-29396 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Chesnay Schepler >Priority: Blocker > > When a JobMaster is stopped it > a) sends a message to the RM informing it of the final job status > b) removes itself as the leader. > Once the JM loses leadership the RM is also informed about that. > With that we have 2 messages being sent to the RM at about the same time. > If the shutdown notifications arrives first (and job is in a terminal state) > we wipe the resource requirements, and the leader loss notification is > effectively ignored. > If the leader loss notification arrives first we keep the resource > requirements, assuming that another JM will pick the job up later on, and the > shutdown notification will be ignored. > This can cause a session cluster to essentially do nothing until the job > timeout is triggered due to no leader being present (default 5 minutes). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29422) Production tests return/argument types do not take transitivity into account
[ https://issues.apache.org/jira/browse/FLINK-29422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609965#comment-17609965 ] Chesnay Schepler commented on FLINK-29422: -- Alright this is "solved" for the IterativeCondition with an exclusion. _great_ > Production tests return/argument types do not take transitivity into account > > > Key: FLINK-29422 > URL: https://issues.apache.org/jira/browse/FLINK-29422 > Project: Flink > Issue Type: Technical Debt > Components: API / DataStream >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Priority: Major > > In FLINK-29403 I'm marking {{SimpleCondition}} as {{PublicEvolving}}, but the > production tests reject it: > {code:java} > Architecture Violation [Priority: MEDIUM] - Rule 'Return and argument types > of methods annotated with @PublicEvolving must be annotated with > @Public(Evolving).' was violated (1 times): > Sep 26 15:20:12 > org.apache.flink.cep.pattern.conditions.SimpleCondition.filter(java.lang.Object, > org.apache.flink.cep.pattern.conditions.IterativeCondition$Context): > Argument leaf type > org.apache.flink.cep.pattern.conditions.IterativeCondition$Context does not > satisfy: reside outside of package 'org.apache.flink..' or reside in any > package ['..shaded..'] or annotated with @Public or annotated with > @PublicEvolving or annotated with @Deprecated > {code} > This doesn't make any sense given that {{IterativeCondition}} itself is > already {{PublicEvolving}} and contains the exact same method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29431) Exceptions during job graph serialization lock up client
[ https://issues.apache.org/jira/browse/FLINK-29431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29431: - Description: The parallel serialization introduced in FLINK-26675 does not properly handle errors. {{StreamConfig#triggerSerializationAndReturnFuture}} returns a future that is never completed if the serialization fails, because all we do is throw an exception but that isn't propagated anywhere. As a result {{StreamingJobGraphGenerator#createJobGraph}} blocks indefinitely. was: The parallel serialization introduced in FLINK-26675 does not properly handle errors. {{StreamConfig#triggerSerializationAndReturnFuture}} returns a future that is never completed if the serialization fails. As a result {{StreamingJobGraphGenerator#createJobGraph}} blocks indefinitely. > Exceptions during job graph serialization lock up client > > > Key: FLINK-29431 > URL: https://issues.apache.org/jira/browse/FLINK-29431 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Priority: Blocker > Fix For: 1.16.0 > > > The parallel serialization introduced in FLINK-26675 does not properly handle > errors. > {{StreamConfig#triggerSerializationAndReturnFuture}} returns a future that is > never completed if the serialization fails, because all we do is throw an > exception but that isn't propagated anywhere. > As a result {{StreamingJobGraphGenerator#createJobGraph}} blocks indefinitely. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29431) Exceptions during job graph serialization lock up client
[ https://issues.apache.org/jira/browse/FLINK-29431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29431: - Summary: Exceptions during job graph serialization lock up client (was: Exceptions during job grapg serialization lock up client) > Exceptions during job graph serialization lock up client > > > Key: FLINK-29431 > URL: https://issues.apache.org/jira/browse/FLINK-29431 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Priority: Blocker > Fix For: 1.16.0 > > > The parallel serialization introduced in FLINK-26675 does not properly handle > errors. > {{StreamConfig#triggerSerializationAndReturnFuture}} returns a future that is > never completed if the serialization fails. > As a result {{StreamingJobGraphGenerator#createJobGraph}} blocks indefinitely. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29431) Exceptions during job grapg serialization lock up client
Chesnay Schepler created FLINK-29431: Summary: Exceptions during job grapg serialization lock up client Key: FLINK-29431 URL: https://issues.apache.org/jira/browse/FLINK-29431 Project: Flink Issue Type: Bug Components: Client / Job Submission Affects Versions: 1.16.0 Reporter: Chesnay Schepler Fix For: 1.16.0 The parallel serialization introduced in FLINK-26675 does not properly handle errors. {{StreamConfig#triggerSerializationAndReturnFuture}} returns a future that is never completed if the serialization fails. As a result {{StreamingJobGraphGenerator#createJobGraph}} blocks indefinitely. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29418) Update flink-shaded dependencies
[ https://issues.apache.org/jira/browse/FLINK-29418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29418: - Description: Do a round of updates in flink-shaded. [asm] Upgrade 9.2 -> 9.3 [jackson] Upgrade 2.12.4 -> 2.13.4 [swagger] Upgrade 2.1.11 -> 2.2.2 [zookeeper] Upgrade 3.5.9 -> 3.5.10 [curator] Upgrade 5.2.0 -> 5.3.0 [netty] Upgrade 4.1.70 -> 4.1.82 [zookeeper] Add 3.7 support [zookeeper] Add 3.8 support was:Do a round of updates in flink-shaded. > Update flink-shaded dependencies > > > Key: FLINK-29418 > URL: https://issues.apache.org/jira/browse/FLINK-29418 > Project: Flink > Issue Type: Technical Debt > Components: BuildSystem / Shaded >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: shaded-1.16 > > > Do a round of updates in flink-shaded. > [asm] Upgrade 9.2 -> 9.3 > [jackson] Upgrade 2.12.4 -> 2.13.4 > [swagger] Upgrade 2.1.11 -> 2.2.2 > [zookeeper] Upgrade 3.5.9 -> 3.5.10 > [curator] Upgrade 5.2.0 -> 5.3.0 > [netty] Upgrade 4.1.70 -> 4.1.82 > [zookeeper] Add 3.7 support > [zookeeper] Add 3.8 support -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29418) Update flink-shaded dependencies
[ https://issues.apache.org/jira/browse/FLINK-29418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29418. Resolution: Fixed > Update flink-shaded dependencies > > > Key: FLINK-29418 > URL: https://issues.apache.org/jira/browse/FLINK-29418 > Project: Flink > Issue Type: Technical Debt > Components: BuildSystem / Shaded >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: shaded-1.16 > > > Do a round of updates in flink-shaded. > [asm] Upgrade 9.2 -> 9.3 > [jackson] Upgrade 2.12.4 -> 2.13.4 > [swagger] Upgrade 2.1.11 -> 2.2.2 > [zookeeper] Upgrade 3.5.9 -> 3.5.10 > [curator] Upgrade 5.2.0 -> 5.3.0 > [netty] Upgrade 4.1.70 -> 4.1.82 > [zookeeper] Add 3.7 support > [zookeeper] Add 3.8 support -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29418) Update flink-shaded dependencies
[ https://issues.apache.org/jira/browse/FLINK-29418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29418: - Summary: Update flink-shaded dependencies (was: Update flink-shaded) > Update flink-shaded dependencies > > > Key: FLINK-29418 > URL: https://issues.apache.org/jira/browse/FLINK-29418 > Project: Flink > Issue Type: Technical Debt > Components: BuildSystem / Shaded >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: shaded-1.16 > > > Do a round of updates in flink-shaded. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29418) Update flink-shaded dependencies
[ https://issues.apache.org/jira/browse/FLINK-29418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609905#comment-17609905 ] Chesnay Schepler commented on FLINK-29418: -- master: d630f8b1e8d19f70cd2a967e1d210454ef1c7849..bfb2c5abf75e73ab4e8159de19c379c3324695f0 > Update flink-shaded dependencies > > > Key: FLINK-29418 > URL: https://issues.apache.org/jira/browse/FLINK-29418 > Project: Flink > Issue Type: Technical Debt > Components: BuildSystem / Shaded >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: shaded-1.16 > > > Do a round of updates in flink-shaded. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-29423) JobDetails is incorrect OpenAPI spec
[ https://issues.apache.org/jira/browse/FLINK-29423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-29423: Assignee: Chesnay Schepler > JobDetails is incorrect OpenAPI spec > - > > Key: FLINK-29423 > URL: https://issues.apache.org/jira/browse/FLINK-29423 > Project: Flink > Issue Type: Bug > Components: Documentation, Runtime / REST >Affects Versions: 1.16.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > The JobDetails use custom serialization, but the introspection ignores that > and analyzes the class as-is, resulting in various fields being documented > that shouldn't be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29423) JobDetails is incorrect OpenAPI spec
Chesnay Schepler created FLINK-29423: Summary: JobDetails is incorrect OpenAPI spec Key: FLINK-29423 URL: https://issues.apache.org/jira/browse/FLINK-29423 Project: Flink Issue Type: Bug Components: Documentation, Runtime / REST Affects Versions: 1.16.0 Reporter: Chesnay Schepler The JobDetails use custom serialization, but the introspection ignores that and analyzes the class as-is, resulting in various fields being documented that shouldn't be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29422) Production tests return/argument types do not take transitivity into account
Chesnay Schepler created FLINK-29422: Summary: Production tests return/argument types do not take transitivity into account Key: FLINK-29422 URL: https://issues.apache.org/jira/browse/FLINK-29422 Project: Flink Issue Type: Technical Debt Components: API / DataStream Affects Versions: 1.16.0 Reporter: Chesnay Schepler In FLINK-29403 I'm marking {{SimpleCondition}} as {{PublicEvolving}}, but the production tests reject it: {code:java} Architecture Violation [Priority: MEDIUM] - Rule 'Return and argument types of methods annotated with @PublicEvolving must be annotated with @Public(Evolving).' was violated (1 times): Sep 26 15:20:12 org.apache.flink.cep.pattern.conditions.SimpleCondition.filter(java.lang.Object, org.apache.flink.cep.pattern.conditions.IterativeCondition$Context): Argument leaf type org.apache.flink.cep.pattern.conditions.IterativeCondition$Context does not satisfy: reside outside of package 'org.apache.flink..' or reside in any package ['..shaded..'] or annotated with @Public or annotated with @PublicEvolving or annotated with @Deprecated {code} This doesn't make any sense given that {{IterativeCondition}} itself is already {{PublicEvolving}} and contains the exact same method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29420) Add support for Zookeeper 3.7 / 3.8
Chesnay Schepler created FLINK-29420: Summary: Add support for Zookeeper 3.7 / 3.8 Key: FLINK-29420 URL: https://issues.apache.org/jira/browse/FLINK-29420 Project: Flink Issue Type: Improvement Components: BuildSystem / Shaded, Runtime / Coordination Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29414) Setup license checks
[ https://issues.apache.org/jira/browse/FLINK-29414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-29414. Resolution: Fixed master: 575912de77153544a5e4211f504cc67e5ee776a8 > Setup license checks > > > Key: FLINK-29414 > URL: https://issues.apache.org/jira/browse/FLINK-29414 > Project: Flink > Issue Type: Technical Debt > Components: BuildSystem / Shaded >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: shaded-1.16 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29418) Update flink-shaded
Chesnay Schepler created FLINK-29418: Summary: Update flink-shaded Key: FLINK-29418 URL: https://issues.apache.org/jira/browse/FLINK-29418 Project: Flink Issue Type: Technical Debt Components: BuildSystem / Shaded Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: shaded-1.16 Do a round of updates in flink-shaded. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI
[ https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29315: - Fix Version/s: (was: 1.16.0) (was: 1.17.0) > HDFSTest#testBlobServerRecovery fails on CI > --- > > Key: FLINK-29315 > URL: https://issues.apache.org/jira/browse/FLINK-29315 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / FileSystem, Tests >Affects Versions: 1.16.0, 1.15.2 >Reporter: Chesnay Schepler >Priority: Blocker > Labels: pull-request-available > > The test started failing 2 days ago on different branches. I suspect > something's wrong with the CI infrastructure. > {code:java} > Sep 15 09:11:22 [ERROR] Failures: > Sep 15 09:11:22 [ERROR] HDFSTest.testBlobServerRecovery Multiple Failures > (2 failures) > Sep 15 09:11:22 java.lang.AssertionError: Test failed Error while > running command to get file permissions : java.io.IOException: Cannot run > program "ls": error=1, Operation not permitted > Sep 15 09:11:22 at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) > Sep 15 09:11:22 at > org.apache.hadoop.util.Shell.runCommand(Shell.java:913) > Sep 15 09:11:22 at org.apache.hadoop.util.Shell.run(Shell.java:869) > Sep 15 09:11:22 at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170) > Sep 15 09:11:22 at > org.apache.hadoop.util.Shell.execCommand(Shell.java:1264) > Sep 15 09:11:22 at > org.apache.hadoop.util.Shell.execCommand(Shell.java:1246) > Sep 15 09:11:22 at > org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089) > Sep 15 09:11:22 at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697) > Sep 15 09:11:22 at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672) > Sep 15 09:11:22 at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233) > Sep 15 09:11:22 at > org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141) > Sep 15 09:11:22 at > org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485) > Sep 15 09:11:22 at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444) > Sep 15 09:11:22 at > org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93) > Sep 15 09:11:22 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Sep 15 09:11:22 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Sep 15 09:11:22 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Sep 15 09:11:22 at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) > Sep 15 09:11:22 ... 67 more > Sep 15 09:11:22 > Sep 15 09:11:22 java.lang.NullPointerException: > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29414) Setup license checks
Chesnay Schepler created FLINK-29414: Summary: Setup license checks Key: FLINK-29414 URL: https://issues.apache.org/jira/browse/FLINK-29414 Project: Flink Issue Type: Technical Debt Components: BuildSystem / Shaded Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: shaded-1.16 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29405) InputFormatCacheLoaderTest is unstable
Chesnay Schepler created FLINK-29405: Summary: InputFormatCacheLoaderTest is unstable Key: FLINK-29405 URL: https://issues.apache.org/jira/browse/FLINK-29405 Project: Flink Issue Type: Technical Debt Components: Table SQL / Runtime Affects Versions: 1.17.0 Reporter: Chesnay Schepler #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably when run in a loop. {code} java.lang.AssertionError: Expecting AtomicInteger(0) to have value: 0 but did not. at org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29403) Streamline SimpleCondition usage
[ https://issues.apache.org/jira/browse/FLINK-29403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-29403: - Description: CEP SimpleCondition are essentially filter functions, but since it's an abstract class it ends up being incredibly verbose. We can add a simple factory method to streamline this. Additionally the class should not be annotated with {{@Internal}} given how much it is advertised in the docs. was: CEP SimpleCondition are essentially filter functions, but since it's an abstract class it ends up being incredibly verbose. Additionally the class should not be annotated with {{@Internal}} given how much it is advertised in the docs. > Streamline SimpleCondition usage > > > Key: FLINK-29403 > URL: https://issues.apache.org/jira/browse/FLINK-29403 > Project: Flink > Issue Type: Improvement > Components: Library / CEP >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.17.0 > > > CEP SimpleCondition are essentially filter functions, but since it's an > abstract class it ends up being incredibly verbose. We can add a simple > factory method to streamline this. > Additionally the class should not be annotated with {{@Internal}} given how > much it is advertised in the docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29403) Streamline SimpleCondition usage
Chesnay Schepler created FLINK-29403: Summary: Streamline SimpleCondition usage Key: FLINK-29403 URL: https://issues.apache.org/jira/browse/FLINK-29403 Project: Flink Issue Type: Improvement Components: Library / CEP Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 CEP SimpleCondition are essentially filter functions, but since it's an abstract class it ends up being incredibly verbose. Additionally the class should not be annotated with {{@Internal}} given how much it is advertised in the docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)