[jira] [Commented] (FLINK-31820) Support data source sub-database and sub-table
[ https://issues.apache.org/jira/browse/FLINK-31820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719641#comment-17719641 ] xingyuan cheng commented on FLINK-31820: [~martijnvisser] Hello, sorry for the late reply after the May Day holiday. I did some simple research. At present, in the case of a large number of single databases, some domestic companies use sub-database and sub-table solutions, and foreign companies use distributed databases. Solution. For the difference between domestic and foreign use, the reason for using sub-database and sub-table is because the storage medium of the stock data has been determined, and the cost of migrating to the distributed database is too high, which is unacceptable for the enterprise business, so we need to transform the connector Make it support sub-database sub-table. The reference given by [~Thesharing] is a powerful explanation of mysql sub-database and sub-table. I will update the documentation in the near future. > Support data source sub-database and sub-table > -- > > Key: FLINK-31820 > URL: https://issues.apache.org/jira/browse/FLINK-31820 > Project: Flink > Issue Type: Improvement > Components: Connectors / JDBC >Reporter: xingyuan cheng >Priority: Major > Labels: pull-request-available > > At present, apache/flink-connector-jdbc does not support sub-database and > table sub-database. Now three commonly used databases Mysql, Postgres and > Oracle support sub-database and sub-table > > Taking oracle as an example, users only need to configure the following > format to use > > {code:java} > create table oracle_source ( > EMPLOYEE_ID BIGINT, > START_DATE TIMESTAMP, > END_DATE TIMESTAMP, > JOB_ID VARCHAR, > DEPARTMENT_ID VARCHAR > ) with ( > type = 'oracle', > url = > 'jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,}),jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,})', > userName = 'userName', > password = 'password', > dbName = 'hr', > table-name = 'order_([0-9]{1,})', > timeField = 'START_DATE', > startTime = '2007-1-1 00:00:00' > ); {code} > In the above code, the dbName attribute corresponds to the schema-name > attribute in oracle or postgres, and the mysql database needs to manually > specify the dbName > > At the same time, I am also developing the CDAS whole database > synchronization syntax for the company, and the data source supports > sub-database and table as part of it. Add unit tests. For now, please keep > this PR in draft status. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #584: [FLINK-31867] Enforce a minimum number of observations within a metric window
gyfora commented on code in PR #584: URL: https://github.com/apache/flink-kubernetes-operator/pull/584#discussion_r1185716270 ## flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java: ## @@ -52,6 +52,13 @@ private static ConfigOptions.OptionBuilder autoScalerConfig(String key) { .defaultValue(Duration.ofMinutes(5)) .withDescription("Scaling metrics aggregation window size."); +public static final ConfigOption METRICS_WINDOW_MIN_OBSERVATIONS = +autoScalerConfig("metrics.window.min-observations") +.intType() +.defaultValue(10) Review Comment: I think the default value here is inconsistent with the default 5 minute metric window. I suggest setting the default here to 5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-31820) Support data source sub-database and sub-table
[ https://issues.apache.org/jira/browse/FLINK-31820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xingyuan cheng updated FLINK-31820: --- Description: At present, apache/flink-connector-jdbc does not support sub-database and table sub-database. Now three commonly used databases Mysql, Postgres and Oracle support sub-database and sub-table Taking oracle as an example, users only need to configure the following format to use {code:java} create table oracle_source ( EMPLOYEE_ID BIGINT, START_DATE TIMESTAMP, END_DATE TIMESTAMP, JOB_ID VARCHAR, DEPARTMENT_ID VARCHAR ) with ( type = 'oracle', url = 'jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,}),jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,})', userName = 'userName', password = 'password', dbName = 'hr', table-name = 'order_([0-9]{1,})', timeField = 'START_DATE', startTime = '2007-1-1 00:00:00' ); {code} In the above code, the dbName attribute corresponds to the schema-name attribute in oracle or postgres, and the mysql database needs to manually specify the dbName At the same time, I am also developing the CDAS whole database synchronization syntax for the company, and the data source supports sub-database and table as part of it. Add unit tests. For now, please keep this PR in draft status. was: At present, apache/flink-connector-jdbc does not support sub-database and table sub-database. Now three commonly used databases Mysql, Postgres and Oracle support sub-database and sub-table Taking oracle as an example, users only need to configure the following format to use {code:java} create table oracle_source ( EMPLOYEE_ID BIGINT, START_DATE TIMESTAMP, END_DATE TIMESTAMP, JOB_ID VARCHAR, DEPARTMENT_ID VARCHAR ) with ( type = 'oracle', url = 'jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,}),jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,})', userName = 'userName', password = 'password', dbName = 'hr', tableName = 'job_history', timeField = 'START_DATE', startTime = '2007-1-1 00:00:00' ); {code} In the above code, the dbName attribute corresponds to the schema-name attribute in oracle or postgres, and the mysql database needs to manually specify the dbName At the same time, I am also developing the CDAS whole database synchronization syntax for the company, and the data source supports sub-database and table as part of it. Add unit tests. For now, please keep this PR in draft status. > Support data source sub-database and sub-table > -- > > Key: FLINK-31820 > URL: https://issues.apache.org/jira/browse/FLINK-31820 > Project: Flink > Issue Type: Improvement > Components: Connectors / JDBC >Reporter: xingyuan cheng >Priority: Major > Labels: pull-request-available > > At present, apache/flink-connector-jdbc does not support sub-database and > table sub-database. Now three commonly used databases Mysql, Postgres and > Oracle support sub-database and sub-table > > Taking oracle as an example, users only need to configure the following > format to use > > {code:java} > create table oracle_source ( > EMPLOYEE_ID BIGINT, > START_DATE TIMESTAMP, > END_DATE TIMESTAMP, > JOB_ID VARCHAR, > DEPARTMENT_ID VARCHAR > ) with ( > type = 'oracle', > url = > 'jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,}),jdbc:oracle:thin:@//localhost:3306/order_([0-9]{1,})', > userName = 'userName', > password = 'password', > dbName = 'hr', > table-name = 'order_([0-9]{1,})', > timeField = 'START_DATE', > startTime = '2007-1-1 00:00:00' > ); {code} > In the above code, the dbName attribute corresponds to the schema-name > attribute in oracle or postgres, and the mysql database needs to manually > specify the dbName > > At the same time, I am also developing the CDAS whole database > synchronization syntax for the company, and the data source supports > sub-database and table as part of it. Add unit tests. For now, please keep > this PR in draft status. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] cy2008 commented on pull request #22498: [FLINK-31928][build] Upgrade okhttp3 to 4.11.0
cy2008 commented on PR #22498: URL: https://github.com/apache/flink/pull/22498#issuecomment-1535684672 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases part of the Deployment/Execution that are not easily covered, e.g., [deploy](https://github.com/pgaref/flink/blob/41728f1bae85ce19f843fc18d0d80ddfe75af6b9/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L608), [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104), [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) The intuition here is that all failures are internal and could skip labeling -- but happy to discuss other thoughts? ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases part of the Deployment/Execution that are not easily covered, e.g., [deploy](https://github.com/pgaref/flink/blob/41728f1bae85ce19f843fc18d0d80ddfe75af6b9/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L608), [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104), [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) The intuition here is that all these failures are internal and could skip labeling -- but happy to discuss other thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases part of the Deployment/Execution that are not easily covered, e.g., [deploy](https://github.com/pgaref/flink/blob/41728f1bae85ce19f843fc18d0d80ddfe75af6b9/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L608), [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104), [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) The intuition here is that all failures are internal and could skip labeling -- but happy to discuss alternatives -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases part of the Deployment/Execution that are not easily covered, e.g., [deploy](https://github.com/pgaref/flink/blob/41728f1bae85ce19f843fc18d0d80ddfe75af6b9/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L608), [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104), [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases part of the Deployment/Execution that are not easily covered, e.g., [deploy](https://github.com/pgaref/flink/blob/41728f1bae85ce19f843fc18d0d80ddfe75af6b9/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L608), [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104), [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-jdbc] reswqa commented on a diff in pull request #45: [hotfix] Fix incorrect sql_url in jdbc.yml.
reswqa commented on code in PR #45: URL: https://github.com/apache/flink-connector-jdbc/pull/45#discussion_r1185677895 ## docs/data/jdbc.yml: ## @@ -18,4 +18,4 @@ variants: - maven: flink-connector-jdbc -sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-jdbc/$full_version/flink-sql-connector-jdbc-$full_version.jar +sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc/$full_version/flink-connector-jdbc-$full_version.jar Review Comment: If `3.1.0` is the only version that support `flink-1.17`, there are two possibilities when releasing: - Create `v3.1` branch: let the `setup_docs.sh` tracks this branch. - Do not create `v3.1` branch: modify the doc in `jdbc-connector` repository, let `sql_connector_download_table "jdbc"` point to `3.1.0`. Both of these will automatically change the download url linked to `3.1.0-1.17`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But there are still corner cases that are not easily covered, e.g., [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104) [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185674060 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Again this is covered with the next PR: https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR781-R782 But still there are corner cases that are not easily covered, e.g., [triggerTaskFailover](https://github.com/apache/flink/pull/22511/files#diff-c7f8fc6bbf59b935414529c76720dcafce6105e16783b918af6b06a10e247ab2R103-R104) [UpdatePartitionInfo](https://github.com/apache/flink/pull/22511/files#diff-f9e390431932be9a4505e581632815b81c1f0520e65cec7614ea4b90c8a52afcR1378) Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185672001 ## flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java: ## @@ -473,26 +500,50 @@ public CompletableFuture cancel(Time timeout) { @Override public CompletableFuture updateTaskExecutionState( final TaskExecutionState taskExecutionState) { -FlinkException taskExecutionException; +checkNotNull(taskExecutionState, "taskExecutionState"); +// Use the main/caller thread for all updates to make sure they are processed in order. +// (MainThreadExecutor i.e., the akka thread pool does not guarantee that) +// Only detach for a FAILED state update that is terminal and may perform io heavy labeling. +if (ExecutionState.FAILED.equals(taskExecutionState.getExecutionState())) { +return labelFailure(taskExecutionState) +.thenApplyAsync( +taskStateWithLabels -> { +try { +return doUpdateTaskExecutionState(taskStateWithLabels); +} catch (FlinkException e) { +throw new CompletionException(e); +} +}, +getMainThreadExecutor()); +} try { -checkNotNull(taskExecutionState, "taskExecutionState"); +return CompletableFuture.completedFuture( +doUpdateTaskExecutionState(taskExecutionState)); +} catch (FlinkException e) { +return FutureUtils.completedExceptionally(e); +} +} +private Acknowledge doUpdateTaskExecutionState(final TaskExecutionState taskExecutionState) +throws FlinkException { +@Nullable FlinkException taskExecutionException; +try { if (schedulerNG.updateTaskExecutionState(taskExecutionState)) { Review Comment: Btw, InternalFailuresListener#notifyTaskFailure is already covered as part of FLINK-31891 -- when a TM disconnection happens we need to Release Payload Slot and since the error is not fromSchedulerNg we use the internalTaskFailuresListener: https://github.com/apache/flink/pull/22511/files#diff-d535f910a10f835962b0637e12014068a9727b2152a84223fd9b1bf9c6c074d6R1665 This is not however covering the `onMissingDeploymentsOf` case David mentioned above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
pgaref commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185667103 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ErrorInfo.java: ## @@ -98,4 +112,13 @@ public String getExceptionAsString() { public long getTimestamp() { return timestamp; } + +/** + * Returns the labels associated with the exception. + * + * @return Map of exception labels + */ +public Map getLabels() { +return labels; Review Comment: Great idea, added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] reswqa commented on pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
reswqa commented on PR #22447: URL: https://github.com/apache/flink/pull/22447#issuecomment-1535639126 Rebased on master to avoid the CI problem caused by [FLINK-30972](https://issues.apache.org/jira/browse/FLINK-30972). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32001) SupportsRowLevelUpdate does not support returning only a part of the columns.
[ https://issues.apache.org/jira/browse/FLINK-32001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719601#comment-17719601 ] Jingsong Lee commented on FLINK-32001: -- CC [~yuxia] > SupportsRowLevelUpdate does not support returning only a part of the columns. > - > > Key: FLINK-32001 > URL: https://issues.apache.org/jira/browse/FLINK-32001 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Affects Versions: 1.17.0 >Reporter: Ming Li >Priority: Major > > [FLIP-282|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061] > introduces the new Delete and Update API in Flink SQL. Although it is > described in the documentation that in case of {{partial-update}} we only > need to return the primary key columns and the updated columns. > But in fact, the topology of the job is {{{}source -> cal -> > constraintEnforcer -> sink{}}}, and the constraint check will be performed in > the operator of {{{}constraintEnforcer{}}}, which is done according to index, > not according to column. If only some columns are returned, the constraint > check is wrong, and it is easy to generate > {{{}ArrayIndexOutOfBoundsException{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31968) [PyFlink] 1.17.0 version on M1 processor
[ https://issues.apache.org/jira/browse/FLINK-31968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719599#comment-17719599 ] Dian Fu edited comment on FLINK-31968 at 5/5/23 2:27 AM: - [~gradysnik] Thanks for the confirmation. I'm closing this ticket since it should have been addressed in FLINK-28786. Feel free to reopen it if this issue still happens after 1.17.1. was (Author: dianfu): [~gradysnik] Thanks for the confirmation. I'm closing this ticket since it has been addressed in FLINK-28786. Feel free to reopen it if this issue still happens after 1.17.1. > [PyFlink] 1.17.0 version on M1 processor > > > Key: FLINK-31968 > URL: https://issues.apache.org/jira/browse/FLINK-31968 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.17.0 >Reporter: Yury Smirnov >Priority: Major > Attachments: 1170_output.log > > > PyFlink version 1.17.0 > NumPy version 1.21.4 and 1.21.6 > Python versions 3.8, 3.9, 3.9 > Slack thread: > [https://apache-flink.slack.com/archives/C03G7LJTS2G/p1682508650487979] > While running any of [PyFlink > Examples|https://github.com/apache/flink/blob/release-1.17/flink-python/pyflink/examples] > , getting error: > {code:java} > /Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/bin/python > /Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py > Traceback (most recent call last): > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 76, in > basic_operations() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 53, in basic_operations > show(ds.map(update_tel), env) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 28, in show > env.execute() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/datastream/stream_execution_environment.py", > line 764, in execute > return > JobExecutionResult(self._j_stream_execution_environment.execute(j_stream_graph)) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/java_gateway.py", > line 1322, in __call__ > return_value = get_return_value( > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/util/exceptions.py", > line 146, in deco > return f(*a, **kw) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/protocol.py", > line 326, in get_return_value > raise Py4JJavaError( > py4j.protocol.Py4JJavaError: An error occurred while calling o10.execute. > : org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:267) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1277) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at >
[jira] [Commented] (FLINK-31968) [PyFlink] 1.17.0 version on M1 processor
[ https://issues.apache.org/jira/browse/FLINK-31968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719599#comment-17719599 ] Dian Fu commented on FLINK-31968: - [~gradysnik] Thanks for the confirmation. I'm closing this ticket since it has been addressed in FLINK-28786. Feel free to reopen it if this issue still happens after 1.17.1. > [PyFlink] 1.17.0 version on M1 processor > > > Key: FLINK-31968 > URL: https://issues.apache.org/jira/browse/FLINK-31968 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.17.0 >Reporter: Yury Smirnov >Priority: Major > Attachments: 1170_output.log > > > PyFlink version 1.17.0 > NumPy version 1.21.4 and 1.21.6 > Python versions 3.8, 3.9, 3.9 > Slack thread: > [https://apache-flink.slack.com/archives/C03G7LJTS2G/p1682508650487979] > While running any of [PyFlink > Examples|https://github.com/apache/flink/blob/release-1.17/flink-python/pyflink/examples] > , getting error: > {code:java} > /Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/bin/python > /Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py > Traceback (most recent call last): > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 76, in > basic_operations() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 53, in basic_operations > show(ds.map(update_tel), env) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 28, in show > env.execute() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/datastream/stream_execution_environment.py", > line 764, in execute > return > JobExecutionResult(self._j_stream_execution_environment.execute(j_stream_graph)) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/java_gateway.py", > line 1322, in __call__ > return_value = get_return_value( > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/util/exceptions.py", > line 146, in deco > return f(*a, **kw) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/protocol.py", > line 326, in get_return_value > raise Py4JJavaError( > py4j.protocol.Py4JJavaError: An error occurred while calling o10.execute. > : org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:267) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1277) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47) > at akka.dispatch.OnComplete.internal(Future.scala:300) > at akka.dispatch.OnComplete.internal(Future.scala:297) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:224) > at
[jira] [Closed] (FLINK-31968) [PyFlink] 1.17.0 version on M1 processor
[ https://issues.apache.org/jira/browse/FLINK-31968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-31968. --- Resolution: Duplicate > [PyFlink] 1.17.0 version on M1 processor > > > Key: FLINK-31968 > URL: https://issues.apache.org/jira/browse/FLINK-31968 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.17.0 >Reporter: Yury Smirnov >Priority: Major > Attachments: 1170_output.log > > > PyFlink version 1.17.0 > NumPy version 1.21.4 and 1.21.6 > Python versions 3.8, 3.9, 3.9 > Slack thread: > [https://apache-flink.slack.com/archives/C03G7LJTS2G/p1682508650487979] > While running any of [PyFlink > Examples|https://github.com/apache/flink/blob/release-1.17/flink-python/pyflink/examples] > , getting error: > {code:java} > /Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/bin/python > /Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py > Traceback (most recent call last): > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 76, in > basic_operations() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 53, in basic_operations > show(ds.map(update_tel), env) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/word_count/word_count.py", > line 28, in show > env.execute() > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/datastream/stream_execution_environment.py", > line 764, in execute > return > JobExecutionResult(self._j_stream_execution_environment.execute(j_stream_graph)) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/java_gateway.py", > line 1322, in __call__ > return_value = get_return_value( > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/pyflink/util/exceptions.py", > line 146, in deco > return f(*a, **kw) > File > "/Users/ysmirnov/Documents/Repos/flink-python-jobs/venv/lib/python3.9/site-packages/py4j/protocol.py", > line 326, in get_return_value > raise Py4JJavaError( > py4j.protocol.Py4JJavaError: An error occurred while calling o10.execute. > : org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:267) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1277) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47) > at akka.dispatch.OnComplete.internal(Future.scala:300) > at akka.dispatch.OnComplete.internal(Future.scala:297) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:224) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:221) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) > at >
[jira] [Updated] (FLINK-31988) Implement Python wrapper for new KDS source
[ https://issues.apache.org/jira/browse/FLINK-31988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-31988: Component/s: API / Python Connectors / Kinesis > Implement Python wrapper for new KDS source > --- > > Key: FLINK-31988 > URL: https://issues.apache.org/jira/browse/FLINK-31988 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Kinesis >Reporter: Hong Liang Teoh >Priority: Major > > *What?* > - Implement Python wrapper for KDS source > - Write tests for this KDS source -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-jdbc] leonardBang commented on a diff in pull request #45: [hotfix] Fix incorrect sql_url in jdbc.yml.
leonardBang commented on code in PR #45: URL: https://github.com/apache/flink-connector-jdbc/pull/45#discussion_r1185651232 ## docs/data/jdbc.yml: ## @@ -18,4 +18,4 @@ variants: - maven: flink-connector-jdbc -sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-jdbc/$full_version/flink-sql-connector-jdbc-$full_version.jar +sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc/$full_version/flink-connector-jdbc-$full_version.jar Review Comment: > At present, the `release-1.17` branch of flink tracks the `v3.0` branch of the `flink-connnector-jdbc` repository. IIUC, `3.0.0-1.17` never released and looks we don't have plan to release this version, https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] cy2008 commented on pull request #22498: [FLINK-31928][build] Upgrade okhttp3 to 4.11.0
cy2008 commented on PR #22498: URL: https://github.com/apache/flink/pull/22498#issuecomment-1535613125 > @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] libenchao commented on a diff in pull request #10594: [FLINK-15266] Fix NPE for case operator code gen in blink planner
libenchao commented on code in PR #10594: URL: https://github.com/apache/flink/pull/10594#discussion_r1185649952 ## flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala: ## @@ -1287,12 +1287,16 @@ object ScalarOperatorGens { |boolean $nullTerm; |if (${condition.resultTerm}) { | ${trueAction.code} - | $resultTerm = ${trueAction.resultTerm}; + | if (!${trueAction.nullTerm}) { Review Comment: @kurnoolsaketh We didn't solve it, it just disappeared. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] cy2008 commented on pull request #22498: [FLINK-31928][build] Upgrade okhttp3 to 4.11.0
cy2008 commented on PR #22498: URL: https://github.com/apache/flink/pull/22498#issuecomment-1535612435 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] zhipeng93 opened a new pull request, #237: [Flink-27826] Support machine learning training for very high dimesional models
zhipeng93 opened a new pull request, #237: URL: https://github.com/apache/flink-ml/pull/237 ## What is the purpose of the change *(For example: This pull request adds the implementation of Naive Bayes algorithm.)* ## Brief change log *(for example:)* - *Adds Transformer and Estimator implementation of Naive Bayes in Java and Python* - *Adds examples and documentations of Naive Bayes* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wgtmac commented on pull request #22481: [FLINK-27805][Connectors/ORC] bump orc version to 1.7.8
wgtmac commented on PR #22481: URL: https://github.com/apache/flink/pull/22481#issuecomment-1535588804 Thanks @dongjoon-hyun and @pgaref. I will keep an eye on it and prepare RC2 if required. BTW, it would be better to test 1.7.9-RC1 instead of 1.7.9-SNAPSHOT. But I think they have the same content except the version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler
[ https://issues.apache.org/jira/browse/FLINK-31974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719580#comment-17719580 ] Xintong Song commented on FLINK-31974: -- cc [~wangyang0918] > JobManager crashes after KubernetesClientException exception with > FatalExitExceptionHandler > --- > > Key: FLINK-31974 > URL: https://issues.apache.org/jira/browse/FLINK-31974 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.17.0 >Reporter: Sergio Sainz >Assignee: Weijie Guo >Priority: Major > > When resource quota limit is reached JobManager will throw > > org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: > Failure executing: POST at: > https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > > In {*}1.16.1 , this is handled gracefully{*}: > {code} > 2023-04-28 22:07:24,631 WARN > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Failed requesting worker with resource spec WorkerResourceSpec > \{cpuCores=1.0, taskHeapSize=25.600mb (26843542 bytes), taskOffHeapSize=0 > bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=230.400mb > (241591914 bytes), numSlots=4}, current pending count: 0 > java.util.concurrent.CompletionException: > io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: > POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-138" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture.completeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown > Source) ~[?:?] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > ~[?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > ~[?:?] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. > Message: Forbidden!Configured service account doesn't have access. Service > account may have been revoked. pods > "my-namespace-flink-cluster-taskmanager-1-138" is forbidden: exceeded quota: > my-namespace-resource-quota, requested: limits.cpu=3, used: > limits.cpu=12100m, limited: limits.cpu=13. > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:613) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) > ~[flink-dist-1.16.1.jar:1.16.1] > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$1(Fabric8FlinkKubeClient.java:163) > ~[flink-dist-1.16.1.jar:1.16.1] > ... 4 more > {code} > But , {*}in Flink 1.17.0 , Job Manager crashes{*}: > {code} > 2023-04-28 20:50:50,534 ERROR org.apache.flink.util.FatalExitExceptionHandler > [] - FATAL: Thread
[GitHub] [flink] snuyanzin commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization
snuyanzin commented on PR #17830: URL: https://github.com/apache/flink/pull/17830#issuecomment-1535509074 Since I rebased to the latest sql-gateway and sql-client versions @fsk119 may I ask you to have another look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #22523: [FLINK-32006] Harden and speed up AsyncWaitOperatorTest#testProcessingTimeAlwaysTimeoutFunctionWithRetry
flinkbot commented on PR #22523: URL: https://github.com/apache/flink/pull/22523#issuecomment-1535508362 ## CI report: * 991f0249b767b51481e3ad04add3cb8604d9d38f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32006) AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry times out on Azure
[ https://issues.apache.org/jira/browse/FLINK-32006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-32006: --- Labels: pull-request-available test-stability (was: test-stability) > AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > times out on Azure > -- > > Key: FLINK-32006 > URL: https://issues.apache.org/jira/browse/FLINK-32006 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.18.0 >Reporter: David Morávek >Assignee: David Morávek >Priority: Major > Labels: pull-request-available, test-stability > > {code:java} > May 04 13:52:18 [ERROR] > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > Time elapsed: 100.009 s <<< ERROR! > May 04 13:52:18 org.junit.runners.model.TestTimedOutException: test timed out > after 100 seconds > May 04 13:52:18 at java.lang.Thread.sleep(Native Method) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeAlwaysTimeoutFunctionWithRetry(AsyncWaitOperatorTest.java:1313) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry(AsyncWaitOperatorTest.java:1277) > May 04 13:52:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > May 04 13:52:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > May 04 13:52:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > May 04 13:52:18 at java.lang.reflect.Method.invoke(Method.java:498) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > May 04 13:52:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > May 04 13:52:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > May 04 13:52:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > May 04 13:52:18 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > May 04 13:52:18 at java.lang.Thread.run(Thread.java:748) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48671=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9288 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32006) AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry times out on Azure
[ https://issues.apache.org/jira/browse/FLINK-32006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Morávek reassigned FLINK-32006: - Assignee: David Morávek > AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > times out on Azure > -- > > Key: FLINK-32006 > URL: https://issues.apache.org/jira/browse/FLINK-32006 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.18.0 >Reporter: David Morávek >Assignee: David Morávek >Priority: Major > Labels: test-stability > > {code:java} > May 04 13:52:18 [ERROR] > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > Time elapsed: 100.009 s <<< ERROR! > May 04 13:52:18 org.junit.runners.model.TestTimedOutException: test timed out > after 100 seconds > May 04 13:52:18 at java.lang.Thread.sleep(Native Method) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeAlwaysTimeoutFunctionWithRetry(AsyncWaitOperatorTest.java:1313) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry(AsyncWaitOperatorTest.java:1277) > May 04 13:52:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > May 04 13:52:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > May 04 13:52:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > May 04 13:52:18 at java.lang.reflect.Method.invoke(Method.java:498) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > May 04 13:52:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > May 04 13:52:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > May 04 13:52:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > May 04 13:52:18 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > May 04 13:52:18 at java.lang.Thread.run(Thread.java:748) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48671=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9288 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dmvk opened a new pull request, #22523: [FLINK-32006] Harden and speed up AsyncWaitOperatorTest#testProcessingTimeAlwaysTimeoutFunctionWithRetry
dmvk opened a new pull request, #22523: URL: https://github.com/apache/flink/pull/22523 https://issues.apache.org/jira/browse/FLINK-32006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] kurnoolsaketh commented on a diff in pull request #10594: [FLINK-15266] Fix NPE for case operator code gen in blink planner
kurnoolsaketh commented on code in PR #10594: URL: https://github.com/apache/flink/pull/10594#discussion_r1185567460 ## flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala: ## @@ -1287,12 +1287,16 @@ object ScalarOperatorGens { |boolean $nullTerm; |if (${condition.resultTerm}) { | ${trueAction.code} - | $resultTerm = ${trueAction.resultTerm}; + | if (!${trueAction.nullTerm}) { Review Comment: @libenchao @JingsongLi Hello! This might be a bit unrelated, but I am developing on flink and I am also running into a very similar `VerifyError` pointing to a Janino code generated method (specifically in the Flink protobuf row serialization codepath). How did you workaround/resolve this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] liuml07 commented on pull request #22522: [BP-1.17][FLINK-31984][fs][checkpoint] Savepoint should be relocatable if entropy injection is not effective
liuml07 commented on PR #22522: URL: https://github.com/apache/flink/pull/22522#issuecomment-1535413955 Tried locally and it can be cherry-picked to `release-1.16` branch cleanly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #22522: [BP-1.17][FLINK-31984][fs][checkpoint] Savepoint should be relocatable if entropy injection is not effective
flinkbot commented on PR #22522: URL: https://github.com/apache/flink/pull/22522#issuecomment-1535413802 ## CI report: * ed6a6d56af6b23282a45a717b98f126017ab259a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] liuml07 opened a new pull request, #22522: [FLINK-31984][fs][checkpoint] Savepoint should be relocatable if entropy injection is not effective
liuml07 opened a new pull request, #22522: URL: https://github.com/apache/flink/pull/22522 This is backporting FLINK-31984 aka #22510 to 1.17 branch. The original backport 5c0a53534ed was reverted by #22518 because API change causes [build failures](https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48655=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=1316). As discussed in [FLINK-31984](https://issues.apache.org/jira/browse/FLINK-31984) we preserve the old public evolving API and introduce a new one. The old API is unlikely used elsewhere and now is `@Deprecated`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler
[ https://issues.apache.org/jira/browse/FLINK-31974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719525#comment-17719525 ] Sergio Sainz commented on FLINK-31974: -- Hi [~mapohl] - let me setup a new cluster later on to get the full logs. Below please find the thread dump from the Flink 1.17.0 crash: {code:java} 2023-04-28 20:50:50,305 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Received resource requirements from job 0a97c80a173b7ebb619c5b030b607520: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}] ... 2023-04-28 20:50:50,534 ERROR org.apache.flink.util.FatalExitExceptionHandler [] - FATAL: Thread 'flink-akka.actor.default-dispatcher-15' produced an uncaught exception. Stopping the process... java.util.concurrent.CompletionException: org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.96.0.1/api/v1/namespaces/env-my-namespace/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is forbidden: exceeded quota: my-namespace-realtime-server-resource-quota, requested: limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?] at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) ~[?:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.96.0.1/api/v1/namespaces/env-my-namespace/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is forbidden: exceeded quota: my-namespace-realtime-server-resource-quota, requested: limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:613) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) ~[flink-dist-1.17.0.jar:1.17.0] at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$1(Fabric8FlinkKubeClient.java:163) ~[flink-dist-1.17.0.jar:1.17.0] ... 4 more 2023-04-28 20:50:50,602 ERROR org.apache.flink.util.FatalExitExceptionHandler [] - Thread dump: "main" prio=5 Id=1 WAITING on java.util.concurrent.CompletableFuture$Signaller@2897b146 at java.base@11.0.19/jdk.internal.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.CompletableFuture$Signaller@2897b146 at java.base@11.0.19/java.util.concurrent.locks.LockSupport.park(Unknown Source) at java.base@11.0.19/java.util.concurrent.CompletableFuture$Signaller.block(Unknown Source) at java.base@11.0.19/java.util.concurrent.ForkJoinPool.managedBlock(Unknown Source) at java.base@11.0.19/java.util.concurrent.CompletableFuture.waitingGet(Unknown Source) at
[GitHub] [flink] nateab commented on pull request #22359: [FLINK-31660][table] add jayway json path dependency to table-planner
nateab commented on PR #22359: URL: https://github.com/apache/flink/pull/22359#issuecomment-1535370028 @zentol i believe i've addressed all your feedback, and I got CI to be pass again, so if you could take a final look hopefully 爛 And thanks for all your help reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-32007) Implement Python Wrappers for DynamoDB Connector
Ahmed Hamdy created FLINK-32007: --- Summary: Implement Python Wrappers for DynamoDB Connector Key: FLINK-32007 URL: https://issues.apache.org/jira/browse/FLINK-32007 Project: Flink Issue Type: New Feature Components: API / Python, Connectors / DynamoDB Reporter: Ahmed Hamdy Fix For: 1.18.0 Implement Python API Wrappers for DynamoDB Sink -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] architgyl commented on a diff in pull request #22509: [FLINK-31983] Add yarn Acls capability to Flink containers
architgyl commented on code in PR #22509: URL: https://github.com/apache/flink/pull/22509#discussion_r1185469043 ## flink-yarn-tests/src/test/java/org/apache/flink/yarn/YARNSessionFIFOITCase.java: ## @@ -61,6 +61,9 @@ class YARNSessionFIFOITCase extends YarnTestBase { private static final Logger log = LoggerFactory.getLogger(YARNSessionFIFOITCase.class); +protected static final String VIEW_ACLS = "user groupUser"; +protected static final String MODIFY_ACLS = "admin groupAdmin"; Review Comment: sure, changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on pull request #22481: [FLINK-27805][Connectors/ORC] bump orc version to 1.7.8
pgaref commented on PR #22481: URL: https://github.com/apache/flink/pull/22481#issuecomment-1535349716 hey @dongjoon-hyun -- thanks for keeping an eye! Just triggered a run on[ 1.7.9-SNAPSHOT](https://github.com/apache/flink/pull/22481/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R184) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-aws] vahmed-hamdy commented on a diff in pull request #70: [FLINK-31772] Adjusting Kinesis Ratelimiting strategy to fix performance regression
vahmed-hamdy commented on code in PR #70: URL: https://github.com/apache/flink-connector-aws/pull/70#discussion_r1185454980 ## flink-connector-aws-kinesis-streams/src/test/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSinkWriterTest.java: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connector.kinesis.sink; + +import org.apache.flink.api.common.serialization.SimpleStringSchema; +import org.apache.flink.connector.aws.testutils.AWSServicesTestUtils; +import org.apache.flink.connector.base.sink.writer.ElementConverter; +import org.apache.flink.connector.base.sink.writer.TestSinkInitContext; +import org.apache.flink.connector.base.sink.writer.strategy.AIMDScalingStrategy; + +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; Review Comment: Thanks @hlteoh37 ! I agree as per coding guidelines. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32006) AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry times out on Azure
[ https://issues.apache.org/jira/browse/FLINK-32006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Morávek updated FLINK-32006: -- Labels: test-stability (was: ) > AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > times out on Azure > -- > > Key: FLINK-32006 > URL: https://issues.apache.org/jira/browse/FLINK-32006 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.18.0 >Reporter: David Morávek >Priority: Major > Labels: test-stability > > {code:java} > May 04 13:52:18 [ERROR] > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry > Time elapsed: 100.009 s <<< ERROR! > May 04 13:52:18 org.junit.runners.model.TestTimedOutException: test timed out > after 100 seconds > May 04 13:52:18 at java.lang.Thread.sleep(Native Method) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeAlwaysTimeoutFunctionWithRetry(AsyncWaitOperatorTest.java:1313) > May 04 13:52:18 at > org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry(AsyncWaitOperatorTest.java:1277) > May 04 13:52:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > May 04 13:52:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > May 04 13:52:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > May 04 13:52:18 at java.lang.reflect.Method.invoke(Method.java:498) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > May 04 13:52:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > May 04 13:52:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > May 04 13:52:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > May 04 13:52:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > May 04 13:52:18 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > May 04 13:52:18 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > May 04 13:52:18 at java.lang.Thread.run(Thread.java:748) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48671=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9288 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32006) AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry times out on Azure
David Morávek created FLINK-32006: - Summary: AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry times out on Azure Key: FLINK-32006 URL: https://issues.apache.org/jira/browse/FLINK-32006 Project: Flink Issue Type: Bug Components: API / DataStream Affects Versions: 1.18.0 Reporter: David Morávek {code:java} May 04 13:52:18 [ERROR] org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry Time elapsed: 100.009 s <<< ERROR! May 04 13:52:18 org.junit.runners.model.TestTimedOutException: test timed out after 100 seconds May 04 13:52:18 at java.lang.Thread.sleep(Native Method) May 04 13:52:18 at org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeAlwaysTimeoutFunctionWithRetry(AsyncWaitOperatorTest.java:1313) May 04 13:52:18 at org.apache.flink.streaming.api.operators.async.AsyncWaitOperatorTest.testProcessingTimeWithTimeoutFunctionOrderedWithRetry(AsyncWaitOperatorTest.java:1277) May 04 13:52:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) May 04 13:52:18 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) May 04 13:52:18 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) May 04 13:52:18 at java.lang.reflect.Method.invoke(Method.java:498) May 04 13:52:18 at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) May 04 13:52:18 at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) May 04 13:52:18 at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) May 04 13:52:18 at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) May 04 13:52:18 at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) May 04 13:52:18 at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) May 04 13:52:18 at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) May 04 13:52:18 at java.util.concurrent.FutureTask.run(FutureTask.java:266) May 04 13:52:18 at java.lang.Thread.run(Thread.java:748) {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48671=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9288 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dmvk commented on pull request #22521: [FLINK-32000][webui] Expose max parallelism in the vertex detail drawer.
dmvk commented on PR #22521: URL: https://github.com/apache/flink/pull/22521#issuecomment-1535282266 failed with https://issues.apache.org/jira/browse/FLINK-32006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dmvk commented on pull request #22521: [FLINK-32000][webui] Expose max parallelism in the vertex detail drawer.
dmvk commented on PR #22521: URL: https://github.com/apache/flink/pull/22521#issuecomment-1535281949 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] mridulm commented on a diff in pull request #22509: [FLINK-31983] Add yarn Acls capability to Flink containers
mridulm commented on code in PR #22509: URL: https://github.com/apache/flink/pull/22509#discussion_r1185420037 ## flink-yarn-tests/src/test/java/org/apache/flink/yarn/YARNSessionFIFOITCase.java: ## @@ -61,6 +61,9 @@ class YARNSessionFIFOITCase extends YarnTestBase { private static final Logger log = LoggerFactory.getLogger(YARNSessionFIFOITCase.class); +protected static final String VIEW_ACLS = "user groupUser"; +protected static final String MODIFY_ACLS = "admin groupAdmin"; Review Comment: `groupUser` threw me off - `group` would be better :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] liuml07 commented on pull request #20935: [FLINK-29961][doc] Make referencing custom image clearer for Docker
liuml07 commented on PR #20935: URL: https://github.com/apache/flink/pull/20935#issuecomment-1535210299 Hi @MartijnVisser could you take another look? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32005) Add a per-deployment error metric to signal about potential issues
[ https://issues.apache.org/jira/browse/FLINK-32005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-32005: --- Labels: pull-request-available (was: ) > Add a per-deployment error metric to signal about potential issues > -- > > Key: FLINK-32005 > URL: https://issues.apache.org/jira/browse/FLINK-32005 > Project: Flink > Issue Type: Improvement > Components: Autoscaler, Kubernetes Operator >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.5.0 > > > If any of the autoscaled deployment produce errors they are only visible in > the logs or in the k8s events. Additionally, it would be good to have metrics > to detect any potential issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-kubernetes-operator] mxm opened a new pull request, #585: [FLINK-32005] Add a per-deployment error metric to signal about potential issues
mxm opened a new pull request, #585: URL: https://github.com/apache/flink-kubernetes-operator/pull/585 If any of the autoscaled deployment produce errors they are only visible in the logs or in the k8s events. Additionally, it would be good to have metrics to detect any potential issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-32005) Add a per-deployment error metric to signal about potential issues
Maximilian Michels created FLINK-32005: -- Summary: Add a per-deployment error metric to signal about potential issues Key: FLINK-32005 URL: https://issues.apache.org/jira/browse/FLINK-32005 Project: Flink Issue Type: Improvement Components: Autoscaler, Kubernetes Operator Reporter: Maximilian Michels Assignee: Maximilian Michels Fix For: kubernetes-operator-1.5.0 If any of the autoscaled deployment produce errors they are only visible in the logs or in the k8s events. Additionally, it would be good to have metrics to detect any potential issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31867) Enforce a minimum number of observations within a metric window
[ https://issues.apache.org/jira/browse/FLINK-31867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-31867: --- Labels: pull-request-available (was: ) > Enforce a minimum number of observations within a metric window > --- > > Key: FLINK-31867 > URL: https://issues.apache.org/jira/browse/FLINK-31867 > Project: Flink > Issue Type: Improvement > Components: Autoscaler, Kubernetes Operator >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.5.0 > > > The metric window is currently only time-based. We should make sure we see a > minimum number of observations to ensure we don't decide based on too few > observations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-kubernetes-operator] mxm opened a new pull request, #584: [FLINK-31867] Enforce a minimum number of observations within a metric window
mxm opened a new pull request, #584: URL: https://github.com/apache/flink-kubernetes-operator/pull/584 This makes sure that a minimum number of observations have been taken within the metric window. Otherwise the metrics won't be returned to prevent decision making on too few observations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32004) Intermittent ingress creation when JobManager is restarted via autoscaler
[ https://issues.apache.org/jira/browse/FLINK-32004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tan Kim updated FLINK-32004: Description: I set up access to the flink Web UI via the ingress settings in flinkDeployment. Since this is an AWS environment, the ingress class uses ALB, not NGINX. When using aws alb with flinkDeployment ingress settings when JobManager is restarted via autoscaler, intermittently ingress creation fails. I say intermittent because ingress is generated irregularly, regardless of whether scaling is successful or not. This issue occurs when the initial deployment always succeeds in creating INGRESS and accessing the FLINK WEB UI, but when it is scaled up/down via the autoscaler, INGRESS is not created sometimes. was: I set up access to the flink Web UI via the ingress settings in flinkDeployment. Since this is an AWS environment, the ingress class uses ALB, not NGINX. When using aws alb with flinkDeployment ingress settings when JobManager is restarted via autoscaler, intermittently ingress creation fails. I say intermittent because ingress is generated irregularly, regardless of whether scaling is successful or not. This issue occurs when the initial deployment always succeeds in creating INGRESS and accessing the FLINK WEB UI, but when it is scaled up/down via the autoscaler, INGRESS is not created. > Intermittent ingress creation when JobManager is restarted via autoscaler > - > > Key: FLINK-32004 > URL: https://issues.apache.org/jira/browse/FLINK-32004 > Project: Flink > Issue Type: Bug > Components: Autoscaler, Kubernetes Operator >Reporter: Tan Kim >Priority: Major > > I set up access to the flink Web UI via the ingress settings in > flinkDeployment. > Since this is an AWS environment, the ingress class uses ALB, not NGINX. > When using aws alb with flinkDeployment ingress settings when JobManager is > restarted via autoscaler, intermittently ingress creation fails. > I say intermittent because ingress is generated irregularly, regardless of > whether scaling is successful or not. > > This issue occurs when the initial deployment always succeeds in creating > INGRESS and accessing the FLINK WEB UI, but when it is scaled up/down via the > autoscaler, INGRESS is not created sometimes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32004) Intermittent ingress creation when JobManager is restarted via autoscaler
[ https://issues.apache.org/jira/browse/FLINK-32004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tan Kim updated FLINK-32004: Description: I set up access to the flink Web UI via the ingress settings in flinkDeployment. Since this is an AWS environment, the ingress class uses ALB, not NGINX. When using aws alb with flinkDeployment ingress settings when JobManager is restarted via autoscaler, intermittently ingress creation fails. I say intermittent because ingress is generated irregularly, regardless of whether scaling is successful or not. This issue occurs when the initial deployment always succeeds in creating INGRESS and accessing the FLINK WEB UI, but when it is scaled up/down via the autoscaler, INGRESS is not created. was: I set up access to the flink Web UI via the ingress settings in flinkDeployment. Since this is an AWS environment, the ingress class uses ALB, not NGINX. When using aws alb with flinkDeployment ingress settings when JobManager is restarted via autoscaler, intermittently ingress creation fails. I say intermittent because ingress is generated irregularly, regardless of whether scaling is successful or not. > Intermittent ingress creation when JobManager is restarted via autoscaler > - > > Key: FLINK-32004 > URL: https://issues.apache.org/jira/browse/FLINK-32004 > Project: Flink > Issue Type: Bug > Components: Autoscaler, Kubernetes Operator >Reporter: Tan Kim >Priority: Major > > I set up access to the flink Web UI via the ingress settings in > flinkDeployment. > Since this is an AWS environment, the ingress class uses ALB, not NGINX. > When using aws alb with flinkDeployment ingress settings when JobManager is > restarted via autoscaler, intermittently ingress creation fails. > I say intermittent because ingress is generated irregularly, regardless of > whether scaling is successful or not. > > This issue occurs when the initial deployment always succeeds in creating > INGRESS and accessing the FLINK WEB UI, but when it is scaled up/down via the > autoscaler, INGRESS is not created. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32004) Intermittent ingress creation when JobManager is restarted via autoscaler
Tan Kim created FLINK-32004: --- Summary: Intermittent ingress creation when JobManager is restarted via autoscaler Key: FLINK-32004 URL: https://issues.apache.org/jira/browse/FLINK-32004 Project: Flink Issue Type: Bug Components: Autoscaler, Kubernetes Operator Reporter: Tan Kim I set up access to the flink Web UI via the ingress settings in flinkDeployment. Since this is an AWS environment, the ingress class uses ALB, not NGINX. When using aws alb with flinkDeployment ingress settings when JobManager is restarted via autoscaler, intermittently ingress creation fails. I say intermittent because ingress is generated irregularly, regardless of whether scaling is successful or not. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31977) If scaling.effectiveness.detection.enabled is false, the call to the detectIneffectiveScaleUp() function is unnecessary
[ https://issues.apache.org/jira/browse/FLINK-31977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719423#comment-17719423 ] Tan Kim commented on FLINK-31977: - I understand what you're saying. However, there's a part of the code that's a bit hard to understand because it has multiple returns based on conditional statements. I think I can change it to something more intuitive like this, what do you think? This is unrelated to the original suggestion in the jira ticket to improve inefficient function calls per SCALING_EFFECTIVENESS_DETECTION_ENABLED, but it does make the code a little easier to understand. {code:java} private boolean detectIneffectiveScaleUp( AbstractFlinkResource resource, JobVertexID vertex, Configuration conf, Map evaluatedMetrics, ScalingSummary lastSummary) { double lastProcRate = lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage(); double lastExpectedProcRate = lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent(); var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage(); // To judge the effectiveness of the scale up operation we compute how much of the expected // increase actually happened. For example if we expect a 100 increase in proc rate and only // got an increase of 10 we only accomplished 10% of the desired increase. If this number is // below the threshold, we mark the scaling ineffective. double expectedIncrease = lastExpectedProcRate - lastProcRate; double actualIncrease = currentProcRate - lastProcRate; boolean isInEffectiveScaleUp = (actualIncrease / expectedIncrease) < conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD); if (isInEffectiveScaleUp) { var message = String.format(INNEFFECTIVE_MESSAGE_FORMAT, vertex); eventRecorder.triggerEvent( resource, EventRecorder.Type.Normal, EventRecorder.Reason.IneffectiveScaling, EventRecorder.Component.Operator, message); if (conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_DETECTION_ENABLED)) { LOG.info( "Ineffective scaling detected for {}, expected increase {}, actual {}", vertex, expectedIncrease, actualIncrease); return true; } } return false; } {code} > If scaling.effectiveness.detection.enabled is false, the call to the > detectIneffectiveScaleUp() function is unnecessary > --- > > Key: FLINK-31977 > URL: https://issues.apache.org/jira/browse/FLINK-31977 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Affects Versions: 1.17.0 >Reporter: Tan Kim >Priority: Minor > > The code below is a function to detect inefficient scaleups. > It returns a result if the value of SCALING_EFFECTIVENESS_DETECTION_ENABLED > (scaling.effectiveness.detection.enabled) is true after all the necessary > computations for detection, but this is an unnecessary computation. > {code:java} > JobVertexScaler.java #175 > private boolean detectIneffectiveScaleUp( > AbstractFlinkResource resource, > JobVertexID vertex, > Configuration conf, > Map evaluatedMetrics, > ScalingSummary lastSummary) { > double lastProcRate = > lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage(); // > 22569.315633422066 > double lastExpectedProcRate = > > lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent(); // > 37340.0 > var currentProcRate = > evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage(); > // To judge the effectiveness of the scale up operation we compute how > much of the expected > // increase actually happened. For example if we expect a 100 increase in > proc rate and only > // got an increase of 10 we only accomplished 10% of the desired > increase. If this number is > // below the threshold, we mark the scaling ineffective. > double expectedIncrease = lastExpectedProcRate - lastProcRate; > double actualIncrease = currentProcRate - lastProcRate; > boolean withinEffectiveThreshold = > (actualIncrease / expectedIncrease) > >= > conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD); > if (withinEffectiveThreshold) { > return false; > } > var message = String.format(INNEFFECTIVE_MESSAGE_FORMAT, vertex); > eventRecorder.triggerEvent( > resource, > EventRecorder.Type.Normal, > EventRecorder.Reason.IneffectiveScaling, >
[GitHub] [flink] dongjoon-hyun commented on pull request #22481: [FLINK-27805][Connectors/ORC] bump orc version to 1.7.8
dongjoon-hyun commented on PR #22481: URL: https://github.com/apache/flink/pull/22481#issuecomment-1535071204 If we need some patches in order to help this PR, Apache ORC community can cut RC2 accordingly to help Apache Flink community. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dongjoon-hyun commented on pull request #22481: [FLINK-27805][Connectors/ORC] bump orc version to 1.7.8
dongjoon-hyun commented on PR #22481: URL: https://github.com/apache/flink/pull/22481#issuecomment-1535069827 BTW, @pgaref . As you know, @wgtmac started Apache ORC 1.7.9 RC1. Can we test `RC1` in this PR? - https://github.com/apache/orc/issues/1437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on pull request #22481: [FLINK-27805][Connectors/ORC] bump orc version to 1.7.8
pgaref commented on PR #22481: URL: https://github.com/apache/flink/pull/22481#issuecomment-1535056878 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] reswqa commented on pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
reswqa commented on PR #22447: URL: https://github.com/apache/flink/pull/22447#issuecomment-1535049629 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-32003) Release 3.0.0-1.16 and 1.16.1 doesn't work with OAuth2
Neng Lu created FLINK-32003: --- Summary: Release 3.0.0-1.16 and 1.16.1 doesn't work with OAuth2 Key: FLINK-32003 URL: https://issues.apache.org/jira/browse/FLINK-32003 Project: Flink Issue Type: Bug Components: Connectors / Pulsar Reporter: Neng Lu The release for 3.0.0-1.16 and 1.16.1 depends on Pulsar client version 2.10.1. There is an issue using OAuth2 with this client version which results in the following error. {code:java} Exception in thread "main" java.lang.RuntimeException: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: https://func-test-31a67160-533f-4a5f-81a8-30b6221f34a9.gcp-shared-gcp-usce1-martin.streamnative.g.snio.cloud:443 at me.nlu.pulsar.PulsarAdminTester.main(PulsarAdminTester.java:56) Caused by: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: https://func-test-31a67160-533f-4a5f-81a8-30b6221f34a9.gcp-shared-gcp-usce1-martin.streamnative.g.snio.cloud:443 at org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:251) at org.apache.pulsar.client.admin.internal.TopicsImpl$1.failed(TopicsImpl.java:187) at org.apache.pulsar.shade.org.glassfish.jersey.client.JerseyInvocation$1.failed(JerseyInvocation.java:882) at org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:247) at org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:242) at org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime.access$100(ClientRuntime.java:62) at org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime$2.lambda$failure$1(ClientRuntime.java:178) at org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) at org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) at org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:292) at org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:274) at org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:244) at org.apache.pulsar.shade.org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:288) at org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime$2.failure(ClientRuntime.java:178) at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.lambda$apply$1(AsyncHttpConnector.java:218) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.lambda$retryOperation$4(AsyncHttpConnector.java:277) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) at org.apache.pulsar.shade.org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:273) at org.apache.pulsar.shade.org.asynchttpclient.netty.channel.NettyConnectListener.onFailure(NettyConnectListener.java:181) at org.apache.pulsar.shade.org.asynchttpclient.netty.channel.NettyConnectListener$1.onFailure(NettyConnectListener.java:151) at org.apache.pulsar.shade.org.asynchttpclient.netty.SimpleFutureListener.operationComplete(SimpleFutureListener.java:26) at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) at
[jira] [Commented] (FLINK-31984) Savepoint on S3 should be relocatable if entropy injection is not effective
[ https://issues.apache.org/jira/browse/FLINK-31984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719402#comment-17719402 ] Mingliang Liu commented on FLINK-31984: --- Thanks for catching this [~mapohl]. Sorry it is my fault to break public API without a closer look into EntropyInjector. I completely agree with [~xtsong] on preserving (and deprecating?) the old public method while introducing the new one. I will provide a patch shortly for 1.16 and 1.17. > Savepoint on S3 should be relocatable if entropy injection is not effective > --- > > Key: FLINK-31984 > URL: https://issues.apache.org/jira/browse/FLINK-31984 > Project: Flink > Issue Type: Improvement > Components: FileSystems, Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.2, 1.18.0, 1.17.1 > > > We have a limitation that if we create savepoints with an injected entropy, > they are not relocatable > (https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/savepoints/#triggering-savepoints). > FLINK-25952 improves the check by inspecting both the FileSystem extending > {{EntropyInjectingFileSystem}} and > {{FlinkS3FileSystem#getEntropyInjectionKey}} not returning null. We can > improve this further by checking the checkpoint path is indeed using the > entropy injection key. Without that, the savepoint is not relocatable even if > the {{state.savepoints.dir}} does not contain the entropy. > In our setting, we enable entropy injection by setting {{s3.entropy.key}} to > {{\__ENTROPY_KEY\__}} and use the entropy key in the checkpoint path (for > e.g. {{s3://mybuket/checkpoints/__ENTROPY_KEY__/myapp}}). However, in the > savepoint path, we don't use the entropy key (for e.g. > {{s3://mybuket/savepoints/myapp}}) because we want the savepoint to be > relocatable. But the current logic still generates non-relocatable savepoint > path just because the entropy injection key is non-null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] XComp commented on pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into DefaultLea
XComp commented on PR #22422: URL: https://github.com/apache/flink/pull/22422#issuecomment-1535006196 Did a final rebase after PR #22517 was merged to `master` to run the test suite with the proper `DirectExecutorService` implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-31995) DirectExecutorService doesn't follow the ExecutorService contract throwing a RejectedExecutionException in case it's already shut down
[ https://issues.apache.org/jira/browse/FLINK-31995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl resolved FLINK-31995. --- Fix Version/s: 1.18.0 Assignee: Matthias Pohl Resolution: Fixed master: 8d430f51d77cf4f0d3291da6a7333f1aa9a87d22 > DirectExecutorService doesn't follow the ExecutorService contract throwing a > RejectedExecutionException in case it's already shut down > -- > > Key: FLINK-31995 > URL: https://issues.apache.org/jira/browse/FLINK-31995 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.16.1, 1.18.0 >Reporter: Matthias Pohl >Assignee: Matthias Pohl >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > We experienced an issue where we tested behavior using the > {{DirectExecutorService}} with the {{ExecutorService}} being shutdown > already. The tests succeeded. The production code failed, though, because we > used a singleThreadExecutor for which the task execution failed after the > executor was stopped. We should add this behavior to the > {{DirectExecutorService}} as well to make the unit tests be closer to the > production code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] XComp merged pull request #22517: [FLINK-31995][tests] Adds shutdown check to DirectExecutorService
XComp merged PR #22517: URL: https://github.com/apache/flink/pull/22517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zoltar9264 commented on pull request #22458: [FLINK-31743][statebackend/rocksdb] disable rocksdb log relocating wh…
zoltar9264 commented on PR #22458: URL: https://github.com/apache/flink/pull/22458#issuecomment-1534969569 Thanks for your reminder @fredia , but I think this pr is just a hotfix. I will submit a PR to the frocksdb repository to finally fix this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] echauchot commented on pull request #643: Add blog article: Howto test a batch source with the new Source framework
echauchot commented on PR #643: URL: https://github.com/apache/flink-web/pull/643#issuecomment-1534918324 @MartijnVisser @zentol did almost all my last reviews. Would you have time to review this small article as you know the source framework ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] echauchot commented on pull request #631: Add blog article: Howto migrate a real-life batch pipeline from the DataSet API to the DataStream API
echauchot commented on PR #631: URL: https://github.com/apache/flink-web/pull/631#issuecomment-1534879847 @zentol how does this article look like after last modifications ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler
[ https://issues.apache.org/jira/browse/FLINK-31974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719355#comment-17719355 ] Thomas Weise commented on FLINK-31974: -- There are many cases where errors are transient. This specific case is actually quite obvious, the resource availability on a large cluster is changing constantly. A pod may not be scheduled now but few seconds later. Other k8s related issues can also be transient, for example a failed request due to rate limiting will likely succeed soon after and we would actually make things worse by not following a backoff/retry strategy and simply letting the job fail. I'm also leaning more towards retry by default strategy and identify the cases that should be fatal error. > JobManager crashes after KubernetesClientException exception with > FatalExitExceptionHandler > --- > > Key: FLINK-31974 > URL: https://issues.apache.org/jira/browse/FLINK-31974 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.17.0 >Reporter: Sergio Sainz >Assignee: Weijie Guo >Priority: Major > > When resource quota limit is reached JobManager will throw > > org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: > Failure executing: POST at: > https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > > In {*}1.16.1 , this is handled gracefully{*}: > {code} > 2023-04-28 22:07:24,631 WARN > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Failed requesting worker with resource spec WorkerResourceSpec > \{cpuCores=1.0, taskHeapSize=25.600mb (26843542 bytes), taskOffHeapSize=0 > bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=230.400mb > (241591914 bytes), numSlots=4}, current pending count: 0 > java.util.concurrent.CompletionException: > io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: > POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-138" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture.completeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown > Source) ~[?:?] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > ~[?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > ~[?:?] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. > Message: Forbidden!Configured service account doesn't have access. Service > account may have been revoked. pods > "my-namespace-flink-cluster-taskmanager-1-138" is forbidden: exceeded quota: > my-namespace-resource-quota, requested: limits.cpu=3, used: > limits.cpu=12100m, limited: limits.cpu=13. > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:613) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644) > ~[flink-dist-1.16.1.jar:1.16.1] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83) > ~[flink-dist-1.16.1.jar:1.16.1] >
[GitHub] [flink] dmvk commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
dmvk commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185085549 ## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java: ## @@ -778,7 +778,7 @@ private static PartitionInfo createFinishedPartitionInfo( */ @Override public void fail(Throwable t) { -processFail(t, true); +processFail(t, true, Collections.emptyMap()); Review Comment: Would this be enriched once the https://github.com/apache/flink/pull/22506#discussion_r1185081996 is addressed, since `fromSchedulerNg` will be set to `false`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] LadyForest commented on a diff in pull request #22515: [FLINK-31952][table] Support 'EXPLAIN' statement for CompiledPlan.
LadyForest commented on code in PR #22515: URL: https://github.com/apache/flink/pull/22515#discussion_r1184990962 ## flink-table/flink-sql-parser/src/test/java/org/apache/flink/sql/parser/FlinkSqlParserImplTest.java: ## @@ -2072,6 +2072,13 @@ void testExplainPlanFor() { this.sql(sql).ok(expected); } +@Test +void testExplainForJsonFile() { +String sql = "explain plan for 'file://path'"; +String expected = "EXPLAIN 'file://path'"; Review Comment: Please add more cases to cover `ExplainDetail`s, e.g. ```java this.sql("explain changelog_mode '/mydir/plan.json'").ok(EXPLAIN CHANGELOG_MODE '/mydir/plan.json'); ``` ## flink-table/flink-sql-parser/src/test/java/org/apache/flink/sql/parser/FlinkSqlParserImplTest.java: ## @@ -2072,6 +2072,13 @@ void testExplainPlanFor() { this.sql(sql).ok(expected); } +@Test +void testExplainForJsonFile() { Review Comment: nit: `testExplainCompiledPlan` ## flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java: ## @@ -915,6 +916,16 @@ public TableResultInternal executeInternal(Operation operation) { return executeInternal(((StatementSetOperation) operation).getOperations()); } else if (operation instanceof ExplainOperation) { ExplainOperation explainOperation = (ExplainOperation) operation; +if (explainOperation instanceof ExplainFileOperation) { +ExplainFileOperation explainfileOperation = (ExplainFileOperation) explainOperation; +CompiledPlan cp = + loadPlan(PlanReference.fromFile(explainfileOperation.getFilePath())); Review Comment: Why just load the plan and return it? I think we should call `compiledPlan#explain(...)` ## flink-table/flink-sql-parser/src/main/java/org/apache/flink/sql/parser/dql/SqlRichExplain.java: ## @@ -58,6 +60,10 @@ public SqlNode getStatement() { return statement; } +public String getPlanFile() { Review Comment: It might be weird to add this method here. `SqlRichExplain` is a generalized representation of the syntax, while `getPlanFile` is only suitable for `EXPLAIN 'plan.json'`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dmvk commented on a diff in pull request #22506: [FLINK-31890][runtime] Introduce JobMaster per-task failure enrichment/labeling
dmvk commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1185081996 ## flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java: ## @@ -473,26 +500,50 @@ public CompletableFuture cancel(Time timeout) { @Override public CompletableFuture updateTaskExecutionState( final TaskExecutionState taskExecutionState) { -FlinkException taskExecutionException; +checkNotNull(taskExecutionState, "taskExecutionState"); +// Use the main/caller thread for all updates to make sure they are processed in order. +// (MainThreadExecutor i.e., the akka thread pool does not guarantee that) +// Only detach for a FAILED state update that is terminal and may perform io heavy labeling. +if (ExecutionState.FAILED.equals(taskExecutionState.getExecutionState())) { +return labelFailure(taskExecutionState) +.thenApplyAsync( +taskStateWithLabels -> { +try { +return doUpdateTaskExecutionState(taskStateWithLabels); +} catch (FlinkException e) { +throw new CompletionException(e); +} +}, +getMainThreadExecutor()); +} try { -checkNotNull(taskExecutionState, "taskExecutionState"); +return CompletableFuture.completedFuture( +doUpdateTaskExecutionState(taskExecutionState)); +} catch (FlinkException e) { +return FutureUtils.completedExceptionally(e); +} +} +private Acknowledge doUpdateTaskExecutionState(final TaskExecutionState taskExecutionState) +throws FlinkException { +@Nullable FlinkException taskExecutionException; +try { if (schedulerNG.updateTaskExecutionState(taskExecutionState)) { Review Comment: That's a good catch; the most straightforward way to address this would be to change the listener to go through JobMaster#updateTaskExecutionState instead so we have a shared (semi-asynchronous) code path for all execution state transitions. `ExecutionDeploymentReconciliationHandler#onMissingDeploymentsOf` seems to have the same problem (and the fix could most likely be along the same lines). WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-31970) "Key group 0 is not in KeyGroupRange" when using CheckpointedFunction
[ https://issues.apache.org/jira/browse/FLINK-31970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719336#comment-17719336 ] Piotr Nowojski edited comment on FLINK-31970 at 5/4/23 1:53 PM: [~YordanPavlov] I was not able to reproduce your error, however you are incorrectly working with the state in your {{TestWindow}} function. Please take a look at the [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state]. Especially the following paragraph {quote} It is important to keep in mind that these state objects are only used for interfacing with state. The state is not necessarily stored inside but might reside on disk or somewhere else. The second thing to keep in mind is that the value you get from the state depends on the key of the input element. So the value you get in one invocation of your user function can differ from the value in another invocation if the keys involved are different. {quote} First problem is that your {{TestWindow#snapshoState}} method is trying to access {{state}} field, in a keyed operator/function, inside a method that doesn't have key context. That's the error I was getting when I tried to run your code. Secondly, in your {{TestWindow#apply}} method is invoked within some key context, but you are storing the actual {{count}} as a regular java field. So if you ever had more then one key (that's not the case in your example, as you are using {{.keyBy(_ => 1)}} key selector), all of the keys processed by a single parallel instance of the window operator, would be collected on the same {{count}} field, which doesn't make any sense. The corrected window function should look like this: {code:scala} class CorrectTestWindow() extends WindowFunction[(Long, Int), Long, Int, TimeWindow] with CheckpointedFunction { var state: ValueState[Long] = _ override def snapshotState(functionSnapshotContext: FunctionSnapshotContext): Unit = { } override def initializeState(context: FunctionInitializationContext): Unit = { val storeDescriptor = new ValueStateDescriptor[Long]("state-xrp-dex-pricer", createTypeInformation[Long]) state = context.getKeyedStateStore.getState(storeDescriptor) } override def apply(key: Int, window: TimeWindow, input: lang.Iterable[(Long, Int)], out: Collector[Long]): Unit = { val count = state.value() + input.size state.update(count) out.collect(count) } } {code} However even that is kind of strange. You are creating a tumbling window every 1ms, only to aggregate the {{count}} across all past tumbling windows for the given key? If you want to aggregate count only across the given tumbling window, your apply could look like as simple as: {code:scala} override def apply(key: Int, window: TimeWindow, input: lang.Iterable[(Long, Int)], out: Collector[Long]): Unit = { out.collect(input.size) } {code} And you don't need any additional state for that. On the other hand, if you want to aggregate results globally, you can use {code:scala} .keyBy(_ => 1) .window(GlobalWindows.create()) .tigger(/* maybe a custom trigger here */) .apply(new TestWindow()) {code} was (Author: pnowojski): [~YordanPavlov] I was not able to reproduce your error, however you are incorrectly working with the state in your {{TestWindow}} function. Please take a look at the [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state]. Especially the following paragraph {quote} It is important to keep in mind that these state objects are only used for interfacing with state. The state is not necessarily stored inside but might reside on disk or somewhere else. The second thing to keep in mind is that the value you get from the state depends on the key of the input element. So the value you get in one invocation of your user function can differ from the value in another invocation if the keys involved are different. {quote} First problem is that your {{TestWindow#snapshoState}} method is trying to access {{state}} field, in a keyed operator/function, inside a method that doesn't have key context. That's the error I was getting when I tried to run your code. Secondly, in your {{TestWindow#apply}} method is invoked within some key context, but you are storing the actual {{count}} as a regular java field. So if you ever had more then one key (that's not the case in your example, as you are using {{.keyBy(_ => 1)}} key selector), all of the keys processed by a single parallel instance of the window operator, would be collected on the same {{count}} field, which doesn't make any sense. The corrected window function should look like this: {code:scala} class CorrectTestWindow() extends WindowFunction[(Long, Int), Long, Int, TimeWindow] with
[jira] [Commented] (FLINK-31970) "Key group 0 is not in KeyGroupRange" when using CheckpointedFunction
[ https://issues.apache.org/jira/browse/FLINK-31970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719336#comment-17719336 ] Piotr Nowojski commented on FLINK-31970: [~YordanPavlov] I was not able to reproduce your error, however you are incorrectly working with the state in your {{TestWindow}} function. Please take a look at the [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state]. Especially the following paragraph {quote} It is important to keep in mind that these state objects are only used for interfacing with state. The state is not necessarily stored inside but might reside on disk or somewhere else. The second thing to keep in mind is that the value you get from the state depends on the key of the input element. So the value you get in one invocation of your user function can differ from the value in another invocation if the keys involved are different. {quote} First problem is that your {{TestWindow#snapshoState}} method is trying to access {{state}} field, in a keyed operator/function, inside a method that doesn't have key context. That's the error I was getting when I tried to run your code. Secondly, in your {{TestWindow#apply}} method is invoked within some key context, but you are storing the actual {{count}} as a regular java field. So if you ever had more then one key (that's not the case in your example, as you are using {{.keyBy(_ => 1)}} key selector), all of the keys processed by a single parallel instance of the window operator, would be collected on the same {{count}} field, which doesn't make any sense. The corrected window function should look like this: {code:scala} class CorrectTestWindow() extends WindowFunction[(Long, Int), Long, Int, TimeWindow] with CheckpointedFunction { var state: ValueState[Long] = _ override def snapshotState(functionSnapshotContext: FunctionSnapshotContext): Unit = { } override def initializeState(context: FunctionInitializationContext): Unit = { val storeDescriptor = new ValueStateDescriptor[Long]("state-xrp-dex-pricer", createTypeInformation[Long]) state = context.getKeyedStateStore.getState(storeDescriptor) } override def apply(key: Int, window: TimeWindow, input: lang.Iterable[(Long, Int)], out: Collector[Long]): Unit = { val count = state.value() + input.size state.update(count) out.collect(count) } } {code} However even that is kind of strange. You are creating a tumbling window every 1ms, only to aggregate the {{count}} across all past tumbling windows for the given key? If you want to aggregate count only across the given tumbling window, your apply could look like as simple as: {code:scala} override def apply(key: Int, window: TimeWindow, input: lang.Iterable[(Long, Int)], out: Collector[Long]): Unit = { out.collect(input.size) } {code} And you don't need any additional state for that. On the other hand, if you want to aggregate results globally, you can use {code:scala} .keyBy(_ => 1) .window(GlobalWindows.create()) .tigger(/* maybe a custom trigger here */) .apply(new TestWindow()) {code} > "Key group 0 is not in KeyGroupRange" when using CheckpointedFunction > - > > Key: FLINK-31970 > URL: https://issues.apache.org/jira/browse/FLINK-31970 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.17.0 >Reporter: Yordan Pavlov >Priority: Major > Attachments: fill-topic.sh, main.scala > > > I am experiencing a problem where the following exception would be thrown on > Flink stop (stop with savepoint): > > {code:java} > org.apache.flink.util.SerializedThrowable: > java.lang.IllegalArgumentException: Key group 0 is not in > KeyGroupRange{startKeyGroup=86, endKeyGroup=127}.{code} > > I do not have a non deterministic keyBy() operator in fact, I use > {code:java} > .keyBy(_ => 1){code} > I believe the problem is related to using RocksDB state along with a > {code:java} > CheckpointedFunction{code} > In my test program I have commented out a reduction of the parallelism which > would make the problem go away. I am attaching a standalone program which > presents the problem and also a script which generates the input data. For > clarity I would paste here the essence of the job: > > > {code:scala} > env.fromSource(kafkaSource, watermarkStrategy, "KafkaSource") > .setParallelism(3) > .keyBy(_ => 1) > .window(TumblingEventTimeWindows.of(Time.of(1, TimeUnit.MILLISECONDS))) > .apply(new TestWindow()) > /* .setParallelism(1) this would prevent the problem */ > .uid("window tester") > .name("window tester") > .print() > class TestWindow() extends WindowFunction[(Long, Int), Long, Int,
[jira] [Assigned] (FLINK-31867) Enforce a minimum number of observations within a metric window
[ https://issues.apache.org/jira/browse/FLINK-31867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels reassigned FLINK-31867: -- Assignee: Maximilian Michels > Enforce a minimum number of observations within a metric window > --- > > Key: FLINK-31867 > URL: https://issues.apache.org/jira/browse/FLINK-31867 > Project: Flink > Issue Type: Improvement > Components: Autoscaler, Kubernetes Operator >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: kubernetes-operator-1.5.0 > > > The metric window is currently only time-based. We should make sure we see a > minimum number of observations to ensure we don't decide based on too few > observations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32002) Revisit autoscaler defaults for next release
Maximilian Michels created FLINK-32002: -- Summary: Revisit autoscaler defaults for next release Key: FLINK-32002 URL: https://issues.apache.org/jira/browse/FLINK-32002 Project: Flink Issue Type: Improvement Components: Autoscaler, Kubernetes Operator Reporter: Maximilian Michels Assignee: Maximilian Michels Fix For: kubernetes-operator-1.5.0 We haven't put much thought into the defaults. We should revisit and adjust the defaults to fit most use cases without being overly aggressive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31867) Enforce a minimum number of observations within a metric window
[ https://issues.apache.org/jira/browse/FLINK-31867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated FLINK-31867: --- Fix Version/s: kubernetes-operator-1.5.0 > Enforce a minimum number of observations within a metric window > --- > > Key: FLINK-31867 > URL: https://issues.apache.org/jira/browse/FLINK-31867 > Project: Flink > Issue Type: Improvement > Components: Autoscaler, Kubernetes Operator >Reporter: Maximilian Michels >Priority: Major > Fix For: kubernetes-operator-1.5.0 > > > The metric window is currently only time-based. We should make sure we see a > minimum number of observations to ensure we don't decide based on too few > observations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler
[ https://issues.apache.org/jira/browse/FLINK-31974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719331#comment-17719331 ] Xintong Song commented on FLINK-31974: -- [~gyfora], bq. Flink treats only very few errors fatal. IO errors, connector (source/sink ) errors etc all cause job restarts and in many cases "Flink cannot recover from by itself". You actually expect the error to be temporary and hopefully not get it after the restart. So I think it would be generally inconsistent with the current error handling behaviour if resource manager errors would simply let the job die fatally and not retry in the same way. I think the difference here is that, for IO errors and connector errors, it affects the job but not the Flink cluster / deployment. Thinking of a session cluster, we should not fail the cluster for an error from a single job. But for resource manager interacting with Kubernetes API server, this is a cluster behavior and conceptually we don't distinguish resources for individual jobs until the slots are allocated. Moreover, it's possible that multiple jobs share the same resource (pod). One could argue that in application mode the cluster / deployment is equivalent to the job. However, the cluster mode (session / application) is transparent to the resource manager. bq. Flink jobs/clusters should be resilient and keep retrying in case of errors and should not give up especially for streaming workloads. This is different from the feedback that I get from our production. But I can understand if that's what some of the users want. So I guess maybe it worth a configuration option as you suggested. [~mbalassi], +1 to what you said about the specific case. I think there's a consensus on reaching quota limit should not be treated as fatal errors. > JobManager crashes after KubernetesClientException exception with > FatalExitExceptionHandler > --- > > Key: FLINK-31974 > URL: https://issues.apache.org/jira/browse/FLINK-31974 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.17.0 >Reporter: Sergio Sainz >Assignee: Weijie Guo >Priority: Major > > When resource quota limit is reached JobManager will throw > > org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: > Failure executing: POST at: > https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > > In {*}1.16.1 , this is handled gracefully{*}: > {code} > 2023-04-28 22:07:24,631 WARN > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Failed requesting worker with resource spec WorkerResourceSpec > \{cpuCores=1.0, taskHeapSize=25.600mb (26843542 bytes), taskOffHeapSize=0 > bytes, networkMemSize=64.000mb (67108864 bytes), managedMemSize=230.400mb > (241591914 bytes), numSlots=4}, current pending count: 0 > java.util.concurrent.CompletionException: > io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: > POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: > Forbidden!Configured service account doesn't have access. Service account may > have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-138" is > forbidden: exceeded quota: my-namespace-resource-quota, requested: > limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13. > at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture.completeThrowable(Unknown > Source) ~[?:?] > at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown > Source) ~[?:?] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > ~[?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > ~[?:?] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. > Message: Forbidden!Configured service account doesn't have access. Service > account may have been revoked. pods > "my-namespace-flink-cluster-taskmanager-1-138" is forbidden: exceeded quota: > my-namespace-resource-quota, requested: limits.cpu=3, used: > limits.cpu=12100m, limited: limits.cpu=13. > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684) >
[jira] [Closed] (FLINK-31135) ConfigMap DataSize went > 1 MB and cluster stopped working
[ https://issues.apache.org/jira/browse/FLINK-31135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Ganesh closed FLINK-31135. - Resolution: Not A Bug It was an intermittent issue. > ConfigMap DataSize went > 1 MB and cluster stopped working > -- > > Key: FLINK-31135 > URL: https://issues.apache.org/jira/browse/FLINK-31135 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.2.0 >Reporter: Sriram Ganesh >Priority: Major > Attachments: dump_cm.yaml, > flink--kubernetes-application-0-parked-logs-ingestion-644b80-b4bc58747-lc865.log.zip, > image-2023-04-19-09-48-19-089.png, image-2023-05-03-13-47-51-440.png, > image-2023-05-03-13-50-54-783.png, image-2023-05-03-13-51-21-685.png, > jobmanager_log.txt, > parked-logs-ingestion-644b80-3494e4c01b82eb7a75a76080974b41cd-config-map.yaml > > > I am Flink Operator to manage clusters. Flink version: 1.15.2. Flink jobs > failed with the below error. It seems the config map size went beyond 1 MB > (default size). > Since it is managed by the operator and config maps are not updated with any > manual intervention, I suspect it could be an operator issue. > > {code:java} > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: PUT at: > https:///api/v1/namespaces//configmaps/-config-map. Message: > ConfigMap "-config-map" is invalid: []: Too long: must have at most > 1048576 bytes. Received status: Status(apiVersion=v1, code=422, > details=StatusDetails(causes=[StatusCause(field=[], message=Too long: must > have at most 1048576 bytes, reason=FieldValueTooLong, > additionalProperties={})], group=null, kind=ConfigMap, name=-config-map, > retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, > message=ConfigMap "-config-map" is invalid: []: Too long: must have at > most 1048576 bytes, metadata=ListMeta(_continue=null, > remainingItemCount=null, resourceVersion=null, selfLink=null, > additionalProperties={}), reason=Invalid, status=Failure, > additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleUpdate(OperationSupport.java:347) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleUpdate(OperationSupport.java:327) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleUpdate(BaseOperation.java:781) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:183) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:188) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:130) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:41) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$attemptCheckAndUpdateConfigMap$11(Fabric8FlinkKubeClient.java:325) > ~[flink-dist-1.15.2.jar:1.15.2] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) > ~[?:?] > ... 3 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31135) ConfigMap DataSize went > 1 MB and cluster stopped working
[ https://issues.apache.org/jira/browse/FLINK-31135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719329#comment-17719329 ] Sriram Ganesh commented on FLINK-31135: --- In my case, it is not a permission issue. I couldn't repro the issue. My hunch is there could be a network issue during that time. Thanks, [~Swathi Chandrashekar] [~zhihaochen]. I am closing it. > ConfigMap DataSize went > 1 MB and cluster stopped working > -- > > Key: FLINK-31135 > URL: https://issues.apache.org/jira/browse/FLINK-31135 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.2.0 >Reporter: Sriram Ganesh >Priority: Major > Attachments: dump_cm.yaml, > flink--kubernetes-application-0-parked-logs-ingestion-644b80-b4bc58747-lc865.log.zip, > image-2023-04-19-09-48-19-089.png, image-2023-05-03-13-47-51-440.png, > image-2023-05-03-13-50-54-783.png, image-2023-05-03-13-51-21-685.png, > jobmanager_log.txt, > parked-logs-ingestion-644b80-3494e4c01b82eb7a75a76080974b41cd-config-map.yaml > > > I am Flink Operator to manage clusters. Flink version: 1.15.2. Flink jobs > failed with the below error. It seems the config map size went beyond 1 MB > (default size). > Since it is managed by the operator and config maps are not updated with any > manual intervention, I suspect it could be an operator issue. > > {code:java} > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: PUT at: > https:///api/v1/namespaces//configmaps/-config-map. Message: > ConfigMap "-config-map" is invalid: []: Too long: must have at most > 1048576 bytes. Received status: Status(apiVersion=v1, code=422, > details=StatusDetails(causes=[StatusCause(field=[], message=Too long: must > have at most 1048576 bytes, reason=FieldValueTooLong, > additionalProperties={})], group=null, kind=ConfigMap, name=-config-map, > retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, > message=ConfigMap "-config-map" is invalid: []: Too long: must have at > most 1048576 bytes, metadata=ListMeta(_continue=null, > remainingItemCount=null, resourceVersion=null, selfLink=null, > additionalProperties={}), reason=Invalid, status=Failure, > additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleUpdate(OperationSupport.java:347) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleUpdate(OperationSupport.java:327) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleUpdate(BaseOperation.java:781) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:183) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:188) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:130) > ~[flink-dist-1.15.2.jar:1.15.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:41) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$attemptCheckAndUpdateConfigMap$11(Fabric8FlinkKubeClient.java:325) > ~[flink-dist-1.15.2.jar:1.15.2] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) > ~[?:?] > ... 3 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] reswqa commented on pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
reswqa commented on PR #22447: URL: https://github.com/apache/flink/pull/22447#issuecomment-1534797190 Squashed the fix-up commit, let's waiting for the CI green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] reswqa commented on a diff in pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
reswqa commented on code in PR #22447: URL: https://github.com/apache/flink/pull/22447#discussion_r1185020735 ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java: ## @@ -460,17 +454,25 @@ private boolean requestMemorySegmentFromGlobal() { private MemorySegment requestOverdraftMemorySegmentFromGlobal() { assert Thread.holdsLock(availableMemorySegments); -if (numberOfRequestedOverdraftMemorySegments >= maxOverdraftBuffersPerGate) { +// if overdraft buffers(i.e. buffers exceeding poolSize) is greater than or equal to +// maxOverdraftBuffersPerGate, no new buffer can be requested. +if (numberOfRequestedMemorySegments - currentPoolSize >= maxOverdraftBuffersPerGate) { return null; } checkState( !isDestroyed, "Destroyed buffer pools should never acquire segments - this will lead to buffer leaks."); Review Comment: Ah, good catch! I somehow missed this PR updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] akalash commented on a diff in pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
akalash commented on code in PR #22447: URL: https://github.com/apache/flink/pull/22447#discussion_r118500 ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java: ## @@ -460,17 +454,25 @@ private boolean requestMemorySegmentFromGlobal() { private MemorySegment requestOverdraftMemorySegmentFromGlobal() { assert Thread.holdsLock(availableMemorySegments); -if (numberOfRequestedOverdraftMemorySegments >= maxOverdraftBuffersPerGate) { +// if overdraft buffers(i.e. buffers exceeding poolSize) is greater than or equal to +// maxOverdraftBuffersPerGate, no new buffer can be requested. +if (numberOfRequestedMemorySegments - currentPoolSize >= maxOverdraftBuffersPerGate) { return null; } checkState( !isDestroyed, "Destroyed buffer pools should never acquire segments - this will lead to buffer leaks."); Review Comment: I think you can move `checkState` to your newly created method `requestPooledMemorySegment` as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-32001) SupportsRowLevelUpdate does not support returning only a part of the columns.
Ming Li created FLINK-32001: --- Summary: SupportsRowLevelUpdate does not support returning only a part of the columns. Key: FLINK-32001 URL: https://issues.apache.org/jira/browse/FLINK-32001 Project: Flink Issue Type: Improvement Components: Table SQL / Runtime Affects Versions: 1.17.0 Reporter: Ming Li [FLIP-282|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061] introduces the new Delete and Update API in Flink SQL. Although it is described in the documentation that in case of {{partial-update}} we only need to return the primary key columns and the updated columns. But in fact, the topology of the job is {{{}source -> cal -> constraintEnforcer -> sink{}}}, and the constraint check will be performed in the operator of {{{}constraintEnforcer{}}}, which is done according to index, not according to column. If only some columns are returned, the constraint check is wrong, and it is easy to generate {{{}ArrayIndexOutOfBoundsException{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] reswqa commented on pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
reswqa commented on PR #22447: URL: https://github.com/apache/flink/pull/22447#issuecomment-1534754347 Thanks @akalash for carefully confirming this change! I have been extracted the logic related to request buffer from global pool and increase the `numberOfRequestedMemorySegments` into a common method in the fix-up commit. PTAL~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-31560) Savepoint failing to complete with ExternallyInducedSources
[ https://issues.apache.org/jira/browse/FLINK-31560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719314#comment-17719314 ] Brian Zhou commented on FLINK-31560: [~Yanfei Lei] I think it's fair for the community to not fix a plan-to-deprecate API for now, this issue is only from the legacy source, and from Pravega side discussed with the team, as we are also transitioning into FLIP-27 source, we will also have a way out. > Savepoint failing to complete with ExternallyInducedSources > --- > > Key: FLINK-31560 > URL: https://issues.apache.org/jira/browse/FLINK-31560 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.0 >Reporter: Fan Yang >Assignee: Yanfei Lei >Priority: Major > Attachments: image-2023-03-23-18-03-05-943.png, > image-2023-03-23-18-19-24-482.png, jobmanager_log.txt, > taskmanager_172.28.17.19_6123-f2dbff_log, > tmp_tm_172.28.17.19_6123-f2dbff_tmp_job_83ad4f408d0e7bf30f940ddfa5fe00e3_op_WindowOperator_137df028a798f504a6900a4081c9990c__1_1__uuid_edc681f0-3825-45ce-a123-9ff69ce6d8f1_db_LOG > > > Flink version: 1.16.0 > > We are using Flink to run some streaming applications with Pravega as source > and use window and reduce transformations. We use RocksDB state backend with > incremental checkpointing enabled. We don't enable the latest changelog state > backend. > When we try to stop the job, we encounter issues with the savepoint failing > to complete for the job. This happens most of the time. On rare occasions, > the job gets canceled suddenly with its savepoint get completed successfully. > Savepointing shows below error: > > {code:java} > 2023-03-22 08:55:57,521 [jobmanager-io-thread-1] WARN > org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to > trigger or complete checkpoint 189 for job 7354442cd6f7c121249360680c04284d. > (0 consecutive failed attempts so > far)org.apache.flink.runtime.checkpoint.CheckpointException: Failure to > finalize checkpoint. at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.finalizeCheckpoint(CheckpointCoordinator.java:1375) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.completePendingCheckpoint(CheckpointCoordinator.java:1265) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveAcknowledgeMessage(CheckpointCoordinator.java:1157) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$acknowledgeCheckpoint$1(ExecutionGraphHandler.java:89) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119) > ~[flink-dist-1.16.0.jar:1.16.0] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] > Caused by: java.io.IOException: Unknown implementation of StreamStateHandle: > class org.apache.flink.runtime.state.PlaceholderStreamStateHandle at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeStreamStateHandle(MetadataV2V3SerializerBase.java:699) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeStreamStateHandleMap(MetadataV2V3SerializerBase.java:813) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeKeyedStateHandle(MetadataV2V3SerializerBase.java:344) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeKeyedStateCol(MetadataV2V3SerializerBase.java:269) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeSubtaskState(MetadataV2V3SerializerBase.java:262) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV3Serializer.serializeSubtaskState(MetadataV3Serializer.java:142) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV3Serializer.serializeOperatorState(MetadataV3Serializer.java:122) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV2V3SerializerBase.serializeMetadata(MetadataV2V3SerializerBase.java:146) > ~[flink-dist-1.16.0.jar:1.16.0] at > org.apache.flink.runtime.checkpoint.metadata.MetadataV3Serializer.serialize(MetadataV3Serializer.java:83) > ~[flink-dist-1.16.0.jar:1.16.0] at >
[GitHub] [flink-connector-jdbc] reswqa commented on a diff in pull request #45: [hotfix] Fix incorrect sql_url in jdbc.yml.
reswqa commented on code in PR #45: URL: https://github.com/apache/flink-connector-jdbc/pull/45#discussion_r1184966872 ## docs/data/jdbc.yml: ## @@ -18,4 +18,4 @@ variants: - maven: flink-connector-jdbc -sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-jdbc/$full_version/flink-sql-connector-jdbc-$full_version.jar +sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc/$full_version/flink-connector-jdbc-$full_version.jar Review Comment: IIRC, `v3.1.0` should not have been released yet. So, I guess you mean that the jdbc connector supports `flink-1.17` will be the version `v3.1.0`. At present, the `release-1.17` branch of flink tracks the `v3.0` branch of the `flink-connnector-jdbc` repository. https://github.com/apache/flink-connector-jdbc/blob/b07bddfc8d741e203d3d9b6eeefe5113ecfb4acc/docs/content/docs/connectors/table/jdbc.md?plain=1#L41 This line in `v3.0` branch indicates that the connector version is `v3.0.0`, which will cause `$full_version` to be replaced with `3.0.0-1,17`. If we want to release `v3.1.0`, modifying the version number here should be a necessary part of the release process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-jdbc] reswqa commented on a diff in pull request #45: [hotfix] Fix incorrect sql_url in jdbc.yml.
reswqa commented on code in PR #45: URL: https://github.com/apache/flink-connector-jdbc/pull/45#discussion_r1184966872 ## docs/data/jdbc.yml: ## @@ -18,4 +18,4 @@ variants: - maven: flink-connector-jdbc -sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-jdbc/$full_version/flink-sql-connector-jdbc-$full_version.jar +sql_url: https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc/$full_version/flink-connector-jdbc-$full_version.jar Review Comment: IIRC, `v3.1.0` should not have been released yet. So, I guess you mean that the jdbc connector supports `flink-1.17` will be the version `v3.1.0`. At present, the `release-1.17` branch of flink tracks the `v3.0` branch of the `flink-connnector-jdbc` repository. https://github.com/apache/flink-connector-jdbc/blob/b07bddfc8d741e203d3d9b6eeefe5113ecfb4acc/docs/content/docs/connectors/table/jdbc.md?plain=1#L41 This line indicates that the connector version is `v3.0.0`, which will cause `$full_version` to be replaced with `3.0.0-1,17`. If we want to release `v3.1.0`, modifying the version number here should be a necessary part of the release process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-31984) Savepoint on S3 should be relocatable if entropy injection is not effective
[ https://issues.apache.org/jira/browse/FLINK-31984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719309#comment-17719309 ] Xintong Song commented on FLINK-31984: -- Thanks for fixing it, [~mapohl]. And sorry for causing this trouble. I'm a bit surprised that {{EntropyInjector}} is an API class. Now I see two options to moving forward: 1) we merge this only for 1.18, or 2) we try to fix this for 1.16 & 1.17 in a compatible way (preserving the old and likely unused public method). Given that this is an improvement, I'm leaning towards 1). [~liuml07] WDTY? > Savepoint on S3 should be relocatable if entropy injection is not effective > --- > > Key: FLINK-31984 > URL: https://issues.apache.org/jira/browse/FLINK-31984 > Project: Flink > Issue Type: Improvement > Components: FileSystems, Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.2, 1.18.0, 1.17.1 > > > We have a limitation that if we create savepoints with an injected entropy, > they are not relocatable > (https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/savepoints/#triggering-savepoints). > FLINK-25952 improves the check by inspecting both the FileSystem extending > {{EntropyInjectingFileSystem}} and > {{FlinkS3FileSystem#getEntropyInjectionKey}} not returning null. We can > improve this further by checking the checkpoint path is indeed using the > entropy injection key. Without that, the savepoint is not relocatable even if > the {{state.savepoints.dir}} does not contain the entropy. > In our setting, we enable entropy injection by setting {{s3.entropy.key}} to > {{\__ENTROPY_KEY\__}} and use the entropy key in the checkpoint path (for > e.g. {{s3://mybuket/checkpoints/__ENTROPY_KEY__/myapp}}). However, in the > savepoint path, we don't use the entropy key (for e.g. > {{s3://mybuket/savepoints/myapp}}) because we want the savepoint to be > relocatable. But the current logic still generates non-relocatable savepoint > path just because the entropy injection key is non-null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] fredia commented on pull request #22458: [FLINK-31743][statebackend/rocksdb] disable rocksdb log relocating wh…
fredia commented on PR #22458: URL: https://github.com/apache/flink/pull/22458#issuecomment-1534698282 @zoltar9264 Thanks for the PR, should the description of [`state.backend.rocksdb.log.dir`](https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBConfigurableOptions.java#L90) be changed accordingly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #22521: [FLINK-32000][webui] Expose max parallelism in the vertex detail drawer.
flinkbot commented on PR #22521: URL: https://github.com/apache/flink/pull/22521#issuecomment-1534678585 ## CI report: * 019520d0ff3f7c88559fe35ecde57aa01bfeada9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-32000) Expose vertex max parallelism in the WebUI
[ https://issues.apache.org/jira/browse/FLINK-32000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Morávek reassigned FLINK-32000: - Assignee: David Morávek > Expose vertex max parallelism in the WebUI > -- > > Key: FLINK-32000 > URL: https://issues.apache.org/jira/browse/FLINK-32000 > Project: Flink > Issue Type: Improvement > Components: Runtime / Web Frontend >Reporter: David Morávek >Assignee: David Morávek >Priority: Minor > Labels: pull-request-available > Attachments: Screenshot 2023-05-04 at 14.15.34.png > > > It would be great to expose max parallelism in the vertex detail drawer for > debug purposes !Screenshot 2023-05-04 at 14.15.34.png|width=533,height=195! . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32000) Expose vertex max parallelism in the WebUI
[ https://issues.apache.org/jira/browse/FLINK-32000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-32000: --- Labels: pull-request-available (was: ) > Expose vertex max parallelism in the WebUI > -- > > Key: FLINK-32000 > URL: https://issues.apache.org/jira/browse/FLINK-32000 > Project: Flink > Issue Type: Improvement > Components: Runtime / Web Frontend >Reporter: David Morávek >Priority: Minor > Labels: pull-request-available > Attachments: Screenshot 2023-05-04 at 14.15.34.png > > > It would be great to expose max parallelism in the vertex detail drawer for > debug purposes !Screenshot 2023-05-04 at 14.15.34.png|width=533,height=195! . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dmvk opened a new pull request, #22521: [FLINK-32000][webui] Expose max parallelism in the vertex detail drawer.
dmvk opened a new pull request, #22521: URL: https://github.com/apache/flink/pull/22521 https://issues.apache.org/jira/browse/FLINK-32000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32000) Expose vertex max parallelism in the WebUI
[ https://issues.apache.org/jira/browse/FLINK-32000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Morávek updated FLINK-32000: -- Description: It would be great to expose max parallelism in the vertex detail drawer for debug purposes !Screenshot 2023-05-04 at 14.15.34.png|width=533,height=195! . (was: It would be great to expose max parallelism in the vertex detail drawer for debug purposes !Screenshot 2023-05-04 at 14.15.34.png! .) > Expose vertex max parallelism in the WebUI > -- > > Key: FLINK-32000 > URL: https://issues.apache.org/jira/browse/FLINK-32000 > Project: Flink > Issue Type: Improvement > Components: Runtime / Web Frontend >Reporter: David Morávek >Priority: Minor > Attachments: Screenshot 2023-05-04 at 14.15.34.png > > > It would be great to expose max parallelism in the vertex detail drawer for > debug purposes !Screenshot 2023-05-04 at 14.15.34.png|width=533,height=195! . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32000) Expose vertex max parallelism in the WebUI
David Morávek created FLINK-32000: - Summary: Expose vertex max parallelism in the WebUI Key: FLINK-32000 URL: https://issues.apache.org/jira/browse/FLINK-32000 Project: Flink Issue Type: Improvement Components: Runtime / Web Frontend Reporter: David Morávek Attachments: Screenshot 2023-05-04 at 14.15.34.png It would be great to expose max parallelism in the vertex detail drawer for debug purposes !Screenshot 2023-05-04 at 14.15.34.png! . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] XComp commented on pull request #22380: [FLINK-31773][runtime] Separate DefaultLeaderElectionService.start(LeaderContender) into two separate methods for starting the driver and regis
XComp commented on PR #22380: URL: https://github.com/apache/flink/pull/22380#issuecomment-1534661805 Rebased to most-recent version of FLINK-31838 (PR #22422) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #22520: [FLINK-31996] Chaining operators with different max parallelism prevents rescaling
flinkbot commented on PR #22520: URL: https://github.com/apache/flink/pull/22520#issuecomment-1534654730 ## CI report: * 8c07ee4ee759fe351480649f43a359bf0881e4c9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu commented on a diff in pull request #22316: [FLINK-31638][network] Downstream supports reading buffers from tiered store
WencongLiu commented on code in PR #22316: URL: https://github.com/apache/flink/pull/22316#discussion_r1184921505 ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGateBufferReaderAdapter.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.io.network.partition.consumer; + +import java.io.Closeable; +import java.io.IOException; +import java.util.Optional; + +/** + * {@link SingleInputGateBufferReaderAdapter} includes the logic of reading buffer in the {@link + * SingleInputGate}. + */ +public interface SingleInputGateBufferReaderAdapter extends Closeable { Review Comment: There is no SingleInputGateBufferReaderAdapter now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu commented on a diff in pull request #22316: [FLINK-31638][network] Downstream supports reading buffers from tiered store
WencongLiu commented on code in PR #22316: URL: https://github.com/apache/flink/pull/22316#discussion_r1184920929 ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/tier/TierConsumerAgent.java: ## @@ -18,5 +18,21 @@ package org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier; +import org.apache.flink.runtime.io.network.partition.consumer.InputChannel; + +import java.io.Closeable; +import java.util.Optional; + /** The consumer-side agent of a Tier. */ -public interface TierConsumerAgent {} +public interface TierConsumerAgent extends Closeable { + +/** + * Get buffer from the client according to specific subpartition and segment id. + * + * @param inputChannel indicates the subpartition to read. + * @param segmentId indicate the id of segment. + * @return the next buffer. + */ +Optional getNextBuffer( +InputChannel inputChannel, int segmentId); Review Comment: InputChannel has been removed. BufferAndAvailability is actually the combination of essential information required by SingleInputGate.Maybe the public class BufferAndAvailability can be moved outside from InputChannel? ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/tier/TierConsumerAgent.java: ## @@ -18,5 +18,21 @@ package org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier; +import org.apache.flink.runtime.io.network.partition.consumer.InputChannel; + +import java.io.Closeable; +import java.util.Optional; + /** The consumer-side agent of a Tier. */ -public interface TierConsumerAgent {} +public interface TierConsumerAgent extends Closeable { + +/** + * Get buffer from the client according to specific subpartition and segment id. + * + * @param inputChannel indicates the subpartition to read. + * @param segmentId indicate the id of segment. + * @return the next buffer. + */ +Optional getNextBuffer( +InputChannel inputChannel, int segmentId); Review Comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org