[jira] [Comment Edited] (FLINK-32953) [State TTL]resolve data correctness problem after ttl was changed
[ https://issues.apache.org/jira/browse/FLINK-32953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764016#comment-17764016 ] Jinzhong Li edited comment on FLINK-32953 at 9/12/23 5:39 AM: -- [~masteryhx] Currently, the stateBackend layer (HeapStateBackend/RocksDBStateBackend) doesn't contain information about state-ttl. To solve this problem, i think we have to pass a ttl-state restore transformer from TtlStateFactory to StateBackend which is like [StateSnapshotTransformer|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSnapshotTransformer.java]. And this requires changing the interface KeyedStateFactory#createOrUpdateInternalState (see my [draft commit|https://github.com/ljz2051/flink/commit/7a058fd0f9d34a8fb42254a7ca1d6f32419c69e1] ) I think that fixing this corner case but changing the general interface of stateBackend layer may not be worthwhile. But at least,i think we should alert users to this situation in the documentation. WDYT? [~masteryhx] was (Author: lijinzhong): [~masteryhx] Currently, the stateBackend layer (HeapStateBackend/RocksDBStateBackend) doesn't contain information about state-ttl. To solve this problem, i think we have to pass a ttl-state restore transformer from TtlStateFactory to StateBackend which is like [StateSnapshotTransformer|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSnapshotTransformer.java]. And this requires changing the interface KeyedStateFactory#createOrUpdateInternalState (see my [draft commit|[https://github.com/ljz2051/flink/commit/7a058fd0f9d34a8fb42254a7ca1d6f32419c69e1]] I think that fixing this corner case but changing the general interface of stateBackend layer may not be worthwhile. But at least,i think we should alert users to this situation in the documentation. WDYT? [~masteryhx] > [State TTL]resolve data correctness problem after ttl was changed > -- > > Key: FLINK-32953 > URL: https://issues.apache.org/jira/browse/FLINK-32953 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Jinzhong Li >Assignee: Jinzhong Li >Priority: Major > > Because expired data is cleaned up in background on a best effort basis > (hashmap use INCREMENTAL_CLEANUP strategy, rocksdb use > ROCKSDB_COMPACTION_FILTER strategy), some expired state is often persisted > into snapshots. > > In some scenarios, user changes the state ttl of the job and then restore job > from the old state. If the user adjust the state ttl from a short value to a > long value (eg, from 12 hours to 24 hours), some expired data that was not > cleaned up will be alive after restore. Obviously this is unreasonable, and > may break data regulatory requirements. > > Particularly, rocksdb stateBackend may cause data correctness problems due to > level compaction in this case.(eg. One key has two versions at level-1 and > level-2,both of which are ttl expired. Then level-1 version is cleaned up by > compaction, and level-2 version isn't. If we adjust state ttl and restart > job, the incorrect data of level-2 will become valid after restore) > > To solve this problem, I think we can > 1) persist old state ttl into snapshot meta info; (eg. > RegisteredKeyValueStateBackendMetaInfo or others) > 2) During state restore, check the size between the current ttl and old ttl; > 3) If current ttl is longer than old ttl, we need to iterate over all data, > filter out expired data with old ttl, and wirte valid data into stateBackend. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-32953) [State TTL]resolve data correctness problem after ttl was changed
[ https://issues.apache.org/jira/browse/FLINK-32953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764016#comment-17764016 ] Jinzhong Li edited comment on FLINK-32953 at 9/12/23 5:38 AM: -- [~masteryhx] Currently, the stateBackend layer (HeapStateBackend/RocksDBStateBackend) doesn't contain information about state-ttl. To solve this problem, i think we have to pass a ttl-state restore transformer from TtlStateFactory to StateBackend which is like [StateSnapshotTransformer|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSnapshotTransformer.java]. And this requires changing the interface KeyedStateFactory#createOrUpdateInternalState (see my [draft commit|[https://github.com/ljz2051/flink/commit/7a058fd0f9d34a8fb42254a7ca1d6f32419c69e1]] I think that fixing this corner case but changing the general interface of stateBackend layer may not be worthwhile. But at least,i think we should alert users to this situation in the documentation. WDYT? [~masteryhx] was (Author: lijinzhong): [~masteryhx] Currently, the stateBackend layer (HeapStateBackend/RocksDBStateBackend) doesn't contain information about state-ttl. To solve this problem, i think we have to pass a ttl-state restore transformer from TtlStateFactory to StateBackend which is like [StateSnapshotTransformer|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSnapshotTransformer.java]. And this requires changing the interface KeyedStateFactory#createOrUpdateInternalState (see my [draft commit|[https://github.com/ljz2051/flink/commit/7a058fd0f9d34a8fb42254a7ca1d6f32419c69e1])] I think that fixing this corner case but changing the general interface of stateBackend layer may not be worthwhile. But at least,i think we should alert users to this situation in the documentation. WDYT? [~masteryhx] > [State TTL]resolve data correctness problem after ttl was changed > -- > > Key: FLINK-32953 > URL: https://issues.apache.org/jira/browse/FLINK-32953 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Jinzhong Li >Assignee: Jinzhong Li >Priority: Major > > Because expired data is cleaned up in background on a best effort basis > (hashmap use INCREMENTAL_CLEANUP strategy, rocksdb use > ROCKSDB_COMPACTION_FILTER strategy), some expired state is often persisted > into snapshots. > > In some scenarios, user changes the state ttl of the job and then restore job > from the old state. If the user adjust the state ttl from a short value to a > long value (eg, from 12 hours to 24 hours), some expired data that was not > cleaned up will be alive after restore. Obviously this is unreasonable, and > may break data regulatory requirements. > > Particularly, rocksdb stateBackend may cause data correctness problems due to > level compaction in this case.(eg. One key has two versions at level-1 and > level-2,both of which are ttl expired. Then level-1 version is cleaned up by > compaction, and level-2 version isn't. If we adjust state ttl and restart > job, the incorrect data of level-2 will become valid after restore) > > To solve this problem, I think we can > 1) persist old state ttl into snapshot meta info; (eg. > RegisteredKeyValueStateBackendMetaInfo or others) > 2) During state restore, check the size between the current ttl and old ttl; > 3) If current ttl is longer than old ttl, we need to iterate over all data, > filter out expired data with old ttl, and wirte valid data into stateBackend. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32953) [State TTL]resolve data correctness problem after ttl was changed
[ https://issues.apache.org/jira/browse/FLINK-32953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764016#comment-17764016 ] Jinzhong Li commented on FLINK-32953: - [~masteryhx] Currently, the stateBackend layer (HeapStateBackend/RocksDBStateBackend) doesn't contain information about state-ttl. To solve this problem, i think we have to pass a ttl-state restore transformer from TtlStateFactory to StateBackend which is like [StateSnapshotTransformer|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSnapshotTransformer.java]. And this requires changing the interface KeyedStateFactory#createOrUpdateInternalState (see my [draft commit|[https://github.com/ljz2051/flink/commit/7a058fd0f9d34a8fb42254a7ca1d6f32419c69e1])] I think that fixing this corner case but changing the general interface of stateBackend layer may not be worthwhile. But at least,i think we should alert users to this situation in the documentation. WDYT? [~masteryhx] > [State TTL]resolve data correctness problem after ttl was changed > -- > > Key: FLINK-32953 > URL: https://issues.apache.org/jira/browse/FLINK-32953 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Jinzhong Li >Assignee: Jinzhong Li >Priority: Major > > Because expired data is cleaned up in background on a best effort basis > (hashmap use INCREMENTAL_CLEANUP strategy, rocksdb use > ROCKSDB_COMPACTION_FILTER strategy), some expired state is often persisted > into snapshots. > > In some scenarios, user changes the state ttl of the job and then restore job > from the old state. If the user adjust the state ttl from a short value to a > long value (eg, from 12 hours to 24 hours), some expired data that was not > cleaned up will be alive after restore. Obviously this is unreasonable, and > may break data regulatory requirements. > > Particularly, rocksdb stateBackend may cause data correctness problems due to > level compaction in this case.(eg. One key has two versions at level-1 and > level-2,both of which are ttl expired. Then level-1 version is cleaned up by > compaction, and level-2 version isn't. If we adjust state ttl and restart > job, the incorrect data of level-2 will become valid after restore) > > To solve this problem, I think we can > 1) persist old state ttl into snapshot meta info; (eg. > RegisteredKeyValueStateBackendMetaInfo or others) > 2) During state restore, check the size between the current ttl and old ttl; > 3) If current ttl is longer than old ttl, we need to iterate over all data, > filter out expired data with old ttl, and wirte valid data into stateBackend. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23399: [FLINK-33061][docs] Translate failure-enricher documentation to Chinese
flinkbot commented on PR #23399: URL: https://github.com/apache/flink/pull/23399#issuecomment-1715000130 ## CI report: * f9a5b318b86856634929075d2b3c37820a858e45 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangzzu commented on pull request #23399: [FLINK-33061][docs] Translate failure-enricher documentation to Chinese
wangzzu commented on PR #23399: URL: https://github.com/apache/flink/pull/23399#issuecomment-1714996308 cc @pgaref @huwh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33061) Translate failure-enricher documentation to Chinese
[ https://issues.apache.org/jira/browse/FLINK-33061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33061: --- Labels: pull-request-available (was: ) > Translate failure-enricher documentation to Chinese > --- > > Key: FLINK-33061 > URL: https://issues.apache.org/jira/browse/FLINK-33061 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Matt Wang >Priority: Not a Priority > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] wangzzu opened a new pull request, #23399: [FLINK-33061][docs] Translate failure-enricher documentation to Chinese
wangzzu opened a new pull request, #23399: URL: https://github.com/apache/flink/pull/23399 ## What is the purpose of the change - Translate failure-enricher documentation to Chinese ## Brief change log - Translate failure-enricher documentation to Chinese ## Verifying this change verify this in local ![image](https://github.com/apache/flink/assets/7333163/460ee6b0-59b3-45c7-884b-fd1ade007590) ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] 1996fanrui commented on a diff in pull request #23391: [FLINK-33071][metrics,checkpointing] Log a json dump of checkpoint statistics when checkpoint complets or fails
1996fanrui commented on code in PR #23391: URL: https://github.com/apache/flink/pull/23391#discussion_r1322399618 ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsTracker.java: ## @@ -92,10 +113,18 @@ public class CheckpointStatsTracker { * progress ones. * @param metricGroup Metric group for exposed metrics */ +public CheckpointStatsTracker(int numRememberedCheckpoints, JobManagerJobMetricGroup metricGroup) { +this(numRememberedCheckpoints, metricGroup, metricGroup.jobId(), metricGroup.jobName()); +} public CheckpointStatsTracker(int numRememberedCheckpoints, MetricGroup metricGroup) { Review Comment: Checkstyple fails https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53115=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=5836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32988) HiveITCase failed due to TestContainer not coming up
[ https://issues.apache.org/jira/browse/FLINK-32988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shengkai Fang updated FLINK-32988: -- Affects Version/s: (was: 1.16.2) (was: 1.17.1) > HiveITCase failed due to TestContainer not coming up > > > Key: FLINK-32988 > URL: https://issues.apache.org/jira/browse/FLINK-32988 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.18.0, 1.19.0 >Reporter: Matthias Pohl >Assignee: Shengkai Fang >Priority: Critical > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52740=logs=87489130-75dc-54e4-1f45-80c30aa367a3=efbee0b1-38ac-597d-6466-1ea8fc908c50=15866 > {code} > Aug 29 02:47:56 org.testcontainers.containers.ContainerLaunchException: > Container startup failed for image prestodb/hive3.1-hive:10 > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:349) > Aug 29 02:47:56 at > org.apache.flink.tests.hive.containers.HiveContainer.doStart(HiveContainer.java:81) > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322) > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1131) > Aug 29 02:47:56 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:28) > Aug 29 02:47:56 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 29 02:47:56 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 29 02:47:56 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 29 02:47:56 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 29 02:47:56 at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > Aug 29 02:47:56 at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > Aug 29 02:47:56 at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > Aug 29 02:47:56 at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > Aug 29 02:47:56 at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:188) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:154) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:128) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32877) Support for HTTP connect and timeout options while writes in GCS connector
[ https://issues.apache.org/jira/browse/FLINK-32877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764005#comment-17764005 ] Jayadeep Jayaraman commented on FLINK-32877: [~martijnvisser] - can you please take a look. > Support for HTTP connect and timeout options while writes in GCS connector > -- > > Key: FLINK-32877 > URL: https://issues.apache.org/jira/browse/FLINK-32877 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Jayadeep Jayaraman >Priority: Major > Labels: pull-request-available > > The current GCS connector uses the gcs java storage library and bypasses the > hadoop gcs connector which supports multiple http options. There are > situations where GCS takes longer to provide a response for a PUT operation > than the default value. > This change will allow users to customize their connect time and read timeout > based on their application -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] ljz2051 commented on a diff in pull request #23376: [FLINK-14032][state]Make the cache size of RocksDBPriorityQueueSetFactory configurable
ljz2051 commented on code in PR #23376: URL: https://github.com/apache/flink/pull/23376#discussion_r1322334957 ## flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java: ## @@ -62,6 +63,16 @@ public class RocksDBOptions { .withDescription( "This determines the factory for timer service state implementation."); +@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB) +public static final ConfigOption ROCKSDB_TIMER_SERVICE_FACTORY_CACHE_SIZE = +ConfigOptions.key("state.backend.rocksdb.timer-service.cache-size") +.intType() +.defaultValue(DEFAULT_CACHES_SIZE) +.withDescription( +String.format( +"The cache size for rocksdb timer service factory. This option only has an effect when '%s' is configured to '%s'", Review Comment: Good idea. I have add a simple description about this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ljz2051 commented on a diff in pull request #23376: [FLINK-14032][state]Make the cache size of RocksDBPriorityQueueSetFactory configurable
ljz2051 commented on code in PR #23376: URL: https://github.com/apache/flink/pull/23376#discussion_r1322334666 ## flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackendBuilder.java: ## @@ -126,6 +129,7 @@ public class RocksDBKeyedStateBackendBuilder extends AbstractKeyedStateBacken private RocksDBStateUploader injectRocksDBStateUploader; // for testing public RocksDBKeyedStateBackendBuilder( +ReadableConfig configuration, Review Comment: In the latest commit, i introduce a separate config class RocksDBPriorityQueueConfig which contains PriorityQueueStateType and cache-size. ## flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java: ## @@ -62,6 +63,16 @@ public class RocksDBOptions { .withDescription( "This determines the factory for timer service state implementation."); +@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB) Review Comment: done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ljz2051 commented on a diff in pull request #23376: [FLINK-14032][state]Make the cache size of RocksDBPriorityQueueSetFactory configurable
ljz2051 commented on code in PR #23376: URL: https://github.com/apache/flink/pull/23376#discussion_r1322333661 ## flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBPriorityQueueSetFactory.java: ## @@ -52,7 +52,9 @@ public class RocksDBPriorityQueueSetFactory implements PriorityQueueSetFactory { /** Default cache size per key-group. */ -@VisibleForTesting static final int DEFAULT_CACHES_SIZE = 128; // TODO make this configurable +@VisibleForTesting static final int DEFAULT_CACHES_SIZE = 128; + +private final int cacheSize; Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-14032) Make the cache size of RocksDBPriorityQueueSetFactory configurable
[ https://issues.apache.org/jira/browse/FLINK-14032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14032: --- Labels: auto-unassigned pull-request-available usability (was: auto-unassigned usability) > Make the cache size of RocksDBPriorityQueueSetFactory configurable > -- > > Key: FLINK-14032 > URL: https://issues.apache.org/jira/browse/FLINK-14032 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Yun Tang >Assignee: Jinzhong Li >Priority: Major > Labels: auto-unassigned, pull-request-available, usability > Fix For: 1.18.0 > > > Currently, the cache size of {{RocksDBPriorityQueueSetFactory}} has been set > as 128 and no any ways to configure this to other value. (We could increase > this to obtain better performance if necessary). Actually, this is also a > TODO for quiet a long time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] ljz2051 commented on pull request #23376: [FLINK-14032][state]Make the cache size of RocksDBPriorityQueueSetFactory configurable
ljz2051 commented on PR #23376: URL: https://github.com/apache/flink/pull/23376#issuecomment-1714932710 > Thanks for the pr. PTAL my comments. And you should also provide a UT Case to verify reading the correct config. @masteryhx Thanks for the comments. In the latest commit, i add the UT RocksDBStateBackendConfigTest#testConfigureRocksDBPriorityQueueFactoryCacheSize to verify cacheSize config. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-32978) Deprecate RichFunction#open(Configuration parameters)
[ https://issues.apache.org/jira/browse/FLINK-32978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-32978. Resolution: Done > Deprecate RichFunction#open(Configuration parameters) > - > > Key: FLINK-32978 > URL: https://issues.apache.org/jira/browse/FLINK-32978 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Wencong Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > The > [FLIP-344|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425231] > has decided that the parameter in RichFunction#open will be removed in the > next major version. We should deprecate it now and remove it in Flink 2.0. > The removal will be tracked in > [FLINK-6912|https://issues.apache.org/jira/browse/FLINK-6912]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32978) Deprecate RichFunction#open(Configuration parameters)
[ https://issues.apache.org/jira/browse/FLINK-32978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763999#comment-17763999 ] Xintong Song commented on FLINK-32978: -- master (1.19): e9353319ad625baa5b2c20fa709ab5b23f83c0f4 > Deprecate RichFunction#open(Configuration parameters) > - > > Key: FLINK-32978 > URL: https://issues.apache.org/jira/browse/FLINK-32978 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Wencong Liu >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > The > [FLIP-344|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425231] > has decided that the parameter in RichFunction#open will be removed in the > next major version. We should deprecate it now and remove it in Flink 2.0. > The removal will be tracked in > [FLINK-6912|https://issues.apache.org/jira/browse/FLINK-6912]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32978) Deprecate RichFunction#open(Configuration parameters)
[ https://issues.apache.org/jira/browse/FLINK-32978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-32978: Assignee: Wencong Liu > Deprecate RichFunction#open(Configuration parameters) > - > > Key: FLINK-32978 > URL: https://issues.apache.org/jira/browse/FLINK-32978 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Wencong Liu >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > The > [FLIP-344|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425231] > has decided that the parameter in RichFunction#open will be removed in the > next major version. We should deprecate it now and remove it in Flink 2.0. > The removal will be tracked in > [FLINK-6912|https://issues.apache.org/jira/browse/FLINK-6912]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] xintongsong closed pull request #23058: [FLINK-32978] Deprecate RichFunction#open(Configuration parameters)
xintongsong closed pull request #23058: [FLINK-32978] Deprecate RichFunction#open(Configuration parameters) URL: https://github.com/apache/flink/pull/23058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-32870) Reading multiple small buffers by reading and slicing one large buffer for tiered storage
[ https://issues.apache.org/jira/browse/FLINK-32870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-32870. Fix Version/s: 1.18.0 Resolution: Fixed - master(1.19): ade583cf80478ac60e9b83fc0a97f36bc2b26f1c - release-1.18: d100ab65367fe0b3d74a9901bcaaa26049c996b0 > Reading multiple small buffers by reading and slicing one large buffer for > tiered storage > - > > Key: FLINK-32870 > URL: https://issues.apache.org/jira/browse/FLINK-32870 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.18.0 >Reporter: Yuxin Tan >Assignee: Yuxin Tan >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Currently, when the file reader of tiered storage loads data from the disk > file, it reads data in buffer granularity. Before compression, each buffer is > 32K by default. After compressed, the size will become smaller (may less than > 5K), which is pretty small for the network buffer and the file IO. > We should read multiple small buffers by reading and slicing one large buffer > to decrease the buffer competition and the file IO, leading to better > performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] xintongsong closed pull request #23255: [FLINK-32870][network] Tiered storage supports reading multiple small buffers by reading and slicing one large buffer
xintongsong closed pull request #23255: [FLINK-32870][network] Tiered storage supports reading multiple small buffers by reading and slicing one large buffer URL: https://github.com/apache/flink/pull/23255 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33011) Operator deletes HA data unexpectedly
[ https://issues.apache.org/jira/browse/FLINK-33011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora reassigned FLINK-33011: -- Assignee: Gyula Fora > Operator deletes HA data unexpectedly > - > > Key: FLINK-33011 > URL: https://issues.apache.org/jira/browse/FLINK-33011 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: 1.17.1, kubernetes-operator-1.6.0 > Environment: Flink: 1.17.1 > Flink Kubernetes Operator: 1.6.0 >Reporter: Ruibin Xing >Assignee: Gyula Fora >Priority: Critical > Attachments: flink_operator_logs_0831.csv > > > We encountered a problem where the operator unexpectedly deleted HA data. > The timeline is as follows: > 12:08 We submitted the first spec, which suspended the job with savepoint > upgrade mode. > 12:08 The job was suspended, while the HA data was preserved, and the log > showed the observed job deployment status was MISSING. > 12:10 We submitted the second spec, which deployed the job with the last > state upgrade mode. > 12:10 Logs showed the operator deleted both the Flink deployment and the HA > data again. > 12:10 The job failed to start because the HA data was missing. > According to the log, the deletion was triggered by > https://github.com/apache/flink-kubernetes-operator/blob/a728ba768e20236184e2b9e9e45163304b8b196c/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java#L168 > I think this would only be triggered if the job deployment status wasn't > MISSING. But the log before the deletion showed the observed job status was > MISSING at that moment. > Related logs: > > {code:java} > 2023-08-30 12:08:48.190 + o.a.f.k.o.s.AbstractFlinkService [INFO > ][default/pipeline-pipeline-se-3] Cluster shutdown completed. > 2023-08-30 12:10:27.010 + o.a.f.k.o.o.d.ApplicationObserver [INFO > ][default/pipeline-pipeline-se-3] Observing JobManager deployment. Previous > status: MISSING > 2023-08-30 12:10:27.533 + o.a.f.k.o.l.AuditUtils [INFO > ][default/pipeline-pipeline-se-3] >>> Event | Info | SPECCHANGED | > UPGRADE change(s) detected (Diff: FlinkDeploymentSpec[image : > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:0835137c-362 > -> > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:23db7ae8-365, > podTemplate.metadata.labels.app.kubernetes.io~1version : > 0835137cd803b7258695eb53a6ec520cb62a48a7 -> > 23db7ae84bdab8d91fa527fe2f8f2fce292d0abc, job.state : suspended -> running, > job.upgradeMode : last-state -> savepoint, restartNonce : 1545 -> 1547]), > starting reconciliation. > 2023-08-30 12:10:27.679 + o.a.f.k.o.s.NativeFlinkService [INFO > ][default/pipeline-pipeline-se-3] Deleting JobManager deployment and HA > metadata. > {code} > A more complete log file is attached. Thanks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33011) Operator deletes HA data unexpectedly
[ https://issues.apache.org/jira/browse/FLINK-33011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-33011: --- Priority: Blocker (was: Critical) > Operator deletes HA data unexpectedly > - > > Key: FLINK-33011 > URL: https://issues.apache.org/jira/browse/FLINK-33011 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: 1.17.1, kubernetes-operator-1.6.0 > Environment: Flink: 1.17.1 > Flink Kubernetes Operator: 1.6.0 >Reporter: Ruibin Xing >Assignee: Gyula Fora >Priority: Blocker > Attachments: flink_operator_logs_0831.csv > > > We encountered a problem where the operator unexpectedly deleted HA data. > The timeline is as follows: > 12:08 We submitted the first spec, which suspended the job with savepoint > upgrade mode. > 12:08 The job was suspended, while the HA data was preserved, and the log > showed the observed job deployment status was MISSING. > 12:10 We submitted the second spec, which deployed the job with the last > state upgrade mode. > 12:10 Logs showed the operator deleted both the Flink deployment and the HA > data again. > 12:10 The job failed to start because the HA data was missing. > According to the log, the deletion was triggered by > https://github.com/apache/flink-kubernetes-operator/blob/a728ba768e20236184e2b9e9e45163304b8b196c/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java#L168 > I think this would only be triggered if the job deployment status wasn't > MISSING. But the log before the deletion showed the observed job status was > MISSING at that moment. > Related logs: > > {code:java} > 2023-08-30 12:08:48.190 + o.a.f.k.o.s.AbstractFlinkService [INFO > ][default/pipeline-pipeline-se-3] Cluster shutdown completed. > 2023-08-30 12:10:27.010 + o.a.f.k.o.o.d.ApplicationObserver [INFO > ][default/pipeline-pipeline-se-3] Observing JobManager deployment. Previous > status: MISSING > 2023-08-30 12:10:27.533 + o.a.f.k.o.l.AuditUtils [INFO > ][default/pipeline-pipeline-se-3] >>> Event | Info | SPECCHANGED | > UPGRADE change(s) detected (Diff: FlinkDeploymentSpec[image : > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:0835137c-362 > -> > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:23db7ae8-365, > podTemplate.metadata.labels.app.kubernetes.io~1version : > 0835137cd803b7258695eb53a6ec520cb62a48a7 -> > 23db7ae84bdab8d91fa527fe2f8f2fce292d0abc, job.state : suspended -> running, > job.upgradeMode : last-state -> savepoint, restartNonce : 1545 -> 1547]), > starting reconciliation. > 2023-08-30 12:10:27.679 + o.a.f.k.o.s.NativeFlinkService [INFO > ][default/pipeline-pipeline-se-3] Deleting JobManager deployment and HA > metadata. > {code} > A more complete log file is attached. Thanks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33011) Operator deletes HA data unexpectedly
[ https://issues.apache.org/jira/browse/FLINK-33011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-33011: --- Priority: Critical (was: Major) > Operator deletes HA data unexpectedly > - > > Key: FLINK-33011 > URL: https://issues.apache.org/jira/browse/FLINK-33011 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: 1.17.1, kubernetes-operator-1.6.0 > Environment: Flink: 1.17.1 > Flink Kubernetes Operator: 1.6.0 >Reporter: Ruibin Xing >Priority: Critical > Attachments: flink_operator_logs_0831.csv > > > We encountered a problem where the operator unexpectedly deleted HA data. > The timeline is as follows: > 12:08 We submitted the first spec, which suspended the job with savepoint > upgrade mode. > 12:08 The job was suspended, while the HA data was preserved, and the log > showed the observed job deployment status was MISSING. > 12:10 We submitted the second spec, which deployed the job with the last > state upgrade mode. > 12:10 Logs showed the operator deleted both the Flink deployment and the HA > data again. > 12:10 The job failed to start because the HA data was missing. > According to the log, the deletion was triggered by > https://github.com/apache/flink-kubernetes-operator/blob/a728ba768e20236184e2b9e9e45163304b8b196c/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java#L168 > I think this would only be triggered if the job deployment status wasn't > MISSING. But the log before the deletion showed the observed job status was > MISSING at that moment. > Related logs: > > {code:java} > 2023-08-30 12:08:48.190 + o.a.f.k.o.s.AbstractFlinkService [INFO > ][default/pipeline-pipeline-se-3] Cluster shutdown completed. > 2023-08-30 12:10:27.010 + o.a.f.k.o.o.d.ApplicationObserver [INFO > ][default/pipeline-pipeline-se-3] Observing JobManager deployment. Previous > status: MISSING > 2023-08-30 12:10:27.533 + o.a.f.k.o.l.AuditUtils [INFO > ][default/pipeline-pipeline-se-3] >>> Event | Info | SPECCHANGED | > UPGRADE change(s) detected (Diff: FlinkDeploymentSpec[image : > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:0835137c-362 > -> > docker-registry.randomcompany.com/octopus/pipeline-pipeline-online:23db7ae8-365, > podTemplate.metadata.labels.app.kubernetes.io~1version : > 0835137cd803b7258695eb53a6ec520cb62a48a7 -> > 23db7ae84bdab8d91fa527fe2f8f2fce292d0abc, job.state : suspended -> running, > job.upgradeMode : last-state -> savepoint, restartNonce : 1545 -> 1547]), > starting reconciliation. > 2023-08-30 12:10:27.679 + o.a.f.k.o.s.NativeFlinkService [INFO > ][default/pipeline-pipeline-se-3] Deleting JobManager deployment and HA > metadata. > {code} > A more complete log file is attached. Thanks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-elasticsearch] liyubin117 commented on pull request #39: [FLINK-29042][Connectors/ElasticSearch] Support lookup join for es connector
liyubin117 commented on PR #39: URL: https://github.com/apache/flink-connector-elasticsearch/pull/39#issuecomment-1714909304 @snuyanzin @MartijnVisser Thanks a lot for your patient work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu closed pull request #23375: Dev 0908 1101
WencongLiu closed pull request #23375: Dev 0908 1101 URL: https://github.com/apache/flink/pull/23375 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23398: [hotfix][network] Fix some bugs of Hybrid Shuffle
flinkbot commented on PR #23398: URL: https://github.com/apache/flink/pull/23398#issuecomment-1714902279 ## CI report: * 2c46fa10143f17ad1a2946536046858556f14771 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23397: [hotfix][network] Fix some bugs of Hybrid Shuffle
flinkbot commented on PR #23397: URL: https://github.com/apache/flink/pull/23397#issuecomment-1714902192 ## CI report: * edbcd0e920025a70f0aa770daeb8e0caa12d9c72 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu opened a new pull request, #23398: [hotfix][network] Fix some bugs of Hybrid Shuffle
WencongLiu opened a new pull request, #23398: URL: https://github.com/apache/flink/pull/23398 ## What is the purpose of the change [hotfix][network] Fix the memory usage threshold that triggers disk writing in Hybrid Shuffle [hotfix][network] Optimize the backlog calculation logic in Hybrid Shuffle -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu opened a new pull request, #23397: [hotfix][network] Fix some bugs of Hybrid Shuffle
WencongLiu opened a new pull request, #23397: URL: https://github.com/apache/flink/pull/23397 ## What is the purpose of the change [hotfix][network] Fix the memory usage threshold that triggers disk writing in Hybrid Shuffle [hotfix][network] Optimize the backlog calculation logic in Hybrid Shuffle -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
[ https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shengkai Fang closed FLINK-33063. - Resolution: Fixed > udaf with user defined pojo object throw error while generate record equaliser > -- > > Key: FLINK-33063 > URL: https://issues.apache.org/jira/browse/FLINK-33063 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Udaf with user define pojo object throw error while generating record > equaliser: > When user create an udaf while recore contains user define complex pojo > object (like List or Map). The codegen > will throw error while generating record equaliser, the error is: > {code:java} > A method named "compareTo" is not declared in any enclosing class nor any > subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
[ https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763994#comment-17763994 ] Shengkai Fang commented on FLINK-33063: --- Merged into release-1.18: f2584a1df364a14ff50b5a52fe7cf5e38d4cdc9a > udaf with user defined pojo object throw error while generate record equaliser > -- > > Key: FLINK-33063 > URL: https://issues.apache.org/jira/browse/FLINK-33063 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Udaf with user define pojo object throw error while generating record > equaliser: > When user create an udaf while recore contains user define complex pojo > object (like List or Map). The codegen > will throw error while generating record equaliser, the error is: > {code:java} > A method named "compareTo" is not declared in any enclosing class nor any > subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32952) Scan reuse with readable metadata and watermark push down will get wrong watermark
[ https://issues.apache.org/jira/browse/FLINK-32952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shengkai Fang closed FLINK-32952. - Resolution: Fixed > Scan reuse with readable metadata and watermark push down will get wrong > watermark > --- > > Key: FLINK-32952 > URL: https://issues.apache.org/jira/browse/FLINK-32952 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Scan reuse with readable metadata and watermark push down will get wrong > result. In class ScanReuser, we will re-build watermark spec after projection > push down. However, we will get wrong index while try to find index in new > source type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] fsk119 merged pull request #23388: [FLINK-33063][table-runtime] Fix udaf with complex user defined pojo object throw error while generate record equaliser
fsk119 merged PR #23388: URL: https://github.com/apache/flink/pull/23388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32952) Scan reuse with readable metadata and watermark push down will get wrong watermark
[ https://issues.apache.org/jira/browse/FLINK-32952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763993#comment-17763993 ] Shengkai Fang commented on FLINK-32952: --- Merged into release-1.18: f66679bfe5f5b344eec71a7579504762cc3c04ae > Scan reuse with readable metadata and watermark push down will get wrong > watermark > --- > > Key: FLINK-32952 > URL: https://issues.apache.org/jira/browse/FLINK-32952 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Scan reuse with readable metadata and watermark push down will get wrong > result. In class ScanReuser, we will re-build watermark spec after projection > push down. However, we will get wrong index while try to find index in new > source type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] fsk119 merged pull request #23338: [FLINK-32952][table-planner] Fix scan reuse with readable metadata and watermark push down get wrong watermark error
fsk119 merged PR #23338: URL: https://github.com/apache/flink/pull/23338 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32488) Introduce configuration to control ExecutionGraph cache in REST API
[ https://issues.apache.org/jira/browse/FLINK-32488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763992#comment-17763992 ] Yangze Guo commented on FLINK-32488: Thanks for the PR [~hejufang001] and [~xiangyu0xf] . I'm now considering not introducing a configuration for it. As we all know, the configuration of flink become massive which would harm the usability. How about make the refresh interval adaptive to the total number of execution cache in job manager? > Introduce configuration to control ExecutionGraph cache in REST API > --- > > Key: FLINK-32488 > URL: https://issues.apache.org/jira/browse/FLINK-32488 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.16.2, 1.17.1 >Reporter: Hong Liang Teoh >Assignee: Jufang He >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > *What* > Currently, REST handlers that inherit from AbstractExecutionGraphHandler > serve information derived from a cached ExecutionGraph. > This ExecutionGraph cache currently derives it's timeout from > {*}web.refresh-interval{*}. The *web.refresh-interval* controls both the > refresh rate of the Flink dashboard and the ExecutionGraph cache timeout. > We should introduce a new configuration to control the ExecutionGraph cache, > namely {*}rest.cache.execution-graph.expiry{*}. > *Why* > Sharing configuration between REST handler and Flink dashboard is a sign that > we are coupling the two. > Ideally, we want our REST API behaviour to independent of the Flink dashboard > (e.g. supports programmatic access). > > Mailing list discussion: > https://lists.apache.org/thread/7o330hfyoqqkkrfhtvz3kp448jcspjrm > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-32488) Introduce configuration to control ExecutionGraph cache in REST API
[ https://issues.apache.org/jira/browse/FLINK-32488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763989#comment-17763989 ] xiangyu feng edited comment on FLINK-32488 at 9/12/23 2:51 AM: --- Hi [~guoyangze] , [~hejufang001] has created a PR([https://github.com/apache/flink/pull/23387)|https://github.com/apache/flink/pull/23387,] for this issue would you kindly assign this issue to him? was (Author: JIRAUSER301129): Hi [~guoyangze] , [~hejufang001] has created a [PR|[https://github.com/apache/flink/pull/23387|https://github.com/apache/flink/pull/23387,]] for this issue would you kindly assign this issue to him? > Introduce configuration to control ExecutionGraph cache in REST API > --- > > Key: FLINK-32488 > URL: https://issues.apache.org/jira/browse/FLINK-32488 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.16.2, 1.17.1 >Reporter: Hong Liang Teoh >Assignee: Jufang He >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > *What* > Currently, REST handlers that inherit from AbstractExecutionGraphHandler > serve information derived from a cached ExecutionGraph. > This ExecutionGraph cache currently derives it's timeout from > {*}web.refresh-interval{*}. The *web.refresh-interval* controls both the > refresh rate of the Flink dashboard and the ExecutionGraph cache timeout. > We should introduce a new configuration to control the ExecutionGraph cache, > namely {*}rest.cache.execution-graph.expiry{*}. > *Why* > Sharing configuration between REST handler and Flink dashboard is a sign that > we are coupling the two. > Ideally, we want our REST API behaviour to independent of the Flink dashboard > (e.g. supports programmatic access). > > Mailing list discussion: > https://lists.apache.org/thread/7o330hfyoqqkkrfhtvz3kp448jcspjrm > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32488) Introduce configuration to control ExecutionGraph cache in REST API
[ https://issues.apache.org/jira/browse/FLINK-32488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo reassigned FLINK-32488: -- Assignee: Jufang He (was: xiangyu feng) > Introduce configuration to control ExecutionGraph cache in REST API > --- > > Key: FLINK-32488 > URL: https://issues.apache.org/jira/browse/FLINK-32488 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.16.2, 1.17.1 >Reporter: Hong Liang Teoh >Assignee: Jufang He >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > *What* > Currently, REST handlers that inherit from AbstractExecutionGraphHandler > serve information derived from a cached ExecutionGraph. > This ExecutionGraph cache currently derives it's timeout from > {*}web.refresh-interval{*}. The *web.refresh-interval* controls both the > refresh rate of the Flink dashboard and the ExecutionGraph cache timeout. > We should introduce a new configuration to control the ExecutionGraph cache, > namely {*}rest.cache.execution-graph.expiry{*}. > *Why* > Sharing configuration between REST handler and Flink dashboard is a sign that > we are coupling the two. > Ideally, we want our REST API behaviour to independent of the Flink dashboard > (e.g. supports programmatic access). > > Mailing list discussion: > https://lists.apache.org/thread/7o330hfyoqqkkrfhtvz3kp448jcspjrm > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32488) Introduce configuration to control ExecutionGraph cache in REST API
[ https://issues.apache.org/jira/browse/FLINK-32488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763989#comment-17763989 ] xiangyu feng commented on FLINK-32488: -- Hi [~guoyangze] , [~hejufang001] has created a [PR|[https://github.com/apache/flink/pull/23387|https://github.com/apache/flink/pull/23387,]] for this issue would you kindly assign this issue to him? > Introduce configuration to control ExecutionGraph cache in REST API > --- > > Key: FLINK-32488 > URL: https://issues.apache.org/jira/browse/FLINK-32488 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.16.2, 1.17.1 >Reporter: Hong Liang Teoh >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > *What* > Currently, REST handlers that inherit from AbstractExecutionGraphHandler > serve information derived from a cached ExecutionGraph. > This ExecutionGraph cache currently derives it's timeout from > {*}web.refresh-interval{*}. The *web.refresh-interval* controls both the > refresh rate of the Flink dashboard and the ExecutionGraph cache timeout. > We should introduce a new configuration to control the ExecutionGraph cache, > namely {*}rest.cache.execution-graph.expiry{*}. > *Why* > Sharing configuration between REST handler and Flink dashboard is a sign that > we are coupling the two. > Ideally, we want our REST API behaviour to independent of the Flink dashboard > (e.g. supports programmatic access). > > Mailing list discussion: > https://lists.apache.org/thread/7o330hfyoqqkkrfhtvz3kp448jcspjrm > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23396: Cherry pick to release-1.18
flinkbot commented on PR #23396: URL: https://github.com/apache/flink/pull/23396#issuecomment-1714856384 ## CI report: * 7b43c3eacd85d79286bc47629e738f7e674a6374 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fsk119 opened a new pull request, #23396: Cherry pick to release-1.18
fsk119 opened a new pull request, #23396: URL: https://github.com/apache/flink/pull/23396 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32988) HiveITCase failed due to TestContainer not coming up
[ https://issues.apache.org/jira/browse/FLINK-32988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763984#comment-17763984 ] Shengkai Fang commented on FLINK-32988: --- Merged into master: 649b7fe197c8b03cce9595adcfea33c8d708a8b4 > HiveITCase failed due to TestContainer not coming up > > > Key: FLINK-32988 > URL: https://issues.apache.org/jira/browse/FLINK-32988 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0 >Reporter: Matthias Pohl >Assignee: Shengkai Fang >Priority: Critical > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52740=logs=87489130-75dc-54e4-1f45-80c30aa367a3=efbee0b1-38ac-597d-6466-1ea8fc908c50=15866 > {code} > Aug 29 02:47:56 org.testcontainers.containers.ContainerLaunchException: > Container startup failed for image prestodb/hive3.1-hive:10 > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:349) > Aug 29 02:47:56 at > org.apache.flink.tests.hive.containers.HiveContainer.doStart(HiveContainer.java:81) > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322) > Aug 29 02:47:56 at > org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1131) > Aug 29 02:47:56 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:28) > Aug 29 02:47:56 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 29 02:47:56 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 29 02:47:56 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 29 02:47:56 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 29 02:47:56 at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > Aug 29 02:47:56 at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > Aug 29 02:47:56 at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > Aug 29 02:47:56 at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > Aug 29 02:47:56 at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86) > Aug 29 02:47:56 at > org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:188) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:154) > Aug 29 02:47:56 at > org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:128) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] fsk119 closed pull request #23370: [FLINK-32731][e2e] Fix NameNode uses port 9870 in hadoop3
fsk119 closed pull request #23370: [FLINK-32731][e2e] Fix NameNode uses port 9870 in hadoop3 URL: https://github.com/apache/flink/pull/23370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r132058 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: Thanks, updated correspondingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33073) Implement end-to-end tests for the Kinesis Streams Sink
[ https://issues.apache.org/jira/browse/FLINK-33073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Liang Teoh reassigned FLINK-33073: --- Assignee: Hong Liang Teoh > Implement end-to-end tests for the Kinesis Streams Sink > --- > > Key: FLINK-33073 > URL: https://issues.apache.org/jira/browse/FLINK-33073 > Project: Flink > Issue Type: Sub-task > Components: Connectors / AWS >Reporter: Hong Liang Teoh >Assignee: Hong Liang Teoh >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > *What* > Implement end-to-end tests for KinesisStreamsSink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33072) Implement end-to-end tests for AWS Kinesis Connectors
[ https://issues.apache.org/jira/browse/FLINK-33072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Liang Teoh updated FLINK-33072: Description: *What* We want to implement end-to-end tests that target real Kinesis Data Streams. *Why* This solidifies our testing to ensure we pick up any integration issues with Kinesis Data Streams API. We especially want to test happy cases and failure cases to ensure those cases are handled as expected by the KDS connector. Reference: https://issues.apache.org/jira/browse/INFRA-24474 was: *What* We want to implement end-to-end tests that target real Kinesis Data Streams. *Why* This solidifies our testing to ensure we pick up any integration issues with Kinesis Data Streams API. We especially want to test happy cases and failure cases to ensure those cases are handled as expected by the KDS connector. > Implement end-to-end tests for AWS Kinesis Connectors > - > > Key: FLINK-33072 > URL: https://issues.apache.org/jira/browse/FLINK-33072 > Project: Flink > Issue Type: Improvement > Components: Connectors / AWS >Reporter: Hong Liang Teoh >Priority: Major > Fix For: 2.0.0 > > > *What* > We want to implement end-to-end tests that target real Kinesis Data Streams. > *Why* > This solidifies our testing to ensure we pick up any integration issues with > Kinesis Data Streams API. > We especially want to test happy cases and failure cases to ensure those > cases are handled as expected by the KDS connector. > > Reference: https://issues.apache.org/jira/browse/INFRA-24474 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33073) Implement end-to-end tests for the Kinesis Streams Sink
[ https://issues.apache.org/jira/browse/FLINK-33073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33073: --- Labels: pull-request-available (was: ) > Implement end-to-end tests for the Kinesis Streams Sink > --- > > Key: FLINK-33073 > URL: https://issues.apache.org/jira/browse/FLINK-33073 > Project: Flink > Issue Type: Sub-task > Components: Connectors / AWS >Reporter: Hong Liang Teoh >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > *What* > Implement end-to-end tests for KinesisStreamsSink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-aws] hlteoh37 opened a new pull request, #94: [FLINK-33073][Connectors/AWS] Implement end-to-end tests for KinesisS…
hlteoh37 opened a new pull request, #94: URL: https://github.com/apache/flink-connector-aws/pull/94 …treamsSink ## Purpose of the change Implement end-to-end tests for KinesisStreamsSink. ## Verifying this change This change added tests and can be verified as follows: - Added end-to-end tests ## Significant changes *(Please check any boxes [x] if the answer is "yes". You can first publish the PR and check them afterwards, for convenience.)* - [ ] Dependencies have been added or upgraded - [ ] Public API has been changed (Public API is any class annotated with `@Public(Evolving)`) - [ ] Serializers have been changed - [ ] New feature has been introduced - If yes, how is this documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33073) Implement end-to-end tests for the Kinesis Streams Sink
Hong Liang Teoh created FLINK-33073: --- Summary: Implement end-to-end tests for the Kinesis Streams Sink Key: FLINK-33073 URL: https://issues.apache.org/jira/browse/FLINK-33073 Project: Flink Issue Type: Sub-task Components: Connectors / AWS Reporter: Hong Liang Teoh Fix For: 2.0.0 *What* Implement end-to-end tests for KinesisStreamsSink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33072) Implement end-to-end tests for AWS Kinesis Connectors
Hong Liang Teoh created FLINK-33072: --- Summary: Implement end-to-end tests for AWS Kinesis Connectors Key: FLINK-33072 URL: https://issues.apache.org/jira/browse/FLINK-33072 Project: Flink Issue Type: Improvement Components: Connectors / AWS Reporter: Hong Liang Teoh Fix For: 2.0.0 *What* We want to implement end-to-end tests that target real Kinesis Data Streams. *Why* This solidifies our testing to ensure we pick up any integration issues with Kinesis Data Streams API. We especially want to test happy cases and failure cases to ensure those cases are handled as expected by the KDS connector. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] tweise commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
tweise commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1322170417 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: +1, that would probably make it a bit easier to understand -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1322163343 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: @tweise @mxm , I can get rid off `MonitoringApiMessageHeaders`, and make `getSupportedAPIVersions` a public method of `UrlPrefixDecorator`. Please let me know if you want me to do so. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1322162928 ## flink-clients/src/main/java/org/apache/flink/client/program/rest/UrlPrefixDecorator.java: ## @@ -41,7 +40,7 @@ */ public class UrlPrefixDecorator< R extends RequestBody, P extends ResponseBody, M extends MessageParameters> -implements SqlGatewayMessageHeaders { Review Comment: `SQLGatewayUrlPrefixDecorator`, the subclass of `UrlPrefixDecorator`, overrides `getSupportedAPIVersions`and does what `SqlGatewayMessageHeaders::getSupportedAPIVersions` was doing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-elasticsearch] mtfelisb commented on a diff in pull request #53: [FLINK-26088][Connectors/ElasticSearch] Add Elasticsearch 8.0 support
mtfelisb commented on code in PR #53: URL: https://github.com/apache/flink-connector-elasticsearch/pull/53#discussion_r1322154473 ## flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/INetworkConfigFactory.java: ## @@ -0,0 +1,29 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ +package org.apache.flink.connector.elasticsearch.sink; + +import co.elastic.clients.elasticsearch.ElasticsearchAsyncClient; + +import java.io.Serializable; + +public interface INetworkConfigFactory extends Serializable { Review Comment: You're right. Just removed it! Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24241) test_table_environment_api.py fail with NPE
[ https://issues.apache.org/jira/browse/FLINK-24241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-24241: --- Labels: stale-major test-stability (was: test-stability) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Major but is unassigned and neither itself nor its Sub-Tasks have been updated for 60 days. I have gone ahead and added a "stale-major" to the issue". If this ticket is a Major, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > test_table_environment_api.py fail with NPE > --- > > Key: FLINK-24241 > URL: https://issues.apache.org/jira/browse/FLINK-24241 > Project: Flink > Issue Type: Bug > Components: API / Python, Table SQL / Planner >Affects Versions: 1.14.0, 1.15.0 >Reporter: Xintong Song >Priority: Major > Labels: stale-major, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23876=logs=821b528f-1eed-5598-a3b4-7f748b13f261=6bb545dd-772d-5d8c-f258-f5085fba3295=23263 > {code} > Sep 10 03:03:39 E py4j.protocol.Py4JJavaError: An error > occurred while calling o16211.execute. > Sep 10 03:03:39 E : java.lang.NullPointerException > Sep 10 03:03:39 E at > java.util.Objects.requireNonNull(Objects.java:203) > Sep 10 03:03:39 E at > org.apache.calcite.rel.metadata.RelMetadataQuery.(RelMetadataQuery.java:144) > Sep 10 03:03:39 E at > org.apache.calcite.rel.metadata.RelMetadataQuery.(RelMetadataQuery.java:108) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.metadata.FlinkRelMetadataQuery.(FlinkRelMetadataQuery.java:78) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.metadata.FlinkRelMetadataQuery.instance(FlinkRelMetadataQuery.java:59) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.calcite.FlinkRelOptClusterFactory$$anon$1.get(FlinkRelOptClusterFactory.scala:39) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.calcite.FlinkRelOptClusterFactory$$anon$1.get(FlinkRelOptClusterFactory.scala:38) > Sep 10 03:03:39 E at > org.apache.calcite.plan.RelOptCluster.getMetadataQuery(RelOptCluster.java:178) > Sep 10 03:03:39 E at > org.apache.calcite.rel.metadata.RelMdUtil.clearCache(RelMdUtil.java:965) > Sep 10 03:03:39 E at > org.apache.calcite.plan.hep.HepPlanner.buildFinalPlan(HepPlanner.java:942) > Sep 10 03:03:39 E at > org.apache.calcite.plan.hep.HepPlanner.buildFinalPlan(HepPlanner.java:939) > Sep 10 03:03:39 E at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.optimize.program.FlinkGroupProgram$$anonfun$optimize$1$$anonfun$apply$1.apply(FlinkGroupProgram.scala:63) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.optimize.program.FlinkGroupProgram$$anonfun$optimize$1$$anonfun$apply$1.apply(FlinkGroupProgram.scala:60) > Sep 10 03:03:39 E at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > Sep 10 03:03:39 E at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > Sep 10 03:03:39 E at > scala.collection.Iterator$class.foreach(Iterator.scala:891) > Sep 10 03:03:39 E at > scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > Sep 10 03:03:39 E at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > Sep 10 03:03:39 E at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) > Sep 10 03:03:39 E at > scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) > Sep 10 03:03:39 E at > scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104) > Sep 10 03:03:39 E at > org.apache.flink.table.planner.plan.optimize.program.FlinkGroupProgram$$anonfun$optimize$1.apply(FlinkGroupProgram.scala:60) > Sep 10 03:03:39 E at >
[jira] [Updated] (FLINK-31857) Support pluginable observers mechanism
[ https://issues.apache.org/jira/browse/FLINK-31857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-31857: --- Labels: auto-deprioritized-major pull-request-available (was: pull-request-available stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Support pluginable observers mechanism > -- > > Key: FLINK-31857 > URL: https://issues.apache.org/jira/browse/FLINK-31857 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.5.0 >Reporter: Daren Wong >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > > Currently, Kubernetes Operator uses AbstractFlinkResourceObserver to observe > Flink Job state, cluster state, JM/TM state, etc and update Custom Resource: > FlinkDeployment, FlinkSessionJob with the gathered info. > This Jira is to introduce a Flink plugin to allow user to implement custom > observer logic after the default observer logic is run. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32938) flink-connector-pulsar should remove all `PulsarAdmin` calls
[ https://issues.apache.org/jira/browse/FLINK-32938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-32938: --- Labels: pull-request-available (was: ) > flink-connector-pulsar should remove all `PulsarAdmin` calls > > > Key: FLINK-32938 > URL: https://issues.apache.org/jira/browse/FLINK-32938 > Project: Flink > Issue Type: Improvement > Components: Connectors / Pulsar >Reporter: Neng Lu >Priority: Major > Labels: pull-request-available > > The flink-connector-pulsar should not access and interact with the admin > endpoint. This could introduce potential security issues. > In a production environment, a Pulsar cluster admin will not grant the > permissions for the flink application to conduct any admin operations. > Currently, the connector does various admin calls: > ```{{{}{}}}{{{}{}}} > PulsarAdmin.topics().getPartitionedTopicMetadata(topic) > PulsarAdmin.namespaces().getTopics(namespace) > PulsarAdmin.topics().getLastMessageId(topic) > PulsarAdmin.topics().getMessageIdByTimestamp(topic, timestamp) > PulsarAdmin.topics().getSubscriptions(topic) > PulsarAdmin.topics().createSubscription(topic, subscription, > MessageId.earliest) > PulsarAdmin.topics().resetCursor(topic, subscription, initial, !include) > ``` > We need to replace these calls with consumer or client calls. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-pulsar] nlu90 commented on pull request #59: [FLINK-32938] Replace pulsar admin calls
nlu90 commented on PR #59: URL: https://github.com/apache/flink-connector-pulsar/pull/59#issuecomment-1714662290 Can someone help rerun the CI? On my local laptop, everything runs fine: ``` -> mvn clean verify ... ... ... [INFO] [INFO] Reactor Summary for Flink : Connectors : Pulsar : Parent 4.0-SNAPSHOT: [INFO] [INFO] Flink : Connectors : Pulsar : Parent ... SUCCESS [ 1.819 s] [INFO] Flink : Connectors : Pulsar SUCCESS [27:41 min] [INFO] Flink : Connectors : Pulsar : SQL .. SUCCESS [ 5.184 s] [INFO] Flink : Connectors : Pulsar : E2E Tests SUCCESS [ 1.157 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 27:50 min [INFO] Finished at: 2023-09-11T15:01:51-07:00 [INFO] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23395: [FLINK-33058][formats] Add encoding option to Avro format
flinkbot commented on PR #23395: URL: https://github.com/apache/flink/pull/23395#issuecomment-1714582332 ## CI report: * 858c3eebae767ea71e1d43fc5e8f4af88f924c4a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33058) Support for JSON-encoded Avro
[ https://issues.apache.org/jira/browse/FLINK-33058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33058: --- Labels: avro flink flink-formats pull-request-available (was: avro flink flink-formats) > Support for JSON-encoded Avro > - > > Key: FLINK-33058 > URL: https://issues.apache.org/jira/browse/FLINK-33058 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Dale Lane >Priority: Minor > Labels: avro, flink, flink-formats, pull-request-available > > Avro supports two serialization encoding methods: binary and JSON > cf. [https://avro.apache.org/docs/1.11.1/specification/#encodings] > flink-avro currently has a hard-coded assumption that Avro data is > binary-encoded (and cannot process Avro data that has been JSON-encoded). > I propose adding a new optional format option to flink-avro: *avro.encoding* > It will support two options: 'binary' and 'json'. > It unset, it will default to 'binary' to maintain compatibility/consistency > with current behaviour. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dalelane opened a new pull request, #23395: [FLINK-33058][formats] Add encoding option to Avro format
dalelane opened a new pull request, #23395: URL: https://github.com/apache/flink/pull/23395 ## What is the purpose of the change Initially proposed in https://issues.apache.org/jira/browse/FLINK-33058 Avro supports two serialization encoding methods: binary and JSON (cf. [Avro docs](https://avro.apache.org/docs/1.11.1/specification/#encodings)) flink-avro currently has a hard-coded assumption that Avro data is binary-encoded (and cannot process Avro data that has been JSON-encoded). This pull request introduces a new optional format option to flink-avro: `avro.encoding` It supports two options: 'binary' and 'json'. It unset, it will default to 'binary' to maintain compatibility/consistency with current behaviour. ## Brief change log Flink uses Avro `[Decoder](https://avro.apache.org/docs/1.8.2/api/java/org/apache/avro/io/Decoder.html)` and `[Encoder](https://avro.apache.org/docs/1.8.2/api/java/org/apache/avro/io/Encoder.html)` classes for deserializing/serializing Avro data. However it was hard-coding the use of factory classes to only use the binary-encoding implementations of these abstract classes. (`DecoderFactory.get().binaryDecoder` and `EncoderFactory.get().directBinaryEncoder`) In this pull request, I'm using the value of the new `avro.encoding` option to create the JSON Decoder/Encoder classes where appropriate. ## Verifying this change This change modified existing tests by re-running all of the tests that perform Avro serialization/deserialization to repeat the test using both binary and avro encoding. This verifies that the existing binary behaviour is unaffected by the new option, as well as the new JSON support. I've also manually verified the new support using Flink SQL such as: ```sql CREATE TABLE JSONAVRORAW ( ... my columns ... ) WITH ( 'connector' = 'kafka', 'topic' = 'MY.TOPIC', 'properties.bootstrap.servers' = 'localhost:9092', 'properties.group.id' = 'my-group', 'scan.startup.mode' = 'earliest-offset', 'format' = 'avro', 'avro.encoding' = 'json' ); ``` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): yes - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? update to the Avro format page to explain the new option -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-29042) Support lookup join for es connector
[ https://issues.apache.org/jira/browse/FLINK-29042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763915#comment-17763915 ] Sergey Nuyanzin commented on FLINK-29042: - Merged as [fef5c964f4a2fc8b61c482b0bc8504757581adfb|https://github.com/apache/flink-connector-elasticsearch/commit/fef5c964f4a2fc8b61c482b0bc8504757581adfb] > Support lookup join for es connector > > > Key: FLINK-29042 > URL: https://issues.apache.org/jira/browse/FLINK-29042 > Project: Flink > Issue Type: New Feature > Components: Connectors / ElasticSearch >Affects Versions: 1.16.0 >Reporter: Yubin Li >Assignee: Yubin Li >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: elasticsearch-4.0.0 > > > Now es connector could only be used as a sink, but in many business > scenarios, we treat es as a index database, we should support to make it > lookupable in flink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23394: [BP-1.16][FLINK-33010][table] GREATEST/LEAST should work with other functions as input
flinkbot commented on PR #23394: URL: https://github.com/apache/flink/pull/23394#issuecomment-1714538178 ## CI report: * fa40b4f52627a9b2fc32b48b138310311d211a6e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-29042) Support lookup join for es connector
[ https://issues.apache.org/jira/browse/FLINK-29042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin closed FLINK-29042. --- Fix Version/s: elasticsearch-4.0.0 Resolution: Fixed > Support lookup join for es connector > > > Key: FLINK-29042 > URL: https://issues.apache.org/jira/browse/FLINK-29042 > Project: Flink > Issue Type: New Feature > Components: Connectors / ElasticSearch >Affects Versions: 1.16.0 >Reporter: Yubin Li >Assignee: Yubin Li >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: elasticsearch-4.0.0 > > > Now es connector could only be used as a sink, but in many business > scenarios, we treat es as a index database, we should support to make it > lookupable in flink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-elasticsearch] boring-cyborg[bot] commented on pull request #39: [FLINK-29042][Connectors/ElasticSearch] Support lookup join for es connector
boring-cyborg[bot] commented on PR #39: URL: https://github.com/apache/flink-connector-elasticsearch/pull/39#issuecomment-1714536181 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-elasticsearch] snuyanzin merged pull request #39: [FLINK-29042][Connectors/ElasticSearch] Support lookup join for es connector
snuyanzin merged PR #39: URL: https://github.com/apache/flink-connector-elasticsearch/pull/39 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23393: [BP-1.18][FLINK-33010][table] GREATEST/LEAST should work with other functions as input
flinkbot commented on PR #23393: URL: https://github.com/apache/flink/pull/23393#issuecomment-1714524505 ## CI report: * e296671f364fdeda71054bc090ad35e3f77ad07b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23392: [BP-1.17][FLINK-33010][table] GREATEST/LEAST should work with other functions as input
flinkbot commented on PR #23392: URL: https://github.com/apache/flink/pull/23392#issuecomment-1714518365 ## CI report: * d7e7c703c9a23d18f5c44a0c8b42db7d6afe46a8 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-32999) Remove HBase connector from master branch
[ https://issues.apache.org/jira/browse/FLINK-32999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin resolved FLINK-32999. - Fix Version/s: 1.18.0 1.19.0 Resolution: Fixed > Remove HBase connector from master branch > - > > Key: FLINK-32999 > URL: https://issues.apache.org/jira/browse/FLINK-32999 > Project: Flink > Issue Type: Sub-task > Components: Connectors / HBase >Reporter: Sergey Nuyanzin >Assignee: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > The connector was externalized at FLINK-30061 > Once it is released it would make sense to remove it from master branch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32999) Remove HBase connector from master branch
[ https://issues.apache.org/jira/browse/FLINK-32999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763905#comment-17763905 ] Sergey Nuyanzin commented on FLINK-32999: - Merged to 1.18 as [53981d81d378770f8a1f87717c5d80b35e4010d7|https://github.com/apache/flink/commit/53981d81d378770f8a1f87717c5d80b35e4010d7] master: [62e41acdf56cccde53cc6226b2cc7c0bc3cda3e6|https://github.com/apache/flink/commit/62e41acdf56cccde53cc6226b2cc7c0bc3cda3e6] > Remove HBase connector from master branch > - > > Key: FLINK-32999 > URL: https://issues.apache.org/jira/browse/FLINK-32999 > Project: Flink > Issue Type: Sub-task > Components: Connectors / HBase >Reporter: Sergey Nuyanzin >Assignee: Ferenc Csaky >Priority: Major > Labels: pull-request-available > > The connector was externalized at FLINK-30061 > Once it is released it would make sense to remove it from master branch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] snuyanzin merged pull request #23343: [BP-1.18][FLINK-32999][connectors/hbase] Remove HBase connector code from main repo
snuyanzin merged PR #23343: URL: https://github.com/apache/flink/pull/23343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a diff in pull request #23391: [FLINK-33071][metrics,checkpointing] Log a json dump of checkpoint statistics when checkpoint complets or fails
rkhachatryan commented on code in PR #23391: URL: https://github.com/apache/flink/pull/23391#discussion_r1321980461 ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsTracker.java: ## @@ -227,11 +257,23 @@ void reportFailedCheckpoint(FailedCheckpointStats failed) { history.replacePendingCheckpointById(failed); dirty = true; +logCheckpointStatistics(failed); } finally { statsReadWriteLock.unlock(); } } +private void logCheckpointStatistics(AbstractCheckpointStats checkpointStats) { +try { +StringWriter sw = new StringWriter(); +MAPPER.writeValue(sw, CheckpointStatistics.generateCheckpointStatistics(checkpointStats, true)); +String jsonDump = sw.toString(); +LOG.debug("CheckpointStatistics (for jobID={}, jobName={}) dump = {} ", jobID, jobName, jsonDump); Review Comment: How about explicitly logging checkpoint ID along with job ID? So that we don't need to parse/know json to get stats for a specific checkpoint. ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsTracker.java: ## @@ -227,11 +257,23 @@ void reportFailedCheckpoint(FailedCheckpointStats failed) { history.replacePendingCheckpointById(failed); dirty = true; +logCheckpointStatistics(failed); } finally { statsReadWriteLock.unlock(); } } +private void logCheckpointStatistics(AbstractCheckpointStats checkpointStats) { +try { +StringWriter sw = new StringWriter(); +MAPPER.writeValue(sw, CheckpointStatistics.generateCheckpointStatistics(checkpointStats, true)); +String jsonDump = sw.toString(); Review Comment: Should we save some resources by guarding this with isDebugEnabled? ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsTracker.java: ## @@ -227,11 +257,23 @@ void reportFailedCheckpoint(FailedCheckpointStats failed) { history.replacePendingCheckpointById(failed); dirty = true; +logCheckpointStatistics(failed); } finally { statsReadWriteLock.unlock(); } } +private void logCheckpointStatistics(AbstractCheckpointStats checkpointStats) { +try { +StringWriter sw = new StringWriter(); +MAPPER.writeValue(sw, CheckpointStatistics.generateCheckpointStatistics(checkpointStats, true)); +String jsonDump = sw.toString(); +LOG.debug("CheckpointStatistics (for jobID={}, jobName={}) dump = {} ", jobID, jobName, jsonDump); +} catch (Exception ex) { +LOG.error("Fail to log CheckpointStatistics", ex); Review Comment: nit: I'd log it as a warning -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33066) Enable to inject environment variable from secret/configmap to operatorPod
[ https://issues.apache.org/jira/browse/FLINK-33066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora reassigned FLINK-33066: -- Assignee: dongwoo.kim > Enable to inject environment variable from secret/configmap to operatorPod > -- > > Key: FLINK-33066 > URL: https://issues.apache.org/jira/browse/FLINK-33066 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: dongwoo.kim >Assignee: dongwoo.kim >Priority: Minor > > Hello, I've been working with the Flink Kubernetes operator and noticed that > the {{operatorPod.env}} only allows for simple key-value pairs and doesn't > support Kubernetes {{valueFrom}} syntax. > How about changing template to support more various k8s syntax? > *Current template* > {code:java} > {{- range $k, $v := .Values.operatorPod.env }} > - name: {{ $v.name | quote }} > value: {{ $v.value | quote }} > {{- end }}{code} > > *Proposed template* > 1) Modify template like below > {code:java} > {{- with .Values.operatorPod.env }} > {{- toYaml . | nindent 12 }} > {{- end }} > {code} > 2) create extra config, *Values.operatorPod.envFrom* and utilize this > > I'd be happy to implement this update if it's approved. > Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33066) Enable to inject environment variable from secret/configmap to operatorPod
[ https://issues.apache.org/jira/browse/FLINK-33066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763887#comment-17763887 ] Gyula Fora commented on FLINK-33066: I think this would be a nice improvement :) > Enable to inject environment variable from secret/configmap to operatorPod > -- > > Key: FLINK-33066 > URL: https://issues.apache.org/jira/browse/FLINK-33066 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: dongwoo.kim >Priority: Minor > > Hello, I've been working with the Flink Kubernetes operator and noticed that > the {{operatorPod.env}} only allows for simple key-value pairs and doesn't > support Kubernetes {{valueFrom}} syntax. > How about changing template to support more various k8s syntax? > *Current template* > {code:java} > {{- range $k, $v := .Values.operatorPod.env }} > - name: {{ $v.name | quote }} > value: {{ $v.value | quote }} > {{- end }}{code} > > *Proposed template* > 1) Modify template like below > {code:java} > {{- with .Values.operatorPod.env }} > {{- toYaml . | nindent 12 }} > {{- end }} > {code} > 2) create extra config, *Values.operatorPod.envFrom* and utilize this > > I'd be happy to implement this update if it's approved. > Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] pgaref commented on a diff in pull request #23386: [FLINK-33022] Log an error when enrichers defined as part of the configuration can not be found/loaded
pgaref commented on code in PR #23386: URL: https://github.com/apache/flink/pull/23386#discussion_r1321976643 ## flink-runtime/src/main/java/org/apache/flink/runtime/failure/FailureEnricherUtils.java: ## @@ -106,23 +108,34 @@ static Collection getFailureEnrichers( } } +if (!includedEnrichers.values().stream().allMatch(Boolean::booleanValue)) { Review Comment: I agree, I went back and forth between changing the method signature but this simplifies things a lot. Updated, and thanks ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321874986 ## flink-clients/src/main/java/org/apache/flink/client/program/rest/UrlPrefixDecorator.java: ## @@ -41,7 +40,7 @@ */ public class UrlPrefixDecorator< R extends RequestBody, P extends ResponseBody, M extends MessageParameters> -implements SqlGatewayMessageHeaders { Review Comment: It is used in other places which is not related to this PR. But the reason `UrlPrefixDecorator` implemented `SqlGatewayMessageHeaders` to fetch `SqlGatewayRestAPIVersion::isStableVersion`, as `UrlPrefixDecorator` was initially meant to be used only with FlinkSQLGateway. Now I intend to move `UrlPrefixDecorator` into `flink-clients` module, so that `UrlPrefixDecorator` can be used by RestClusterClient also for PyFlink remote execution. `flink-sql-gateway` has dependency on `flink-clients` module, thus will be able to extend `UrlPrefixDecorator` class and use it also. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321869568 ## flink-clients/src/main/java/org/apache/flink/client/program/rest/MonitoringApiMessageHeaders.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.client.program.rest; + +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.runtime.rest.versioning.RuntimeRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class links {@link RequestBody}s to {@link ResponseBody}s types and contains meta-data + * required for their http headers in runtime module. + * + * Implementations must be state-less. + * + * @param request message type + * @param response message type + * @param message parameters type + */ +public interface MonitoringApiMessageHeaders< Review Comment: This interface provides a method `getSupportedAPIVersions` which enables fetching required stable REST versions. For example for Flink SQL Gateway it can be `SqlGatewayRestAPIVersion::isStableVersion`, but for PyFlink or other cases it will be `RuntimeRestAPIVersion::isStableVersion` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321868038 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: The overridden method `getSupportedAPIVersions()` of SQLGatewayUrlPrefixDecorator will return SQL Gateway's stable REST API versions. UrlPrefixDecorator will be generic and can be used on the PyFlink side, which will need to fetch RuntimeRestAPIVersion stable versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321865428 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: The overridden method `getSupportedAPIVersions()` of SQLGatewayUrlPrefixDecorator will return SQL Gateway's stable REST API versions. UrlPrefixDecorator will be generic and can be used on the PyFlink side, which will need to fetch different stable rest API versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] elkhand commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
elkhand commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321865428 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: The overridden method `getSupportedAPIVersions()` of SQLGatewayUrlPrefixDecorator will return SQL Gateway's stable REST API versions. UrlPrefixDecorator will be generic and can be used on the PyFlink side, which will need to fetch different stable rest API versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23391: [FLINK-33071][metrics,checkpointing] Log a json dump of checkpoint statistics when checkpoint complets or fails
flinkbot commented on PR #23391: URL: https://github.com/apache/flink/pull/23391#issuecomment-1714278117 ## CI report: * 7571dc3f4381b14c4764a19de346fd4e44bb59dc UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski opened a new pull request, #23391: [FLINK-33071][metrics,checkpointing] Log a json dump of checkpoint statistics when checkpoint complets or fails
pnowojski opened a new pull request, #23391: URL: https://github.com/apache/flink/pull/23391 ## What is the purpose of the change This adds a DEBUG level log message that contains all checkpoint stats on every completed/failed checkpoint. A stop-gap solution until we come up with a proper way to expose those stats as metrics somewhere (FLINK-23411). ## Verifying this change I manually checked log output in some of the ITCases: ``` 8299 [jobmanager-io-thread-10] DEBUG org.apache.flink.runtime.checkpoint.CheckpointStatsTracker [] - CheckpointStatistics (for jobID=cf70776f3762dd273961bd57a2dbdc62, jobName=Flink Streaming Job) dump = {"className":"completed","id":2,"status":"COMPLETED","is_savepoint":false,"savepointFormat":null,"trigger_timestamp":1694452019812,"latest_ack_timestamp":1694452019823,"checkpointed_size":0,"state_size":0,"end_to_end_duration":11,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":10,"num_acknowledged_subtasks":10,"checkpoint_type":"CHECKPOINT","tasks":{"feca28aff5a3958840bee985ee7de4d3":{"id":2,"status":"COMPLETED","latest_ack_timestamp":1694452019823,"checkpointed_size":0,"state_size":0,"end_to_end_duration":11,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":1,"num_acknowledged_subtasks":1},"8b9f3e513101bcc5d198aee685286d98":{"id":2,"status":"COMPLETED","latest_ack_timestamp":1694452019823,"checkpointed_size":0,"state_size":0,"e nd_to_end_duration":11,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":4,"num_acknowledged_subtasks":4},"51397532e2d9c7a21097a30d590b3114":{"id":2,"status":"COMPLETED","latest_ack_timestamp":1694452019823,"checkpointed_size":0,"state_size":0,"end_to_end_duration":11,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":4,"num_acknowledged_subtasks":4},"bc764cd8ddf7a0cff126f51c16239658":{"id":2,"status":"COMPLETED","latest_ack_timestamp":1694452019823,"checkpointed_size":0,"state_size":0,"end_to_end_duration":11,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":1,"num_acknowledged_subtasks":1}},"external_path":"","discarded":false} 8357 [Checkpoint Timer] DEBUG org.apache.flink.runtime.checkpoint.CheckpointStatsTracker [] - CheckpointStatistics (for jobID=cf70776f3762dd273961bd57a2dbdc62, jobName=Flink Streaming Job) dump = {"className":"failed","id":3,"status":"FAILED","is_savepoint":false,"savepointFormat":null,"trigger_timestamp":1694452019832,"latest_ack_timestamp":-1,"checkpointed_size":0,"state_size":0,"end_to_end_duration":48,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":10,"num_acknowledged_subtasks":0,"checkpoint_type":"CHECKPOINT","tasks":{"feca28aff5a3958840bee985ee7de4d3":{"id":3,"status":"FAILED","latest_ack_timestamp":-1,"checkpointed_size":0,"state_size":0,"end_to_end_duration":-1,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":1,"num_acknowledged_subtasks":0},"8b9f3e513101bcc5d198aee685286d98":{"id":3,"status":"FAILED","latest_ack_timestamp":-1,"checkpointed_size":0,"state_size":0,"end_to_end_duration":-1,"alignment_buffered":0,"proces sed_data":0,"persisted_data":0,"num_subtasks":4,"num_acknowledged_subtasks":0},"51397532e2d9c7a21097a30d590b3114":{"id":3,"status":"FAILED","latest_ack_timestamp":-1,"checkpointed_size":0,"state_size":0,"end_to_end_duration":-1,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":4,"num_acknowledged_subtasks":0},"bc764cd8ddf7a0cff126f51c16239658":{"id":3,"status":"FAILED","latest_ack_timestamp":-1,"checkpointed_size":0,"state_size":0,"end_to_end_duration":-1,"alignment_buffered":0,"processed_data":0,"persisted_data":0,"num_subtasks":1,"num_acknowledged_subtasks":0}},"failure_timestamp":1694452019880,"failure_message":"TaskManager received a checkpoint request for unknown task 4d61cc52dbd40f4ac96e1ebf5f561cba_bc764cd8ddf7a0cff126f51c16239658_0_0. Failure reason: Task local checkpoint failure."} ``` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
[jira] [Updated] (FLINK-33071) Log checkpoint statistics
[ https://issues.apache.org/jira/browse/FLINK-33071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33071: --- Labels: pull-request-available (was: ) > Log checkpoint statistics > -- > > Key: FLINK-33071 > URL: https://issues.apache.org/jira/browse/FLINK-33071 > Project: Flink > Issue Type: New Feature > Components: Runtime / Checkpointing, Runtime / Metrics >Affects Versions: 1.18.0 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Major > Labels: pull-request-available > > This is a stop gap solution until we have a proper way of solving FLINK-23411. > The plan is to dump JSON serialised checkpoint statistics into Flink JM's > log, with a {{DEBUG}} level. This could be used to analyse what has happened > with a certain checkpoint in the past. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] ferenc-csaky commented on pull request #23343: [BP-1.18][FLINK-32999][connectors/hbase] Remove HBase connector code from main repo
ferenc-csaky commented on PR #23343: URL: https://github.com/apache/flink/pull/23343#issuecomment-1714173307 @snuyanzin can you merge this one as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33018) GCP Pubsub PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream failed
[ https://issues.apache.org/jira/browse/FLINK-33018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33018: --- Labels: pull-request-available (was: ) > GCP Pubsub > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream > failed > > > Key: FLINK-33018 > URL: https://issues.apache.org/jira/browse/FLINK-33018 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: gcp-pubsub-3.0.2 >Reporter: Martijn Visser >Priority: Blocker > Labels: pull-request-available > > https://github.com/apache/flink-connector-gcp-pubsub/actions/runs/6061318336/job/16446392844#step:13:507 > {code:java} > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream:119 > > expected: ["1", "2", "3"] > but was: ["1", "2"] > [INFO] > Error: Tests run: 30, Failures: 1, Errors: 0, Skipped: 0 > [INFO] > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-gcp-pubsub] RyanSkraba opened a new pull request, #19: [FLINK-33018][Tests] Fix flaky test when cancelling source
RyanSkraba opened a new pull request, #19: URL: https://github.com/apache/flink-connector-gcp-pubsub/pull/19 We sometimes observe the test failure described in FLINK-33018 ``` Expected :["1", "2", "3"] Actual :["1", "2"] ``` This occurs because the thread running the source is explicitly cancelled in the **finally** clause. Unlike the other tests in this class, the source thread should run to completion and finish when the last message ID (`"3"`) arrives without external intervention. In rare cases, it was possible for the explicit `cancel()` to happen after the two records arrived, but before the third record causes the source to cleanly shut itself down and finish. This is fixed by no longer waiting for any records to arrive, just letting the thread run to completion, and _then_ testing the results. I checked this by running this particular test as a `@RepeatedTest(100_000)`. I have never observed the behaviour described in the issue after this change, but I've noticed that *with or without* the change, this test will consistently fail after the 16,248-16,250th execution, during the `thread.start()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-23411) Expose Flink checkpoint details metrics
[ https://issues.apache.org/jira/browse/FLINK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763789#comment-17763789 ] Piotr Nowojski edited comment on FLINK-23411 at 9/11/23 3:51 PM: - Hey, I've just upon this ticket. Should we really expose point-wise metrics as continuous ones? It's following the pattern of {{lastCheckpointDuration}}, but I don't think this is a good pattern: * This is wasteful if {{checkpoint duration >> metrics collecting/reporting interval}}. We will report the same values many times over. * This is loosing the data if {{checkpoint duration < metric collecting/reporting interval}}. Only a fraction of checkpoint's stats will be visible in the metric system. * I don't know how this approach will scale with larger jobs (parallelism > 20). How user/dev ops could actually visualize and analyze hundreds/thousands of this kind of metrics for a single checkpoint {{parallelism * number_of_tasks * 8 (number of metrics) == a lot}} (800 for parallelism 20 and 5 tasks)? From my experience if there is an issue with checkpoint, I need to look down to subtask level checkpoint stats to track down what is the problem and then often correlate numbers between different subtasks. I just don't see how this can be done with the current metric reporting scheme. I think in the long term, this should be exposed as something like [distributed tracing|https://newrelic.com/blog/how-to-relic/distributed-tracing-anomaly-detection] on the user side, and reported by Flink as a bunch of traces via OTEL. Maybe we should expand the current {{MetricReporter}} plugins to support that? As a stop gap solution, I would propose this FLINK-33071. was (Author: pnowojski): Hey, I've just upon this ticket as we are also looking into lack of those metrics. Should we really expose point-wise metrics as continuous ones? It's following the pattern of {{lastCheckpointDuration}}, but I don't think this is a good pattern: * this is wasteful if {{checkpoint duration >> metrics collecting/reporting interval}} * this is loosing the data if {{checkpoint duration < metric collecting/reporting interval}} * I don't know how this approach will scale with larger jobs (parallelism > 20). How user/dev ops could actually visualise and analyse hundreds/thousands of this kind of metrics for a single checkpoint {{parallelism * number_of_tasks * 8 (number of metrics) == a lot}} (800 for parallelism 20 and 5 tasks). I think in the long term, this should be exposed as something like [distributed tracing|https://newrelic.com/blog/how-to-relic/distributed-tracing-anomaly-detection] on the user side, and reported by Flink as a bunch of traces via OTEL. Maybe we should expand the current {{MetricReporter}} plugins to support that? As a stop gap solution, I would propose this FLINK-33071. > Expose Flink checkpoint details metrics > --- > > Key: FLINK-23411 > URL: https://issues.apache.org/jira/browse/FLINK-23411 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics >Affects Versions: 1.13.1, 1.12.4 >Reporter: Jun Qin >Assignee: Hangxiang Yu >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: 1.18.0 > > > The checkpoint metrics as shown in the Flink Web UI like the > sync/async/alignment/start delay are not exposed to the metrics system. This > makes problem investigation harder when Web UI is not enabled: those numbers > can not get in the DEBUG logs. I think we should see how we can expose > metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-23411) Expose Flink checkpoint details metrics
[ https://issues.apache.org/jira/browse/FLINK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763789#comment-17763789 ] Piotr Nowojski commented on FLINK-23411: Hey, I've just upon this ticket as we are also looking into lack of those metrics. Should we really expose point-wise metrics as continuous ones? It's following the pattern of {{lastCheckpointDuration}}, but I don't think this is a good pattern: * this is wasteful if {{checkpoint duration >> metrics collecting/reporting interval}} * this is loosing the data if {{checkpoint duration < metric collecting/reporting interval}} * I don't know how this approach will scale with larger jobs (parallelism > 20). How user/dev ops could actually visualise and analyse hundreds/thousands of this kind of metrics for a single checkpoint {{parallelism * number_of_tasks * 8 (number of metrics) == a lot}} (800 for parallelism 20 and 5 tasks). I think in the long term, this should be exposed as something like [distributed tracing|https://newrelic.com/blog/how-to-relic/distributed-tracing-anomaly-detection] on the user side, and reported by Flink as a bunch of traces via OTEL. Maybe we should expand the current {{MetricReporter}} plugins to support that? As a stop gap solution, I would propose this FLINK-33071. > Expose Flink checkpoint details metrics > --- > > Key: FLINK-23411 > URL: https://issues.apache.org/jira/browse/FLINK-23411 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics >Affects Versions: 1.13.1, 1.12.4 >Reporter: Jun Qin >Assignee: Hangxiang Yu >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: 1.18.0 > > > The checkpoint metrics as shown in the Flink Web UI like the > sync/async/alignment/start delay are not exposed to the metrics system. This > makes problem investigation harder when Web UI is not enabled: those numbers > can not get in the DEBUG logs. I think we should see how we can expose > metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33071) Log checkpoint statistics
Piotr Nowojski created FLINK-33071: -- Summary: Log checkpoint statistics Key: FLINK-33071 URL: https://issues.apache.org/jira/browse/FLINK-33071 Project: Flink Issue Type: New Feature Components: Runtime / Checkpointing, Runtime / Metrics Affects Versions: 1.18.0 Reporter: Piotr Nowojski Assignee: Piotr Nowojski This is a stop gap solution until we have a proper way of solving FLINK-23411. The plan is to dump JSON serialised checkpoint statistics into Flink JM's log, with a {{DEBUG}} level. This could be used to analyse what has happened with a certain checkpoint in the past. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] tweise commented on pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for PyFlink re
tweise commented on PR #23383: URL: https://github.com/apache/flink/pull/23383#issuecomment-1714118207 @afedulov PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tweise commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
tweise commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321705462 ## flink-clients/src/main/java/org/apache/flink/client/program/rest/UrlPrefixDecorator.java: ## @@ -41,7 +40,7 @@ */ public class UrlPrefixDecorator< R extends RequestBody, P extends ResponseBody, M extends MessageParameters> -implements SqlGatewayMessageHeaders { Review Comment: Where/how is `SqlGatewayMessageHeaders` used after this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32854) [JUnit5 Migration] The state package of flink-runtime module
[ https://issues.apache.org/jira/browse/FLINK-32854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763765#comment-17763765 ] Rui Fan commented on FLINK-32854: - Merged via 554810abbbd44a656a446d051b3dd199b0df9055 > [JUnit5 Migration] The state package of flink-runtime module > > > Key: FLINK-32854 > URL: https://issues.apache.org/jira/browse/FLINK-32854 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: Rui Fan >Assignee: Jiabao Sun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-32854) [JUnit5 Migration] The state package of flink-runtime module
[ https://issues.apache.org/jira/browse/FLINK-32854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Fan resolved FLINK-32854. - Fix Version/s: 1.19.0 Resolution: Fixed > [JUnit5 Migration] The state package of flink-runtime module > > > Key: FLINK-32854 > URL: https://issues.apache.org/jira/browse/FLINK-32854 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: Rui Fan >Assignee: Jiabao Sun >Priority: Minor > Labels: pull-request-available > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #23359: [FLINK-33029][python] Drop python 3.7 support
gaborgsomogyi commented on code in PR #23359: URL: https://github.com/apache/flink/pull/23359#discussion_r1321570938 ## tools/releasing/create_binary_release.sh: ## @@ -129,8 +129,8 @@ make_python_release() { cp ${pyflink_actual_name} "${PYTHON_RELEASE_DIR}/${pyflink_release_name}" wheel_packages_num=0 - # py37,py38,py39,py310 for mac and linux (11 wheel packages) - EXPECTED_WHEEL_PACKAGES_NUM=11 + # py38,py39,py310 for mac 10.9, 11.0 and linux (9 wheel packages) + EXPECTED_WHEEL_PACKAGES_NUM=9 Review Comment: https://pypi.org/project/apache-flink/#files Here we've had originally built: * (py38,py39,py310) X (mac 10.9, mac 11.0, linux) = 9 * (py37) X (mac 10.9, linux) = 2 Now the last one dropped so we should have 9 but I was unable to test it on azure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tweise commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for
tweise commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321682712 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: Based on what I see here the class filters a SQL gateway specific header. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] mxm commented on a diff in pull request #23383: [FLINK-32885 ] [flink-clients] Refactoring: Moving UrlPrefixDecorator into flink-clients so it can be used by RestClusterClient for PyF
mxm commented on code in PR #23383: URL: https://github.com/apache/flink/pull/23383#discussion_r1321673233 ## flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/header/util/SQLGatewayUrlPrefixDecorator.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.gateway.rest.header.util; + +import org.apache.flink.client.program.rest.UrlPrefixDecorator; +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.table.gateway.rest.util.SqlGatewayRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class decorates the URL of the original message headers by adding a prefix. The purpose of + * this decorator is to provide a consistent URL prefix for all the REST API endpoints while + * preserving the behavior of the original message headers. + * + * @param the type of the request body + * @param the type of the response body + * @param the type of the message parameters + */ +public class SQLGatewayUrlPrefixDecorator< +R extends RequestBody, P extends ResponseBody, M extends MessageParameters> +extends UrlPrefixDecorator { Review Comment: Why is `SQLGatewayUrlPrefixDecorator` required? There is already UrlPrefixDecorator which this class is using as a super class. I don't understand why we need a new class. We can simply use UrlPrefixDecorator . ## flink-clients/src/main/java/org/apache/flink/client/program/rest/MonitoringApiMessageHeaders.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.client.program.rest; + +import org.apache.flink.runtime.rest.messages.MessageHeaders; +import org.apache.flink.runtime.rest.messages.MessageParameters; +import org.apache.flink.runtime.rest.messages.RequestBody; +import org.apache.flink.runtime.rest.messages.ResponseBody; +import org.apache.flink.runtime.rest.versioning.RestAPIVersion; +import org.apache.flink.runtime.rest.versioning.RuntimeRestAPIVersion; + +import java.util.Arrays; +import java.util.Collection; +import java.util.stream.Collectors; + +/** + * This class links {@link RequestBody}s to {@link ResponseBody}s types and contains meta-data + * required for their http headers in runtime module. + * + * Implementations must be state-less. + * + * @param request message type + * @param response message type + * @param message parameters type + */ +public interface MonitoringApiMessageHeaders< Review Comment: Why is this class required? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] 1996fanrui commented on a diff in pull request #22985: [FLINK-21883][scheduler] Implement cooldown period for adaptive scheduler
1996fanrui commented on code in PR #22985: URL: https://github.com/apache/flink/pull/22985#discussion_r1321654007 ## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/Executing.java: ## @@ -124,17 +154,33 @@ private void handleDeploymentFailure(ExecutionVertex executionVertex, JobExcepti @Override public void onNewResourcesAvailable() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } @Override public void onNewResourceRequirements() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } private void maybeRescale() { -if (context.shouldRescale(getExecutionGraph())) { -getLogger().info("Can change the parallelism of job. Restarting job."); +final Duration timeSinceLastRescale = timeSinceLastRescale(); +rescaleScheduled = false; +final boolean shouldForceRescale = +(scalingIntervalMax != null) +&& (timeSinceLastRescale.compareTo(scalingIntervalMax) > 0) +&& (lastRescale != Instant.EPOCH); // initial rescale is not forced +if (shouldForceRescale || context.shouldRescale(getExecutionGraph())) { +if (shouldForceRescale) { +getLogger() +.info( +"Time since last rescale ({}) > {} ({}). Force-changing the parallelism of the job. Restarting the job.", +timeSinceLastRescale, + JobManagerOptions.SCHEDULER_SCALING_INTERVAL_MAX.key(), +scalingIntervalMax); +} else { +getLogger().info("Can change the parallelism of the job. Restarting the job."); +} +lastRescale = Instant.now(); context.goToRestarting( getExecutionGraph(), Review Comment: >> Do you mean we force trigger a rescale when new resources comes and the current time > lastRescaleTime + scalingIntervalMax? > > yes, that is what is what the community agreed on in the FLIP. I think this strategy just solve one case: the last resource comes after scalingIntervalMax. The job can rescale directly. However, it cannot solve the case: the last resource comes before scalingIntervalMax. Assuming the scalingIntervalMax is 5 miuntes, the `jobmanager.adaptive-scheduler.min-parallelism-increase` is 2, and the expected paralleslim is 100. - The job has 99 TMs at 09:00:00, it run with parallelism 99. - At 09:03:00, the last TM comes. - The `Executing` will ignore the rescale due to the parallelism diff is 1, it less than `min-parallelism-increase` and `current time < lastRescaleTime + scalingIntervalMax`. And then the job doesn't need the new resource and the expected parallelism isn't change, so the job won't rescale anymore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] 1996fanrui commented on a diff in pull request #22985: [FLINK-21883][scheduler] Implement cooldown period for adaptive scheduler
1996fanrui commented on code in PR #22985: URL: https://github.com/apache/flink/pull/22985#discussion_r1321641317 ## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/Executing.java: ## @@ -124,17 +154,33 @@ private void handleDeploymentFailure(ExecutionVertex executionVertex, JobExcepti @Override public void onNewResourcesAvailable() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } @Override public void onNewResourceRequirements() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } private void maybeRescale() { -if (context.shouldRescale(getExecutionGraph())) { -getLogger().info("Can change the parallelism of job. Restarting job."); +final Duration timeSinceLastRescale = timeSinceLastRescale(); +rescaleScheduled = false; +final boolean shouldForceRescale = +(scalingIntervalMax != null) +&& (timeSinceLastRescale.compareTo(scalingIntervalMax) > 0) +&& (lastRescale != Instant.EPOCH); // initial rescale is not forced +if (shouldForceRescale || context.shouldRescale(getExecutionGraph())) { +if (shouldForceRescale) { +getLogger() +.info( +"Time since last rescale ({}) > {} ({}). Force-changing the parallelism of the job. Restarting the job.", +timeSinceLastRescale, + JobManagerOptions.SCHEDULER_SCALING_INTERVAL_MAX.key(), +scalingIntervalMax); +} else { +getLogger().info("Can change the parallelism of the job. Restarting the job."); +} +lastRescale = Instant.now(); context.goToRestarting( getExecutionGraph(), Review Comment: > > If so, I think the logic related to scalingIntervalMax should be moved into maybeRescale. > > This logic is already in maybeRescale() method Sorry, I have a typo here. I mean the logic related to scalingIntervalMax should be moved out of the maybeRescale. `onNewResourceRequirements` also call the `rescaleWhenCooldownPeriodIsOver` when user changed the parallelism, maybeRescale shouldn't force trigger a rescale in this case even if the `current time > lastRescaleTime + scalingIntervalMax`. Because after requirement is adjusted, resources cannot be ready immediately. The rescale is meaningless at this time even if the `current time > lastRescaleTime + scalingIntervalMax`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] echauchot commented on a diff in pull request #22985: [FLINK-21883][scheduler] Implement cooldown period for adaptive scheduler
echauchot commented on code in PR #22985: URL: https://github.com/apache/flink/pull/22985#discussion_r1321602343 ## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/Executing.java: ## @@ -124,17 +154,33 @@ private void handleDeploymentFailure(ExecutionVertex executionVertex, JobExcepti @Override public void onNewResourcesAvailable() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } @Override public void onNewResourceRequirements() { -maybeRescale(); +rescaleWhenCooldownPeriodIsOver(); } private void maybeRescale() { -if (context.shouldRescale(getExecutionGraph())) { -getLogger().info("Can change the parallelism of job. Restarting job."); +final Duration timeSinceLastRescale = timeSinceLastRescale(); +rescaleScheduled = false; +final boolean shouldForceRescale = +(scalingIntervalMax != null) +&& (timeSinceLastRescale.compareTo(scalingIntervalMax) > 0) +&& (lastRescale != Instant.EPOCH); // initial rescale is not forced +if (shouldForceRescale || context.shouldRescale(getExecutionGraph())) { +if (shouldForceRescale) { +getLogger() +.info( +"Time since last rescale ({}) > {} ({}). Force-changing the parallelism of the job. Restarting the job.", +timeSinceLastRescale, + JobManagerOptions.SCHEDULER_SCALING_INTERVAL_MAX.key(), +scalingIntervalMax); +} else { +getLogger().info("Can change the parallelism of the job. Restarting the job."); +} +lastRescale = Instant.now(); context.goToRestarting( getExecutionGraph(), Review Comment: > Do you mean we force trigger a rescale when new resources comes and the current time > lastRescaleTime + scalingIntervalMax? yes, that is what is what the community agreed on in the FLIP. > If so, I think the logic related to scalingIntervalMax should be moved into maybeRescale. This logic is already in maybeRescale() method > onNewResourceRequirements also call the rescaleWhenCooldownPeriodIsOver, when user changed the parallelism; maybeRescale will force trigger a rescale as well. right? yes `onNewResourceRequirements` is ultimately called by a rest endpoint to react to use action on the web ui so it will call maybeRescale in the end but that would not always mean a rescale: if timeSinceLastRescale < scalingIntervalMax and minimal resources are not met then there will be no rescale. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org