[jira] [Commented] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762965#comment-17762965 ] Zili Chen commented on FLINK-33053: --- The log seems trimed. I saw: 2023-09-08 11:09:03,738 DEBUG org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.WatcherRemovalManager [] - Removing watcher for path: /flink/flink-native-test-117/leader/7db5c7316828f598234677e2169e7b0f/connection_info So the TM has issued watcher removal request. > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762962#comment-17762962 ] Matthias Pohl commented on FLINK-33053: --- Thanks for sharing this. I guess, the thread dump (which you can generate by sending {{kill -3 }} to the TaskManager process) would be more helpful than the heap dump. Or am I missing something. The thread dump might tell us more of what threads are still available even after the JobMaster is closed. > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762962#comment-17762962 ] Matthias Pohl edited comment on FLINK-33053 at 9/8/23 5:51 AM: --- Thanks for sharing this. I guess, the thread dump (which you can generate by sending {{kill -3 }} to the TaskManager process) would be more helpful than the heap dump. Or am I missing something. The thread dump might tell us more of what threads are still available even after the JobMaster is closed. was (Author: mapohl): Thanks for sharing this. I guess, the thread dump (which you can generate by sending {{kill -3 }} to the TaskManager process) would be more helpful than the heap dump. Or am I missing something. The thread dump might tell us more of what threads are still available even after the JobMaster is closed. > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711096171 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-32620) Migrate DiscardingSink to sinkv2
[ https://issues.apache.org/jira/browse/FLINK-32620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weijie Guo closed FLINK-32620. -- Resolution: Fixed master(1.19) via c4a76e2f006d834771092a8fac96bfc63361e20b. > Migrate DiscardingSink to sinkv2 > > > Key: FLINK-32620 > URL: https://issues.apache.org/jira/browse/FLINK-32620 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Common >Affects Versions: 1.19.0 >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32620) Migrate DiscardingSink to sinkv2
[ https://issues.apache.org/jira/browse/FLINK-32620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weijie Guo updated FLINK-32620: --- Fix Version/s: 1.19.0 > Migrate DiscardingSink to sinkv2 > > > Key: FLINK-32620 > URL: https://issues.apache.org/jira/browse/FLINK-32620 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Common >Affects Versions: 1.19.0 >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23378: [FLINK-33055][doc]Correct the error value about 'state.backend.type' in the document
flinkbot commented on PR #23378: URL: https://github.com/apache/flink/pull/23378#issuecomment-1711059626 ## CI report: * 144ab76379d0058918c30cc5e0830f65f7f53d09 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ljz2051 opened a new pull request, #23378: [FLINK-33055][doc]Correct the error value about 'state.backend.type' in the document
ljz2051 opened a new pull request, #23378: URL: https://github.com/apache/flink/pull/23378 ## What is the purpose of the change This pull request correct the error value about 'state.backend.type' in the document(config.md). ## Brief change log - Update the description about 'state.backend.type' in the document (config.md). ## Verifying this change This change is a trivial work about document without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23377: [FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source
flinkbot commented on PR #23377: URL: https://github.com/apache/flink/pull/23377#issuecomment-1711053515 ## CI report: * acfd24c22b8b16fd8d2f3d12b89ec3759efac90f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-30217) Use ListState#update() to replace clear + add mode.
[ https://issues.apache.org/jira/browse/FLINK-30217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hangxiang Yu resolved FLINK-30217. -- Fix Version/s: 1.19.0 Resolution: Fixed merged 20c0acf60d137cd613914503f1b40e7b2adb86c1 into master > Use ListState#update() to replace clear + add mode. > --- > > Key: FLINK-30217 > URL: https://issues.apache.org/jira/browse/FLINK-30217 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: xljtswf >Assignee: Zakelly Lan >Priority: Major > Fix For: 1.19.0 > > > When using listState, I found many times we need to clear current state, then > add new values. This is especially common in > CheckpointedFunction#snapshotState, which is slower than just use > ListState#update(). > Suppose we want to update the liststate to contain value1, value2, value3. > With current implementation, we first call Liststate#clear(). this updates > the state 1 time. > then we add value1, value2, value3 to the state. > if we use heap state, we need to search the stateTable 3 times and add 3 > values to the list. > this happens in memory and is not too bad. > if we use rocksdb. then we will call backend.db.merge() 3 times. > finally, we will update the state 4 times. > The more values to be added, the more times we will update the state. > while if we use listState#update. then we just need to update the state 1 > time. I think this can save a lot of time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711052496 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33055) Correct the error value about 'state.backend.type' in the document
[ https://issues.apache.org/jira/browse/FLINK-33055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762950#comment-17762950 ] Hangxiang Yu commented on FLINK-33055: -- [~lijinzhong] Thanks for volunteering, already assigned to you, please go ahead. > Correct the error value about 'state.backend.type' in the document > -- > > Key: FLINK-33055 > URL: https://issues.apache.org/jira/browse/FLINK-33055 > Project: Flink > Issue Type: Bug > Components: Documentation, Runtime / State Backends >Reporter: Hangxiang Yu >Assignee: Jinzhong Li >Priority: Minor > > > {code:java} > state.backend.type: The state backend to use. This defines the data structure > mechanism for taking snapshots. Common values are filesystem or rocksdb{code} > filesystem should be replaced with hashmap after FLINK-16444. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33055) Correct the error value about 'state.backend.type' in the document
[ https://issues.apache.org/jira/browse/FLINK-33055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hangxiang Yu reassigned FLINK-33055: Assignee: Jinzhong Li > Correct the error value about 'state.backend.type' in the document > -- > > Key: FLINK-33055 > URL: https://issues.apache.org/jira/browse/FLINK-33055 > Project: Flink > Issue Type: Bug > Components: Documentation, Runtime / State Backends >Reporter: Hangxiang Yu >Assignee: Jinzhong Li >Priority: Minor > > > {code:java} > state.backend.type: The state backend to use. This defines the data structure > mechanism for taking snapshots. Common values are filesystem or rocksdb{code} > filesystem should be replaced with hashmap after FLINK-16444. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] masteryhx closed pull request #23344: [FLINK-30217] Replace ListState#clear along with following #add/addAll with a ListState#update
masteryhx closed pull request #23344: [FLINK-30217] Replace ListState#clear along with following #add/addAll with a ListState#update URL: https://github.com/apache/flink/pull/23344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] swuferhong opened a new pull request, #23377: [FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source
swuferhong opened a new pull request, #23377: URL: https://github.com/apache/flink/pull/23377 ## What is the purpose of the change Improve the error message when the lookup source is used as the scan source. Currently, if we use a source which only implement `LookupTableSource` but not implement `ScanTableSource` as a scan source, it cannot get a property plan and give a `Cannot generate a valid execution plan for the given query` which can be improved. ## Brief change log - optimize the erro message. - adding test to cover it ## Verifying this change - adding test to cover it ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33055) Correct the error value about 'state.backend.type' in the document
[ https://issues.apache.org/jira/browse/FLINK-33055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762948#comment-17762948 ] Jinzhong Li commented on FLINK-33055: - hi,[~masteryhx] , I would like to take this trivial work. > Correct the error value about 'state.backend.type' in the document > -- > > Key: FLINK-33055 > URL: https://issues.apache.org/jira/browse/FLINK-33055 > Project: Flink > Issue Type: Bug > Components: Documentation, Runtime / State Backends >Reporter: Hangxiang Yu >Priority: Minor > > > {code:java} > state.backend.type: The state backend to use. This defines the data structure > mechanism for taking snapshots. Common values are filesystem or rocksdb{code} > filesystem should be replaced with hashmap after FLINK-16444. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711047440 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33064) Improve the error message when the lookup source is used as the scan source
Yunhong Zheng created FLINK-33064: - Summary: Improve the error message when the lookup source is used as the scan source Key: FLINK-33064 URL: https://issues.apache.org/jira/browse/FLINK-33064 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.18.0 Reporter: Yunhong Zheng Fix For: 1.19.0 Improve the error message when the lookup source is used as the scan source. Currently, if we use a source which only implement LookupTableSource but not implement ScanTableSource, as a scan source, it cannot get a property plan and give a ' Cannot generate a valid execution plan for the given query' which can be improved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-15922) Show "Warn - received late message for checkpoint" only when checkpoint actually expired
[ https://issues.apache.org/jira/browse/FLINK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hangxiang Yu resolved FLINK-15922. -- Fix Version/s: 1.19.0 Resolution: Fixed merged 80a8250176fb2b1d9809fcd47df6097fd53999f6 into master > Show "Warn - received late message for checkpoint" only when checkpoint > actually expired > > > Key: FLINK-15922 > URL: https://issues.apache.org/jira/browse/FLINK-15922 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Stephan Ewen >Assignee: Zakelly Lan >Priority: Not a Priority > Labels: auto-deprioritized-major, beginner, beginner-friendly, > usability > Fix For: 1.19.0 > > > The message "Warn - received late message for checkpoint" is shown frequently > in the logs, also when a checkpoint was purposefully canceled. > In those case, this message is unhelpful and misleading. > We should log this only when the checkpoint is actually expired. > Meaning that when receiving the message, we check if we have an expired > checkpoint for that ID. If yes, we log that message, if not, we simply drop > the message. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] masteryhx closed pull request #23364: [FLINK-15922] Only remember expired checkpoints and log received late messages of those in CheckpointCoordinator
masteryhx closed pull request #23364: [FLINK-15922] Only remember expired checkpoints and log received late messages of those in CheckpointCoordinator URL: https://github.com/apache/flink/pull/23364 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xintongsong commented on a diff in pull request #23255: [FLINK-32870][network] Tiered storage supports reading multiple small buffers by reading and slicing one large buffer
xintongsong commented on code in PR #23255: URL: https://github.com/apache/flink/pull/23255#discussion_r1319297747 ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/file/PartitionFileReader.java: ## @@ -78,4 +87,79 @@ long getPriority( /** Release the {@link PartitionFileReader}. */ void release(); + +/** A {@link PartialBuffer} is a part slice of a larger buffer. */ +class PartialBuffer extends CompositeBuffer { + +public PartialBuffer(BufferHeader bufferHeader) { +super(bufferHeader); +} +} + +/** + * A wrapper class of the reading buffer result, including the read buffers, the hint of + * continue reading, and the read progress, etc. + */ +class ReadBufferResult { + +/** The read buffers. */ +private final List readBuffers; + +/** + * A hint to determine whether the caller should continue reading the following buffers. + * Note that this hint is merely a recommendation and not obligatory. Following the hint + * while reading buffers may improve performance. + */ +private final boolean shouldContinueReadHint; Review Comment: ```suggestion private final boolean continuousReadSuggested; ``` ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/file/PartitionFileReader.java: ## @@ -78,4 +87,79 @@ long getPriority( /** Release the {@link PartitionFileReader}. */ void release(); + +/** A {@link PartialBuffer} is a part slice of a larger buffer. */ +class PartialBuffer extends CompositeBuffer { + +public PartialBuffer(BufferHeader bufferHeader) { +super(bufferHeader); +} +} Review Comment: Why not use `CompositeBuffer` directly? ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/tier/disk/DiskIOScheduler.java: ## @@ -396,13 +417,46 @@ private void prepareForScheduling() { if (nextSegmentId < 0) { updateSegmentId(); } +if (nextSegmentId < 0) { +priority = Long.MAX_VALUE; +return; +} priority = -nextSegmentId < 0 -? Long.MAX_VALUE +readProgress != null +? readProgress.getCurrentReadOffset() Review Comment: This implies the io scheduler understands that priority is file offset. ## flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/file/PartitionFileReader.java: ## @@ -78,4 +87,79 @@ long getPriority( /** Release the {@link PartitionFileReader}. */ void release(); + +/** A {@link PartialBuffer} is a part slice of a larger buffer. */ +class PartialBuffer extends CompositeBuffer { + +public PartialBuffer(BufferHeader bufferHeader) { +super(bufferHeader); +} +} + +/** + * A wrapper class of the reading buffer result, including the read buffers, the hint of + * continue reading, and the read progress, etc. + */ +class ReadBufferResult { + +/** The read buffers. */ +private final List readBuffers; + +/** + * A hint to determine whether the caller should continue reading the following buffers. + * Note that this hint is merely a recommendation and not obligatory. Following the hint + * while reading buffers may improve performance. + */ +private final boolean shouldContinueReadHint; + +/** The read progress state. */ +private final ReadProgress readProgress; + +public ReadBufferResult( +List readBuffers, +boolean shouldContinueReadHint, +ReadProgress readProgress) { +this.readBuffers = readBuffers; +this.shouldContinueReadHint = shouldContinueReadHint; +this.readProgress = readProgress; +} + +public List getReadBuffers() { +return readBuffers; +} + +public boolean shouldContinueReadHint() { +return shouldContinueReadHint; +} + +public ReadProgress getReadProgress() { +return readProgress; +} +} + +/** The {@link ReadProgress} mainly includes current reading offset, end of read offset, etc. */ +class ReadProgress { + +/** + * The current read file offset. Note the offset does not contain the length of the partial + * buffer, because the partial buffer may be dropped at anytime. + */ +private final long currentReadOffset; Review Comment: ```suggestion private final long currentBufferOffset; ``` ##
[GitHub] [flink] flinkbot commented on pull request #23376: [FLINK-14032][state]Make the cache size of RocksDBPriorityQueueSetFactory configurable
flinkbot commented on PR #23376: URL: https://github.com/apache/flink/pull/23376#issuecomment-1711040369 ## CI report: * df6392d251e75003327ada3ba5d728ec98674ac6 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
[ https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lincoln lee reassigned FLINK-33063: --- Assignee: Yunhong Zheng > udaf with user defined pojo object throw error while generate record equaliser > -- > > Key: FLINK-33063 > URL: https://issues.apache.org/jira/browse/FLINK-33063 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Fix For: 1.18.0 > > > Udaf with user define pojo object throw error while generating record > equaliser: > When user create an udaf while recore contains user define complex pojo > object (like List or Map). The codegen > will throw error while generating record equaliser, the error is: > {code:java} > A method named "compareTo" is not declared in any enclosing class nor any > subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
[ https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762946#comment-17762946 ] lincoln lee commented on FLINK-33063: - [~337361...@qq.com] assigned to you :) > udaf with user defined pojo object throw error while generate record equaliser > -- > > Key: FLINK-33063 > URL: https://issues.apache.org/jira/browse/FLINK-33063 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Fix For: 1.18.0 > > > Udaf with user define pojo object throw error while generating record > equaliser: > When user create an udaf while recore contains user define complex pojo > object (like List or Map). The codegen > will throw error while generating record equaliser, the error is: > {code:java} > A method named "compareTo" is not declared in any enclosing class nor any > subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] ljz2051 opened a new pull request, #23376: [FLINK-14032]Make the cache size of RocksDBPriorityQueueSetFactory configurable
ljz2051 opened a new pull request, #23376: URL: https://github.com/apache/flink/pull/23376 ## What is the purpose of the change This pull request make the cache size of RocksDBPriorityQueueSetFactory configurable. ## Brief change log - Pass the configurableOptions of EmbeddedRocksDBStateBackend into RocksDBKeyedStateBackendBuilder, then create RocksDBPriorityQueueSetFactory using the configured cache size ## Verifying this change This change is a trivial work without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711032791 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711029547 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JunRuiLee commented on pull request #23181: W.I.P only used to run ci.
JunRuiLee commented on PR #23181: URL: https://github.com/apache/flink/pull/23181#issuecomment-1711027731 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762936#comment-17762936 ] Yangze Guo commented on FLINK-33053: I reproduce the issue and filter out some logs. [^26.dump.zip] [^26.log] This tm still watches the leader of job 7db5c7316828f598234677e2169e7b0f. {{/flink/flink-native-test-117/leader/7db5c7316828f598234677e2169e7b0f/connection_info}} {{ 0x6400013b748a4420}} [~mapohl] [~tison] > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-33053: --- Attachment: 26.dump.zip > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.dump.zip, 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo updated FLINK-33053: --- Attachment: 26.log > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > Attachments: 26.log > > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23375: Dev 0908 1101
flinkbot commented on PR #23375: URL: https://github.com/apache/flink/pull/23375#issuecomment-1711016903 ## CI report: * ac37fad2d9e939bd2dd3397de07f677b0931eaad UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] WencongLiu opened a new pull request, #23375: Dev 0908 1101
WencongLiu opened a new pull request, #23375: URL: https://github.com/apache/flink/pull/23375 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] mayuehappy commented on pull request #23169: [FLINK-31238] [WIP] Use IngestDB to speed up Rocksdb rescaling recovery
mayuehappy commented on PR #23169: URL: https://github.com/apache/flink/pull/23169#issuecomment-1711009736 > @mayuehappy I noticed that one required PR against RocksDB ([facebook/rocksdb#11646](https://github.com/facebook/rocksdb/pull/11646)) is still not merged and the conversation is stalling. Do you think it's ok to ping Adam to take another look? > @mayuehappy I noticed that one required PR against RocksDB ([facebook/rocksdb#11646](https://github.com/facebook/rocksdb/pull/11646)) is still not merged and the conversation is stalling. Do you think it's ok to ping Adam to take another look? @StefanRRichter Thanks for the reminder, I've pinged him and cbi42 under the PR. BTW, is it possible that we merge this part of the JNI code into frocksdb before they merge into rocksdb? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] mayuehappy commented on pull request #23169: [FLINK-31238] [WIP] Use IngestDB to speed up Rocksdb rescaling recovery
mayuehappy commented on PR #23169: URL: https://github.com/apache/flink/pull/23169#issuecomment-1711007885 > It looks like we will need to bump RocksDB from 6.20.x to 8.5.x > > > 8.5.0-ververica-1.0-db-SNASHOT > > @curcur or @Myasuka , do you know if there are any known hurdles that would prevent us from upgrading RocksDB atm? > @mayuehappy thanks for the draft PR. General idea of the PR looks good to me, we should still polish it and avoid code duplication. I currently cannot test the PR because the frocksDB dependency is missing. How do you want to proceed with this? @StefanRRichter I’m not sure how the Flink community tested Frocksdb-jni before it was released. Is there a maven repository for testing where I can deploy my test-frocksdb-jar ? @Myasuka -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
[ https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762932#comment-17762932 ] Yunhong Zheng commented on FLINK-33063: --- Hi, [~lincoln.86xy] . Can you assign this issue to me? Thanks ! > udaf with user defined pojo object throw error while generate record equaliser > -- > > Key: FLINK-33063 > URL: https://issues.apache.org/jira/browse/FLINK-33063 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 1.18.0 > > > Udaf with user define pojo object throw error while generating record > equaliser: > When user create an udaf while recore contains user define complex pojo > object (like List or Map). The codegen > will throw error while generating record equaliser, the error is: > {code:java} > A method named "compareTo" is not declared in any enclosing class nor any > subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23374: [FLINK-33063][table-runtime] Fix udaf with complex user defined pojo object throw error while generate record equaliser
flinkbot commented on PR #23374: URL: https://github.com/apache/flink/pull/23374#issuecomment-1711004891 ## CI report: * 52afe17fa1e4ac66fa3eb72bb9fc0d0159e2a631 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Zakelly commented on pull request #23364: [FLINK-15922] Only remember expired checkpoints and log received late messages of those in CheckpointCoordinator
Zakelly commented on PR #23364: URL: https://github.com/apache/flink/pull/23364#issuecomment-1711002566 @masteryhx Thanks for your comments. I changed my PR accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] swuferhong opened a new pull request, #23374: [FLINK-33063][table-planner] Fix udaf with complex user defined pojo object throw error while generate record equaliser
swuferhong opened a new pull request, #23374: URL: https://github.com/apache/flink/pull/23374 ## What is the purpose of the change When user create an udaf while recore contains user define complex pojo object (like List or Map). The codegen will throw error while generating record equaliser, the error is: `A method named "compareTo" is not declared in any enclosing class nor any subtype, nor through a static import.` ## Brief change log - Add a test to cover the user defined pojo object. ## Verifying this change ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser
Yunhong Zheng created FLINK-33063: - Summary: udaf with user defined pojo object throw error while generate record equaliser Key: FLINK-33063 URL: https://issues.apache.org/jira/browse/FLINK-33063 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.18.0 Reporter: Yunhong Zheng Fix For: 1.18.0 Udaf with user define pojo object throw error while generating record equaliser: When user create an udaf while recore contains user define complex pojo object (like List or Map). The codegen will throw error while generating record equaliser, the error is: {code:java} A method named "compareTo" is not declared in any enclosing class nor any subtype, nor through a static import.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] WencongLiu commented on pull request #23362: [FLINK-33041][docs] Add an article to guide users to migrate their DataSet jobs to DataStream
WencongLiu commented on PR #23362: URL: https://github.com/apache/flink/pull/23362#issuecomment-1710989624 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32962) Failure to install python dependencies from requirements file
[ https://issues.apache.org/jira/browse/FLINK-32962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762927#comment-17762927 ] Dian Fu commented on FLINK-32962: - +1 to backport to Flink 1.17 and 1.18, especially Flink 1.18. > Failure to install python dependencies from requirements file > - > > Key: FLINK-32962 > URL: https://issues.apache.org/jira/browse/FLINK-32962 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.15.4, 1.16.2, 1.18.0, 1.17.1 >Reporter: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > We have encountered an issue when Flink fails to install python dependencies > from requirements file if python environment contains setuptools dependency > version 67.5.0 or above. > Flink job fails with following error: > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o118.await. > : java.util.concurrent.ExecutionException: > org.apache.flink.table.api.TableException: Failed to wait job finish > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118) > at > org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81) > ... > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.client.program.ProgramInvocationException: Job failed > (JobID: 2ca4026944022ac4537c503464d4c47f) > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83) > ... 6 more > Caused by: org.apache.flink.client.program.ProgramInvocationException: Job > failed (JobID: 2ca4026944022ac4537c503464d4c47f) > at > org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:130) > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) > ... > Caused by: java.io.IOException: java.io.IOException: Failed to execute the > command: /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install > --ignore-installed -r > /var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/tm_localhost:63904-4178d3/blobStorage/job_2ca4026944022ac4537c503464d4c47f/blob_p-feb04bed919e628d98b1ef085111482f05bf43a1-1f5c3fda7cbdd6842e950e04d9d807c5 > --install-option > --prefix=/var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/python-dist-c17912db-5aba-439f-9e39-69cb8c957bec/python-requirements > output: > Usage: > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] -r > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > ... > no such option: --install-option > at > org.apache.flink.python.util.PythonEnvironmentManagerUtils.pipInstallRequirements(PythonEnvironmentManagerUtils.java:144) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.installRequirements(AbstractPythonEnvironmentManager.java:215) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.lambda$open$0(AbstractPythonEnvironmentManager.java:126) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.createResource(AbstractPythonEnvironmentManager.java:435) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.getOrAllocateSharedResource(AbstractPythonEnvironmentManager.java:402) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.open(AbstractPythonEnvironmentManager.java:114) > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:238) > at > org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57) > at >
[jira] [Assigned] (FLINK-32962) Failure to install python dependencies from requirements file
[ https://issues.apache.org/jira/browse/FLINK-32962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-32962: --- Assignee: Aleksandr Pilipenko > Failure to install python dependencies from requirements file > - > > Key: FLINK-32962 > URL: https://issues.apache.org/jira/browse/FLINK-32962 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.15.4, 1.16.2, 1.18.0, 1.17.1 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > We have encountered an issue when Flink fails to install python dependencies > from requirements file if python environment contains setuptools dependency > version 67.5.0 or above. > Flink job fails with following error: > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o118.await. > : java.util.concurrent.ExecutionException: > org.apache.flink.table.api.TableException: Failed to wait job finish > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118) > at > org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81) > ... > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.client.program.ProgramInvocationException: Job failed > (JobID: 2ca4026944022ac4537c503464d4c47f) > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83) > ... 6 more > Caused by: org.apache.flink.client.program.ProgramInvocationException: Job > failed (JobID: 2ca4026944022ac4537c503464d4c47f) > at > org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:130) > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) > ... > Caused by: java.io.IOException: java.io.IOException: Failed to execute the > command: /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install > --ignore-installed -r > /var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/tm_localhost:63904-4178d3/blobStorage/job_2ca4026944022ac4537c503464d4c47f/blob_p-feb04bed919e628d98b1ef085111482f05bf43a1-1f5c3fda7cbdd6842e950e04d9d807c5 > --install-option > --prefix=/var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/python-dist-c17912db-5aba-439f-9e39-69cb8c957bec/python-requirements > output: > Usage: > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] -r > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > ... > no such option: --install-option > at > org.apache.flink.python.util.PythonEnvironmentManagerUtils.pipInstallRequirements(PythonEnvironmentManagerUtils.java:144) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.installRequirements(AbstractPythonEnvironmentManager.java:215) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.lambda$open$0(AbstractPythonEnvironmentManager.java:126) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.createResource(AbstractPythonEnvironmentManager.java:435) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.getOrAllocateSharedResource(AbstractPythonEnvironmentManager.java:402) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.open(AbstractPythonEnvironmentManager.java:114) > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:238) > at > org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57) > at >
[GitHub] [flink] 1996fanrui commented on pull request #23372: [FLINK-33042][flamegraph] Allow trigger flamegraph when task is initializing
1996fanrui commented on PR #23372: URL: https://github.com/apache/flink/pull/23372#issuecomment-1710983769 cc @RocMarshal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33061) Translate failure-enricher documentation to Chinese
[ https://issues.apache.org/jira/browse/FLINK-33061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weihua Hu reassigned FLINK-33061: - Assignee: Matt Wang > Translate failure-enricher documentation to Chinese > --- > > Key: FLINK-33061 > URL: https://issues.apache.org/jira/browse/FLINK-33061 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Matt Wang >Priority: Not a Priority > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-31895) End-to-end integration tests for failure labels
[ https://issues.apache.org/jira/browse/FLINK-31895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weihua Hu reassigned FLINK-31895: - Assignee: Matt Wang > End-to-end integration tests for failure labels > --- > > Key: FLINK-31895 > URL: https://issues.apache.org/jira/browse/FLINK-31895 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Matt Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33062) Deserialization creates multiple instances of case objects in Scala 2.13
SmedbergM created FLINK-33062: - Summary: Deserialization creates multiple instances of case objects in Scala 2.13 Key: FLINK-33062 URL: https://issues.apache.org/jira/browse/FLINK-33062 Project: Flink Issue Type: Bug Components: API / Core Affects Versions: 1.17.1, 1.15.4 Environment: Scala 2.13.12 Flink 1.17.1 and 1.15.4 running inside IntelliJ Flink 1.15.4 running in AWS Managed Flink Reporter: SmedbergM See [https://github.com/SmedbergM/mwe-flink-2-13-deserialization] for a minimal working example. When running a Flink job with Scala 2.13, deserialized objects whose fields are case objects have those case objects re-instantiated. Thus any code that relies on reference equality (such as methods of `scala.Option`) will break. I suspect that this is due to Kyro deserialization not being singleton-aware for case objects, but I haven't been able to drill in and catch this in the act. Here are relevant lines of my application log: {code:java} 17:37:13.224 [jobmanager-io-thread-1] INFO o.a.f.r.c.CheckpointCoordinator -- No checkpoint found during restore. 17:37:13.531 [parse-book -> Sink: log-book (2/2)#0] WARN smedbergm.mwe.BookSink$ -- Book.isbn: None with identityHashCode 1043314405 reports isEmpty false; true None is 2019204827 17:37:13.531 [parse-book -> Sink: log-book (1/2)#0] INFO smedbergm.mwe.BookSink$ -- Winkler, Scott: Terraform In Action (ISBN 978-1-61729-689-5) $49.99 17:37:13.534 [parse-book -> Sink: log-book (2/2)#0] WARN smedbergm.mwe.BookSink$ -- Book.isbn: None with identityHashCode 465138693 reports isEmpty false; true None is 2019204827 17:37:13.534 [parse-book -> Sink: log-book (1/2)#0] INFO smedbergm.mwe.BookSink$ -- Čukić, Ivan: Functional Programming in C++ (ISBN 978-1-61729-381-8) $49.99 17:37:13.538 [flink-akka.actor.default-dispatcher-8] INFO o.a.f.r.c.CheckpointCoordinator -- Stopping checkpoint coordinator for job eed1c049790ac5f38664ddfd6b049282. {code} I know that https://issues.apache.org/jira/browse/FLINK-13414 (support for Scala 2.13) is still listed as in-progress, but there is no warning in the docs that using 2.13 might not be stable. (This particular error does not occur on Scala 2.12, in this case because Option.isEmpty was re-implemented in 2.13; however, I suspect that multiple deserialization may be occurring already in 2.12.) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-20628) Port RabbitMQ Sources to FLIP-27 API
[ https://issues.apache.org/jira/browse/FLINK-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-20628: --- Labels: auto-deprioritized-major pull-request-available stale-assigned (was: auto-deprioritized-major pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > Port RabbitMQ Sources to FLIP-27 API > > > Key: FLINK-20628 > URL: https://issues.apache.org/jira/browse/FLINK-20628 > Project: Flink > Issue Type: Improvement > Components: Connectors/ RabbitMQ >Reporter: Jan Westphal >Assignee: RocMarshal >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available, > stale-assigned > > *Structure* > The new RabbitMQ Source will have three components: > * RabbitMQ enumerator that receives one RabbitMQ Channel Config. > * RabbitMQ splits contain the RabbitMQ Channel Config > * RabbitMQ Readers which subscribe to the same RabbitMQ channel and receive > the messages (automatically load balanced by RabbitMQ). > *Checkpointing Enumerators* > The enumerator only needs to checkpoint the RabbitMQ channel config since the > continuous discovery of new unread/unhandled messages is taken care of by the > subscribed RabbitMQ readers and RabbitMQ itself. > *Checkpointing Readers* > The new RabbitMQ Source needs to ensure that every reader can be checkpointed. > Since RabbitMQ is non-persistent and cannot be read by offset, a combined > usage of checkpoints and message acknowledgments is necessary. Until a > received message is checkpointed by a reader, it will stay in an > un-acknowledge state. As soon as the checkpoint is created, the messages from > the last checkpoint can be acknowledged as handled against RabbitMQ and thus > will be deleted only then. Messages need to be acknowledged one by one as > messages are handled by each SourceReader individually. > When deserializing the messages we will make use of the implementation in the > existing RabbitMQ Source. > *Message Delivery Guarantees* > Unacknowledged messages of a reader will be redelivered by RabbitMQ > automatically to other consumers of the same channel if the reader goes down. > > This Source is going to only support at-least-once as this is the default > RabbitMQ behavior and thus everything else would require changes to RabbitMQ > itself or would impair the idea of parallelizing SourceReaders. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32509) avoid using skip in InputStreamFSInputWrapper.seek
[ https://issues.apache.org/jira/browse/FLINK-32509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-32509: --- Labels: auto-deprioritized-major pull-request-available (was: pull-request-available stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > avoid using skip in InputStreamFSInputWrapper.seek > -- > > Key: FLINK-32509 > URL: https://issues.apache.org/jira/browse/FLINK-32509 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.18.0 >Reporter: Libin Qin >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > > The implementation of InputStream does not return -1 for eof. > The java doc of InputStream said "The skip method may, for a variety of > reasons, end up skipping over some smaller number of bytes, possibly 0." > For FileInputStream, it allows skipping any number of bytes past the end of > the file. > So the method "seek" of InputStreamFSInputWrapper will cause infinite loop if > desired exceed end of file > > I reproduced with following case > > {code:java} > byte[] bytes = "flink".getBytes(); > try (InputStream inputStream = new ByteArrayInputStream(bytes)){ > InputStreamFSInputWrapper wrapper = new > InputStreamFSInputWrapper(inputStream); > wrapper.seek(20); > } {code} > I found an issue of commons-io talks about the problem of skip > https://issues.apache.org/jira/browse/IO-203 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-12869) Add yarn acls capability to flink containers
[ https://issues.apache.org/jira/browse/FLINK-12869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-12869: --- Labels: pull-request-available stale-assigned (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > Add yarn acls capability to flink containers > > > Key: FLINK-12869 > URL: https://issues.apache.org/jira/browse/FLINK-12869 > Project: Flink > Issue Type: New Feature > Components: Deployment / YARN >Reporter: Nicolas Fraison >Assignee: Archit Goyal >Priority: Minor > Labels: pull-request-available, stale-assigned > Time Spent: 20m > Remaining Estimate: 0h > > Yarn provide application acls mechanism to be able to provide specific rights > to other users than the one running the job (view logs through the > resourcemanager/job history, kill the application) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31238) Use IngestDB to speed up Rocksdb rescaling recovery
[ https://issues.apache.org/jira/browse/FLINK-31238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-31238: --- Labels: pull-request-available stale-assigned (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > Use IngestDB to speed up Rocksdb rescaling recovery > > > Key: FLINK-31238 > URL: https://issues.apache.org/jira/browse/FLINK-31238 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.16.1 >Reporter: Yue Ma >Assignee: Yue Ma >Priority: Major > Labels: pull-request-available, stale-assigned > Attachments: image-2023-02-27-16-41-18-552.png, > image-2023-02-27-16-57-18-435.png, image-2023-03-07-14-27-10-260.png, > image-2023-03-09-15-23-30-581.png, image-2023-03-09-15-26-12-314.png, > image-2023-03-09-15-28-32-363.png, image-2023-03-09-15-41-03-074.png, > image-2023-03-09-15-41-08-379.png, image-2023-03-09-15-45-56-081.png, > image-2023-03-09-15-46-01-176.png, image-2023-03-09-15-50-04-281.png, > image-2023-03-29-15-25-21-868.png, image-2023-07-17-14-37-38-864.png, > image-2023-07-17-14-38-56-946.png, image-2023-07-22-14-16-31-856.png, > image-2023-07-22-14-19-01-390.png, image-2023-08-08-21-32-43-783.png, > image-2023-08-08-21-34-39-008.png, image-2023-08-08-21-39-39-135.png, > screenshot-1.png > > > (The detailed design is in this document > [https://docs.google.com/document/d/10MNVytTsyiDLZQSR89kDkVdmK_YjbM6jh0teerfDFfI|https://docs.google.com/document/d/10MNVytTsyiDLZQSR89kDkVdmK_YjbM6jh0teerfDFfI]) > There have been many discussions and optimizations in the community about > optimizing rocksdb scaling and recovery. > https://issues.apache.org/jira/browse/FLINK-17971 > https://issues.apache.org/jira/browse/FLINK-8845 > https://issues.apache.org/jira/browse/FLINK-21321 > We hope to discuss some of our explorations under this ticket > The process of scaling and recovering in rocksdb simply requires two steps > # Insert the valid keyGroup data of the new task. > # Delete the invalid data in the old stateHandle. > The current method for data writing is to specify the main Db first and then > insert data using writeBatch.In addition, the method of deleteRange is > currently used to speed up the ClipDB. But in our production environment, we > found that the speed of rescaling is still very slow, especially when the > state of a single Task is large. > > We hope that the previous sst file can be reused directly when restoring > state, instead of retraversing the data. So we made some attempts to optimize > it in our internal version of flink and frocksdb. > > We added two APIs *ClipDb* and *IngestDb* in frocksdb. > * ClipDB is used to clip the data of a DB. Different from db.DeteleRange and > db.Delete, DeleteValue and RangeTombstone will not be generated for parts > beyond the key range. We will iterate over the FileMetaData of db. Process > each sst file. There are three situations here. > If all the keys of a file are required, we will keep the sst file and do > nothing > If all the keys of the sst file exceed the specified range, we will delete > the file directly. > If we only need some part of the sst file, we will rewrite the required keys > to generate a new sst file。 > All sst file changes will be placed in a VersionEdit, and the current > versions will LogAndApply this edit to ensure that these changes can take > effect > * IngestDb is used to directly ingest all sst files of one DB into another > DB. But it is necessary to strictly ensure that the keys of the two DBs do > not overlap, which is easy to do in the Flink scenario. The hard link method > will be used in the process of ingesting files, so it will be very fast. At > the same time, the file number of the main DB will be incremented > sequentially, and the SequenceNumber of the main DB will be updated to the > larger SequenceNumber of the two DBs. > When IngestDb and ClipDb are supported, the state restoration logic is as > follows > * Open the first StateHandle as the main DB and pause the compaction. > * Clip the main DB according to the KeyGroup range of the Task with ClipDB
[jira] [Updated] (FLINK-27402) Unexpected boolean expression simplification for AND expression
[ https://issues.apache.org/jira/browse/FLINK-27402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-27402: --- Labels: pull-request-available stale-assigned (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > Unexpected boolean expression simplification for AND expression > - > > Key: FLINK-27402 > URL: https://issues.apache.org/jira/browse/FLINK-27402 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: luoyuxia >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available, stale-assigned > > Flink supports to compare between string and boolean, so the following sql > can work fine > > {code:java} > create table (c1 int, c2 string); > select * from c2 = true; > {code} > But the following sql will throw excpetion > > {code:java} > select * from c1 = 1 and c2 = true; {code} > The reason it that Flink will try to simplify BOOLEAN expressions if possible > in > "c1 = 1 and c2 = true". > So "c2 = true" will be simplified to "c2" by the following code in Flink: > > {code:java} > RexSimplify#simplifyAnd2ForUnknownAsFalse > // Simplify BOOLEAN expressions if possible > while (term.getKind() == SqlKind.EQUALS) { > RexCall call = (RexCall) term; > if (call.getOperands().get(0).isAlwaysTrue()) { > term = call.getOperands().get(1); > terms.set(i, term); > continue; > } else if (call.getOperands().get(1).isAlwaysTrue()) { > term = call.getOperands().get(0); > terms.set(i, term); > continue; > } > break; > } {code} > So the expression will be reduced to ""c1 = 1 and c2". But AND requries both > sides are boolean expression and c2 is not a boolean expression for it > actually is a string. > Then the exception "Boolean expression type expected" is thrown. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23373: [FLINK-30025][doc] improve the option description with more precise content
flinkbot commented on PR #23373: URL: https://github.com/apache/flink/pull/23373#issuecomment-1710829165 ## CI report: * 6251199a537a74a4f122b1d6139380b7da89de3c UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe opened a new pull request, #23373: [FLINK-30025][doc] improve the option description with more precise content
JingGe opened a new pull request, #23373: URL: https://github.com/apache/flink/pull/23373 ## What is the purpose of the change improve the option description with more precise content ## Brief change log - SqlClientOptions - TableConfigOptions ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32906) Release Testing: Verify FLIP-279 Unified the max display column width for SqlClient and Table APi in both Streaming and Batch execMode
[ https://issues.apache.org/jira/browse/FLINK-32906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762879#comment-17762879 ] Jing Ge commented on FLINK-32906: - [~liyubin117] e.g. means "for example", it does not need to contain all cases. Nevertheless, I will improve the description with more precise content. > Release Testing: Verify FLIP-279 Unified the max display column width for > SqlClient and Table APi in both Streaming and Batch execMode > -- > > Key: FLINK-32906 > URL: https://issues.apache.org/jira/browse/FLINK-32906 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: Jing Ge >Assignee: Yubin Li >Priority: Major > Fix For: 1.18.0 > > Attachments: image-2023-08-22-18-54-45-122.png, > image-2023-08-22-18-54-57-699.png, image-2023-08-22-18-58-11-241.png, > image-2023-08-22-18-58-20-509.png, image-2023-08-22-19-04-49-344.png, > image-2023-08-22-19-06-49-335.png, image-2023-08-23-10-39-10-225.png, > image-2023-08-23-10-39-26-602.png, image-2023-08-23-10-41-30-913.png, > image-2023-08-23-10-41-42-513.png, image-2023-08-30-17-09-32-728.png, > screenshot-1.png > > > more info could be found at > FLIP-279 > [https://cwiki.apache.org/confluence/display/FLINK/FLIP-279+Unified+the+max+display+column+width+for+SqlClient+and+Table+APi+in+both+Streaming+and+Batch+execMode] > > Tests: > Both configs could be set with new value and following behaviours are > expected : > | | |sql-client.display.max-column-width, default value is > 30|table.display.max-column-width, default value is 30| > |sqlclient|Streaming|text longer than the value will be truncated and > replaced with “...”|Text longer than the value will be truncated and replaced > with “...”| > |sqlclient|Batch|text longer than the value will be truncated and replaced > with “...”|Text longer than the value will be truncated and replaced with > “...”| > |Table API|Streaming|No effect. > table.display.max-column-width with the default value 30 will be used|Text > longer than the value will be truncated and replaced with “...”| > |Table API|Batch|No effect. > table.display.max-column-width with the default value 30 will be used|Text > longer than the value will be truncated and replaced with “...”| > > Please pay attention that this task offers a backward compatible solution and > deprecated sql-client.display.max-column-width, which means once > sql-client.display.max-column-width is used, it falls into the old scenario > where table.display.max-column-width didn't exist, any changes of > table.display.max-column-width won't take effect. You should test it either > by only using table.display.max-column-width or by using > sql-client.display.max-column-width, but not both of them back and forth. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] pgaref commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
pgaref commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318975346 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's REST API (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise values associated with them may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +The JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded all class names should be defined as part of [jobmanager.failure-enrichers configuration]({{< ref "docs/deployment/config#jobmanager-failure-enrichers" >}}). + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: So what you are implying here is by filtering on the Factory level we could avoid instantiating unwanted enrichers that could be potentialy setting up resources? Happy to add that as a followup ticket and follow an approach similar to MetricsReporter setup -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
pgaref commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318968366 ## docs/content.zh/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. Review Comment: like a setup/close methods? Main functionality is enriching/labeling exceptions but that would be a useful addition indeed. Lets see what users do with it first :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
pgaref commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318966093 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's REST API (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise values associated with them may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +The JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded all class names should be defined as part of [jobmanager.failure-enrichers configuration]({{< ref "docs/deployment/config#jobmanager-failure-enrichers" >}}). + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: Yes, this is the first version of the feature -- if many enrichers are loaded we could easily change that to what metrics reporters do but looked like an overkill for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
pgaref commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318964767 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded: +* All class names should be defined as part of {{< javadoc file="org/apache/flink/configuration/JobManagerOptions.html#FAILURE_ENRICHERS_LIST" name="jobmanager.failure-enrichers">}} configuration. + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: `org.apache.flink.test.plugin.jar.failure` is the package? Not sure what you are referring to here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
pgaref commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318963910 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. Review Comment: When enrichers output keys overlap, only the first one is loaded and the rest are dropped with an Error message -- that was the meaning of `This set of keys has to be unique across enrichers otherwise may be ignored.` Rephrasing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] venkata91 commented on pull request #23313: [FLINK-20767][table planner] Support filter push down on nested fields
venkata91 commented on PR #23313: URL: https://github.com/apache/flink/pull/23313#issuecomment-171041 Also tagging @slinkydeveloper -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] venkata91 commented on pull request #23313: [FLINK-20767][table planner] Support filter push down on nested fields
venkata91 commented on PR #23313: URL: https://github.com/apache/flink/pull/23313#issuecomment-1710555055 @wuchong @swuferhong @JingsongLi @becketqin please review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318921748 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. Review Comment: oh wow all enrichers that clash in any way are just completely dropped. That doesn't seem to match the documented behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318921748 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. Review Comment: oh wow all enrichers that clash in any way are just completely dropped if there is a clash. That doesn't seem to match the documented behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318909983 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's REST API (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise values associated with them may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +The JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded all class names should be defined as part of [jobmanager.failure-enrichers configuration]({{< ref "docs/deployment/config#jobmanager-failure-enrichers" >}}). + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: Oh my it does work like that :( Better hope the enrichers don't setup any resources in their constructor :/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318907314 ## docs/content.zh/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. Review Comment: side-note: bit surprised that something that is advertised as being able to talk to external systems has no lifecycle hooks for managing resources (like a http client). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318902851 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's REST API (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise values associated with them may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +The JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded all class names should be defined as part of [jobmanager.failure-enrichers configuration]({{< ref "docs/deployment/config#jobmanager-failure-enrichers" >}}). + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: side-note: If multiple enrichers are used this option can become a bit cumbersome for users, and a structure like we use for metrics reporters could've solved that (and paved a way to make them configurable!) ``` jobmanager.failure-enrichers.enricher1.class = org.apache.flink.test.plugin.jar.failure.CustomEnricher ``` ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,106 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). +This enables users to implement their own failure enrichment plugins to categorize job failures, expose custom metrics, or make calls to external notification systems. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318898982 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. +{{< /hint >}} + +FailureEnricherFactory example: + +``` java +public class TestFailureEnricherFactory implements FailureEnricherFactory { + + @Override + public FailureEnricher createFailureEnricher(Configuration conf) { +return new CustomEnricher(); + } +} +``` + +FailureEnricher example: + +``` java +public class CustomEnricher implements FailureEnricher { +private final Set outputKeys; + +public CustomEnricher() { +this.outputKeys = Collections.singleton("labelKey"); +} + +@Override +public Set getOutputKeys() { +return outputKeys; +} + +@Override +public CompletableFuture> processFailure( +Throwable cause, Context context) { +return CompletableFuture.completedFuture(Collections.singletonMap("labelKey", "labelValue")); +} +} +``` + +### Configuration + +JobManager loads FailureEnricher plugins at startup. To make sure your FailureEnrichers are loaded: +* All class names should be defined as part of {{< javadoc file="org/apache/flink/configuration/JobManagerOptions.html#FAILURE_ENRICHERS_LIST" name="jobmanager.failure-enrichers">}} configuration. + If this configuration is empty, NO enrichers will be started. Example: +``` +jobmanager.failure-enrichers = org.apache.flink.test.plugin.jar.failure.CustomEnricher Review Comment: Did you miss this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: [FLINK-31889] Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318898608 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. Review Comment: I don't understand what the implications "ignoring values" is. In practice all that happens if that one enricher would override the label of another one, right? Can we just write that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-32962) Failure to install python dependencies from requirements file
[ https://issues.apache.org/jira/browse/FLINK-32962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762807#comment-17762807 ] Hong Liang Teoh commented on FLINK-32962: - [~a.pilipenko] should we also backport this to Flink 1.17 and 1.18? > Failure to install python dependencies from requirements file > - > > Key: FLINK-32962 > URL: https://issues.apache.org/jira/browse/FLINK-32962 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.15.4, 1.16.2, 1.18.0, 1.17.1 >Reporter: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > We have encountered an issue when Flink fails to install python dependencies > from requirements file if python environment contains setuptools dependency > version 67.5.0 or above. > Flink job fails with following error: > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o118.await. > : java.util.concurrent.ExecutionException: > org.apache.flink.table.api.TableException: Failed to wait job finish > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118) > at > org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81) > ... > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.client.program.ProgramInvocationException: Job failed > (JobID: 2ca4026944022ac4537c503464d4c47f) > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83) > ... 6 more > Caused by: org.apache.flink.client.program.ProgramInvocationException: Job > failed (JobID: 2ca4026944022ac4537c503464d4c47f) > at > org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:130) > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) > ... > Caused by: java.io.IOException: java.io.IOException: Failed to execute the > command: /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install > --ignore-installed -r > /var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/tm_localhost:63904-4178d3/blobStorage/job_2ca4026944022ac4537c503464d4c47f/blob_p-feb04bed919e628d98b1ef085111482f05bf43a1-1f5c3fda7cbdd6842e950e04d9d807c5 > --install-option > --prefix=/var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/python-dist-c17912db-5aba-439f-9e39-69cb8c957bec/python-requirements > output: > Usage: > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] -r > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > ... > no such option: --install-option > at > org.apache.flink.python.util.PythonEnvironmentManagerUtils.pipInstallRequirements(PythonEnvironmentManagerUtils.java:144) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.installRequirements(AbstractPythonEnvironmentManager.java:215) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.lambda$open$0(AbstractPythonEnvironmentManager.java:126) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.createResource(AbstractPythonEnvironmentManager.java:435) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.getOrAllocateSharedResource(AbstractPythonEnvironmentManager.java:402) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.open(AbstractPythonEnvironmentManager.java:114) > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:238) > at > org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57) > at >
[jira] [Commented] (FLINK-32962) Failure to install python dependencies from requirements file
[ https://issues.apache.org/jira/browse/FLINK-32962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762805#comment-17762805 ] Hong Liang Teoh commented on FLINK-32962: - merged commit [{{1193e17}}|https://github.com/apache/flink/commit/1193e178f6c478bf493231fc53c74f03158c0ca9] into apache:master > Failure to install python dependencies from requirements file > - > > Key: FLINK-32962 > URL: https://issues.apache.org/jira/browse/FLINK-32962 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.15.4, 1.16.2, 1.18.0, 1.17.1 >Reporter: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > We have encountered an issue when Flink fails to install python dependencies > from requirements file if python environment contains setuptools dependency > version 67.5.0 or above. > Flink job fails with following error: > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o118.await. > : java.util.concurrent.ExecutionException: > org.apache.flink.table.api.TableException: Failed to wait job finish > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118) > at > org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81) > ... > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.client.program.ProgramInvocationException: Job failed > (JobID: 2ca4026944022ac4537c503464d4c47f) > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83) > ... 6 more > Caused by: org.apache.flink.client.program.ProgramInvocationException: Job > failed (JobID: 2ca4026944022ac4537c503464d4c47f) > at > org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:130) > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) > ... > Caused by: java.io.IOException: java.io.IOException: Failed to execute the > command: /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install > --ignore-installed -r > /var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/tm_localhost:63904-4178d3/blobStorage/job_2ca4026944022ac4537c503464d4c47f/blob_p-feb04bed919e628d98b1ef085111482f05bf43a1-1f5c3fda7cbdd6842e950e04d9d807c5 > --install-option > --prefix=/var/folders/rb/q_3h54_94b57gz_qkbp593vwgn/T/python-dist-c17912db-5aba-439f-9e39-69cb8c957bec/python-requirements > output: > Usage: > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] -r > [package-index-options] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > [-e] ... > /Users/z3d1k/.pyenv/versions/3.9.17/bin/python -m pip install [options] > ... > no such option: --install-option > at > org.apache.flink.python.util.PythonEnvironmentManagerUtils.pipInstallRequirements(PythonEnvironmentManagerUtils.java:144) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.installRequirements(AbstractPythonEnvironmentManager.java:215) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.lambda$open$0(AbstractPythonEnvironmentManager.java:126) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.createResource(AbstractPythonEnvironmentManager.java:435) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager$PythonEnvResources.getOrAllocateSharedResource(AbstractPythonEnvironmentManager.java:402) > at > org.apache.flink.python.env.AbstractPythonEnvironmentManager.open(AbstractPythonEnvironmentManager.java:114) > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:238) > at > org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57) > at >
[GitHub] [flink] hlteoh37 merged pull request #23297: [FLINK-32962][python] Remove pip version check on installing dependencies.
hlteoh37 merged PR #23297: URL: https://github.com/apache/flink/pull/23297 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pgaref commented on pull request #22468: FLINK-31889: Add documentation for implementing/loading enrichers
pgaref commented on PR #22468: URL: https://github.com/apache/flink/pull/22468#issuecomment-1710421376 > This needs a copy in the chinese version of the docs. (that may be in english; translation is a follow-up) Thanks for the review @zentol ! Addressed your comments, also created a copy as the chinese version of the docs that will be addressed as part of https://issues.apache.org/jira/browse/FLINK-33061 by @wangzzu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23372: [FLINK-33042][flamegraph] Allow trigger flamegraph when task is initializing
flinkbot commented on PR #23372: URL: https://github.com/apache/flink/pull/23372#issuecomment-1710406803 ## CI report: * 5a37a649a983c2f23d4de33ccac525bd9ead16bb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] hlteoh37 commented on pull request #23297: [FLINK-32962][python] Remove pip version check on installing dependencies.
hlteoh37 commented on PR #23297: URL: https://github.com/apache/flink/pull/23297#issuecomment-1710397554 @JingGe @HuangXingBo Any thoughts on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] 1996fanrui opened a new pull request, #23372: [FLINK-33042][flamegraph] Allow trigger flamegraph when task is initializing
1996fanrui opened a new pull request, #23372: URL: https://github.com/apache/flink/pull/23372 ## What is the purpose of the change Currently, the flamegraph can be triggered when task is running. After [FLINK-17012](https://issues.apache.org/jira/browse/FLINK-17012) and [FLINK-22215](https://issues.apache.org/jira/browse/FLINK-22215), flink split the running to running and initializing. We should allow trigger flamegraph when task is initializing. For example, the initialization is very slow, we need to troubleshoot. ## Brief change log [FLINK-33042][flamegraph] Allow trigger flamegraph when task is initializing ## Verifying this change Added the `@Parameter public ExecutionState executionState;` for the `VertexThreadInfoTrackerTest`. And all tests will run with running and initializing. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33018) GCP Pubsub PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream failed
[ https://issues.apache.org/jira/browse/FLINK-33018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762793#comment-17762793 ] Ryan Skraba commented on FLINK-33018: - I think I have a fix -- it looks like occasionally (every 10K runs or so) we hit the explicit cancel() written in this specific test, before we hit the implicit cancel() due to the endofStream message being discovered. I'm letting this run for a while before submitting a PR. This test also slows down in consecutive runs, which might indicate a leak but I'm hoping this is not the case. > GCP Pubsub > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream > failed > > > Key: FLINK-33018 > URL: https://issues.apache.org/jira/browse/FLINK-33018 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: gcp-pubsub-3.0.2 >Reporter: Martijn Visser >Priority: Blocker > > https://github.com/apache/flink-connector-gcp-pubsub/actions/runs/6061318336/job/16446392844#step:13:507 > {code:java} > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream:119 > > expected: ["1", "2", "3"] > but was: ["1", "2"] > [INFO] > Error: Tests run: 30, Failures: 1, Errors: 0, Skipped: 0 > [INFO] > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-15736) Support Java 17 (LTS)
[ https://issues.apache.org/jira/browse/FLINK-15736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-15736: -- Release Note: Apache Flink was made ready to compile and run with Java 17 (LTS). This feature is still in beta mode. Issues should be reported in Flink's bug tracker. (was: Apache Flink was made ready to compile and run with Java 17 (LTS).) > Support Java 17 (LTS) > - > > Key: FLINK-15736 > URL: https://issues.apache.org/jira/browse/FLINK-15736 > Project: Flink > Issue Type: New Feature > Components: Build System >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: auto-deprioritized-major, pull-request-available, > stale-assigned > Fix For: 1.18.0 > > > Long-term issue for preparing Flink for Java 17. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33061) Translate failure-enricher documentation to Chinese
[ https://issues.apache.org/jira/browse/FLINK-33061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762782#comment-17762782 ] Panagiotis Garefalakis commented on FLINK-33061: [~wangm92] feel free to take over > Translate failure-enricher documentation to Chinese > --- > > Key: FLINK-33061 > URL: https://issues.apache.org/jira/browse/FLINK-33061 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Priority: Not a Priority > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33061) Translate failure-enricher documentation to Chinese
[ https://issues.apache.org/jira/browse/FLINK-33061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated FLINK-33061: --- Summary: Translate failure-enricher documentation to Chinese (was: Translate failure-enricher documentation) > Translate failure-enricher documentation to Chinese > --- > > Key: FLINK-33061 > URL: https://issues.apache.org/jira/browse/FLINK-33061 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Priority: Not a Priority > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33061) Translate failure-enricher documentation
Panagiotis Garefalakis created FLINK-33061: -- Summary: Translate failure-enricher documentation Key: FLINK-33061 URL: https://issues.apache.org/jira/browse/FLINK-33061 Project: Flink Issue Type: Sub-task Reporter: Panagiotis Garefalakis -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33018) GCP Pubsub PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream failed
[ https://issues.apache.org/jira/browse/FLINK-33018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762765#comment-17762765 ] Jayadeep Jayaraman commented on FLINK-33018: I ran it 23K times yesterday and didnt face any issue hence stopped the test :). Good that you were able to reproduce the error! > GCP Pubsub > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream > failed > > > Key: FLINK-33018 > URL: https://issues.apache.org/jira/browse/FLINK-33018 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: gcp-pubsub-3.0.2 >Reporter: Martijn Visser >Priority: Blocker > > https://github.com/apache/flink-connector-gcp-pubsub/actions/runs/6061318336/job/16446392844#step:13:507 > {code:java} > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream:119 > > expected: ["1", "2", "3"] > but was: ["1", "2"] > [INFO] > Error: Tests run: 30, Failures: 1, Errors: 0, Skipped: 0 > [INFO] > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-30649) Shutting down MiniCluster times out
[ https://issues.apache.org/jira/browse/FLINK-30649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl reassigned FLINK-30649: - Assignee: (was: Matthias Pohl) > Shutting down MiniCluster times out > --- > > Key: FLINK-30649 > URL: https://issues.apache.org/jira/browse/FLINK-30649 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Test Infrastructure >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Critical > Labels: stale-assigned, starter, test-stability > > {{Run kubernetes session test (default input)}} failed with a timeout. > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44748=logs=bea52777-eaf8-5663-8482-18fbc3630e81=b2642e3a-5b86-574d-4c8a-f7e2842bfb14=6317] > It appears that there was some issue with shutting down the pods of the > MiniCluster: > {code:java} > 2023-01-12T08:22:13.1388597Z timed out waiting for the condition on > pods/flink-native-k8s-session-1-7dc9976688-gq788 > 2023-01-12T08:22:13.1390040Z timed out waiting for the condition on > pods/flink-native-k8s-session-1-taskmanager-1-1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31889) Add documentation for implementing/loading enrichers
[ https://issues.apache.org/jira/browse/FLINK-31889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762746#comment-17762746 ] Matt Wang commented on FLINK-31889: --- [~pgaref] if you create a ticket for chinese documention, you can aasign to me, i will translate it > Add documentation for implementing/loading enrichers > > > Key: FLINK-31889 > URL: https://issues.apache.org/jira/browse/FLINK-31889 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > > Describe how enrichers can be implemented and loaded to Flink as part of > documentation -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-24943) SequenceNumber class is not POJO type
[ https://issues.apache.org/jira/browse/FLINK-24943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-24943: -- Fix Version/s: (was: 1.15.5) (was: 1.16.3) > SequenceNumber class is not POJO type > - > > Key: FLINK-24943 > URL: https://issues.apache.org/jira/browse/FLINK-24943 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.5, 1.13.3, 1.14.4, 1.15.0 >Reporter: Alexander Egorov >Assignee: Alexander Egorov >Priority: Minor > Labels: pull-request-available, stale-assigned > Fix For: aws-connector-4.2.0 > > > SequenceNumber class is currently part of the "Kinesis-Stream-Shard-State", > but it does not follow requirements of POJO compatible type. Because of that > we are getting warning like this: > {{TypeExtractor - class > software.amazon.kinesis.connectors.flink.model.SequenceNumber does not > contain a setter for field sequenceNumber}} > While the warning itself or inability to use optimal sterilizer for such a > small state is not the problem, this warning prevents us to disable Generic > Types via {{disableGenericTypes()}} > So, the problem is similar to > https://issues.apache.org/jira/browse/FLINK-15904 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-24943) SequenceNumber class is not POJO type
[ https://issues.apache.org/jira/browse/FLINK-24943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762736#comment-17762736 ] Danny Cranmer commented on FLINK-24943: --- [~Cyberness] can you please recreate this PR against https://github.com/apache/flink-connector-aws? > SequenceNumber class is not POJO type > - > > Key: FLINK-24943 > URL: https://issues.apache.org/jira/browse/FLINK-24943 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.5, 1.13.3, 1.14.4, 1.15.0 >Reporter: Alexander Egorov >Assignee: Alexander Egorov >Priority: Minor > Labels: pull-request-available, stale-assigned > Fix For: 1.15.5, aws-connector-4.2.0, 1.16.3 > > > SequenceNumber class is currently part of the "Kinesis-Stream-Shard-State", > but it does not follow requirements of POJO compatible type. Because of that > we are getting warning like this: > {{TypeExtractor - class > software.amazon.kinesis.connectors.flink.model.SequenceNumber does not > contain a setter for field sequenceNumber}} > While the warning itself or inability to use optimal sterilizer for such a > small state is not the problem, this warning prevents us to disable Generic > Types via {{disableGenericTypes()}} > So, the problem is similar to > https://issues.apache.org/jira/browse/FLINK-15904 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-24943) SequenceNumber class is not POJO type
[ https://issues.apache.org/jira/browse/FLINK-24943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762731#comment-17762731 ] Danny Cranmer commented on FLINK-24943: --- [~martijnvisser] we will pick this up. Thanks for the ping. > SequenceNumber class is not POJO type > - > > Key: FLINK-24943 > URL: https://issues.apache.org/jira/browse/FLINK-24943 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.5, 1.13.3, 1.14.4, 1.15.0 >Reporter: Alexander Egorov >Assignee: Alexander Egorov >Priority: Minor > Labels: pull-request-available, stale-assigned > Fix For: 1.15.5, aws-connector-4.2.0, 1.16.3 > > > SequenceNumber class is currently part of the "Kinesis-Stream-Shard-State", > but it does not follow requirements of POJO compatible type. Because of that > we are getting warning like this: > {{TypeExtractor - class > software.amazon.kinesis.connectors.flink.model.SequenceNumber does not > contain a setter for field sequenceNumber}} > While the warning itself or inability to use optimal sterilizer for such a > small state is not the problem, this warning prevents us to disable Generic > Types via {{disableGenericTypes()}} > So, the problem is similar to > https://issues.apache.org/jira/browse/FLINK-15904 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33018) GCP Pubsub PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream failed
[ https://issues.apache.org/jira/browse/FLINK-33018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762728#comment-17762728 ] Ryan Skraba commented on FLINK-33018: - Hey, I can reproduce this -- all you need to do is run this test about 12,000 times ;) (using IntelliJ repeat until fail). {code} Connected to the target VM, address: '127.0.0.1:44879', transport: 'socket' org.opentest4j.AssertionFailedError: expected: ["1", "2", "3"] but was: ["1", "2"] Expected :["1", "2", "3"] Actual :["1", "2"] {code} I'm taking a look. > GCP Pubsub > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream > failed > > > Key: FLINK-33018 > URL: https://issues.apache.org/jira/browse/FLINK-33018 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: gcp-pubsub-3.0.2 >Reporter: Martijn Visser >Priority: Blocker > > https://github.com/apache/flink-connector-gcp-pubsub/actions/runs/6061318336/job/16446392844#step:13:507 > {code:java} > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PubSubConsumingTest.testStoppingConnectorWhenDeserializationSchemaIndicatesEndOfStream:119 > > expected: ["1", "2", "3"] > but was: ["1", "2"] > [INFO] > Error: Tests run: 30, Failures: 1, Errors: 0, Skipped: 0 > [INFO] > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31895) End-to-end integration tests for failure labels
[ https://issues.apache.org/jira/browse/FLINK-31895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762718#comment-17762718 ] Panagiotis Garefalakis commented on FLINK-31895: Fine by me [~wangm92] > End-to-end integration tests for failure labels > --- > > Key: FLINK-31895 > URL: https://issues.apache.org/jira/browse/FLINK-31895 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31895) End-to-end integration tests for failure labels
[ https://issues.apache.org/jira/browse/FLINK-31895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762715#comment-17762715 ] Matt Wang commented on FLINK-31895: --- [~pgaref] I would like to take this ticket if possible > End-to-end integration tests for failure labels > --- > > Key: FLINK-31895 > URL: https://issues.apache.org/jira/browse/FLINK-31895 > Project: Flink > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] wangyang0918 commented on a diff in pull request #23164: [FLINK- 32775]Add parent dir of files to classpath using yarn.provided.lib.dirs
wangyang0918 commented on code in PR #23164: URL: https://github.com/apache/flink/pull/23164#discussion_r1318475489 ## flink-yarn/src/main/java/org/apache/flink/yarn/YarnApplicationFileUploader.java: ## @@ -360,11 +362,27 @@ List registerProvidedLocalResources() { envShipResourceList.add(descriptor); if (!isFlinkDistJar(filePath.getName()) && !isPlugin(filePath)) { -classPaths.add(fileName); +URI parentDirectoryUri = new Path(fileName).getParent().toUri(); +String relativeParentDirectory = Review Comment: ``` -Dyarn.provided.lib.dirs="hdfs://myhdfs/tmp/flink-1.18-SNAPSHOT/lib;hdfs://myhdfs/tmp/flink-1.18-SNAPSHOT/plugins;hdfs://myhdfs/tmp/flink-1.18-SNAPSHOT/bin" ``` It seems that we already do the relativize in the `getAllFilesInProvidedLibDirs`. So the input `providedLibDirs` is already relative path(e.g. `bin/config.sh`, `bin/find-flink-home.sh`). In addition, adding a non-jar file into the classpath does not take too many benefits. I suggest to only add the parent directory to the classpath. The Yarn ship files have the same behavior. Then we could have a simpler code here. ``` if (fileName.endsWith("jar")) { resourcesJar.add(fileName); } else { resourcesDir.add(new Path(fileName).getParent().toString()); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33000) SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory
[ https://issues.apache.org/jira/browse/FLINK-33000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762707#comment-17762707 ] Matthias Pohl commented on FLINK-33000: --- Hi [~ade_hddu], thanks for offering your help. What are things that are still unclear to you? > SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using > a ThreadFactory > - > > Key: FLINK-33000 > URL: https://issues.apache.org/jira/browse/FLINK-33000 > Project: Flink > Issue Type: Bug > Components: Table SQL / Gateway, Tests >Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0 >Reporter: Matthias Pohl >Priority: Major > Labels: starter > > {{SqlGatewayServiceITCase}} uses a {{ExecutorThreadFactory}} for its > asynchronous operations. Instead, one should use {{TestExecutorExtension}} to > ensure proper cleanup of threads. > We might also want to remove the {{AbstractTestBase}} parent class because > that uses JUnit4 whereas {{SqlGatewayServiceITCase}} is already based on > JUnit5 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33060) Fix the javadoc of ListState.update/addAll about not allowing null value
Zakelly Lan created FLINK-33060: --- Summary: Fix the javadoc of ListState.update/addAll about not allowing null value Key: FLINK-33060 URL: https://issues.apache.org/jira/browse/FLINK-33060 Project: Flink Issue Type: Bug Reporter: Zakelly Lan After FLINK-8411, the ListState.update/add/addAll do not allow a null value passed in, while the javadoc says "If null is passed in, the state value will remain unchanged". This should be fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33060) Fix the javadoc of ListState.update/addAll about not allowing null value
[ https://issues.apache.org/jira/browse/FLINK-33060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zakelly Lan reassigned FLINK-33060: --- Assignee: Zakelly Lan > Fix the javadoc of ListState.update/addAll about not allowing null value > > > Key: FLINK-33060 > URL: https://issues.apache.org/jira/browse/FLINK-33060 > Project: Flink > Issue Type: Bug >Reporter: Zakelly Lan >Assignee: Zakelly Lan >Priority: Minor > > After FLINK-8411, the ListState.update/add/addAll do not allow a null value > passed in, while the javadoc says "If null is passed in, the state value will > remain unchanged". This should be fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762700#comment-17762700 ] Yangze Guo commented on FLINK-33053: Thanks for the pointer [~mapohl]. I'll try to get a debug log and heap dump. > Watcher leak in Zookeeper HA mode > - > > Key: FLINK-33053 > URL: https://issues.apache.org/jira/browse/FLINK-33053 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.17.0, 1.17.1 >Reporter: Yangze Guo >Priority: Critical > > We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA > mode. TM's watches on the leader of JobMaster has not been stopped after job > finished. > Here is how we re-produce this issue: > - Start a session cluster and enable Zookeeper HA mode. > - Continuously and concurrently submit short queries, e.g. WordCount to the > cluster. > - echo -n wchp | nc \{zk host} \{zk port} to get current watches. > We can see a lot of watches on > /flink/\{cluster_name}/leader/\{job_id}/connection_info. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23371: [FLINK-5865][state] Unwrapping exceptions and throw the original ones in state interfaces
flinkbot commented on PR #23371: URL: https://github.com/apache/flink/pull/23371#issuecomment-1709932341 ## CI report: * 4198ccb50d2dc4f18ed2ce1b3d57b2b04e96de32 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Zakelly opened a new pull request, #23371: [FLINK-5865][state] Unwrapping exceptions and throw the original ones in state interfaces
Zakelly opened a new pull request, #23371: URL: https://github.com/apache/flink/pull/23371 ## What is the purpose of the change Some implementations of state interfaces wrap the original exception and throw as FlinkRuntimeException. Since all the interfaces claim to throw Exception, it is unnecessary to wrap exceptions into FlinkRuntimeException, which is not intuitive to users in log. ## Brief change log - Go through the implementation of state interfaces and unwrap the try...catch blocks in best effort. ## Verifying this change This change is a code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (**yes** / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #22468: FLINK-31889: Add documentation for implementing/loading enrichers
zentol commented on code in PR #22468: URL: https://github.com/apache/flink/pull/22468#discussion_r1318386858 ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then, create a jar which includes your `FailureEnricher`, `FailureEnricherFactory`, `META-INF/services/` and all the external dependencies. +Make a directory in `plugins/` of your Flink distribution with an arbitrary name, e.g. "failure-enrichment", and put the jar into this directory. +See [Flink Plugin]({% link deployment/filesystems/plugins.md %}) for more details. + +{{< hint warning >}} +Note that every FailureEnricher should have defined a set of {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="output keys" >}} that may be associated with values. This set of keys has to be unique across enrichers otherwise may be ignored. Review Comment: > otherwise may be ignored Otherwise _what_ may be ignored? ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). Review Comment: ```suggestion Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string key-value pairs). ``` ## docs/content/docs/deployment/advanced/failure_enrichers.md: ## @@ -0,0 +1,112 @@ +--- +title: "Failure Enrichers" +nav-title: failure-enrichers +nav-parent_id: advanced +nav-pos: 3 +--- + + +## Custom failure enrichers +Flink provides a pluggable interface for users to register their custom logic and enrich failures with extra metadata labels (string kv pairs). +The goal is to enable developers implement their own failure enrichment plugins to categorize job failures, expose custom metrics, make calls to external notification systems, and more. + +FailureEnrichers are triggered every time an exception is reported at runtime by the JobManager. +Every FailureEnricher may asynchronously return labels associated with the failure that are then exposed via the JobManager's Rest interface (e.g., a 'type:System' label implying the failure is categorized as a system error). + + +### Implement a plugin for your custom enricher + +To implement a custom FailureEnricher plugin, you need to: + +- Add your own FailureEnricher by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricher.java" name="FailureEnricher" >}} interface. + +- Add your own FailureEnricherFactory by implementing the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/core/failure/FailureEnricherFactory.java" name="FailureEnricherFactory" >}} interface. + +- Add a service entry. Create a file `META-INF/services/org.apache.flink.core.failure.FailureEnricherFactory` which contains the class name of your failure enricher factory class (see [Java Service Loader](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/ServiceLoader.html) docs for more details). + + +Then,
[jira] [Commented] (FLINK-30314) Unable to read all records from compressed delimited format file
[ https://issues.apache.org/jira/browse/FLINK-30314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762689#comment-17762689 ] Etienne Chauchot commented on FLINK-30314: -- Ticket will be fixed with [this|https://issues.apache.org/jira/browse/FLINK-33059] feature > Unable to read all records from compressed delimited format file > > > Key: FLINK-30314 > URL: https://issues.apache.org/jira/browse/FLINK-30314 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.0, 1.15.2, 1.17.1 >Reporter: Dmitry Yaraev >Assignee: Etienne Chauchot >Priority: Major > Attachments: input.json, input.json.gz, input.json.zip > > > I am reading gzipped JSON line-delimited files in the batch mode using > [FileSystem > Connector|https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/filesystem/]. > For reading the files a new table is created with the following > configuration: > {code:sql} > CREATE TEMPORARY TABLE `my_database`.`my_table` ( > `my_field1` BIGINT, > `my_field2` INT, > `my_field3` VARCHAR(2147483647) > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'path-to-input-dir', > 'format' = 'json', > 'json.ignore-parse-errors' = 'false', > 'json.fail-on-missing-field' = 'true' > ) {code} > In the input directory I have two files: input-0.json.gz and > input-1.json.gz. As it comes from the filenames, the files are compressed > with GZIP. Each of the files contains 10 records. The issue is that only 2 > records from each file are read (4 in total). If decompressed versions of the > same data files are used, all 20 records are read. > As far as I understand, that problem may be related to the fact that split > length, which is used when the files are read, is in fact the length of a > compressed file. So files are closed before all records are read from them > because read position of the decompressed file stream exceeds split length. > Probably, it makes sense to add a flag to {{{}FSDataInputStream{}}}, so we > could identify if the file compressed or not. The flag can be set to true in > {{InputStreamFSInputWrapper}} because it is used for wrapping compressed file > streams. With such a flag it could be possible to differentiate > non-splittable compressed files and only rely on the end of the stream. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33059) Support transparent compression for file-connector for all file input formats
Etienne Chauchot created FLINK-33059: Summary: Support transparent compression for file-connector for all file input formats Key: FLINK-33059 URL: https://issues.apache.org/jira/browse/FLINK-33059 Project: Flink Issue Type: Technical Debt Components: Connectors / FileSystem Reporter: Etienne Chauchot Assignee: Etienne Chauchot Delimited file input formats (contrary to binary input format etc...) do not support compression via the existing decorator because split length is determined by the compressed file length lead to [this|https://issues.apache.org/jira/browse/FLINK-30314] bug . We should force reading the whole file split (like it is done for binary input formats) on compressed files. Parallelism is still done at the file level (as now) -- This message was sent by Atlassian Jira (v8.20.10#820010)