Re: How to enable RocksDB native metrics?

2024-04-07 Thread Zakelly Lan
Hi Lei, You can enable it by some configurations listed in: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics (RocksDB Native Metrics) Best, Zakelly On Sun, Apr 7, 2024 at 4:59 PM Zakelly Lan wrote: > Hi Lei, > > You c

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Zakelly Lan
Congratulations! Best, Zakelly On Thu, Mar 21, 2024 at 12:05 PM weijie guo wrote: > Congratulations! Well done. > > > Best regards, > > Weijie > > > Feng Jin 于2024年3月21日周四 11:40写道: > >> Congratulations! >> >> >> Best, >> Feng >> >> >> On Thu, Mar 21, 2024 at 11:37 AM Ron liu wrote: >> >> >

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Zakelly Lan
Congratulations! Best, Zakelly On Thu, Mar 21, 2024 at 12:05 PM weijie guo wrote: > Congratulations! Well done. > > > Best regards, > > Weijie > > > Feng Jin 于2024年3月21日周四 11:40写道: > >> Congratulations! >> >> >> Best, >> Feng >> >> >> On Thu, Mar 21, 2024 at 11:37 AM Ron liu wrote: >> >> >

Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Zakelly Lan
Congratulations! Thanks Lincoln, Yun, Martijn and Jing for driving this release. Thanks everyone involved. Best, Zakelly On Mon, Mar 18, 2024 at 5:05 PM weijie guo wrote: > Congratulations! > > Thanks release managers and all the contributors involved. > > Best regards, > > Weijie > > >

Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Zakelly Lan
Congratulations! Thanks Lincoln, Yun, Martijn and Jing for driving this release. Thanks everyone involved. Best, Zakelly On Mon, Mar 18, 2024 at 5:05 PM weijie guo wrote: > Congratulations! > > Thanks release managers and all the contributors involved. > > Best regards, > > Weijie > > >

Re: Unaligned checkpoint blocked by long Async operation

2024-03-17 Thread Zakelly Lan
> > Gyula > > On Fri, Mar 15, 2024 at 3:49 AM Zakelly Lan wrote: > >> Hi Gyula, >> >> Processing checkpoint halfway through `processElement` is problematic. >> The current element will not be included in the input in-flight data, and >> we cannot assum

Re: Unaligned checkpoint blocked by long Async operation

2024-03-14 Thread Zakelly Lan
> On Thu, Mar 14, 2024 at 1:22 PM Zakelly Lan wrote: > >> Hi Gyula, >> >> Well I tried your example in local mini-cluster, and it seems the source >> can take checkpoints but it will block in the following AsyncWaitOperator. >> IIUC, the unaligned checkpoint barr

Re: Unaligned checkpoint blocked by long Async operation

2024-03-14 Thread Zakelly Lan
Hi Gyula, Well I tried your example in local mini-cluster, and it seems the source can take checkpoints but it will block in the following AsyncWaitOperator. IIUC, the unaligned checkpoint barrier should wait until the current `processElement` finishes its execution. In your example, the element

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zakelly Lan
Hi Gabriele, Quick answer: You can use the built-in window operators which have been integrated with state backends including RocksDB. Thanks, Zakelly On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen wrote: > Hi Gabriele, > > I'd recommend extending the existing window function whenever

Re: Preparing keyed state before snapshot

2024-02-20 Thread Zakelly Lan
pecifically for > AbstractTopNFuction in StreamExecRank. > How can I do something similar without modifying the Flink runtime? > > Lorenzo > > > On Sun, 18 Feb 2024 at 03:42, Zakelly Lan wrote: > >> Hi Lorenzo, >> >> It is not recommended to do this with

Re: Impact of RocksDB backend on the Java heap

2024-02-18 Thread Zakelly Lan
of each key is "large"? Again assuming the > number of distinct partition keys is large. > > Regards, > Alexis. > > On Sun, 18 Feb 2024, 05:02 Zakelly Lan, wrote: > >> Hi Alexis, >> >> Flink does need some heap memory to bridge requests to roc

Re: Impact of RocksDB backend on the Java heap

2024-02-17 Thread Zakelly Lan
Hi Alexis, Flink does need some heap memory to bridge requests to rocksdb and gather the results. In most cases, the memory is discarded immediately (eventually collected by GC). In case of timers, flink do cache a limited subset of key-values in heap to improve performance. In general you don't

Re: Preparing keyed state before snapshot

2024-02-17 Thread Zakelly Lan
Hi Lorenzo, It is not recommended to do this with the keyed state. However there is an example in flink code (FastTop1Function#snapshotState) [1] of setting keys when snapshotState(). Hope this helps. [1]

Re: Redis as a State Backend

2024-01-30 Thread Zakelly Lan
And I found some previous discussion, FYI: 1. https://issues.apache.org/jira/browse/FLINK-3035 2. https://www.mail-archive.com/dev@flink.apache.org/msg10666.html Hope this helps. Best, Zakelly On Tue, Jan 30, 2024 at 4:08 PM Zakelly Lan wrote: > Hi Chirag > > That's an interesting i

Re: Redis as a State Backend

2024-01-30 Thread Zakelly Lan
Hi Chirag That's an interesting idea. IIUC, storing key-values can be simply implemented for Redis, but supporting checkpoint and recovery is relatively challenging. Flink's checkpoint should be consistent among all stateful operators at the same time. For an *embedded* and *file-based* key value

Re: Why calling ListBucket for each file in a checkpoint

2024-01-21 Thread Zakelly Lan
Are you accessing the s3 API with presto implementation? If so, you may read the code of `com.facebook.presto.hive.s3.PrestoS3FileSystem#create` and find it check the existence of the target path first, in which the `getFileStatus` and `listPrefix` are called. There is no option for this. Best,

Re: java.lang.UnsupportedOperationException: A serializer has already been registered for the state; re-registration is not allowed.

2024-01-18 Thread Zakelly Lan
by the company that I co-operate with. >> But, yes you are right that inside the code, I can see that the state >> initialization happens inside the AbstractProcessFunction#processElement >> method. >> >> Thank you very much, >> Kostas >> >> On Thu

Re: Re:Re: RocksDB增量模式checkpoint大小持续增长的问题

2024-01-17 Thread Zakelly Lan
ize不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和rocksdb文件正在看~ > > 在 2024-01-18 10:56:51,"Zakelly Lan" 写道: > > >你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State > >TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大 > > >

Re: java.lang.UnsupportedOperationException: A serializer has already been registered for the state; re-registration is not allowed.

2024-01-17 Thread Zakelly Lan
Hi, Could you please share the code of state initialization (getting state from a state descriptor)? It seems you are creating a state in #processElement? Best, Zakelly On Thu, Jan 18, 2024 at 2:25 PM Zakelly Lan wrote: > Hi, > > Could you please share the code of state initi

Re: RocksDB增量模式checkpoint大小持续增长的问题

2024-01-17 Thread Zakelly Lan
你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大 On Wed, Jan 17, 2024 at 4:09 PM fufu wrote: > >

Re: flink-checkpoint 问题

2024-01-11 Thread Zakelly Lan
> > > > > 任务人为从25548恢复时失败,抛出异常找不到_metadate文件 > > > | | > 吴先生 > | > | > 15951914...@163.com > | > 回复的原邮件 ---- > | 发件人 | Xuyang | > | 发送日期 | 2024年1月11日 14:55 | > | 收件人 | | > | 主题 | Re:回复: flink-checkpoint 问题 | > Hi, 你的图挂了,可以用图床处理一下,或者直接贴log。

Re: flink-checkpoint 问题

2024-01-10 Thread Zakelly Lan
你好, 方便的话贴一下jobmanager的log吧,应该有一些线索 On Wed, Jan 10, 2024 at 5:55 PM 吴先生 <15951914...@163.com> wrote: > Flink版本: 1.12 > checkpoint配置:hdfs > > 现象:作业由于一些因素第N个checkpoint失败,导致任务重试,任务重试失败,hdfs中不存在第N个chk路径,但是为什么会出现一个第N+1的chk路径,且这个路径下是空的 > >

Re: keyby mapState use question

2023-12-10 Thread Zakelly Lan
Hi, This should not happen. I guess the `onTimer` and `processElement` you are testing are triggered under different keyby keys. Note that the keyed states are partitioned by the keyby key first, so if querying or setting the state, you are only manipulating the specific partition which does not

Re: Unsubscribe from user list.

2023-10-18 Thread Zakelly Lan
Hi Lijuan, Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user@flink.apache.org. Best, Zakelly On Thu, Oct 19, 2023 at 6:23 AM Hou, Lijuan via user wrote: > > Hi team, > > > > Could you please remove this email from the subscription list? I

Re: Problems with the state.backend.fs.memory-threshold parameter

2023-10-13 Thread Zakelly Lan
Hi rui, The 'state.backend.fs.memory-threshold' configures the threshold below which state is stored as part of the metadata, rather than in separate files. So as a result the JM will use its memory to merge small checkpoint files and write them into one file. Currently the FLIP-306[1][2] is

Re: Problems with the state.backend.fs.memory-threshold parameter

2023-10-13 Thread Zakelly Lan
Hi rui, The 'state.backend.fs.memory-threshold' configures the threshold below which state is stored as part of the metadata, rather than in separate files. So as a result the JM will use its memory to merge small checkpoint files and write them into one file. Currently the FLIP-306[1][2] is

Re: updating keyed state in open method.

2023-09-07 Thread Zakelly Lan
Hi, You cannot access the keyed state within #open(). It can only be accessed under a keyed context ( a key is selected while processing an element, e.g. #processElement). Best, Zakelly On Thu, Sep 7, 2023 at 4:55 PM Krzysztof Chmielewski wrote: > > Hi, > I'm having a problem with my toy flink

Re: Broadcast state and job restarts

2022-10-28 Thread Zakelly Lan
Hi Alexis, Broadcast state is one type of the Operator State, which is included in savepoints and checkpoints and won't be lost. Please refer to https://stackoverflow.com/questions/62509773/flink-broadcast-state-rocksdb-state-backend/62510423#62510423 Best, Zakelly On Fri, Oct 28, 2022 at 4:41

Re: Limiting backpressure during checkpoints

2022-10-25 Thread Zakelly Lan
Hi Robin, You said that during the checkpoint async phase the CPU is stable at 100%, which is pretty strange to me. Normally the cpu usage of the taskmanager process could exceed 100%, depending on what all the threads are doing. I'm wondering if there is any scheduling mechanism controlling the

Re: Flink RocksDB Performance

2021-07-16 Thread Zakelly Lan
Hi Li Jim, Filesystem performs much better than rocksdb (by multiple times), but it is only suitable for small states. Rocksdb will consume more CPU on background tasks, cache management, serialization/deserialization and compression/decompression. In most cases, performance of the Rocksdb will