Aw: Re: Help with monitoring metrics of StateFun runtime with prometheus

2024-05-28 Thread Oliver Schmied
ds, Oliver     Gesendet: Montag, 27. Mai 2024 um 04:21 Uhr Von: "Biao Geng" An: "Oliver Schmied" Cc: user@flink.apache.org Betreff: Re: Help with monitoring metrics of StateFun runtime with prometheus Hi Oliver,   I am not experienced in StateFun but its doc says 'Along w

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-28 Thread Alexis Sarda-Espinosa
hen it’ll always > show a diff. > > > > *From: *Márton Balassi > *Date: *Thursday, May 9, 2024 at 3:50 PM > *To: *Prasad, Neil > *Cc: *user@flink.apache.org > *Subject: *Re: Flink Kubernetes Operator 1.8.0 CRDs > > Hi, What do you mean exactly by cannot be applied or

Re: Slack Community Invite?

2024-05-27 Thread Alexandre Lemaire
Thank you! > On May 27, 2024, at 9:35 PM, Junrui Lee wrote: > > Hi Alexandre, > > You can try this link: > https://join.slack.com/t/apache-flink/shared_invite/zt-2jn2dlgoi-P4oQBWRJT4I_3HY8ZbLxdg > > Best, > Junrui > > Alexandre Lemaire mailto:alema...@circlical.com>> > 于2024年5月28日周二

Re: Slack Community Invite?

2024-05-27 Thread Junrui Lee
Hi Alexandre, You can try this link: https://join.slack.com/t/apache-flink/shared_invite/zt-2jn2dlgoi-P4oQBWRJT4I_3HY8ZbLxdg Best, Junrui Alexandre Lemaire 于2024年5月28日周二 01:18写道: > Hello! > > Does the Slack community still exist? The link on the site is expired. > > Thank you! > Alex > > >

Re: Pulsar connector resets existing subscription

2024-05-27 Thread Igor Basov
Ok, believe the breaking changes were introduced in this commit . Here

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread Hang Ruan
4 mai 2024 à 23:06 > *À : *Giannis Polyzos > *Cc : *Carlos Sanabria Miranda , Oscar > Perez via user , Péter Váry < > peter.vary.apa...@gmail.com>, mbala...@apache.org > *Objet : *Re: "Self-service ingestion pipelines with evolving schema via > Flink and Iceberg" prese

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread gongzhongqiang
Flink CDC 3.0 focuses on data integration scenarios, so you don't need to pay attention to the framework implementation, you just need to use the YAML format to describe the data source and target to quickly build a data synchronization task with schema evolution.And it supports rich source and

Re: Help with monitoring metrics of StateFun runtime with prometheus

2024-05-26 Thread Biao Geng
Hi Oliver, I am not experienced in StateFun but its doc says 'Along with the standard metric scopes, Stateful Functions supports Function Scope which one level below operator scope.' So, as long as you can collect flink's metric via Prometheus, ideally, there should be no difference between using

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread Xiqian YU
-docs-stable/ [2] https://issues.apache.org/jira/browse/FLINK-34840 Regards, Xiqian De : Andrew Otto Date : vendredi, 24 mai 2024 à 23:06 À : Giannis Polyzos Cc : Carlos Sanabria Miranda , Oscar Perez via user , Péter Váry , mbala...@apache.org Objet : Re: "Self-service ingestion pipe

Re: How to start a flink job on a long running yarn cluster from a checkpoint (with arguments)

2024-05-25 Thread Junrui Lee
Hi Sachin, Yes, that's correct. To resume from a savepoint, use the command bin/flink run -s . You can find more details in the Flink documentation on [1]. Additionally, information on how to trigger a savepoint can be found in the section for triggering savepoints [2]. [1]

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-25 Thread Péter Váry
> Could the table/database sync with schema evolution (without Flink job restarts!) potentially work with the Iceberg sink? Making this work would be a good addition to the Iceberg-Flink connector. It is definitely doable, but not a single PR sized task. If you want to try your hands on it, I

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
> What is not is the automatic syncing of entire databases, with schema evolution and detection of new (and dropped?) tables. :) Wait. Is it? > Flink CDC supports synchronizing all tables of source database instance to downstream in one job by configuring the captured database list and table

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
Indeed, using Flink-CDC to write to Flink Sink Tables, including Iceberg, is supported. What is not is the automatic syncing of entire databases, with schema evolution and detection of new (and dropped?) tables. :) On Fri, May 24, 2024 at 8:58 AM Giannis Polyzos wrote: >

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Giannis Polyzos
https://nightlies.apache.org/flink/flink-cdc-docs-stable/ All these features come from Flink cdc itself. Because Paimon and Flink cdc are projects native to Flink there is a strong integration between them. (I believe it’s on the roadmap to support iceberg as well) On Fri, 24 May 2024 at 3:52 PM,

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
> I’m curious if there is any reason for choosing Iceberg instead of Paimon No technical reason that I'm aware of. We are using it mostly because of momentum. We looked at Flink Table Store (before it was Paimon), but decided it was too early and the docs were too sparse at the time to really

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Giannis Polyzos
I’m curious if there is any reason for choosing Iceberg instead of Paimon (other than - iceberg is more popular). Especially for a use case like CDC that iceberg struggles to support. On Fri, 24 May 2024 at 3:22 PM, Andrew Otto wrote: > Interesting thank you! > > I asked this in the Paimon

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
Interesting thank you! I asked this in the Paimon users group: How coupled to Paimon catalogs and tables is the cdc part of Paimon? RichCdcMultiplexRecord

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
If I understand correctly, Paimon is sending `CdcRecord`-s [1] on the wire which contain not only the data, but the schema as well. With Iceberg we currently only send the row data, and expect to receive the schema on job start - this is more performant than sending the schema all the time, but

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Ah I see, so just auto-restarting to pick up new stuff. I'd love to understand how Paimon does this. They have a database sync action which will sync entire databases, handle schema evolution, and I'm

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
I will ask Marton about the slides. The solution was something like this in a nutshell: - Make sure that on job start the latest Iceberg schema is read from the Iceberg table - Throw a SuppressRestartsException when data arrives with the wrong schema - Use Flink Kubernetes Operator to restart

Re: issues with Flink kinesis connector

2024-05-23 Thread Nick Hecht
Thank you for your help! On Thu, May 23, 2024 at 1:40 PM Aleksandr Pilipenko wrote: > Hi Nick, > > You need to use another method to add sink to your job - sinkTo. > KinesisStreamsSink implements newer Sink interface, while addSink expect > old SinkFunction. You can see this by looking at

Re: issues with Flink kinesis connector

2024-05-23 Thread Aleksandr Pilipenko
Hi Nick, You need to use another method to add sink to your job - sinkTo. KinesisStreamsSink implements newer Sink interface, while addSink expect old SinkFunction. You can see this by looking at method signatures[1] and in usage examples in documentation[2] [1]

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread Zhanghao Chen
, Zhanghao Chen From: John Smith Sent: Thursday, May 23, 2024 22:40 To: Zhanghao Chen Cc: Biao Geng ; user Subject: Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory? Based on these two settings... taskmanager.memory.flink.size: 16384m

Re: Task Manager memory usage

2024-05-23 Thread Zhanghao Chen
Hi Sigalit, For states stored in memory, they would most probably keep alive for several rounds of GC and ended up in the old gen of heap, and won't get recycled until a Full GC. As for the TM pod memory usage, most probabliy it will stop increasing at some point. You could try setting a

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Wow, I would LOVE to see this talk. If there is no recording, perhaps there are slides somewhere? On Thu, May 23, 2024 at 11:00 AM Carlos Sanabria Miranda < sanabria.miranda.car...@gmail.com> wrote: > Hi everyone! > > I have found in the Flink Forward website the following presentation: >

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread John Smith
ult >> setting, regardless of the version of JVM. >> >> Best, >> Zhanghao Chen >> -- >> *From:* John Smith >> *Sent:* Wednesday, May 22, 2024 22:56 >> *To:* Biao Geng >> *Cc:* user >> *Subject:* Re: Wo

Re: Task Manager memory usage

2024-05-23 Thread Sigalit Eliazov
hi, thanks for your reply, we are storing the data in memory since it is a short term we thought that adding rocksdb will add overhead. On Thu, May 23, 2024 at 4:38 PM Sachin Mittal wrote: > Hi > Where are you storing the state. > Try rocksdb. > > Thanks > Sachin > > > On Thu, 23 May 2024 at

Re: Task Manager memory usage

2024-05-23 Thread Sachin Mittal
Hi Where are you storing the state. Try rocksdb. Thanks Sachin On Thu, 23 May 2024 at 6:19 PM, Sigalit Eliazov wrote: > Hi, > > I am trying to understand the following behavior in our Flink application > cluster. Any assistance would be appreciated. > > We are running a Flink application

Re: 关于 mongo db 的splitVector 权限问题

2024-05-23 Thread Jiabao Sun
Hi, splitVector 是 MongoDB 计算分片的内部命令,在副本集部署模式下也可以使用此命令来计算 chunk 区间。 如果没有 splitVector 权限,会自动降级为 sample 切分策略。 Best, Jiabao evio12...@gmail.com 于2024年5月23日周四 16:57写道: > > hello~ > > > 我正在使用 flink-cdc mongodb connector 2.3.0 >

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-23 Thread Vararu, Vadim
That’s great news. Thanks. From: Leonard Xu Date: Thursday, 23 May 2024 at 04:42 To: Vararu, Vadim Cc: user , Danny Cranmer Subject: Re: Flink kinesis connector 4.3.0 release estimated date Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-22 Thread Leonard Xu
Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize the release work in this week if everything goes well. Best, Leonard > 2024年5月22日 下午11:51,Vararu, Vadim 写道: > > Hi guys, > > Any idea when the 4.3.0 kinesis connector is estimated to be released?

Re: Get access to unmatching events in Apache Flink Cep

2024-05-22 Thread Anton Sidorov
In answer Biao said "currently there is no such API to access the middle NFA state". May be that API exist in plan? Or I can create issue or pull request that add API? пт, 17 мая 2024 г. в 12:04, Anton Sidorov : > Ok, thanks for the reply. > > пт, 17 мая 2024 г. в 09:22, Biao Geng : > >> Hi

Re:咨询Flink 1.19文档中关于iterate操作

2024-05-20 Thread Xuyang
Hi, 目前Iterate api在1.19版本上废弃了,不再支持,具体可以参考[1][2]。Flip[1]中提供了另一种替代的办法[3] [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-357%3A+Deprecate+Iteration+API+of+DataStream [2] https://issues.apache.org/jira/browse/FLINK-33144 [3]

Re: Flink autoscaler with AWS ASG: checkpoint access issue

2024-05-20 Thread Chetas Joshi
Hello, After digging into the 403 issue a bit, I figured out that after the scale-up event, the flink-s3-fs-presto uses the node-profile instead of IRSA (Iam Role for Service Account) on some of the newly created TM pods. 1. Anyone else experienced this as well? 2. Verified that this is an issue

Re: flinksql 经过优化后,group by字段少了

2024-05-20 Thread Lincoln Lee
flink中是仍然存在这个问题。 > > > > > --原始邮件-- > 发件人: > "user-zh" > < > libenc...@apache.org; > 发送时间:2024年5月20日(星期一) 中午12:51 > 收件人:"user

Re: What is the best way to aggregate data over a long window

2024-05-20 Thread gongzhongqiang
Hi Sachin, `performing incremental aggregation using stateful processing` is same as `windows with agg`, but former is more flexible.If flink window can not satisfy your performance needs ,and your business logic has some features that can be customized for optimization. You can choose the

Re: Email submission

2024-05-20 Thread Hang Ruan
Hi, Michas. Please subscribe to the mailing list by sending an email to user-subscr...@flink.apache.org . Best, Hang Michas Szacillo (BLOOMBERG/ 919 3RD A) 于2024年5月19日周日 04:34写道: > Sending my email to join the apache user mailing list. > > Email: mszaci...@bloomberg.net >

Re: Restore from checkpoint

2024-05-20 Thread archzi lu
looks like the way it is done > >>> is something like below: > >>> docker-compose exec jobmanager flink run -s > >>> :/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc -py > >>> /opt/app/flink_job.py > >>> But I am getting error: > >>> Caused by: java.io.IOException: Checkpoint/savepoint path > >>> ':/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' is not a valid > >>> file URI. Either the pointer path is invalid, or the checkpoint was > >>> created by a different state backend. > >>> What is wrong with the way the job is re-submitted to the cluster? > >>> Kind regards > >>> Phil > >

Re: Restore from checkpoint

2024-05-20 Thread Jiadong Lu
_job.py But I am getting error: Caused by: java.io.IOException: Checkpoint/savepoint path ':/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' is not a valid file URI. Either the pointer path is invalid, or the checkpoint was created by a different state backend. What is wrong with the way the job

Re: flinksql 经过优化后,group by字段少了

2024-05-19 Thread Benchao Li
> 发送时间:2024年5月20日(星期一) 上午10:32 > 收件人:"user-zh" > 主题:Re: flinksql 经过优化后,group by字段少了 > > > > 看起来像是因为 "dt = cast(CURRENT_DATE as string)" 推导 dt 这个字段是个常量,进而被优化掉了。 > > 将 CURRENT_DATE 优化为常量的行为应该只在 batch 模式下才是这样的,你这个 SQL 是跑在 batch 模式下的嘛? > > ℡

Re: Restore from checkpoint

2024-05-19 Thread Jinzhong Li
5588e19b1d8709ee62be1cdcc' is not a valid file > URI. Either the pointer path is invalid, or the checkpoint was created by a > different state backend. > >> What is wrong with the way the job is re-submitted to the cluster? > >> Kind regards > >> Phil > >

Re: flinksql 经过优化后,group by字段少了

2024-05-19 Thread Benchao Li
看起来像是因为 "dt = cast(CURRENT_DATE as string)" 推导 dt 这个字段是个常量,进而被优化掉了。 将 CURRENT_DATE 优化为常量的行为应该只在 batch 模式下才是这样的,你这个 SQL 是跑在 batch 模式下的嘛? ℡小新的蜡笔不见嘞、 <1515827...@qq.com.invalid> 于2024年5月19日周日 01:01写道: > > create view tmp_view as > SELECT > dt, -- 2 > uid, -- 0 > uname, -- 1 > uage

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
> > > > > > > > > > Original Email > > > > > > > > Sender:"gongzhongqiang"< gongzhongqi...@apache.org ; > > > > Sent Time:2024/5/17 23:10 > > > > To:"Qingsheng Ren"< re...@apache.org ; > > &

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
> > > > > > > > > > Original Email > > > > > > > > Sender:"gongzhongqiang"< gongzhongqi...@apache.org ; > > > > Sent Time:2024/5/17 23:10 > > > > To:"Qingsheng Ren"< re...@apache.org ; > > &

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
Amazing, congrats! Best, Jingsong On Sat, May 18, 2024 at 3:10 PM 大卫415 <2446566...@qq.com.invalid> wrote: > > 退订 > > > > > > > > Original Email > > > > Sender:"gongzhongqiang"< gongzhongqi...@apache.org ; > > Sent Time:2024/5/1

Aw: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Oliver Schmied
;Biao Geng" An: "Oliver Schmied" Cc: user@flink.apache.org Betreff: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes Hi Oliver,   I believe you are almost there. One thing I found could improve is that in your job ya

Re: Restore from checkpoint

2024-05-19 Thread Phil Stavridis
/1875588e19b1d8709ee62be1cdcc -py >> /opt/app/flink_job.py >> But I am getting error: >> Caused by: java.io.IOException: Checkpoint/savepoint path >> ':/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' is not a valid file >> URI. Either the pointer path is invalid, or the

Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Biao Geng
Hi Oliver, I believe you are almost there. One thing I found could improve is that in your job yaml, instead of using: kubernetes.operator.metrics.reporter.prommetrics.reporters: prom kubernetes.operator.metrics.reporter.prommetrics.reporter.prom.factory.class:

Re: problem with the heartbeat interval feature

2024-05-18 Thread Hongshun Wang
Hi Thomas, I have reviewed the code and just noticed that heartbeat.action.query is not mandatory. Debezium will generate Heartbeat Events at regular intervals. Flink CDC will then receive these Heartbeat Events and advance the offset[1]. Finally, the source reader will commit the offset during

Re: Restore from checkpoint

2024-05-17 Thread jiadong.lu
/savepoint path ':/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' is not a valid file URI. Either the pointer path is invalid, or the checkpoint was created by a different state backend. What is wrong with the way the job is re-submitted to the cluster? Kind regards Phil

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Muhammet Orazov via user
Amazing, congrats! Thanks for your efforts! Best, Muhammet On 2024-05-17 09:32, Qingsheng Ren wrote: The Apache Flink community is very happy to announce the release of Apache Flink CDC 3.1.0. Apache Flink CDC is a distributed data integration tool for real time data and batch data, bringing

Re: problem with the heartbeat interval feature

2024-05-17 Thread Thomas Peyric
thanks Hongshun for your response ! Le ven. 17 mai 2024 à 07:51, Hongshun Wang a écrit : > Hi Thomas, > > In debezium dos says: For the connector to detect and process events from > a heartbeat table, you must add the table to the PostgreSQL publication > specified by the publication.name >

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread gongzhongqiang
Congratulations ! Thanks for all contributors. Best, Zhongqiang Gong Qingsheng Ren 于 2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread gongzhongqiang
Congratulations ! Thanks for all contributors. Best, Zhongqiang Gong Qingsheng Ren 于 2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Hang Ruan
Congratulations! Thanks for the great work. Best, Hang Qingsheng Ren 于2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and batch data, bringing

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Hang Ruan
Congratulations! Thanks for the great work. Best, Hang Qingsheng Ren 于2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and batch data, bringing

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu
Congratulations ! Thanks Qingsheng for the great work and all contributors involved !! Best, Leonard > 2024年5月17日 下午5:32,Qingsheng Ren 写道: > > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu
Congratulations ! Thanks Qingsheng for the great work and all contributors involved !! Best, Leonard > 2024年5月17日 下午5:32,Qingsheng Ren 写道: > > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration

Re: Get access to unmatching events in Apache Flink Cep

2024-05-17 Thread Anton Sidorov
Ok, thanks for the reply. пт, 17 мая 2024 г. в 09:22, Biao Geng : > Hi Anton, > > I am afraid that currently there is no such API to access the middle NFA > state in your case. For patterns that contain 'within()' condition, the > timeout events could be retrieved via TimedOutPartialMatchHandler

Re: SSL Kafka PyFlink

2024-05-17 Thread Evgeniy Lyutikov via user
@flink.apache.org Тема: Re: SSL Kafka PyFlink Hi Phil, The kafka configuration keys of ssl maybe not correct. You can refer the kafka document[1] to get the ssl configurations of client. [1] https://kafka.apache.org/documentation/#security_configclients Best, Zhongqiang Gong Phil Stavridis mailto:phi

Re: What is the best way to aggregate data over a long window

2024-05-17 Thread Sachin Mittal
Hi, I am doing the following 1. Use reduce function where the data type of output after windowing is the same as the input. 2. Where the output of data type after windowing is different from that of input I use the aggregate function. For example: SingleOutputStreamOperator data =

Re: Re: Re: Flink kafka connector for v 1.19.0

2024-05-17 Thread Niklas Wilcke
Hi Hang, thanks for pointing me to the mail thread. That is indeed interesting. Can we maybe ping someone to get this done? Can I do something about it? Becoming a PMC member might be difficult. :) Are still three PMC votes outstanding? I'm not entirely sure how to properly check who is part

Re: What is the best way to aggregate data over a long window

2024-05-17 Thread gongzhongqiang
Hi Sachin, We can optimize this problem in the following ways: - use org.apache.flink.streaming.api.datastream.WindowedStream#aggregate(org.apache.flink.api.common.functions.AggregateFunction) to reduce number of data - use TTL to clean data which are not need - enble incremental checkpoint -

Re: Get access to unmatching events in Apache Flink Cep

2024-05-17 Thread Biao Geng
Hi Anton, I am afraid that currently there is no such API to access the middle NFA state in your case. For patterns that contain 'within()' condition, the timeout events could be retrieved via TimedOutPartialMatchHandler interface, but other unmatching events would be pruned immediately once they

Re: problem with the heartbeat interval feature

2024-05-16 Thread Hongshun Wang
Hi Thomas, In debezium dos says: For the connector to detect and process events from a heartbeat table, you must add the table to the PostgreSQL publication specified by the publication.name

Re: SSL Kafka PyFlink

2024-05-16 Thread gongzhongqiang
Hi Phil, The kafka configuration keys of ssl maybe not correct. You can refer the kafka document[1] to get the ssl configurations of client. [1] https://kafka.apache.org/documentation/#security_configclients Best, Zhongqiang Gong Phil Stavridis 于2024年5月17日周五 01:44写道: > Hi, > > I have a

Re: [EXTERNAL]Re: Flink kafka connector for v 1.19.0

2024-05-16 Thread Hang Ruan
Hi, Niklas. The kafka connector version 3.2.0[1] is for Flink 1.19 and it has a vote thread[2] already. But there is not enough votes, Best, Hang [1] https://issues.apache.org/jira/browse/FLINK-35138 [2] https://lists.apache.org/thread/7shs2wzb0jkfdyst3mh6d9pn3z1bo93c Niklas Wilcke

Re: monitoring message latency for flink sql app

2024-05-16 Thread Hang Ruan
Hi, mete. As Feng Jin said, I think you could make use of the metric ` currentEmitEventTimeLag`. Besides that, if you develop your job with the DataStream API, you could add a new operator to handle it by yourself. Best, Hang Feng Jin 于2024年5月17日周五 02:44写道: > Hi Mete > > You can refer to the

Re: Flink 1.18.1 ,重启状态恢复

2024-05-16 Thread Yanfei Lei
看起来和 FLINK-34063 / FLINK-33863 是同样的问题,您可以升级到1.18.2 试试看。 [1] https://issues.apache.org/jira/browse/FLINK-33863 [2] https://issues.apache.org/jira/browse/FLINK-34063 陈叶超 于2024年5月16日周四 16:38写道: > > 升级到 flink 1.18.1 ,任务重启状态恢复的话,遇到如下报错: > 2024-04-09 13:03:48 > java.lang.Exception: Exception while

Re: monitoring message latency for flink sql app

2024-05-16 Thread Feng Jin
Hi Mete You can refer to the metrics provided by the Kafka source connector. https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/datastream/kafka//#monitoring Best, Feng On Thu, May 16, 2024 at 7:55 PM mete wrote: > Hello, > > For an sql application using kafka as

RE: monitoring message latency for flink sql app

2024-05-16 Thread Anton Sidorov
Hello mete. I found this SO article https://stackoverflow.com/questions/54293808/measuring-event-time-latency-with-flink-cep If I'm not mistake, you can use Flink metrics system for operators and get time of processing event in operator. On 2024/05/16 11:54:44 mete wrote: > Hello, > > For an

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-16 Thread John Smith
Hi. No I have not changed the protocol. On Thu, May 16, 2024, 3:20 AM Biao Geng wrote: > Hi John, > > Just want to check, have you ever changed the kafka protocol in your job > after using the new cluster? The error message shows that it is caused by > the kafka client and there is a similar

Re: [EXTERNAL]Re: Flink kafka connector for v 1.19.0

2024-05-16 Thread Niklas Wilcke
Hi Ahmed, are you aware of a blocker? I'm also a bit confused that after Flink 1.19 being available for a month now the connectors still aren't. It would be great to get some insights or maybe a reference to an issue. From looking at the Github repos and the Jira I wasn't able to spot

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-16 Thread Biao Geng
Hi John, Just want to check, have you ever changed the kafka protocol in your job after using the new cluster? The error message shows that it is caused by the kafka client and there is a similar error in this issue

Re: use flink 1.19 JDBC Driver can find jdbc connector

2024-05-15 Thread abc15606
现在可以用中文了?就是opt目录里面的gateway.jar直接编辑Factory文件把connector注册就行了 > 在 2024年5月15日,15:36,Xuyang 写道: > > Hi, 看起来你之前的问题是jdbc driver找不到,可以简单描述下你的解决的方法吗?“注册connection数的数量”有点不太好理解。 > > > > > 如果确实有类似的问题、并且通过这种手段解决了的话,可以建一个improvement的jira issue[1]来帮助社区跟踪、改善这个问题,感谢! > > > > > [1]

Re:请问如何贡献Flink Hologres连接器?

2024-05-15 Thread Xuyang
Hi, 我觉得如果只是从贡献的角度来说,支持flink hologres connector是没问题的,hologres目前作为比较热门的数据库,肯定是有很多需求的,并且现在aliyun github官方也基于此提供了开源的flink hologres connector[1]。 但是涉及到aliyun等公司商业化的ververica-connector-hologres包,如果想直接开源的话,在我的角度最好事先确认下面几点,不然可能会隐含一些法律风险 1.

Re: Job is Failing for every 2hrs - Out of Memory Exception

2024-05-15 Thread Biao Geng
Hi Madan, The error shows that it cannot create new threads. One common reason is that the physical machine does not configure a large enough thread limit(check this SO

Re: how to reduce read times when many jobs read the same kafka topic?

2024-05-14 Thread longfeng Xu
Thanks for you explanation. I'll give it a try. :) Sachin Mittal 于2024年5月15日周三 10:39写道: > Each separate job would have its own consumer group hence they will read > independently from the same topic and when checkpointing they will commit > their own offsets. > So if any job fails, it will not

Re: Checkpointing while loading causing issues

2024-05-14 Thread gongzhongqiang
Hi Lars, Currently, there is no configuration available to trigger a checkpoint immediately after the job starts in Flink. But we can address this issue from multiple perspectives using the insights provided in this document [1]. [1]

Re: how to reduce read times when many jobs read the same kafka topic?

2024-05-14 Thread Sachin Mittal
Each separate job would have its own consumer group hence they will read independently from the same topic and when checkpointing they will commit their own offsets. So if any job fails, it will not affect the progress of other jobs when reading from Kafka. I am not sure of the impact of network

Re: Proper way to modify log4j config file for kubernetes-session

2024-05-14 Thread Vararu, Vadim
Yes, the dynamic log level modification worked great for me. Thanks a lot, Vadim From: Biao Geng Date: Tuesday, 14 May 2024 at 10:07 To: Vararu, Vadim Cc: user@flink.apache.org Subject: Re: Proper way to modify log4j config file for kubernetes-session Hi Vararu, Does this document meet your

Re: Proper way to modify log4j config file for kubernetes-session

2024-05-14 Thread Biao Geng
Hi Vararu, Does this document meet your requirements? https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#logging Best, Biao Geng Vararu, Vadim 于2024年5月14日周二 01:39写道: > Hi, > > > > Trying to configure loggers in the

Re: How can we exclude operator level metrics from getting reported

2024-05-14 Thread Biao Geng
Hi Sachin, Your setting looks fine to me. If you want to verify that, one way is to set the log level to 'trace' and check if logs like 'Ignoring metric {}.{} for reporter #{} due to filter rules.' is printed. Best, Biao Geng Sachin Mittal 于2024年5月12日周日 02:51写道: > Hi > I have a following

Re: 使用Kafka记录自身的时间戳问题

2024-05-13 Thread Biao Geng
Hi, >>> 那这个时间戳是kafka接收到数据自动生成的时间吗?还是说消息发送给kafka的时候需要怎么设置把业务时间附上去? 这个时间戳来自Kafka record里的时间戳,可以参考代码

Re: use flink 1.19 JDBC Driver can find jdbc connector

2024-05-13 Thread kellygeorg...@163.com
退订 Replied Message | From | abc15...@163.com | | Date | 05/10/2024 12:26 | | To | user-zh@flink.apache.org | | Cc | | | Subject | Re: use flink 1.19 JDBC Driver can find jdbc connector | I've solved it. You need to register the number of connections in the jar of gateway

Re: Flink Kubernetes Operator - How can I use a jar that is hosted on a private maven repo for a FlinkSessionJob?

2024-05-12 Thread Nathan T. A. Lewis
Hi Mate, That option might be exactly what I need. Thanks! Best regards, Nathan T. A. Lewis On Sun, 12 May 2024 05:27:10 -0600 czmat...@gmail.com wrote Hi Nathan, Job submissions for FlinkSessionJob resources will always be done by first uploading the JAR file itself from the

Re: Flink Kubernetes Operator - How can I use a jar that is hosted on a private maven repo for a FlinkSessionJob?

2024-05-12 Thread Mate Czagany
Hi Nathan, Job submissions for FlinkSessionJob resources will always be done by first uploading the JAR file itself from the Operator pod using the JobManager's REST API, then starting a new job using the uploaded JAR. This means that downloading the JAR file with an initContainer to the

Re: 退订

2024-05-11 Thread Hang Ruan
请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8 [2] https://flink.apache.org/community.html#mailing-lists 爱看书不识字 于2024年5月11日周六 10:06写道:

Re: Flink kafka connector for v 1.19.0

2024-05-10 Thread Ahmed Hamdy
Hi Aniket The community is currently working on releasing a new version for all the connectors that is compatible with 1.19. Please follow the announcements in Flink website[1] to get notified when it is available. 1-https://flink.apache.org/posts/ Best Regards Ahmed Hamdy On Fri, 10 May 2024

Re: use flink 1.19 JDBC Driver can find jdbc connector

2024-05-09 Thread abc15606
I've solved it. You need to register the number of connections in the jar of gateway. But this is inconvenient, and I still hope to improve it. 发自我的 iPhone > 在 2024年5月10日,11:56,Xuyang 写道: > > Hi, can you print the classloader and verify if the jdbc connector exists in > it? > > > > > --

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-09 Thread Prasad, Neil
are the same that are bundled in a release, then it’ll always show a diff. From: Márton Balassi Date: Thursday, May 9, 2024 at 3:50 PM To: Prasad, Neil Cc: user@flink.apache.org Subject: Re: Flink Kubernetes Operator 1.8.0 CRDs Hi, What do you mean exactly by cannot be applied or replaced? What

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-09 Thread Gyula Fóra
Hey! We have not observed any issue so far, can you please share some error information / log ? Opening a jira ticket would be best Thanks Gyula On Thu, 9 May 2024 at 21:18, Prasad, Neil wrote: > I am writing to report an issue with the Flink Kubernetes Operator version > 1.8.0. The CRD is

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-09 Thread Márton Balassi
Hi, What do you mean exactly by cannot be applied or replaced? What exactly is the issue? Are you installing fresh or trying to upgrade from a previous version? If the latter please follow this:

Re: Apache Flink-Redis Connector Depreciated In New Version | Adsolut Media

2024-05-09 Thread Ahmed Hamdy
Hi Kush Unfortunately there is currently no real Redis connector maintained by the Flink community. I am aware that Bahir's version might be outdated but we are currently working on a community supported connector[1] 1-https://github.com/apache/flink-connector-redis-streams Best Regards Ahmed

Re: 退订

2024-05-09 Thread Yunfeng Zhou
Hi, 退订请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org . Best, yunfeng On Thu, May 9, 2024 at 5:58 PM xpfei0811 wrote: > > 退订 > > 回复的原邮件 > | 发件人 | wangfengyang | > | 发送日期 | 2024年04月23日 18:10 | > | 收件人 | user-zh | > | 主题 | 退订 | > 退订

Re: Re: Looking for help with Job Initialisation issue

2024-05-09 Thread Keith Lee
Hi Abhi, > We see that even when all the Taskslots of that particular operator are stuck in an INITIALISING state Can you include the stack trace of these threads so that we can understand what the operators are stuck on INITIALISING? Regards Keith On Thu, May 9, 2024 at 6:58 AM Abhi Sagar

RE: Re: Looking for help with Job Initialisation issue

2024-05-09 Thread Abhi Sagar Khatri via user
Hi Biao, Thank you for your response. We have tried looking into Thread dumps of Task Managers before but that's not helping our case. We see that even when all the Taskslots of that particular operator are stuck in an INITIALISING state, many of them have already started processing new data. Is

Re: Flink scheduler keeps trying to schedule the pods indefinitely

2024-05-08 Thread Chetas Joshi
Hey Gyula, Thanks for getting back. 1) Yes, some more testing revealed the job was able to start with lower parallelism i.e. lower than the upper bound that was set by the adaptive scheduler. 2) I am limiting the parallelism of any job-vertex by setting pipeline.max-parallelism to a value that

Re: Checkpointing

2024-05-08 Thread Muhammet Orazov via user
Hey Jacob, If you understand how the Kafka offset managed in the checkpoint, then you could map this notion to other Flink sources. I would suggest to read the Data Sources[1] document and FLIP-27[5]. Each source should define a `Split`, then it is `SourceReaderBase`[2] class' responsibility to

Re: Flink Kubernetes Operator Application mode multiple jobs

2024-05-08 Thread Raihan Sunny
Got it. Thanks for the clarification guys. From: Guozhen Yang Sent: Wednesday, May 8, 2024 9:38 AM To: user@flink.apache.org Subject: RE: Flink Kubernetes Operator Application mode multiple jobs Hi Raihan, We have encountered the same issue though we are using

  1   2   3   4   5   6   7   8   9   10   >