TTL issue with large RocksDB keyed state

2024-06-02 Thread Cliff Resnick
Hi everyone, We have a Flink application that has a very large and perhaps unusual state. The basic shape of it is a very large and somewhat random keyed-stream partition space, each with a continuously growing map-state keyed by microsecond time Long values. There are never any overwrites in

Statefun runtime crashes when experiencing high msg/second rate on kubernetes

2024-06-01 Thread Oliver Schmied
Hi everyone,     I set up the Flink statefun runtime on minikube (cpus=4, memory=10240) following the tutorial in statefun-playground https://github.com/apache/flink-statefun-playground/tree/main/deployments/k8s . I developed my own statefun-Functions in java and deployed them the same way as

Re: Pulsar connector resets existing subscription

2024-05-31 Thread Igor Basov
For the record here, I created an issue FLINK-35477 regarding this. On Mon, May 27, 2024 at 1:21 PM Igor Basov wrote: > Ok, believe the breaking changes were introduced in this commit >

Re: DynamoDB Table API Connector Failing On Row Deletion - "The provided key element does not match the schema"

2024-05-31 Thread Ahmed Hamdy
Hi Rob, I agree with the issue here as well as the proposed solution. Thanks alot for the deep dive and the reproducing steps. I have created a ticket on your behalf: https://issues.apache.org/jira/browse/FLINK-35500 you can comment on it if you intend to work on it and then submit a PR against

Re: Flink 1.18.2 release date

2024-05-30 Thread weijie guo
Hi Yang IIRC, 1.18.2 has not been kicked off yet. Best regards, Weijie Yang LI 于2024年5月30日周四 22:33写道: > Dear Flink Community, > > Anyone know about the release date for 1.18.2? > > Thanks very much, > Yang >

DynamoDB Table API Connector Failing On Row Deletion - "The provided key element does not match the schema"

2024-05-30 Thread Rob Goretsky
Hello! I am looking to use the DynamoDB Table API connector to write rows to AWS DynamoDB. I have found what appears to be a bug in the implementation for Row Delete operations. I have an idea on what needs to be fixed as well, and given that I am new to this community, I am looking for

Flink 1.18.2 release date

2024-05-30 Thread Yang LI
Dear Flink Community, Anyone know about the release date for 1.18.2? Thanks very much, Yang

Re: Java 17 incompatibilities with Flink 1.18.1 version

2024-05-30 Thread weijie guo
I believe we ran cron test also on JDK17 and JDK21. Best regards, Weijie Zhanghao Chen 于2024年5月30日周四 19:35写道: > Hi Rajat, > > There's no need to get Flink libraries' compiled version on jdk17. The > only things you need to do are: > > >1. Configure the JAVA_HOME env to use JDK 17 on both

Re: Slack Invite

2024-05-30 Thread gongzhongqiang
Hi, The invite link : https://join.slack.com/t/apache-flink/shared_invite/zt-2jtsd06wy-31q_aELVkdc4dHsx0GMhOQ Best, Zhongqiang Gong Nelson de Menezes Neto 于2024年5月30日周四 15:01写道: > Hey guys! > > I want to join the slack community but the invite has expired.. > Can u send me a new one? >

Re: Java 17 incompatibilities with Flink 1.18.1 version

2024-05-30 Thread Zhanghao Chen
Hi Rajat, There's no need to get Flink libraries' compiled version on jdk17. The only things you need to do are: 1. Configure the JAVA_HOME env to use JDK 17 on both the client and server side 2. Configure the Flink JVM options properly to cooperate with the JDK modularization. The

退订

2024-05-29 Thread jszhouch...@163.com
退订

Re: Re:Flink SQL消费kafka topic有办法限速么?

2024-05-28 Thread Zhanghao Chen
应该是可以的。另外在老版本的 Kafka connector 上,曾经也实现过限速逻辑 [1],可以参考下。这个需求我觉得还比较通用,可以提一个 JIRA。 [1] https://issues.apache.org/jira/browse/FLINK-11501 Best, Zhanghao Chen From: casel.chen Sent: Tuesday, May 28, 2024 22:00 To: user-zh@flink.apache.org Subject: Re:Flink SQL消费kafka

Re: Java 17 incompatibilities with Flink 1.18.1 version

2024-05-28 Thread Zhanghao Chen
Hi Rajat, Flink releases are compiled with JDK 8 but it is able to run on JDK 8-17. As long as your Flink runs on JDK17 on both server and client side, you are free to write your Flink jobs with Java 17. Best, Zhanghao Chen From: Rajat Pratap Sent: Tuesday,

Re:Flink SQL消费kafka topic有办法限速么?

2024-05-28 Thread casel.chen
查了下Flink源码,当前DataGeneratorSource有添加RateLimiterStrategy参数,但KafkaSource没有该参数,可以像DataGeneratorSource那样来实现限速么? public DataGeneratorSource( GeneratorFunction generatorFunction, long count, RateLimiterStrategy rateLimiterStrategy, TypeInformation typeInfo) {...}

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-28 Thread Andrew Otto
> Flink CDC [1] now provides full-DB sync and schema evolution ability as a pipeline job. Ah! That is very cool. > Iceberg sink support was suggested before, and we’re trying to implement this in the next few releases. Does it cover the use-cases you mentioned? Yes! That would be fantastic.

Aw: Re: Help with monitoring metrics of StateFun runtime with prometheus

2024-05-28 Thread Oliver Schmied
Dear Biao Geng,   thank you for your reply. You are right, the statefun metrics are tracked along with the "normal" Flink metrics, I just could not find them. If anyone is interested, flink_taskmanager_job_task_operator_functions___ is the way to get them. Thanks again. Best regards, Oliver

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-28 Thread Alexis Sarda-Espinosa
Hello, I've also noticed this in our Argo CD setup. Since priority=0 is the default, Kubernetes accepts it but doesn't store it in the actual resource, I'm guessing it's like a mutating admission hook that comes out of the box. The "priority" property can be safely removed from the CRDs.

flink sqlgateway 提交sql作业如何设置组账号

2024-05-28 Thread 阿华田
flink sqlgateway 提交sql作业,发现sqlgateway服务启动后,默认是当前机器的租户信息进行任务提交到yarn集群,由于公司的hadoop集群设置了租户权限,需要设置提交的用户信息,各位大佬,flink sqlgateway 提交sql作业如何设置组账号 | | 阿华田 | | a15733178...@163.com | 签名由网易邮箱大师定制

Java 17 incompatibilities with Flink 1.18.1 version

2024-05-28 Thread Rajat Pratap
Hi Team, I am writing flink jobs with latest release version for flink (1.18.1). The jobmanager is also deployed with the same version build. However, we faced issues when we deployed the jobs. On further investigation, I noticed all libraries from flink have build jdk 1.8. I have the following

Re: Slack Community Invite?

2024-05-27 Thread Alexandre Lemaire
Thank you! > On May 27, 2024, at 9:35 PM, Junrui Lee wrote: > > Hi Alexandre, > > You can try this link: > https://join.slack.com/t/apache-flink/shared_invite/zt-2jn2dlgoi-P4oQBWRJT4I_3HY8ZbLxdg > > Best, > Junrui > > Alexandre Lemaire mailto:alema...@circlical.com>> > 于2024年5月28日周二

Re: Slack Community Invite?

2024-05-27 Thread Junrui Lee
Hi Alexandre, You can try this link: https://join.slack.com/t/apache-flink/shared_invite/zt-2jn2dlgoi-P4oQBWRJT4I_3HY8ZbLxdg Best, Junrui Alexandre Lemaire 于2024年5月28日周二 01:18写道: > Hello! > > Does the Slack community still exist? The link on the site is expired. > > Thank you! > Alex > > >

Re: Pulsar connector resets existing subscription

2024-05-27 Thread Igor Basov
Ok, believe the breaking changes were introduced in this commit . Here

Slack Community Invite?

2024-05-27 Thread Alexandre Lemaire
Hello! Does the Slack community still exist? The link on the site is expired. Thank you! Alex

Flink SQL消费kafka topic有办法限速么?

2024-05-27 Thread casel.chen
Flink SQL消费kafka topic有办法限速么?场景是消费kafka topic数据写入下游mongodb,在业务高峰期时下游mongodb写入压力大,希望能够限速消费kafka,请问要如何实现?

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread Hang Ruan
Hi, all. Flink CDC provides the schema evolution ability to sync the entire database. I think it could satisfy your needs. Flink CDC pipeline sources and sinks are listed in [1]. Iceberg pipeline connector is not provided by now. > What is not is the automatic syncing of entire databases, with

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread gongzhongqiang
Flink CDC 3.0 focuses on data integration scenarios, so you don't need to pay attention to the framework implementation, you just need to use the YAML format to describe the data source and target to quickly build a data synchronization task with schema evolution.And it supports rich source and

退订

2024-05-26 Thread 95chenjz
退订

Re: Help with monitoring metrics of StateFun runtime with prometheus

2024-05-26 Thread Biao Geng
Hi Oliver, I am not experienced in StateFun but its doc says 'Along with the standard metric scopes, Stateful Functions supports Function Scope which one level below operator scope.' So, as long as you can collect flink's metric via Prometheus, ideally, there should be no difference between using

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-26 Thread Xiqian YU
Hi Otto, Flink CDC [1] now provides full-DB sync and schema evolution ability as a pipeline job. Iceberg sink support was suggested before, and we’re trying to implement this in the next few releases. Does it cover the use-cases you mentioned? [1]

Re: How to start a flink job on a long running yarn cluster from a checkpoint (with arguments)

2024-05-25 Thread Junrui Lee
Hi Sachin, Yes, that's correct. To resume from a savepoint, use the command bin/flink run -s . You can find more details in the Flink documentation on [1]. Additionally, information on how to trigger a savepoint can be found in the section for triggering savepoints [2]. [1]

How to start a flink job on a long running yarn cluster from a checkpoint (with arguments)

2024-05-25 Thread Sachin Mittal
Hi, I have a long running yarn cluster and I submit my streaming job using the following command: flink run -m yarn-cluster -yid application_1473169569237_0001 /usr/lib/flink/examples/streaming/WordCount.jar --input file:///input.txt --output file:///output/ Let's say I want to stop this job,

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-25 Thread Péter Váry
> Could the table/database sync with schema evolution (without Flink job restarts!) potentially work with the Iceberg sink? Making this work would be a good addition to the Iceberg-Flink connector. It is definitely doable, but not a single PR sized task. If you want to try your hands on it, I

退订

2024-05-24 Thread 蒋少东
退订

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
> What is not is the automatic syncing of entire databases, with schema evolution and detection of new (and dropped?) tables. :) Wait. Is it? > Flink CDC supports synchronizing all tables of source database instance to downstream in one job by configuring the captured database list and table

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
Indeed, using Flink-CDC to write to Flink Sink Tables, including Iceberg, is supported. What is not is the automatic syncing of entire databases, with schema evolution and detection of new (and dropped?) tables. :) On Fri, May 24, 2024 at 8:58 AM Giannis Polyzos wrote: >

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Giannis Polyzos
https://nightlies.apache.org/flink/flink-cdc-docs-stable/ All these features come from Flink cdc itself. Because Paimon and Flink cdc are projects native to Flink there is a strong integration between them. (I believe it’s on the roadmap to support iceberg as well) On Fri, 24 May 2024 at 3:52 PM,

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
> I’m curious if there is any reason for choosing Iceberg instead of Paimon No technical reason that I'm aware of. We are using it mostly because of momentum. We looked at Flink Table Store (before it was Paimon), but decided it was too early and the docs were too sparse at the time to really

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Giannis Polyzos
I’m curious if there is any reason for choosing Iceberg instead of Paimon (other than - iceberg is more popular). Especially for a use case like CDC that iceberg struggles to support. On Fri, 24 May 2024 at 3:22 PM, Andrew Otto wrote: > Interesting thank you! > > I asked this in the Paimon

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-24 Thread Andrew Otto
Interesting thank you! I asked this in the Paimon users group: How coupled to Paimon catalogs and tables is the cdc part of Paimon? RichCdcMultiplexRecord

Ways to detect a scaling event within a flink operator at runtime

2024-05-23 Thread Chetas Joshi
Hello, On a k8s cluster, I have the flink-k8s-operator running 1.8 with autoscaler = enabled (in-place) and a flinkDeployment (application mode) running 1.18.1. The flinkDeployment i.e. the flink streaming application has a mock data producer as the source. The source generates data points

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
If I understand correctly, Paimon is sending `CdcRecord`-s [1] on the wire which contain not only the data, but the schema as well. With Iceberg we currently only send the row data, and expect to receive the schema on job start - this is more performant than sending the schema all the time, but

Pulsar connector resets existing subscription

2024-05-23 Thread Igor Basov
Hi everyone, I have a problem with how Flink deals with the existing subscription in a Pulsar topic. - Subscription has some accumulated backlog - Flink job is deployed from a clear state (no checkpoints) - Flink job uses the same subscription name as the existing one; the start

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Ah I see, so just auto-restarting to pick up new stuff. I'd love to understand how Paimon does this. They have a database sync action which will sync entire databases, handle schema evolution, and I'm

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
I will ask Marton about the slides. The solution was something like this in a nutshell: - Make sure that on job start the latest Iceberg schema is read from the Iceberg table - Throw a SuppressRestartsException when data arrives with the wrong schema - Use Flink Kubernetes Operator to restart

Re: issues with Flink kinesis connector

2024-05-23 Thread Nick Hecht
Thank you for your help! On Thu, May 23, 2024 at 1:40 PM Aleksandr Pilipenko wrote: > Hi Nick, > > You need to use another method to add sink to your job - sinkTo. > KinesisStreamsSink implements newer Sink interface, while addSink expect > old SinkFunction. You can see this by looking at

Re: issues with Flink kinesis connector

2024-05-23 Thread Aleksandr Pilipenko
Hi Nick, You need to use another method to add sink to your job - sinkTo. KinesisStreamsSink implements newer Sink interface, while addSink expect old SinkFunction. You can see this by looking at method signatures[1] and in usage examples in documentation[2] [1]

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread Zhanghao Chen
Hi John, Based on the Memory config screenshot provided before, each of your TM should have MaxDirectMemory=1GB (network mem) + 128 MB (framework off-heap) = 1152 MB. Nor will taskmanager.memory.flink.size and the total including MaxDirectMemory exceed pod physical mem, you may check the

Re: Task Manager memory usage

2024-05-23 Thread Zhanghao Chen
Hi Sigalit, For states stored in memory, they would most probably keep alive for several rounds of GC and ended up in the old gen of heap, and won't get recycled until a Full GC. As for the TM pod memory usage, most probabliy it will stop increasing at some point. You could try setting a

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Wow, I would LOVE to see this talk. If there is no recording, perhaps there are slides somewhere? On Thu, May 23, 2024 at 11:00 AM Carlos Sanabria Miranda < sanabria.miranda.car...@gmail.com> wrote: > Hi everyone! > > I have found in the Flink Forward website the following presentation: >

issues with Flink kinesis connector

2024-05-23 Thread Nick Hecht
Hello, I am currently having issues trying to use the python flink 1.18 Datastream api with the Amazon Kinesis Data Streams Connector. >From the documentation https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kinesis/ I have downloaded the

"Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Carlos Sanabria Miranda
Hi everyone! I have found in the Flink Forward website the following presentation: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg " by

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread John Smith
Based on these two settings... taskmanager.memory.flink.size: 16384m taskmanager.memory.jvm-metaspace.size: 3072m Reading the docs here I'm not sure how to calculate the formula. My suspicion is that I may have allocated too much of taskmanager.memory.flink.size and the total including

kirankumarkathe...@gmail.com-unsubscribe

2024-05-23 Thread Kiran Kumar Kathe
Kindly un subscribe for this gmail account kirankumarkathe...@gmail.com

Re: Task Manager memory usage

2024-05-23 Thread Sigalit Eliazov
hi, thanks for your reply, we are storing the data in memory since it is a short term we thought that adding rocksdb will add overhead. On Thu, May 23, 2024 at 4:38 PM Sachin Mittal wrote: > Hi > Where are you storing the state. > Try rocksdb. > > Thanks > Sachin > > > On Thu, 23 May 2024 at

Re: Task Manager memory usage

2024-05-23 Thread Sachin Mittal
Hi Where are you storing the state. Try rocksdb. Thanks Sachin On Thu, 23 May 2024 at 6:19 PM, Sigalit Eliazov wrote: > Hi, > > I am trying to understand the following behavior in our Flink application > cluster. Any assistance would be appreciated. > > We are running a Flink application

Help with monitoring metrics of StateFun runtime with prometheus

2024-05-23 Thread Oliver Schmied
Dear Apache Flink community,   I am setting up an apche flink statefun runtime on Kubernetes, following the flink-playground example: https://github.com/apache/flink-statefun-playground/tree/main/deployments/k8s. This is the manifest I used for creating the statefun enviroment: ```---

Task Manager memory usage

2024-05-23 Thread Sigalit Eliazov
Hi, I am trying to understand the following behavior in our Flink application cluster. Any assistance would be appreciated. We are running a Flink application cluster with 5 task managers, each with the following configuration: - jobManagerMemory: 12g - taskManagerMemory: 20g -

Re: 关于 mongo db 的splitVector 权限问题

2024-05-23 Thread Jiabao Sun
Hi, splitVector 是 MongoDB 计算分片的内部命令,在副本集部署模式下也可以使用此命令来计算 chunk 区间。 如果没有 splitVector 权限,会自动降级为 sample 切分策略。 Best, Jiabao evio12...@gmail.com 于2024年5月23日周四 16:57写道: > > hello~ > > > 我正在使用 flink-cdc mongodb connector 2.3.0 >

关于 mongo db 的splitVector 权限问题

2024-05-23 Thread evio12...@gmail.com
hello~ 我正在使用 flink-cdc mongodb connector 2.3.0 (https://github.com/apache/flink-cdc/blob/release-2.3/docs/content/connectors/mongodb-cdc.md) , 文档中指出 mongo 账号需要这些权限 'splitVector', 'listDatabases', 'listCollections', 'collStats', 'find', and 'changeStream' , 我现在使用的mongo是 replica-set , 但是了解到

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-23 Thread Vararu, Vadim
That’s great news. Thanks. From: Leonard Xu Date: Thursday, 23 May 2024 at 04:42 To: Vararu, Vadim Cc: user , Danny Cranmer Subject: Re: Flink kinesis connector 4.3.0 release estimated date Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize the

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-22 Thread Leonard Xu
Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize the release work in this week if everything goes well. Best, Leonard > 2024年5月22日 下午11:51,Vararu, Vadim 写道: > > Hi guys, > > Any idea when the 4.3.0 kinesis connector is estimated to be released?

Flink Kubernetes Operator Pod Disruption Budget

2024-05-22 Thread Jeremy Alvis via user
Hello, In order to maintain at least one pod for both the Flink Kubernetes Operator and JobManagers through Kubernetes processes that use the Eviction API such as when draining a node, we have deployed Pod

Flink kinesis connector 4.3.0 release estimated date

2024-05-22 Thread Vararu, Vadim
Hi guys, Any idea when the 4.3.0 kinesis connector is estimated to be released? Cheers, Vadim.

[ANNOUNCE] Apache Celeborn 0.4.1 available

2024-05-22 Thread Nicholas Jiang
Hi all, Apache Celeborn community is glad to announce the new release of Apache Celeborn 0.4.1. Celeborn is dedicated to improving the efficiency and elasticity of different map-reduce engines and provides an elastic, high-efficient service for intermediate data including shuffle data, spilled

StateMigrationException while using stateTTL

2024-05-22 Thread irakli.keshel...@sony.com
Hello, I'm using Flink 1.17.1 and I have stateTTL enabled in one of my Flink jobs where I'm using the RocksDB for checkpointing. I have a value state of Pojo class (which is generated from Avro schema). I added a new field to my schema along with the default value to make sure it is backwards

Re: Get access to unmatching events in Apache Flink Cep

2024-05-22 Thread Anton Sidorov
In answer Biao said "currently there is no such API to access the middle NFA state". May be that API exist in plan? Or I can create issue or pull request that add API? пт, 17 мая 2024 г. в 12:04, Anton Sidorov : > Ok, thanks for the reply. > > пт, 17 мая 2024 г. в 09:22, Biao Geng : > >> Hi

IllegalStateException: invalid BLOB

2024-05-21 Thread Lars Skjærven
Hello, We're facing the bug reported in https://issues.apache.org/jira/browse/FLINK-32212 More specifically, when kubernetes decides to drain a node, a job manager restart (but not the task manager), the job fails with: java.lang.IllegalStateException: The library registration references a

Re:咨询Flink 1.19文档中关于iterate操作

2024-05-20 Thread Xuyang
Hi, 目前Iterate api在1.19版本上废弃了,不再支持,具体可以参考[1][2]。Flip[1]中提供了另一种替代的办法[3] [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-357%3A+Deprecate+Iteration+API+of+DataStream [2] https://issues.apache.org/jira/browse/FLINK-33144 [3]

Re: Flink autoscaler with AWS ASG: checkpoint access issue

2024-05-20 Thread Chetas Joshi
Hello, After digging into the 403 issue a bit, I figured out that after the scale-up event, the flink-s3-fs-presto uses the node-profile instead of IRSA (Iam Role for Service Account) on some of the newly created TM pods. 1. Anyone else experienced this as well? 2. Verified that this is an issue

咨询Flink 1.19文档中关于iterate操作

2024-05-20 Thread www
尊敬的Flink开发团队: 您好! 我目前正在学习如何使用Apache Flink的DataStream API实现迭代算法,例如图的单源最短路径。在Flink 1.18版本的文档中,我注意到有关于iterate操作的介绍,具体请见:https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/overview/#iterations 但是,我发现Flink

Re: flinksql 经过优化后,group by字段少了

2024-05-20 Thread Lincoln Lee
Hi, 可以尝试下 1.17 或更新的版本, 这个问题在 flink 1.17.0 中已修复[1]。 批处理中做这个 remove 优化是符合语义的,而在流中不能直接裁剪, 对于相关时间函数的说明文档[2]中也进行了更新 [1] https://issues.apache.org/jira/browse/FLINK-30006 [2]

Re: What is the best way to aggregate data over a long window

2024-05-20 Thread gongzhongqiang
Hi Sachin, `performing incremental aggregation using stateful processing` is same as `windows with agg`, but former is more flexible.If flink window can not satisfy your performance needs ,and your business logic has some features that can be customized for optimization. You can choose the

Re: Email submission

2024-05-20 Thread Hang Ruan
Hi, Michas. Please subscribe to the mailing list by sending an email to user-subscr...@flink.apache.org . Best, Hang Michas Szacillo (BLOOMBERG/ 919 3RD A) 于2024年5月19日周日 04:34写道: > Sending my email to join the apache user mailing list. > > Email: mszaci...@bloomberg.net >

Re: Restore from checkpoint

2024-05-20 Thread archzi lu
Hi Phil, correction: But the error you have is a familiar error if you have written some code to handle directory path. --> But the error you have is a familiar error if you have written some code to handle directory path with Java. No offence. Best regards. Jiadong. Lu Jiadong Lu

Re: Restore from checkpoint

2024-05-20 Thread Jiadong Lu
Hi, Phil I don't have more expertise about the flink-python module. But the error you have is a familiar error if you have written some code to handle directory path. The correct form of Path/URI will be : 1. "/home/foo" 2. "file:///home/foo/boo" 3. "hdfs:///home/foo/boo" 4. or Win32

Re: flinksql 经过优化后,group by字段少了

2024-05-19 Thread Benchao Li
你引用的这个 calcite 的 issue[1] 是在 calcite-1.22.0 版本中就修复了的,Flink 应该从 1.11 版本开始就已经用的是这个 calcite 版本了。 所以你用的是哪个版本的 Flink 呢,感觉这个可能是另外一个问题。如果可以在当前最新的版本 1.19 中复现这个问题的话,可以建一个 issue 来报一个 bug。 PS: 上面我说的这个行为,我稍微确认下了,这个应该是一个代码生成阶段才做的区分,所以优化过程中并不能识别,所以即使是batch模式下,优化的plan也应该是包含dt字段的。 [1]

Re: Restore from checkpoint

2024-05-19 Thread Jinzhong Li
Hi Phil, I think you can use the "-s :checkpointMetaDataPath" arg to resume the job from a retained checkpoint[1]. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/checkpoints/#resuming-from-a-retained-checkpoint Best, Jinzhong Li On Mon, May 20, 2024 at 2:29 AM Phil

Re: flinksql 经过优化后,group by字段少了

2024-05-19 Thread Benchao Li
看起来像是因为 "dt = cast(CURRENT_DATE as string)" 推导 dt 这个字段是个常量,进而被优化掉了。 将 CURRENT_DATE 优化为常量的行为应该只在 batch 模式下才是这样的,你这个 SQL 是跑在 batch 模式下的嘛? ℡小新的蜡笔不见嘞、 <1515827...@qq.com.invalid> 于2024年5月19日周日 01:01写道: > > create view tmp_view as > SELECT > dt, -- 2 > uid, -- 0 > uname, -- 1 > uage

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
CC to the Paimon community. Best, Jingsong On Mon, May 20, 2024 at 9:55 AM Jingsong Li wrote: > > Amazing, congrats! > > Best, > Jingsong > > On Sat, May 18, 2024 at 3:10 PM 大卫415 <2446566...@qq.com.invalid> wrote: > > > > 退订 > > > > > > > > > > > > > > > > Original Email > > > > > > > >

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
CC to the Paimon community. Best, Jingsong On Mon, May 20, 2024 at 9:55 AM Jingsong Li wrote: > > Amazing, congrats! > > Best, > Jingsong > > On Sat, May 18, 2024 at 3:10 PM 大卫415 <2446566...@qq.com.invalid> wrote: > > > > 退订 > > > > > > > > > > > > > > > > Original Email > > > > > > > >

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
Amazing, congrats! Best, Jingsong On Sat, May 18, 2024 at 3:10 PM 大卫415 <2446566...@qq.com.invalid> wrote: > > 退订 > > > > > > > > Original Email > > > > Sender:"gongzhongqiang"< gongzhongqi...@apache.org ; > > Sent Time:2024/5/17 23:10 > > To:"Qingsheng Ren"< re...@apache.org ; > > Cc

Aw: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Oliver Schmied
Dear Biao Geng,   thank you very much. With the help of your demo and the YAML configuration, I was able to successfully set up monitoring for my Apache Flink jobs.   Thanks again for your time and help.   Best regards, Oliver     Gesendet: Sonntag, 19. Mai 2024 um 17:42 Uhr Von: "Biao

Re: Restore from checkpoint

2024-05-19 Thread Phil Stavridis
Hi Lu, Thanks for your reply. In what way are the paths to get passed to the job that needs to used the checkpoint? Is the standard way, using -s :/ or by passing the path in the module as a Python arg? Kind regards Phil > On 18 May 2024, at 03:19, jiadong.lu wrote: > > Hi Phil, > > AFAIK,

Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Biao Geng
Hi Oliver, I believe you are almost there. One thing I found could improve is that in your job yaml, instead of using: kubernetes.operator.metrics.reporter.prommetrics.reporters: prom kubernetes.operator.metrics.reporter.prommetrics.reporter.prom.factory.class:

Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-18 Thread Oliver Schmied
Dear Apache Flink Community, I am currently trying to monitor an Apache Flink cluster deployed on Kubernetes using Prometheus and Grafana. Despite following the official guide (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/metrics-logging/)  on how

Email submission

2024-05-18 Thread Michas Szacillo (BLOOMBERG/ 919 3RD A)
Sending my email to join the apache user mailing list. Email: mszaci...@bloomberg.net

Re: problem with the heartbeat interval feature

2024-05-18 Thread Hongshun Wang
Hi Thomas, I have reviewed the code and just noticed that heartbeat.action.query is not mandatory. Debezium will generate Heartbeat Events at regular intervals. Flink CDC will then receive these Heartbeat Events and advance the offset[1]. Finally, the source reader will commit the offset during

Re: Restore from checkpoint

2024-05-17 Thread jiadong.lu
Hi Phil, AFAIK, the error indicated your path was incorrect. your should use '/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' or 'file:///opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' instead. Best. Jiadong.Lu On 5/18/24 2:37 AM, Phil Stavridis wrote: Hi, I am trying to test how

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Muhammet Orazov via user
Amazing, congrats! Thanks for your efforts! Best, Muhammet On 2024-05-17 09:32, Qingsheng Ren wrote: The Apache Flink community is very happy to announce the release of Apache Flink CDC 3.1.0. Apache Flink CDC is a distributed data integration tool for real time data and batch data, bringing

Restore from checkpoint

2024-05-17 Thread Phil Stavridis
Hi, I am trying to test how the checkpoints work for restoring state, but not sure how to run a new instance of a flink job, after I have cancelled it, using the checkpoints which I store in the filesystem of the job manager, e.g. /opt/flink/checkpoints. I have tried passing the checkpoint as

Re: problem with the heartbeat interval feature

2024-05-17 Thread Thomas Peyric
thanks Hongshun for your response ! Le ven. 17 mai 2024 à 07:51, Hongshun Wang a écrit : > Hi Thomas, > > In debezium dos says: For the connector to detect and process events from > a heartbeat table, you must add the table to the PostgreSQL publication > specified by the publication.name >

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread gongzhongqiang
Congratulations ! Thanks for all contributors. Best, Zhongqiang Gong Qingsheng Ren 于 2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread gongzhongqiang
Congratulations ! Thanks for all contributors. Best, Zhongqiang Gong Qingsheng Ren 于 2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and

RockDb - Failed to clip DB after initialization - end key comes before start key

2024-05-17 Thread Francesco Leone
Hi, We are facing a new issue related to RockDb when deploying a new version of our job, which is adding 3 more operators. We are using flink 1.17.1 with RockDb on Java 11. We get an exception from another pre-existing operator during its initialization. That operator and the new ones have

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Hang Ruan
Congratulations! Thanks for the great work. Best, Hang Qingsheng Ren 于2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and batch data, bringing

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Hang Ruan
Congratulations! Thanks for the great work. Best, Hang Qingsheng Ren 于2024年5月17日周五 17:33写道: > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration tool for real time > data and batch data, bringing

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu
Congratulations ! Thanks Qingsheng for the great work and all contributors involved !! Best, Leonard > 2024年5月17日 下午5:32,Qingsheng Ren 写道: > > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu
Congratulations ! Thanks Qingsheng for the great work and all contributors involved !! Best, Leonard > 2024年5月17日 下午5:32,Qingsheng Ren 写道: > > The Apache Flink community is very happy to announce the release of > Apache Flink CDC 3.1.0. > > Apache Flink CDC is a distributed data integration

[ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Qingsheng Ren
The Apache Flink community is very happy to announce the release of Apache Flink CDC 3.1.0. Apache Flink CDC is a distributed data integration tool for real time data and batch data, bringing the simplicity and elegance of data integration via YAML to describe the data movement and transformation

[ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Qingsheng Ren
The Apache Flink community is very happy to announce the release of Apache Flink CDC 3.1.0. Apache Flink CDC is a distributed data integration tool for real time data and batch data, bringing the simplicity and elegance of data integration via YAML to describe the data movement and transformation

  1   2   3   4   5   6   7   8   9   10   >