date:20230920

Re: Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

2023-09-20 Thread Gyula Fóra

Hi! The cluster-id for each FlinkDeployment is simply the name of the deployment. So they are all different in a given namespace. (In other words they are not fixed as your question suggests but set automatically) So there should be no problem sharing the ZK cluster . Cheers Gyula On Thu, 21

Flink Gzip Sink with Error

2023-09-20 Thread Yunhui Han

Hi all, I want to write JSON strings with gzip compression by Flink following the demo on StackOverflow. I encountered a problem. There is an ill format string at the

RE: Re: Re: How to read flinkSQL job state

2023-09-20 Thread Yifan He via user

Hi Hangxiang, I still have one question about this problem, when using datastream api I know the key and value type I use in state because I defined ValueStateDescriptor, but how can I get the ValueStateDescriptor in flinksql? Thanks, Yifan On 2023/09/07 06:16:41 Hangxiang Yu wrote: > Hi,

RE: About Flink parquet format

2023-09-20 Thread Kamal Mittal via user

Yes. Due to below error, Flink bulk writer never close the part file and keep on creating new part file continuously. Is flink not handling exceptions like below? From: Feng Jin Sent: 20 September 2023 05:54 PM To: Kamal Mittal Cc: user@flink.apache.org Subject: Re: About Flink parquet

Re: Flink cdc 2.0 历史数据太大，导致log积压怎么解决

2023-09-20 Thread jinzhuguang

你好，除了这些运维手段外，flink cdc本身有什么解法吗，比如说增量阶段不用从头开始读binlog，因为其实很多都是重复读到的数据 > 2023年9月20日 21:00，Jiabao Sun 写道： > > Hi, > 生产环境的binlog还是建议至少保留7天，可以提高故障恢复时间容忍度。 > 另外，可以尝试增加snapshot的并行度和资源来提升snapshot速度，snapshot完成后可以从savepoint恢复并减少资源。 > Best, > Jiabao >

Re: 回复：flink1.17版本不支持hive 2.1版本了吗

2023-09-20 Thread yuxia

把这个 pr https://github.com/apache/flink/pull/19352 revert 掉，然后重新打包 flink hive connector 就可以。 Best regards, Yuxia - 原始邮件 - 发件人: "迎风浪子" <576637...@qq.com.INVALID> 收件人: "user-zh" 发送时间: 星期二, 2023年 9 月 19日下午 5:20:58 主题: 回复：flink1.17版本不支持hive 2.1版本了吗我们还在使用hive1.1.0，怎么办？ ---原始邮件--- 发件人:

Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

2023-09-20 Thread Brian King

Hello Flink Users! We're attempting to deploy a Flink application cluster on Kubernetes, using the Flink Operator and Zookeeper for HA. We're using Flink 1.16 and I have a question about some of the Zookeeper configuration[0]: "high-availability.zookeeper.path.root" is described as "The root

Re: Using Flink k8s operator on OKD

2023-09-20 Thread Krzysztof Chmielewski

Thank you Zach, our flink-operator and flink deployments are in same namespace -> called "flink". We have executed what is described in [1] before my initial message. We are using OKD 4.6.0 that according to the doc is using k8s 1.19. the very same config is working fine on "vanilla" k8s, but for

Test message

2023-09-20 Thread Krzysztof Chmielewski

Community, please forgive me for this message. This is a test, because all day, my replays to my other user thread are being rejected by email server. Sincerely apologies Krzysztof

Extract response stream out of a AsyncSinkBase operator

2023-09-20 Thread Bhupendra Yadav

Hey Everyone, We have a use case where we want to extract a response out of a AsyncSink Operator(HTTP in our case) and perform more transformation on top of it. We implemented a HttpSink by following this blog https://flink.apache.org/2022/03/16/the-generic-asynchronous-base-sink/ . Since By

Flink cdc 2.0 历史数据太大，导致log积压怎么解决

2023-09-20 Thread jinzhuguang

以mysql cdc为例，现在的f整体流程是先同步全量数据，再开启增量同步；我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大，TB级别，全量同步可能需要很久，但是binlog又不能删除，这样堆积起来会占用很大的空间，不知道这个问题现在有什么常见的解法吗？

Re: About Flink parquet format

2023-09-20 Thread Feng Jin

Hi I tested it on my side and also got the same error. This should be a limitation of Parquet. ``` java.lang.IllegalArgumentException: maxCapacityHint can't be less than initialSlabSize 64 1 at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:57)

Re: Flink cdc 2.0 历史数据太大，导致log积压怎么解决

2023-09-20 Thread Jiabao Sun

Hi, 生产环境的binlog还是建议至少保留7天，可以提高故障恢复时间容忍度。另外，可以尝试增加snapshot的并行度和资源来提升snapshot速度，snapshot完成后可以从savepoint恢复并减少资源。 Best, Jiabao -- From:jinzhuguang Send Time:2023年9月20日(星期三) 20:56 To:user-zh Subject:Flink cdc 2.0 历史数据太大，导致log积压怎么解决

Flink cdc 2.0 历史数据太大，导致log积压怎么解决

2023-09-20 Thread jinzhuguang

以mysql cdc为例，现在的f整体流程是先同步全量数据，再开启增量同步；我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大，TB级别，全量同步可能需要很久，但是binlog又不能删除，这样堆积起来会占用很大的空间，不知道这个问题现在有什么常见的解法吗？

Re: Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

Flink Gzip Sink with Error

RE: Re: Re: How to read flinkSQL job state

RE: About Flink parquet format

Re: Flink cdc 2.0 历史数据太大，导致log积压怎么解决

Re: 回复：flink1.17版本不支持hive 2.1版本了吗

Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

Re: Using Flink k8s operator on OKD

Test message

Extract response stream out of a AsyncSinkBase operator

Flink cdc 2.0 历史数据太大，导致log积压怎么解决

Re: About Flink parquet format

Re: Flink cdc 2.0 历史数据太大，导致log积压怎么解决

Flink cdc 2.0 历史数据太大，导致log积压怎么解决

14 matches

Site Navigation

Mail list logo

Footer information