Re: CleanUpInRocksDbCompactFilter

2023-06-15 Thread Hangxiang Yu
Hi, Patricia. In my opinion, This parameter balances the trade-off between the read/write performance and storage space utilization (of course, smaller state also means better performance for the future). I think the right value of longtimeNumberOfQueries depends on several factors, such as the

Re: Store a state at a RDBMS before TTL passes by

2023-06-15 Thread Shammon FY
Hi Anastasios, What you want sounds like a session window [1], maybe you can refer to the doc for more details. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#session-windows Best, Shammon FY On Thu, Jun 15, 2023 at 10:03 PM Anastasios Makris <

CleanUpInRocksDbCompactFilter

2023-06-15 Thread patricia lee
Hi, I am currently migrating our flink project from 1.8 to 1.17. The cleanUpInRocksDbCompactFilter() now accepts longtimeNumberOfQueries() as parameter. The question is how would we know the right value. We set to 1000 temporarily, is there a default value to set. Regards, Patricia

Re: Watermark idleness and alignment - are they exclusive?

2023-06-15 Thread Ken Krugler
I think you’re hitting this issue: https://issues.apache.org/jira/browse/FLINK-31632 Fixed in 1.16.2, 1.171. — Ken > On Jun 15, 2023, at 1:39 PM, Piotr Domagalski wrote: > > Hi all! > > We've been experimenting with watermark alignment

Watermark idleness and alignment - are they exclusive?

2023-06-15 Thread Piotr Domagalski
Hi all! We've been experimenting with watermark alignment in Flink 1.15 and observed an odd behaviour that I couldn't find any mention of in the documentation. With the following strategy: WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(60)) .withTimestampAssigner((e, t) ->

回复:(无主题)

2023-06-15 Thread 海风
多谢多谢 回复的原邮件 | 发件人 | Weihua Hu | | 日期 | 2023年06月14日 12:32 | | 收件人 | user-zh@flink.apache.org | | 抄送至 | | | 主题 | Re: (无主题) | > > 这个状态变量是否需要用transient来修饰 ValueState 再 Rich fuction 的 open 方法中被初始化,不应该被序列化和反序列化,建议使用 transient 来修饰。 但实际上自定义函数的序列化、反序列化只在任务部署阶段执行,而且初始状态下 ValueState 的值是

Store a state at a RDBMS before TTL passes by

2023-06-15 Thread Anastasios Makris
Hi Flink users, I created a KeyedStream that tracks for each user of my website some metrics. It's time a user produces an event the metrics are recomputed and change. I would like to keep the outcome of a user's session at an RDBMS, which will be a single row. The first and obvious solution

Re: AsyncFunction vs Async Sink

2023-06-15 Thread Teoh, Hong
Hi Lu, > 1. Is there any problem if we use Async Function for such a user case? We can > simply drop the output and use Unordered mode. As far as I can tell, it is similar, other than the retry strategy available for AsyncFunctions and batching for Async Sink. Both should work on Flink. >

Re: Interaction between idling sources and watermark alignment

2023-06-15 Thread Teoh, Hong
Hi Alexis, below is my understanding: > I see that, in Flink 1.17.1, watermark alignment will be supported (as beta) > within a single source's splits and across different sources. I don't see > this explicitly mentioned in the documentation, but I assume that the concept > of "maximal drift"

Re: [DISCUSS] Status of Statefun Project

2023-06-15 Thread Martijn Visser
Let me know if you have a PR for a Flink update :) On Thu, Jun 8, 2023 at 5:52 PM Galen Warren via user wrote: > Thanks Martijn. > > Personally, I'm already using a local fork of Statefun that is compatible > with Flink 1.16.x, so I wouldn't have any need for a released version > compatible

Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-15 Thread daniel sun
退订 On Thu, Jun 15, 2023 at 7:23 PM im huzi wrote: > 退订 > > On Tue, Jun 13, 2023 at 08:51 casel.chen wrote: > > > 线上跑了200多个flink > > > sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。 > > flink > > >

Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-15 Thread im huzi
退订 On Tue, Jun 13, 2023 at 08:51 casel.chen wrote: > 线上跑了200多个flink > sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。 > flink > sql作业的指标名称一般是作业名称+算子名称组成的,而算子名称是由sql内容拼出来的,在select字段比较多或sql较复杂的情况下容易生成过长的名称, > 请问这个问题有什么好的办法解决吗?

Re:Re: flink写kafka 事务问题

2023-06-15 Thread xuguang
是这个原因,学习了,感谢! 在 2023-06-15 16:25:30,"yuanfeng hu" 写道: >消费者要设置事务隔离级别 > >> 2023年6月15日 16:23,163 写道: >> >> 据我了解,kafka支持事务,开启checkpoint及exactly-once后仅当checkpoint执行完毕后才能将数据写入kafka中。测试:flink读取kafka的topic >> a写入topic b,开启checkpoint及exactly-once,flink未执行完新一次的checkpoint,但topic >>

Re: flink写kafka 事务问题

2023-06-15 Thread yuanfeng hu
消费者要设置事务隔离级别 > 2023年6月15日 16:23,163 写道: > > 据我了解,kafka支持事务,开启checkpoint及exactly-once后仅当checkpoint执行完毕后才能将数据写入kafka中。测试:flink读取kafka的topic > a写入topic b,开启checkpoint及exactly-once,flink未执行完新一次的checkpoint,但topic > b已经可以消费到新数据,这是什么原因?请大家指教!

flink写kafka 事务问题

2023-06-15 Thread 163
据我了解,kafka支持事务,开启checkpoint及exactly-once后仅当checkpoint执行完毕后才能将数据写入kafka中。测试:flink读取kafka的topic a写入topic b,开启checkpoint及exactly-once,flink未执行完新一次的checkpoint,但topic b已经可以消费到新数据,这是什么原因?请大家指教!