Re: Shaded zookeeper - curator mismatch?

2022-03-04 Thread Zhanghao Chen
Hi Filip, Could you share the version of the ZK server you are connecting to? Best, Zhanghao Chen From: Filip Karnicki Sent: Friday, March 4, 2022 23:12 To: user Subject: Shaded zookeeper - curator mismatch? Hi, I believe there's a mismatch in shaded

Re: Question about Flink counters

2022-03-04 Thread Zhanghao Chen
Hi Shane, Flink provides a generic counter interface with a few implementations. The default implementation SimpleCounter, which is not thread-safe, is used when you calling counter(String name) on a MetricGroup. Therefore, you'll need to use your own thread-safe implementation, check out the

flink1.14.0 temporal join hive

2022-03-04 Thread guanyq
kafka实时流关联hive的最新分区表数据时,关于缓存刷新的问题 'streaming-source.monitor-interval'='12 h' 这个参数我理解是:按照启动开始时间算起,每12小时读取一下最新分区的数据是吧? 还有个问题是读取最新分区的时间间隔之间,实时流里面进入了预关联新分区的数据,那么是不是就相当于关联的还是上一次的最新分区数据吧?

Re: Controlling group partitioning with DataStream

2022-03-04 Thread Dario Heinisch
Hi, I think you are looking for this answer from David: https://stackoverflow.com/questions/69799181/flink-streaming-do-the-events-get-distributed-to-each-task-slots-separately-acc I think then you could technically create your partitioner - though little bit cubersome - by mapping your

Controlling group partitioning with DataStream

2022-03-04 Thread Ken Krugler
Hi all, I need to be able to control which slot a keyBy group goes to, in order to compensate for a badly skewed dataset. Any recommended approach to use here? Previously (with a DataSet) I used groupBy followed by a withPartitioner, and provided my own custom partitioner. I posted this same

Question about Flink counters

2022-03-04 Thread Shane Bishop
Hi all, For Flink counters [1], are increment operations guaranteed to be atomic across all parallel tasks? I.e., is there a guarantee that the counter values will not be higher than expected? Thanks, Shane --- [1]

Re: Source API question around idle (expensive) SplitFetchers being shutdown.

2022-03-04 Thread Roman Khachatryan
Hi, If I understand the code correctly, the only option is to implement a custom SplitFetcherManager. There, you can either: 1) override maybeShutdownFinishedFetchers(), or 2) override createSplitFetcher() to return a custom fetcher; that fetcher would override isIdle() and return true after some

Source API question around idle (expensive) SplitFetchers being shutdown.

2022-03-04 Thread Jonathan Weaver
I am working on developing a custom source with the new Source api. What I'm noticing is that during periods of low incoming data it repeatedly will shutdown and restart the fetchers when the split assignments are empty and periodically added. I get log message such as

Shaded zookeeper - curator mismatch?

2022-03-04 Thread Filip Karnicki
Hi, I believe there's a mismatch in shaded zookeeper/curator dependencies. I see that curator 4.2.0 needs zookeeper 3.5.4-beta, but it's still used in flink-shaded-zookeeper-34, which as far as I can tell is used by flink runtime 1.14.3

Re: Task Manager shutdown causing jobs to fail

2022-03-04 Thread Terry Wang
Hi, Puneet~ AFAIK, that should be expected behavior that jobs on crashed TaskManager restarts. HA means there is no single point risk but Flink job still need to through failover to ensure state and data consistency. You may refer

Re: Incremental checkpointing & RocksDB Serialization

2022-03-04 Thread Yun Tang
Hi Vidya, > Why is the incremental checkpointing taking more time for the snapshot at the > end of the window duration? I guess that this is because the job is under back pressure on end of window. You can expand the checkpoint details to see whether that the async duration of each task is

Re: Version Upgrade of FlinkSQL (1.10 to 1.12)

2022-03-04 Thread zihao chen
Hi, Martijn, Thanks for your information. It seems that the situation is similar to what I know, I will follow FLIP-190. Also congratulations on becoming a Flink committer! Best regards, Chen Zihao Martijn Visser 于2022年3月4日周五 16:18写道: > Hi, > > Per the documentation [1] stateful

[statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-04 Thread Filip Karnicki
Hi All! We're running a statefun uber jar on a shared cloudera flink cluster, the latter of which launches with some ancient protobuf dependencies because of reasons[1]. Setting the following flink-config settings on the entire cluster classloader.parent-first-patterns.additional:

Re: Version Upgrade of FlinkSQL (1.10 to 1.12)

2022-03-04 Thread Martijn Visser
Hi, Per the documentation [1] stateful upgrades for SQL are currently not supported when upgrading from one minor version to another. There's ongoing work to improve this (via FLIP-190 [2]) but that's currently not yet available. Best regards, Martijn Visser https://twitter.com/MartijnVisser82