Re:Flink Memory Usage

2018-05-09 Thread
Hi Pedro, since you are using RocksDB backend, RocksDB will consume some extra native memory, sometimes the amount of that could be very large, because the default setting of RocksDB will keep a `BloomFilter` for every opened sst in memory, and the number of the opened sst is not limited by

Re: Externalized checkpoints and metadata

2018-04-26 Thread
Hi Juan, I think you are right and there maybe more then 3 companies implementing different solutions for this...I created a ticket to address it here https://issues.apache.org/jira/browse/FLINK-9260. Hope this could help to reduce other's redundant efforts on this...(If it could be

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread
Hi 潘, could you please check the number of kafka's partitions, I think if the {{number of kafka partition}} < {{parallelism of source node}}) then there can be some idle parallel which won't recevice any data... Best Regards, Sihua Zhou On 04/26/2018

Re: keyBy and parallelism

2018-04-12 Thread
Hi Christophe, I think what you want to do is "stream join", and I'm a bit confuse that if you have know there are only 8 keys then why would you still like to use 16 parallelisms? 8 of them will be idle(a waste of CPU). In the KeyedStream, the tuples with the same key will be sent to the

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread
Hi Lasse, I met that before. I think maybe the non-heap memory trend of the graph you attached is the "expected" result ... Because rocksdb will keep the a "filter (bloom filter)" in memory for every opened sst file by default, and the num of the sst file will increase by time, so it looks

Re: checkpoint stuck with rocksdb statebackend and s3 filesystem

2018-03-05 Thread
tates growth in the future. Am I right? Best Regards, Tony Wei 2018-03-06 10:55 GMT+08:00 周思华 <summerle...@163.com>: Hi Tony, Sorry for jump into, one thing I want to remind is that from the log you provided it looks like you are using "full checkpoint", this means that the

Re: A "per operator instance" window all ?

2018-02-18 Thread
Hi Julien, If I am not misunderstand, I think you can key your stream on a `Random.nextInt() % parallesm`, this way you can "group" together alerts from different and benefit from multi parallems. 发自网易邮箱大师 On 02/19/2018 09:08,Xingcan Cui wrote: Hi Julien, sorry

Re:Re: Problem with Flink restoring from checkpoints

2017-07-19 Thread
Hi Fran, is the DataTimeBucketer acts like a memory buffer and does't managed by flink's state? If so, then i think the problem is not about Kafka, but about the DateTimeBucketer. Flink won't take snapshot for the DataTimeBucketer if it not in any state. Best, Sihua Zhou At 2017-07-20

Is there some metric info about RocksdbBackend?

2017-06-30 Thread
Hi, Is there some metric info about RocksdbBackend in flink, like sst compact times, memtable dump times, block cache size and so on. Currently when using Rocksdb as backend it behavior is black for us and it consumption a lot of memory, i want to figure out it behavior via metric.

Re:Re: Error when set RocksDBStateBackend option in Flink?

2017-06-29 Thread
docs/api/java/io/Serializable.html On Thu, Jun 29, 2017 at 2:16 AM, 周思华 <summerle...@163.com> wrote: I use the follow code to set RocksDBStateBackend and it option, it can run correctly locally, but can't be submitted to cluster. Main.class: public static void main() { final StreamEx

Error when set RocksDBStateBackend option in Flink?

2017-06-29 Thread
I use the follow code to set RocksDBStateBackend and it option, it can run correctly locally, but can't be submitted to cluster. Main.class: public static void main() { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();