Thanks for the reply.Maybe we need a external priority queue.Happy Chinese New Year! ------------------------------------------------------------------发件人:Aljoscha Krettek <aljos...@apache.org>发送时间:2017年1月25日(星期三) 18:38收件人:dev <dev@beam.apache.org>; lzljs3620320 <lzljs3620...@aliyun.com>; Kenneth Knowles <k...@google.com>主 题:Re: How to implement Timer in runner Hi Jingsong,you're right, it is indeed somewhat tricky to find a good data structure for out-of-core timers. That's why we have them in memory in Flink for now and that's also why I'm afraid I don't have any good advice for you right now. We're aware of the problem in Flink but we're not yet working on a concrete solution. Cheers,Aljoscha On Tue, 24 Jan 2017 at 21:42 Dan Halperin <dhalp...@apache.org> wrote: Hi Jingsong,
Sorry for the delayed response; this email ended up being misclassified by my mail server and I missed it. Maybe Kenn or Aljoscha has suggestions on how runners can best implement timers? Dan On Thu, Jan 19, 2017 at 9:55 PM, lzljs3620320 <lzljs3620...@aliyun.com> wrote: > Hi there, > I'm working on the beam integration for an internal system at Alibaba. Now > most of the runners put timers in memory, such as Flink, Apex, etc. (I do not > know > the implementation of Google Dataflow).But in our scene, unbounded data > has a large number of keys,which will lead to OOM(timers in memory). So > we want to store timers in state(RocksDb in disk).The problem is how to > extract fired event time timers when refresh the input > watermark. Do we have to scan all keys and timers(Now timer is composed of > Key, id, namespace, timestamp, domain)?Is there a better > implement? I'm wondering if you could give me some advice on how to implement > timers in state efficiently. Thank you! > Best,Jingsong Lee