In 1.5.x streaming OLAP, kylin uses a timestamp range to seek the start/end offset in kafka, which is binary search; It allows a margin window, but if some messages are arrived late than the margin, it will not be lost;
Now we're working on a new implementation, which will strictly use offset to fetch the new messages each time, so there will not be message lost. 2016-09-13 15:53 GMT+08:00 Billy(Yiming) Liu <[email protected]>: > The current design is still an experimental approach. Kafka could not > guarantee the global order, so we have to find other solution. The new > design Streaming OLAP solution will relay on the Kafka partition order, > instead of app timestamp. The code is under KYLIN-1726 branch still. > > 2016-09-13 15:46 GMT+08:00 Mario Copperfield <[email protected]>: > > > OK, Thank you > > > > On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <[email protected]> wrote: > > > > > Yes, that's true. If you are looking at an app timestamp(event origin > > > time), then We can't binary search on it. Though Binary search may be > a > > > good approximation for the common case. > > > Not sure what Kylin is designed for. Let's wait to hear from the > experts! > > > > > > On Sep 13, 2016 12:49, "Mario Copperfield" <[email protected]> > wrote: > > > > > > > It's true that data appears in order in Kafka, but it can't assert > that > > > the > > > > timestamp of data is ordered, in fact, in real time it always appear > > > > without order > > > > > > > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <[email protected]> > wrote: > > > > > > > > > I am not sure about what Kylin does. But I know that data appears > in > > > > order > > > > > in Kafka broker. But the consumer can consume in any order that it > > > likes. > > > > > So, offsets are more driven by Consumers and Kafka does not have a > > say > > > > on > > > > > it. > > > > > Sharing this based on my preliminary understanding of how Kafka > > works. > > > > > Best, > > > > > Sarnath > > > > > > > > > > On Sep 13, 2016 12:41, "Mario Copperfield" <[email protected]> > > > wrote: > > > > > > > > > > > Dear all, > > > > > > I am using kylin streaming build, and when i read the code > > > about > > > > > > this module, i found that kylin use binary search to find the > > offset > > > > > which > > > > > > is the closest adjust to the starttamp. I doubt that is that work > > if > > > > the > > > > > > data in kafka is not order? > > > > > > Thanks and waits. > > > > > > > > > > > > > > > > > > -- > > > > > > Best regards, > > > > > > Amuro Copperfield > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > Amuro Copperfield > > > > > > > > > > > > > > > -- > > Best regards, > > Amuro Copperfield > > > > > > -- > With Warm regards > > Yiming Liu (刘一鸣) > -- Best regards, Shaofeng Shi
