Hello Andras, The description in the "new-nrt-streaming" is correct: the late message will be built into next segments, while the segments' time range can have overlap, and Kylin will scan all segments which matches with the query time.
I just closed KYLIN-1210 which was overlooked before. KYLIN-1744 is a (pre-requisite) refactor work, which is a sub task of KYLIN-1726; The KYLIN-1726 was released in v1.6.0. Thanks for your feedback! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: [email protected] Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: [email protected] Join Kylin dev mail group: [email protected] Andras Nagy <[email protected]> 于2019年5月22日周三 下午10:18写道: > Dear All, > > I have a question about the handling of events that arrive significantly > later than the logical event timestamp, in streaming ingestion. > > In the blog post from 2016 at > http://kylin.apache.org/blog/2016/10/18/new-nrt-streaming/ , I read this: > "To let the late/early message can be queried, Cube segments allow overlap > for the partition time dimension: each segment has a “min” date/time and a > “max” date/time; Kylin will scan all segments which matched with the > queried time scope. Figure 2 illurates this. ..." > > On the other hand, I found a ticket: > https://issues.apache.org/jira/browse/KYLIN-1210 titled "Allowing segment > overlap to solve streaming data completeness problem" which seems to be > about the same issue, but its status is Open/unresolved. > > There is also another ticket: > https://issues.apache.org/jira/browse/KYLIN-1744 titled "Separate > concepts of source offset and date range on cube segments", which seems to > be related again. This one is Closed/Fixed in 1.5.3. > > Can you please help to clarify this, what is the status of this > capability? > What is the best practice currently to handle late arrival of events with > Kylin? > > Many thanks, > Andras >
