QiuMM edited a comment on issue #5869: Period load rules should include the future by default URL: https://github.com/apache/incubator-druid/issues/5869#issuecomment-425671460 In conclusion, I think current design of the load/drop rules is flexible and able to deal with almost all scenarios. But current `PeriodDropRule` is a impractical rule for it will always drop recent data. Then if people want to `retain 30 days` data, they can not use such `PeriodDropRule` but have to do like: load 30 days, drop forever. And because people have used the `drop forever` rule, then below things occured: > 2. The user loads some data from slightly in the future (maybe some clocks are running a bit fast or slow) using streaming ingestion. This creates a segment with an interval that is in the future. > 3. The coordinator disables the segment immediately upon noticing it (since it is not within the last 30 days). > 4. The Kafka tasks time out during handoff (because the segments are never loaded). > 5. And after that timeout, the data that was slightly in the future is still not available! Then I think there are two ways to solve these things: 1. Period load rules include the future by default 2. Add a new drop rule or modify current `PeriodDropRule` to support `drop before a period`, then if people want to `retain 30 days` data, they can do like this: drop 30 days before, load forever. I prefer the second way and want to modify current `PeriodDropRule` not add a new one because the current one is very impractical, IMO no people would like to use such drop rule.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
