QiuMM edited a comment on issue #5869: Period load rules should include the 
future by default
URL: 
https://github.com/apache/incubator-druid/issues/5869#issuecomment-425671460
 
 
   In conclusion, I think current design of the load/drop rules is flexible and 
able to deal with almost all scenarios. But current `PeriodDropRule` is a 
impractical rule for it will always drop recent data. Then if people want to 
`retain 30 days` data, they can not use such `PeriodDropRule` but have to do 
like: load 30 days, drop forever. And because people have used the `drop 
forever` rule, then below things occured:
   > 2. The user loads some data from slightly in the future (maybe some clocks 
are running a bit fast or slow) using streaming ingestion. This creates a 
segment with an interval that is in the future.
   > 3. The coordinator disables the segment immediately upon noticing it 
(since it is not within the last 30 days).
   > 4. The Kafka tasks time out during handoff (because the segments are never 
loaded).
   > 5. And after that timeout, the data that was slightly in the future is 
still not available!
   
   Then I think there are two ways to solve these things:
   1. Period load rules include the future by default
   2. Add a new drop rule or modify current `PeriodDropRule` to support `drop 
before a period`, then if people want to `retain 30 days` data, they can do 
like this: drop 30 days before, load forever.
   
   I prefer the second way and want to modify current `PeriodDropRule` not add 
a new one because the current one is very impractical, IMO no people would like 
to use such drop rule.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to