glasser commented on issue #7046: index_parallel: support !appendToExisting 
with no explicit intervals
URL: https://github.com/apache/incubator-druid/pull/7046#issuecomment-464191157
 
 
   This does now work, but I'm now realizing I would never actually want to use 
it in my use case, since if your input data accidentally contains one stray row 
from outside the interval you're trying to replace, it'll delete a ton of data. 
 (And we just found a bug that could lead to rows derived from our Kafka 
backups being outside the interval you'd think they'd be in.)  Maybe I should 
add some scarier warnings to the docs?
   
   Part of me does feel like it would be easier to reason about if 
non-appending batch ingestion always required you to specify a target interval 
and always replaced *all* segments inside that interval on success (like I was 
thinking about in part 3 of 
https://github.com/apache/incubator-druid/issues/6989#issuecomment-461108169), 
but I'm sure there are use cases where that isn't desired.  The current 
semantics of "non-appending batch ingestion replaces any data that happens to 
be within segmentGranularity of specific rows produced by the firehose" seems 
harder for me to reason about.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to