jihoonson commented on issue #8249: ability to let user configure segment 
version in indexing task
URL: 
https://github.com/apache/incubator-druid/issues/8249#issuecomment-518803775
 
 
   I think it could depend on the lock granularity (segment lock vs time chunk 
lock) and the rollup mode (perfect rollup vs best-effort rollup). 
   
   I understand your use case could need this kind of feature. But before we 
talk about implementation details, I'm wondering this is really a good idea. 
Even though Hadoop task already supports the custom segment version, I feel 
like it's a hacky way to avoid the segment versioning system of Druid which 
could be hard to use and even dangerous if something happens (like they might 
see some stale data unexpectedly). Also, it's very weird to me if indexing 
tasks could generate segments overshadowed by the existing segments. It could 
be just waste of time and resources I guess.
   
   > We have an use case (for Parallel index task and Local Index task) where 
the overshadowing should happen based on when the data was generated by the ETL 
pipelines and not when Druid indexing is running for those which could many 
times run in different order for many reasons e.g. Druid tasks may fail and are 
resubmitted.
   
   I guess you're using a sort of workflow scheduler tool and, ideally, this 
issue should be addressed in the tool. Do we need this because it's too hard or 
complex to guarantee the proper job execution order in the tool? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to