JulianJaffePinterest commented on issue #9463: Add namespaces to Druid segments 
within a data source
URL: https://github.com/apache/druid/issues/9463#issuecomment-598465621
 
 
   Let me explain our use case, which will hopefully clarify some of the 
questions here:
   
   We have multiple independent pipelines that produce data we serve as a 
single data source (to give an oversimplified example, you can imagine that one 
pipeline calculate clicks and impressions while another pipeline calculates 
conversions). The outputs of these pipelines are queried together, and they 
share some dimensions and have some unique dimensions, as well as unique 
metrics. We also produce intra-day computations of these data sets that are 
updated with daily true-ups. The output of an intra-day run needs to overshadow 
any existing output for the time it's running for, which can be handled by 
overshadowing, but the intra-day output of a conversion pipeline shouldn't 
affect the output of any other pipeline, so simply synchronizing on version and 
using a linear shard spec won't work. Union data sources, if they were 
performant, could likely handle this without too much trouble since we're only 
discussing a few distinct input sources. In our case, most queries actually 
cover over a hundred namespaces. At that scale, I'm not sure how performant 
unioning would be (possibly fine). The main benefit namespacing provides is 
that clients don't need to know or care about these details, nor do they need 
to know _which_ data sources to union. I agree with the overall point that this 
all could be done in the query layer (and perhaps even should be done there).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to