JulianJaffePinterest commented on issue #9463: Add namespaces to Druid segments within a data source URL: https://github.com/apache/druid/issues/9463#issuecomment-598465621 Let me explain our use case, which will hopefully clarify some of the questions here: We have multiple independent pipelines that produce data we serve as a single data source (to give an oversimplified example, you can imagine that one pipeline calculate clicks and impressions while another pipeline calculates conversions). The outputs of these pipelines are queried together, and they share some dimensions and have some unique dimensions, as well as unique metrics. We also produce intra-day computations of these data sets that are updated with daily true-ups. The output of an intra-day run needs to overshadow any existing output for the time it's running for, which can be handled by overshadowing, but the intra-day output of a conversion pipeline shouldn't affect the output of any other pipeline, so simply synchronizing on version and using a linear shard spec won't work. Union data sources, if they were performant, could likely handle this without too much trouble since we're only discussing a few distinct input sources. In our case, most queries actually cover over a hundred namespaces. At that scale, I'm not sure how performant unioning would be (possibly fine). The main benefit namespacing provides is that clients don't need to know or care about these details, nor do they need to know _which_ data sources to union. I agree with the overall point that this all could be done in the query layer (and perhaps even should be done there).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
