a2l007 commented on issue #9791:
URL: https://github.com/apache/druid/issues/9791#issuecomment-656789106


   @jihoonson Thanks for taking a look.
   
   `MultiDataSource` is designed as such to provide its implementations a 
capability to decide the datasources from which segments need to be queried for 
an interval. This applies only to queries with multiple base tables in the 
`dataSource` part of the query. (Only UnionDataSource at this point).  By 
missing segments, I mean the case when there are no segments in `table1` for a 
specific interval. This is not to be confused with missing segments which are 
handled by the `RetryQueryRunner`.
   Expanding further on this example, lets say I have a union-based query 
against `table1` and `table2`, but I need data from `table2` only if there are 
no segments in `table1` for the interval. Currently this isn't possible using 
`UnionDataSource` . With `MultiDataSource`, I can create an implementation that 
satisfies this usecase.
   
   In order to support this, Broker will now identify segments to be queried 
for all the table datasources from the query in a single pass instead of one 
table datasource at a time. This makes it easier to push the segment selection 
logic to the `MultiDataSource` implementation for such queries. This diff would 
give a clearer picture: 
https://github.com/apache/druid/pull/10030/files#diff-14e0f52ca2d35d282c1e92d1c14eb0d1R374-R395
   Regarding broker-historical interaction, the queries from the broker to the 
historical will now include SegmentDescriptors from all the datasources 
requested in the union-based query.  Thus this would only require one roundtrip 
between broker and historical (as long as there are no missing segments 
reported by `ReportTimelineMissingSegmentQueryRunner`).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to