gianm opened a new pull request #9234: Add join-related DataSource types and 
analysis functionality.
URL: https://github.com/apache/druid/pull/9234
 
 
   Builds on #9111 and implements the datasource analysis mentioned in #8728. 
Still can't
   handle join datasources, but we're a step closer.
   
   Join-related DataSource types:
   
   1) Add "join", "lookup", and "inline" datasources.
   2) Add "getChildren" and "withChildren" methods to DataSource, which will be 
used
      in the future for query rewriting (e.g. inlining of subqueries).
   
   DataSource analysis functionality:
   
   1) Add **DataSourceAnalysis** class, which breaks down datasources into 
three components:
      outer queries, a base datasource (left-most of the highest level 
left-leaning join
      tree), and other joined-in leaf datasources (the right-hand branches of 
the
      left-leaning join tree).
   2) Add "isConcrete", "isGlobal", and "isCacheable" methods to DataSource in 
order to
      support analysis.
   3) Use the DataSourceAnalysis methods throughout the query handling stack, 
replacing
      various ad-hoc approaches. Most of the interesting changes are in
      **ClientQuerySegmentWalker** (brokers), **ServerManager** (historicals), 
and
      **SinkQuerySegmentWalker** (indexing tasks).
   
   Other notes:
   
   1) Changed **TimelineServerView** to return an Optional timeline, which I 
thought made
      the analysis changes cleaner to implement.
   2) Renamed DataSource#getNames to **DataSource#getTableNames**, which I 
think is clearer.
      Also, made it a Set, so implementations don't need to worry about 
duplicates.
   3) Added **QueryToolChest#canPerformSubquery**, which is now used by query 
entry points to
      determine whether it is safe to pass a subquery dataSource to the query 
toolchest.
      Fixes an issue introduced in #5471 where subqueries under 
non-groupBy-typed queries
      were silently ignored, since neither the query entry point nor the 
toolchest did
      anything special with them.
   4) The addition of "isCacheable" should work around #8713, since 
UnionDataSource now
      returns false for cacheability.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to