gianm opened a new pull request #9234: Add join-related DataSource types and analysis functionality. URL: https://github.com/apache/druid/pull/9234 Builds on #9111 and implements the datasource analysis mentioned in #8728. Still can't handle join datasources, but we're a step closer. Join-related DataSource types: 1) Add "join", "lookup", and "inline" datasources. 2) Add "getChildren" and "withChildren" methods to DataSource, which will be used in the future for query rewriting (e.g. inlining of subqueries). DataSource analysis functionality: 1) Add **DataSourceAnalysis** class, which breaks down datasources into three components: outer queries, a base datasource (left-most of the highest level left-leaning join tree), and other joined-in leaf datasources (the right-hand branches of the left-leaning join tree). 2) Add "isConcrete", "isGlobal", and "isCacheable" methods to DataSource in order to support analysis. 3) Use the DataSourceAnalysis methods throughout the query handling stack, replacing various ad-hoc approaches. Most of the interesting changes are in **ClientQuerySegmentWalker** (brokers), **ServerManager** (historicals), and **SinkQuerySegmentWalker** (indexing tasks). Other notes: 1) Changed **TimelineServerView** to return an Optional timeline, which I thought made the analysis changes cleaner to implement. 2) Renamed DataSource#getNames to **DataSource#getTableNames**, which I think is clearer. Also, made it a Set, so implementations don't need to worry about duplicates. 3) Added **QueryToolChest#canPerformSubquery**, which is now used by query entry points to determine whether it is safe to pass a subquery dataSource to the query toolchest. Fixes an issue introduced in #5471 where subqueries under non-groupBy-typed queries were silently ignored, since neither the query entry point nor the toolchest did anything special with them. 4) The addition of "isCacheable" should work around #8713, since UnionDataSource now returns false for cacheability.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
