paul-rogers commented on issue #12262: URL: https://github.com/apache/druid/issues/12262#issuecomment-1190963186
A summary of the notes above might be that the multi-stage engine is a great addition. The existing "low-latency" queries are Druid's defining feature. We'd like to drag them both, kicking and screaming, into a standard optimizer/operator DAG structure so we can exploit the abundant industry knowledge about how to optimize queries, and how to implement the Next Favorite Feature. The notes above are mostly about how we do this with low latency queries. The question we're discussing is whether the current batch multi-stage engine is a good foundation to expand into the low-latency query space. It is great, except we have to replace the code that does exchanges (AKA shuffles: streaming rather than disk based), task management (in-process rather than Overlord based), failure strategy (fail & retry instead of fail & recover), and so on. There are similarities, of course, but the details differ. Low latency queries are optimized for performance, batch queries for durability. Many of the code bits above the basic row operations will differ. (Spark differs from Presto, for example.) So, we need two execution frameworks, but they should share a common foundation. To evolve the multi-stage engine toward that common foundation, we'd want something like: * Extend the "frame processor" idea to be a bit more like a "fragment" in many engines: `(scan | exchange) --> operator+ --> (scan | client)`. * Express the query as a physical plan: a serialized form of a DAG of operator descriptions. Use this form rather than the complex set of ad-hoc native queries and knick-knack specs inherited from Druid's existing ingest engines. * Pretty-up the multi-stage query "planner" to emit a physical plan. Partition that to get the "work orders" sent to workers. * Collapse the Calcite planner, Druid Query Maker, and multi-stage "planner" into a single planner. Since the multi-stage engine emits (physical) operators, and Calcite manipulates (logical) operators, we can cut out (most of) the middlemen. * Use Calcite rules, and Druid-specific transforms operating on the logical plan to optimize the query rather than the large amount of ad-hoc code currently used. The key notion is that if we plan based on operators, we can leverage the vast industry knowledge (of things like relational algebra, how to apply cost estimations, etc.) to improve our planning process -- something which is very hard to do with ad-hoc data structure such as Druid uses for native queries. The above sketches paths to converge both low-latency and multi-stage batch frameworks toward the industry-standard optimizer/operator-DAG architecture. As we do, we can look for points of synergy. For example both should use frames, should use the same operator implementations, etc. Moreover, effort invested in the planner (metadata, statistics, etc.) benefits both low-latency and batch multi-stage queries. In short, the answer to the question, "how to we use multi-stage for low latency" is that both frameworks evolve toward the desired end state of the optimizer/operator-DAG architecture. Most other engines are already there, we've just got us some catching up to do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
