paul-rogers commented on issue #12262: URL: https://github.com/apache/druid/issues/12262#issuecomment-1183512847
@clintropolis asked: > What is the migration path to something like that? Build the new thing side-by-side I guess? For the optimizer/operator prototype, the approach taken was to evolve the current code. First, replace sequences with operators in a way that the two interoperate. Then, we'd find that query runners do nothing other than call a "planner function" that produces an operator, so we can gather up that code into a low-level "native query planner" that is similar to the controller work manager @gianm created. Once we've converted the query runner/sequence stack to native query planner/operators, we can, if we're confident, just ship that. (Operators are designed to be easy to test, unlike query runners and sequences, so we can more easily ensure quality.) Or, if we are cautions, we offer both. The `QueryLifecycle` either goes down the current path, or the new path, returning results the same way regardless of engine used. From there, we can extend the planner for more use cases: multi-tier distribution, distributed joins, etc. We can also combine planner levels. Currently we have Calcite, the native query maker and the query runners. With MSQ, we also have the controller's work allocation code. After the above step we have Calcite, the native query maker and the native query planner in one path, Calcite, the MSQ query maker, and the Controller work allocator in the ingest path. We can begin to merge the planners so we plan the SQL query, not the native query. That removes barriers to what we can offer uses. "Classic" queries are still fast, but BI queries are possible without overloading the Broker. Calcite knows about windowed aggregations, we must add a windowing operator and we're (mostly) good to go. Over time, we could, if there was reason, end up with the common solution: Calcite for query parsing, analysis and query optimization. A Druid-specific layer to convert Calcite rels into the equivalent Druid operators. Finally, execution frameworks optimized for low-latency (in-memory processing) or ingest (disk-based processing). The operators are the same, only the exchanges differ on the data pipeline path. (The execution management, resource management, error handling and other aspects would be different, but the data pipeline is mostly blissfully ignorant of these aspects.) While rearranging the planner, we should be able to move items one at a time: they become operators to Calcite, with the remaining native query planner as remaining as the big, complex operator we have today in the `DruidQueryRel` where a native query is a Calcite relational operator. We could also merge the MSQ efforts. Once sequences are replaced with operators, the operators can change to use frames (rather than Java arrays) as their storage format. Since operators are very flexible, during the transition we can have some operators use frames while others continue to use arrays. "Shim" operators handle translation, the same way that, in the sequence-to-operator transition, shims handle the sequence-to-operator and operator-to-sequence conversions. We'd want to remove the shims in the end, but they do allow an incremental approach to development. Similarly, if we want to change how we do shuffles, we can more easily do so with operators. In an operator DAG, an "exchange operator" is physically comprised of two parts: a sender and a receiver. The query stack doesn't really care how these work, as long as the planner inserts matched pairs. We can start with the existing HTTP scatter/gather. Then, we add a new exchange operator based on frames, or async frames, or a connection-based solution, or multiplexed, connection-based frames. Or, for batch MSQ, based on external files. The "Lego" nature of operators means that the rest of the DAG doesn't care as long as the sender consumes rows from "downstream" and the receiver provides them "upstream". Experience with other tools has shown that, with the operator-based approached, each operator can be evolved and tested in isolation, allowing us to safely incorporate MSQ functionality step-by-step. Similarly, as long as two planner versions produce the same plans for the same queries, we can be sure that we've not broken anything. If version X is supposed to add new features, we can easily verify that only the target plans changed, and everything else remains the same. We don't need to actually execute the queries to verify the planner: we just compare "physical plans". There was a PR that started down this path: #12545, though it was closed for now to focus efforts elsewhere. This PR was based on a technique which both Drill and Impala used with great success. The point is, others have worked out solutions for safely evolving an optimizer/operator-based query engine. We can shamelessly borrow those ideas where they are useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
