[GitHub] [druid] paul-rogers opened a new pull request, #12641: Foundations for an operator-based approach for Druid queries

GitBox Sun, 12 Jun 2022 23:03:57 -0700


paul-rogers opened a new pull request, #12641:
URL: https://github.com/apache/druid/pull/12641

Issue #11933 proposed using the industry-standard operator DAG structure for
Druid queries in place of the existing Sequence-based approach. The issue has a
lengthy discussion of the reasons.

Separately, issue #12262 proposes a multi-stage query engine for Druid,
focused on long-running report-style queries and ingestion. To extend that idea
to the low-latency space would seem to demand we start with what we already
have, and which has proven itself to be rock-solid in many production shops.

Putting the two together, to create a multi-stage solution for Druid's
low-latency query path, we propose to evolve what we have, step-by-step, to the
industry-standard operator DAG approach, which will allow us to introduce
multi-stage queries within the existing framework.

This PR is a first step: it provides the foundation structure. The code here
has already been used to create a full operator-based solution for [scan
queries in the context of the historical
node](https://github.com/paul-rogers/druid/tree/op-step1) and to fully convert
the scan query path for the [test query
stack](https://github.com/paul-rogers/druid/tree/op-step2). That work will be
contributed, step-by-step, building on top of this PR.

See [the
README](https://github.com/paul-rogers/druid/blob/20942c83c23d7bae516bace80fdf07b3603067a5/processing/src/main/java/org/apache/druid/queryng/README.md)
for more details.

### Operators

An operator does one task in a data pipeline. The key operator abstractions
include:

`Operator`: an interface for a data pipeline component. An operator can be
opened to provide an iterator over results, then closed. An operator can have
zero inputs (a leaf operator), one input (a filter, limit or projection
operator) or multiple inputs (join, merge, union, etc.)

Multiple variations of operators are provided in this PR. All of these
operators are simple in the sense that they only refer to other operators, but
not to any of Druid's query infrastructure.

* `LimitOperator`: applies a limit to a result set.
* `NullOperator`: does nothing, like an empty list or empty iterator.
* `MappingOperator`: takes one input and applies some form of mapping as
defined by a derived class.
* `ConcatOpreator`: performs a union of its inputs, emitting each one after
the other.
* `OrderedMergeOperator` implements an ordered merge of multiple inputs.
* `WrappingOperator` similar to "baggage" on sequences: an operator that
does tasks at the start and end, of result set, but imposes no per-row overhead.

### Fragments

Operators combine to form a data pipeline. Data pipelines are distributed,
as in Druid's scatter/gather architecture. A common terminology is to say that
the entire query forms a DAG. The DAG is "sliced" at node boundaries, with
exchanges between slices. At runtime, a *slice* is replicated across many
nodes. Each instance of a slice is a *fragment*.

This PR provides the basics of the fragment structure. In most engines, a
planner converts SQL into a logical plan, then into a physical plan that
describes the operator DAG. Slices of that plan are sent to nodes which then
execute the fragments. Druid, however, already has an existing `QueryRunner`
based structure. `QueryRunner` are actually "query planners": the
`QueryRunner.run()` method is better thought of as `QueryPlanner.plan()`: it
figures out what sequence is needed at that point in the pipeline and creates
that sequence.

Our first step in the path to adopt operators is to reuse the query runners.
Instead of creating sequences, we modify `QueryRunner`s to create operators.
The fragment-related abstractions in this PR support such an approach.

* `FragmentContext`: the state shared by all operators in a fragment. For
now, this state includes the `ResponseContext` and, internally, the collection
of all operators that form the fragment.
* `FragmentBuilder`: creates a fragment from a collection of operators, and
provides an API to run the resulting fragment.
* `FragmentRun`: runs the fragment, which means calling `open()` on the root
operator, returning the root operator's iterator, and closing all operators at
the completion of the run.
* `FragmentBuilderFactory`: a factory to create a fragment builder. This
class will be injected via Guice.

We will need a way to pass the `FragmentContext` to `QueryRunner`s so that
they can create operators for a fragment. It turns out that `QueryPlus` is
handy way to accomplish this, so this PR contains the required `QueryPlus`
code. That code isn't used yet: we're just setting things up.

### Configuration

This PR also provides a very basic configuration system which reports that
the operator-approach is enabled only for scan queries and only if the
`-Ddruid.queryng=true` is set on the command line. This is a temporary
approach, good enough for testing.

Nothing uses that config yet: it will be used in the next PR to allow
`QueryRunner`s to know when to use an operator implementation and when to
continue to use sequences.

### Tests

One of the very handy things about operators is that they are highly modular
and thus extremely easy to unit test. Tests exist for all the basic
abstractions defined above.

### Next Steps

The goal of this PR is for reviewers to focus on the core abstractions. The
next PR will begin to create the parallel operator path for scan queries. Those
PRs will provide operators converted from the existing sequences, along with
the "planner" code that query runners use to define the operator. That whole
path an be seen in [this
branch](https://github.com/paul-rogers/druid/tree/op-step2).

<hr>

This PR has:
- [X] been self-reviewed.
- [X] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] been tested in a test Druid cluster.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] paul-rogers opened a new pull request, #12641: Foundations for an operator-based approach for Druid queries

Reply via email to