gianm opened a new pull request, #16849:
URL: https://github.com/apache/druid/pull/16849
Currently, segments are always sorted by __time, followed by the sort order
provided by the user via dimensionsSpec or CLUSTERED BY. Sorting by __time
enables efficient execution of queries involving time-ordering or granularity.
Time-ordering is a simple matter of reading the rows in stored order, and
granular cursors can be generated in streaming fashion.
However, for various workloads, it's better for storage footprint and query
performance to sort by arbitrary orders that do not start with __time. With
this patch, users can sort segments by such orders.
For spec-based ingestion, users add "useExplicitSegmentSortOrder: true" to
dimensionsSpec. The "dimensions" list determines the sort order. To define a
sort order that includes "__time", users explicitly include a dimension named
"__time".
For SQL-based ingestion, users set the context parameter
"useExplicitSegmentSortOrder: true". The CLUSTERED BY clause is then used as
the explicit segment sort order.
In both cases, when the new "useExplicitSegmentSortOrder" parameter is false
(the default), __timeĀ is implicitly prepended to the sort order, as it always
was prior to this patch.
The new parameter is experimental for two main reasons. First, such segments
can cause errors when loaded by older servers, due to violating their
expectations that timestamps are always monotonically increasing. Second, even
on newer servers, not all queries can run on non-time-sorted segments. Scan
queries involving time-ordering and any query involving granularity will not
run. (To partially mitigate this, a currently-undocumented SQL feature
"sqlUseGranularity" is provided. When set to false the SQL planner avoids using
"granularity".)
Changes on the write path:
1) DimensionsSpec can now optionally contain a __time dimension, which
controls the placement of __time in the sort order. If not present,
__time is considered to be first in the sort order, as it has always
been.
2) IncrementalIndex and IndexMerger are updated to sort facts more
flexibly; not always by time first.
3) Metadata (stored in metadata.drd) gains a "sortOrder" field.
4) MSQ can generate range-based shard specs even when not all columns are
singly-valued strings. It merely stops accepting new clustering key
fields when it encounters the first one that isn't a singly-valued
string. This is useful because it enables range shard specs on
"someDim" to be created for clauses like "CLUSTERED BY someDim, __time".
Changes on the read path:
1) Add StorageAdapter#getSortOrder so query engines can tell how a
segment is sorted.
2) Update QueryableIndexStorageAdapter, IncrementalIndexStorageAdapter,
and VectorCursorGranularizer to throw errors when using granularities
on non-time-ordered segments.
3) Update ScanQueryEngine to throw an error when using the time-ordering
"order" parameter on non-time-ordered segments.
4) Update TimeBoundaryQueryRunnerFactory to perform a segment scan when
running on a non-time-ordered segment.
5) Add "sqlUseGranularity" context parameter that causes the SQL planner
to avoid using granularities other than ALL.
Other changes:
1) Rename DimensionsSpec "hasCustomDimensions" to "hasFixedDimensions"
and change the meaning subtly: it now returns true if the DimensionsSpec
represents an unchanging list of dimensions, or false if there is
some discovery happening. This is what call sites had expected anyway.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]