xiangmy21 opened a new pull request, #17065:
URL: https://github.com/apache/iotdb/pull/17065
# feat: show timeseries [order by timseries] clause
Support `ORDER BY TIMESERIES` in `SHOW TIMESERIES` and Optimize `OFFSET`
with Subtree Measurement Statistics
---
## Brief
This PR introduces ordering support for `SHOW TIMESERIES` by measurement
name (lexicographical) and optimizes the performance of the `OFFSET` clause
using subtree measurement statistics.
Currently, `SHOW TIMESERIES` supports the `ORDER BY TIMESERIES` clause.
When the query scope is within a single database, `OFFSET X` no longer
requires traversing the first X measurements one by one. Instead, it skips
subtrees in the metadata tree based on subtree measurement counts.
Example:
```sql
IoTDB> show timeseries root.dbtest.** order by timeseries desc limit 5
offset 5001
+--------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+
| Timeseries|Alias|
Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType|
+--------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+
|root.dbtest.plant9.device84.m188| null|root.dbtest| INT32| RLE|
SNAPPY|null| null| null| null| BASE|
|root.dbtest.plant9.device84.m187| null|root.dbtest| INT32| RLE|
SNAPPY|null| null| null| null| BASE|
|root.dbtest.plant9.device84.m186| null|root.dbtest| INT32| RLE|
SNAPPY|null| null| null| null| BASE|
|root.dbtest.plant9.device84.m185| null|root.dbtest| INT32| RLE|
SNAPPY|null| null| null| null| BASE|
|root.dbtest.plant9.device84.m184| null|root.dbtest| INT32| RLE|
SNAPPY|null| null| null| null| BASE|
+--------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+
```
* `DESC`: descending order
* `ASC`: ascending order (default if not specified)
---
## Design & Implementation Details
### 1. Syntax Extension
* Updated the grammar of `SHOW TIMESERIES` to support
`orderByTimeseriesClause`.
* Added ordering-related fields to `ShowTimeSeriesStatement` to record
sorting requirements.
### 2. Logical Plan Construction
* Introduced a `SortNode` at the top of the logical plan to enable global
ordering across partitions.
* For single-partition queries, `LIMIT` and `OFFSET` are pushed down to the
metadata tree traversal stage in `LogicalPlanVisitor`.
### 3. Limit/Offset Handling in Multi-Partition Queries
* For multi-partition queries, `OFFSET` cannot be pushed down.
Instead, `limit' = limit + offset` is used.
* Fixed a bug in the previous implementation:
when `limit = 0` (which is treated as unlimited in IoTDB), `limit'` was
incorrectly pushed down, causing unexpected truncation of results.
### 4. Ordered Traversal in Metadata Tree
* In `MTreeBelowSGMemoryImpl`, the iteration strategy of child nodes in
`schemaReader` is overridden under ordering mode.
* Child nodes are traversed in lexicographical order by name.
### 5. Subtree Measurement Count
* Implemented subtree measurement statistics (only for in-memory metadata
tree):
* Added `subtreeMeasurementCount` to `IMemMNode` and related classes.
* Maintained `subtreeMeasurementCount` along ancestor paths during
measurement insertion and deletion.
* This value is not persisted; it is initialized via a DFS during metadata
tree loading.
* During schemaReader construction, `acceptFullMatchedNode` is overridden to
skip subtree traversal when `subtreeMeasurementCount < offset`.
---
## Tests
### Functional Tests
* Added `IoTDBShowTimeseriesOrderByTimeseriesIT` to verify correctness of
ordering and offset behavior.
### Performance Evaluation
* Evaluated performance on a single-partition dataset with a typical
three-level hierarchy:
```
root.plantA.deviceB.measurementC
```
with approximately 900,000 measurements.
* Query pattern:
```sql
SHOW TIMESERIES root.dbtest.** order by timeseries LIMIT 100 OFFSET X;
```
where `X` ranges from 0 to 900,000 with a step of 5,000.
* Compared the original implementation and the optimized version.
* Results are shown in the figure below:
<img width="1200" height="800" alt="image"
src="https://github.com/user-attachments/assets/625a2825-7790-4c95-9cb4-854c3307e7fb"
/>
* Detailed benchmark methodology and results are available at:
[[link].](https://apache-iotdb-project.feishu.cn/wiki/P0RcwbUleipgDjkKwhqcOkkynJd?from=from_copylink)
---
## Impact
* Enables deterministic ordering for `SHOW TIMESERIES`.
* Significantly reduces traversal cost for large `OFFSET` values in
single-partition scenarios $O(limit+offset)\to O(limit+TreeHeight)$.
* Improves correctness of limit/offset handling in multi-partition queries.
* Introduces minimal overhead to metadata maintenance (in-memory only).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]