This is an automated email from the ASF dual-hosted git repository. jackietien pushed a commit to branch TyShowTimeSeries in repository https://gitbox.apache.org/repos/asf/incubator-iotdb.git
commit 75b3523ba2c2b4b3cbc748d4fdb835f3a88dd6a3 Author: JackieTien97 <[email protected]> AuthorDate: Tue Jul 7 14:28:10 2020 +0800 add english docs --- docs/SystemDesign/SchemaManager/SchemaManager.md | 47 ++++++++++++++++++++++ .../zh/SystemDesign/SchemaManager/SchemaManager.md | 32 +++++++++++++-- .../db/query/dataset/ShowTimeseriesDataSet.java | 9 ++++- 3 files changed, 83 insertions(+), 5 deletions(-) diff --git a/docs/SystemDesign/SchemaManager/SchemaManager.md b/docs/SystemDesign/SchemaManager/SchemaManager.md index 78812c1..b7125e2 100644 --- a/docs/SystemDesign/SchemaManager/SchemaManager.md +++ b/docs/SystemDesign/SchemaManager/SchemaManager.md @@ -282,3 +282,50 @@ All timeseries tag/attribute information will be saved in the tag file, which de > tagsSize (tag1=v1, tag2=v2) attributesSize (attr1=v1, attr2=v2) +## Metadata Query + +### show timeseries without index + +The main logic of query is in the `showTimeseries(ShowTimeSeriesPlan plan)` function of `MManager` + +First of all, we should judge whether we need to order by heat, if so, call the `getAllMeasurementSchemaByHeatOrder` function of `MTree`. Otherwise, call the `getAllMeasurementSchema` function. + +#### getAllMeasurementSchemaByHeatOrder + +The heat here is represented by the `lastTimeStamp` of each time series, so we need to fetch all the satisfied time series, and then order them by `lastTimeStamp`, cut them by `offset` and `limit`. + +#### getAllMeasurementSchema + +In this case, we need to pass the limit(if not exists, set fetch size as limit) and offset to the function `findPath` to reduce the memory footprint. + +#### findPath + +It's a recursive function to get all the statisfied MNode in MTree from root until the number of timeseris list has reached limit or all the MTree has been traversed. + +### show timeseries with index + +The filter condition here can only be tag attribute, or it will throw an exception. + +We can fetch all the satisfied `MeasurementMNode` through the inverted tag index in MTree fast without traversing the whole tree. + +If the result needs to be ordered by heat, we should sort them by the order of `lastTimeStamp` or by the natural order, and then we will trim the result by limit and offset. + +### ShowTimeseries Dataset + +If there is too much metadata , one whole `show timeseris` processing will cause OOM, so we need to add a `fetch size` parameter. + +While the client interacting with the server, it will get at most `fetch_size` records once. + +And the intermediate state will be saved in the `ShowTimeseriesDataSet`. The `queryId -> ShowTimeseriesDataSet` key-value pair will be saved in `TsServieImpl`. + +In `ShowTimeseriesDataSet`, we saved the `ShowTimeSeriesPlan`, current cursor `index` and cached result list `List<RowRecord> result`. + +* judge whether the cursor `index`is equal to the size of `List<RowRecord> result` + * if so, call the corresponding method in MManager to fetch result and put them into cache. + * if it is a query without index, call the method `showTimeseries` + * if it is a query with index, call the method `getAllTimeseriesSchema` + * we need to update the offset in plan each time we call the method in MManger to fetch result, we should add it with `fetch size`. + * if`hasLimit` is `false`,the reset `index` to zero. + * if not + * if `index < result.size()`,return true + * if `index > result.size()`,return false \ No newline at end of file diff --git a/docs/zh/SystemDesign/SchemaManager/SchemaManager.md b/docs/zh/SystemDesign/SchemaManager/SchemaManager.md index 832f831..17f944f 100644 --- a/docs/zh/SystemDesign/SchemaManager/SchemaManager.md +++ b/docs/zh/SystemDesign/SchemaManager/SchemaManager.md @@ -286,7 +286,7 @@ IoTDB 的元数据管理采用目录树的形式,倒数第二层为设备层 主要查询逻辑封装在`MManager`的`showTimeseries(ShowTimeSeriesPlan plan)`方法中 -* 首先判断需不需要根据热度排序,如果需要,则调用`MTree`的`getAllMeasurementSchemaByHeatOrder`方法,否则调用`getAllMeasurementSchema`方法 +首先判断需不需要根据热度排序,如果需要,则调用`MTree`的`getAllMeasurementSchemaByHeatOrder`方法,否则调用`getAllMeasurementSchema`方法 #### getAllMeasurementSchemaByHeatOrder @@ -294,6 +294,32 @@ IoTDB 的元数据管理采用目录树的形式,倒数第二层为设备层 #### getAllMeasurementSchema -这里需要 +这里需要在findPath的时候就将limit(如果没有limit,则将请求的fetchSize当成limit)和offset参数传递下去,减少内存占用。 -### 带过滤条件的元数据查询 \ No newline at end of file +#### findPath + +这个方法封装了在MTree中遍历得到满足条件的时间序列的逻辑,是个递归方法,由根节点往下递归寻找,直到当前时间序列数量达到limit或者已经遍历完整个MTree。 + +### 带过滤条件的元数据查询 + +这里的过滤条件只能是tag属性,否则抛异常。 + +通过在MManager中维护的tag的倒排索引,获得所有满足索引条件的`MeasurementMNode`。 + +若需要根据热度排序,则根据`lastTimeStamp`进行排序,否则根据序列名的字母序排序,然后再做`offset`和`limit`的截断。 + +### ShowTimeseries结果集 + +如果元数据量过多,一次show timeseries的结果可能导致OOM,所以增加fetch size参数,客户端跟服务器端交互时,服务器端一次最多只会取fetch size个时间序列。 + +多次交互的状态信息就存在`ShowTimeseriesDataSet`中。`ShowTimeseriesDataSet`中保存了此次的`ShowTimeSeriesPlan`,当前的游标`index`以及缓存的结果行列表`List<RowRecord> result`。 + +* 判断游标`index`是否等于缓存的结果行`List<RowRecord> result`的size + * 若相等,则调用MManager中的相应的方法取结果,放入缓存 + * 若是带过滤条件的元数据查询,则调用`getAllTimeseriesSchema`方法 + * 若是不带过滤条件的元数据查询,则调用`showTimeseries`方法 + * 需要相应的修改plan中的offset,将offset向前推fetch size大小 + * 若`hasLimit`为`false`,则将index重新置为0 + * 若不相等 + * `index < result.size()`,返回true + * `index > result.size()`,返回false diff --git a/server/src/main/java/org/apache/iotdb/db/query/dataset/ShowTimeseriesDataSet.java b/server/src/main/java/org/apache/iotdb/db/query/dataset/ShowTimeseriesDataSet.java index 128f40b..1b763cb 100644 --- a/server/src/main/java/org/apache/iotdb/db/query/dataset/ShowTimeseriesDataSet.java +++ b/server/src/main/java/org/apache/iotdb/db/query/dataset/ShowTimeseriesDataSet.java @@ -48,8 +48,13 @@ public class ShowTimeseriesDataSet extends QueryDataSet { if (index == result.size()) { plan.setOffset(plan.getOffset() + plan.getLimit()); try { - List<ShowTimeSeriesResult> showTimeSeriesResults = MManager.getInstance() - .showTimeseries(plan); + List<ShowTimeSeriesResult> showTimeSeriesResults; + // show timeseries with index + if (plan.getKey() != null && plan.getValue() != null) { + showTimeSeriesResults = MManager.getInstance().getAllTimeseriesSchema(plan); + } else { + showTimeSeriesResults = MManager.getInstance().showTimeseries(plan); + } result = transferShowTimeSeriesResultToRecordList(showTimeSeriesResults); if (!hasLimit) { index = 0;
