contrueCT opened a new issue, #3053: URL: https://github.com/apache/hugegraph/issues/3053
### Bug Type (问题类型) logic (逻辑设计问题) ### Before submit - [x] 我已经确认现有的 [Issues](https://github.com/apache/hugegraph/issues) 与 [FAQ](https://hugegraph.apache.org/docs/guides/faq/) 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents) ### Environment (环境信息) - Server Version: master / PR #2994 based branch - Backend: HStore - OS: GitHub Actions hstore CI ### Expected & Actual behavior (期望与实际表现) While investigating the hstore CI failure in #2994, we found an existing latent issue in the HStore range-index scan path. For range-index queries with `limit`, `offset`, or paging, HugeGraph's upper layer assumes that backend range scan results are returned in global range-index-key order. It also assumes that the returned `PageState.position()` can be reused as a HugeGraph range cursor. However, HStore's multi-node/tablet scan path can return entries in backend iterator order instead of globally sorted key order. The page state is also an internal storage cursor, not necessarily a HugeGraph range-index key. This can make range-index queries return unstable ordering or skip valid entries when paging is involved. One concrete failure exposed by #2994 was: ```java graph.traversal().V().hasLabel("person") .has("birth", P.between(date2013, date2016)) .limit(2) .toList(); ``` The expected range-index order is: 2013 -> 2014 -> 2015 But the HStore scan returned entries like: 2014 -> 2013 -> 2015 Then limit(2) selected the wrong first two entries. Another paging-related failure showed that after the first page, the page position was an HStore internal cursor. Reusing it as the range scan start could skip valid range-index entries. In #2994 we added a narrow workaround in GraphIndexTransaction: for HStore range-index queries whose visible result depends on limit, offset, or paging, the index layer reads the matched range-index entries, sorts them by range-index value, and slices them at the HugeGraph layer. Unbounded range-index scans still use the original streaming path to avoid disturbing count, joint-index, and cleanup paths. This workaround fixes the immediate user-visible correctness issue, but the lower-level contract is still unclear. ### Expected behavior HStore range scans should have a clear and reliable contract: If HugeGraph range-index scan semantics require ordered results, HStore should return globally sorted entries across node/tablet iterators. PageState.position() should have a well-defined meaning. It should be clear whether it is a backend-internal cursor or a HugeGraph key cursor. Range-index paging should not skip valid entries or depend on accidental backend iterator order. ### Possible fix direction A more complete fix should probably be handled in the HStore store-client / scan iterator layer: - define whether IdRangeQuery results must be globally ordered by key; - merge multiple node/tablet iterators by key order when serving ordered range scans; - separate backend-internal page cursor semantics from HugeGraph range-key cursor semantics; - add HStore-specific regression tests for: - range index + limit; - range index + offset; - range index + paging across multiple pages; - cross-node/tablet range scans; - count / joint-index / left-index cleanup paths to avoid regressions. ### Related context This was exposed during #2994, but it does not seem to be caused by the query-condition refactoring itself. The PR only made the latent HStore issue visible in CI. ### Vertex/Edge example (问题点 / 边数据举例) ```javascript ``` ### Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构) ```javascript ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
