imbajin commented on code in PR #3034:
URL: https://github.com/apache/hugegraph/pull/3034#discussion_r3281434544
##########
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/traversal/optimize/TraversalUtil.java:
##########
@@ -166,38 +170,219 @@ public static void trySetGraph(Step<?, ?> step,
HugeGraph graph) {
public static void extractHasContainer(HugeGraphStep<?, ?> newStep,
Traversal.Admin<?, ?> traversal) {
- Step<?, ?> step = newStep;
- do {
- step = step.getNextStep();
+ Step<?, ?> step = newStep.getNextStep();
+ while (step instanceof HasStep || step instanceof NoOpBarrierStep) {
+ Step<?, ?> nextStep = step.getNextStep();
if (step instanceof HasStep) {
HasContainerHolder holder = (HasContainerHolder) step;
- for (HasContainer has : holder.getHasContainers()) {
- if (!GraphStep.processHasContainerIds(newStep, has)) {
- newStep.addHasContainer(has);
- }
+ if (extractHasContainers(newStep, holder)) {
+ TraversalHelper.copyLabels(step, step.getPreviousStep(),
false);
+ traversal.removeStep(step);
}
- TraversalHelper.copyLabels(step, step.getPreviousStep(),
false);
- traversal.removeStep(step);
}
- } while (step instanceof HasStep || step instanceof NoOpBarrierStep);
+ step = nextStep;
+ }
}
public static void extractHasContainer(HugeVertexStep<?> newStep,
Traversal.Admin<?, ?> traversal) {
Step<?, ?> step = newStep;
do {
+ Step<?, ?> nextStep = step.getNextStep();
if (step instanceof HasStep) {
HasContainerHolder holder = (HasContainerHolder) step;
- for (HasContainer has : holder.getHasContainers()) {
- newStep.addHasContainer(has);
+ if (extractHasContainers(newStep, holder)) {
+ TraversalHelper.copyLabels(step, step.getPreviousStep(),
false);
+ traversal.removeStep(step);
}
- TraversalHelper.copyLabels(step, step.getPreviousStep(),
false);
- traversal.removeStep(step);
}
- step = step.getNextStep();
+ step = nextStep;
} while (step instanceof HasStep || step instanceof NoOpBarrierStep);
}
+ private static boolean extractHasContainers(HugeGraphStep<?, ?> newStep,
+ HasContainerHolder holder) {
+ HugeGraph graph = TraversalUtil.tryGetGraph(newStep);
+ List<HasContainer> hasContainers = new
ArrayList<>(holder.getHasContainers());
+ HugeType type = newStep.returnsVertex() ? HugeType.VERTEX :
HugeType.EDGE;
+ Set<Id> labelIds = collectLabelIds(graph, type,
newStep.getHasContainers(),
+ hasContainers);
+ for (HasContainer has : hasContainers) {
+ if (!canExtractHasContainer(graph, type, labelIds, has)) {
+ continue;
+ }
+ if (!GraphStep.processHasContainerIds(newStep, has)) {
+ newStep.addHasContainer(has);
+ }
+ holder.removeHasContainer(has);
Review Comment:
‼️ I think we should scope this PR back down before merging. The original
bug is a focused correctness issue: text range predicates such as `has("vp4",
P.lt(""))` should not be pushed into the backend query path. The latest
implementation now goes further and splits a mixed `HasStep`, pushes selected
containers down, removes them from the original `HasStep`, and adds new
label/index-availability logic in `TraversalUtil`.
```text
Original bug boundary
Text range predicate
|
v
wrongly pushed to backend
|
v
exception / mismatch with match()
```
```text
Latest implementation boundary
HasStep(vp4 < "", age = 1)
|
+-- push age = 1 into HugeGraphStep / HugeVertexStep
|
+-- remove age = 1 from the original HasStep
|
+-- rely on new label/index checks to prove backend can enforce it
```
That is a much broader optimizer change than the original fix, and once a
container is removed from the original `HasStep`, correctness depends on
`canExtractHasContainer()` exactly matching what the backend planner can really
execute. The new tests cover several cases, but this still expands the review
surface into composite-index matching, label context, paging/sysprop
interactions, and future index-planner behavior.
For this PR, I would prefer one of these smaller and safer shapes:
```text
Option 1: minimal correctness fix
- keep unsafe Text + range predicates in the traversal layer
- do not split mixed HasSteps
- avoid adding backend index-availability logic here
Option 2: low-risk partial prefiltering
- optionally push safe containers down as redundant backend pre-filters
- keep the original HasStep unchanged unless all containers are extractable
- correctness remains guarded by the full traversal-layer filter
```
```text
Safer partial prefiltering shape
Backend prefilter:
age = 1
Traversal filter still intact:
vp4 < "" AND age = 1
Result:
backend may reduce candidates, but traversal still owns correctness
```
The current split-and-remove optimization can still be valuable, but I think
it should be a separate follow-up PR with dedicated tests for composite-index
prefix matching, label-less queries, `~page` / paging semantics when some user
filters remain in traversal, and parity with the backend index planner. Keeping
this PR minimal would make the #2935 fix easier to reason about and safer to
merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]