This is an automated email from the ASF dual-hosted git repository. spmallette pushed a commit to branch jsprocess in repository https://gitbox.apache.org/repos/asf/tinkerpop.git
commit c35e817ad63ccaf72f61784fabdd3ea2e6a0bf52 Author: Stephen Mallette <[email protected]> AuthorDate: Tue Jun 30 10:56:34 2026 -0400 Add repeat() to gremlin-javascript Tiny Gremlin Implements repeat() with times()/until()/emit() modulators in the local in-process executor, including position-sensitive while vs. do-while semantics, emit side-output, and nested repeat() to any depth. Refactors the executor around an ExecutionContext (a lightweight analog of Java's TraversalParent) so branching steps run their child pipelines through runBranch/runProject/runRooted rather than bespoke escapes; path, addE, and repeat become parent-step registry entries. The loop is driven breadth-first over the whole frontier, so barrier steps such as order().by() inside the body see every traverser at that loop position and order globally, matching full Gremlin at any nesting depth. The loop-label form, the predicate form of until()/emit(), and mutation steps within the loop are rejected with clear messages. Assisted-by: Claude Code:claude-opus-4-8 --- CHANGELOG.asciidoc | 1 - docs/src/dev/provider/tiny-gremlin.asciidoc | 53 +++- docs/src/reference/gremlin-variants.asciidoc | 3 + docs/src/reference/the-traversal.asciidoc | 1 + .../lib/language/executor/ExecuteVisitor.ts | 36 +++ .../lib/process/local/LocalExecutor.ts | 278 ++++++++++++++++----- .../gremlin-javascript/lib/process/local/passes.ts | 1 + .../gremlin-javascript/lib/process/local/steps.ts | 78 +++++- .../gremlin-javascript/lib/process/local/types.ts | 41 +++ .../test/unit/tiny-gremlin-repeat-test.js | 170 +++++++++++++ .../gremlin/test/features/branch/Repeat.feature | 22 +- 11 files changed, 606 insertions(+), 78 deletions(-) diff --git a/CHANGELOG.asciidoc b/CHANGELOG.asciidoc index b5c70c2d21..963ad54f5c 100644 --- a/CHANGELOG.asciidoc +++ b/CHANGELOG.asciidoc @@ -130,7 +130,6 @@ image::https://raw.githubusercontent.com/apache/tinkerpop/master/docs/static/ima * Added the `gql-gremlin` module containing the TinkerGQL engine, a `MATCH`-pattern executor implementing a deliberate minimal subset of ISO GQL, usable by any graph provider. * Added `Graph.countVerticesByLabel(String)`, `Graph.countEdgesByLabel(String)`, and the `Graph.Index` nested interface as performance-hint extension points for the TinkerGQL engine. * Added GQL support to TinkerGraph with `gql-gremlin`, -* Added Tiny Gremlin local execution to `gremlin-javascript`: `traversal().with_(graph)` now executes a defined subset of Gremlin steps (navigation, filtering, value extraction, path tracking, range, ordering, and local mutation) against an in-memory `Graph` without a remote server. The `Graph` adjacency index (`outEdges`, `inEdges`) enables O(1) navigation. New `forEach(fn)` and `reduce(fn, seed)` terminal methods on `Traversal` support result processing with native JavaScript functions [...] [[release-4-0-0-beta-2]] === TinkerPop 4.0.0-beta.2 (April 1, 2026) diff --git a/docs/src/dev/provider/tiny-gremlin.asciidoc b/docs/src/dev/provider/tiny-gremlin.asciidoc index bffd26056e..1bbb310827 100644 --- a/docs/src/dev/provider/tiny-gremlin.asciidoc +++ b/docs/src/dev/provider/tiny-gremlin.asciidoc @@ -37,12 +37,12 @@ then navigates and filters it locally using familiar Gremlin syntax. A secondary entirely in process for lightweight analysis or testing, using `addV()`, `addE()`, and `property()`. Because the graph is small and entirely in memory, features that are expensive or architecture-dependent in a full -implementation, such as `repeat()`, distributed OLAP steps, aggregation barriers, and strategy-based optimizations, -are deliberately excluded. The result is a small, self-contained, predictable execution model. +implementation, such as distributed OLAP steps, aggregation barriers, and strategy-based optimizations, are +deliberately excluded. The result is a small, self-contained, predictable execution model. === Supported Steps -Tiny Gremlin supports exactly the 33 steps listed below, grouped by category. No other step may be used in a Tiny +Tiny Gremlin supports exactly the 37 steps listed below, grouped by category. No other step may be used in a Tiny Gremlin traversal; doing so results in an error before any graph data is accessed. [width="100%",options="header"] @@ -55,8 +55,9 @@ Gremlin traversal; doing so results in an error before any graph data is accesse |Path |`path()` |Range |`limit()`, `range()`, `skip()`, `tail()` |Ordering |`order()` +|Branch |`repeat()` |Mutation |`addV()`, `addE()`, `property()` -|Modulators |`from()`, `to()` +|Modulators |`from()`, `to()`, `times()`, `until()`, `emit()` |=== === Terminal Methods @@ -347,6 +348,50 @@ Emits elements at positions `[low, high)` in the stream, zero-indexed. * The `Scope` form is not supported and will raise an error. +=== repeat() + +*Gremlin Semantics reference:* <<repeat-step,repeat()>> + +Repeatedly applies a child traversal to each traverser, forming a loop. The loop is bounded by a `times()` or +`until()` modulator, and `emit()` may side-output traversers as they pass through the loop. The position of a +modulator relative to `repeat()` determines its semantics, matching full Gremlin: + +* `repeat(traversal).times(n)` runs the loop body `n` times. Declared after `repeat()`, the body always executes at + least once, so `times(0)` still runs the body once. Declared before, as in `times(n).repeat(traversal)`, the count + is checked first, so `times(0)` emits the input unchanged without running the body. +* `repeat(traversal).until(condition)` is do-while: the body runs and then the condition is tested. Reversing the + order to `until(condition).repeat(traversal)` is while: the condition is tested before the body. +* `emit()` side-outputs every traverser, and `emit(condition)` side-outputs only those satisfying the condition. + Declared after `repeat()` it emits traversers leaving the body, and declared before it emits them entering the body. + +A `condition` for `until()` or `emit()` is an anonymous filter traversal, satisfied when it produces at least one +result. The loop body and these conditions may themselves contain any supported step, including `order().by()` or +`path()`. + +The loop is driven breadth-first over the whole frontier: at each loop iteration every surviving traverser advances +in lockstep through one shared evaluation of the body. A barrier step such as `order().by()` inside the body +therefore sees all traversers at that loop position and orders them globally, matching full Gremlin rather than +ordering within the descendants of a single input. + +Nested `repeat()` is supported. A `repeat()` may appear inside another `repeat()`, whether in the loop body or in +an `until()` or `emit()` condition, to any depth. For example, +`repeat(__.out().repeat(__.in()).times(2)).until(__.hasId(10))` is valid, and the global barrier behaviour above is +preserved through the nesting, so `order().by()` within an inner `repeat()` still orders the whole frontier. + +*Tiny Gremlin limitations:* + +* The loop-label form (`repeat("label", traversal)`) is not supported because `as()`-style step labels and the + `loops("label")` reference are excluded from Tiny Gremlin. +* The predicate form (`until(P)` and `emit(P)`) is not supported. It corresponds to Java's `Predicate<Traverser>` + lambda overload rather than a value predicate, so the anonymous-traversal form (for example + `until(__.has("name", "ripple"))`) must be used instead. +* Mutation steps (`addV()`, `addE()`, `property()`) are not permitted within a `repeat()` body or its `until()` and + `emit()` conditions, since mutating the graph while looping over it has no well-defined semantics. This is enforced + at any nesting depth, including inside a nested `repeat()`. +* A `repeat()` with no `times()` or `until()` bound terminates only when its body stops producing traversers, as on + an acyclic graph. As in full Gremlin, an unbounded loop over cyclic data does not terminate, so a bound is the + caller's responsibility. + === skip() *Gremlin Semantics reference:* <<skip-step,skip()>> diff --git a/docs/src/reference/gremlin-variants.asciidoc b/docs/src/reference/gremlin-variants.asciidoc index 4256fe0a32..6e96938a45 100644 --- a/docs/src/reference/gremlin-variants.asciidoc +++ b/docs/src/reference/gremlin-variants.asciidoc @@ -2243,6 +2243,9 @@ const totalAge = await g.V().hasLabel('person').values('age').reduce((acc, age) // path() projects each element through by() const paths = await g.V().has('name', 'marko').out().path().by('name').toList(); + +// repeat() loops a child traversal, here two hops out from marko +const reached = await g.V().has('name', 'marko').repeat(__.out()).times(2).values('name').toList(); ---- ==== Working within the subset diff --git a/docs/src/reference/the-traversal.asciidoc b/docs/src/reference/the-traversal.asciidoc index 844bdfa8bd..c5804d2c65 100644 --- a/docs/src/reference/the-traversal.asciidoc +++ b/docs/src/reference/the-traversal.asciidoc @@ -6889,6 +6889,7 @@ graph analysis tasks. |Value Extraction |<<id-step,`id()`>>, <<label-step,`label()`>>, <<values-step,`values()`>>, <<valuemap-step,`valueMap()`>>, <<elementmap-step,`elementMap()`>>, <<value-step,`value()`>>, <<key-step,`key()`>> |Path |<<path-step,`path()`>> |Range |<<limit-step,`limit()`>>, <<range-step,`range()`>>, <<skip-step,`skip()`>>, <<tail-step,`tail()`>> +|Branching | <<repeat-step,`repeat()`>>, <<emit-step, `emit()`>>, <<times-step, `times()`>>, <<until-step, `until()`>> |Ordering |<<order-step,`order()`>> |Mutation |<<addvertex-step,`addV()`>>, <<addedge-step,`addE()`>>, <<property-step,`property()`>> |Modulators |<<from-step,`from()`>>, <<to-step,`to()`>> - in conjunction with `addE()`, not `path()`. diff --git a/gremlin-js/gremlin-javascript/lib/language/executor/ExecuteVisitor.ts b/gremlin-js/gremlin-javascript/lib/language/executor/ExecuteVisitor.ts index 37c5808196..84b77dc3e4 100644 --- a/gremlin-js/gremlin-javascript/lib/language/executor/ExecuteVisitor.ts +++ b/gremlin-js/gremlin-javascript/lib/language/executor/ExecuteVisitor.ts @@ -480,6 +480,42 @@ export class ExecuteVisitor { } } + // ── Repeat / loop steps ─────────────────────────────────────────────────── + // repeat() and its times()/until()/emit() modulators are pushed as separate + // descriptors here; the local executor folds them into a single repeat spec, + // tracking whether each modulator was declared before or after the repeat(). + + visitTraversalMethod_repeat_Traversal(ctx: any): void { + this.push('repeat', [this.visitNestedTraversalAsSubPipeline(ctx.nestedTraversal())]); + } + + // repeat(String, Traversal) — the loop-label form. Tiny Gremlin has no step labels, + // so this is left without a body so the executor can reject it with a clear message. + + visitTraversalMethod_times(ctx: any): void { + this.push('times', [this.recoverIntegerLiteral(ctx.integerLiteral())]); + } + + visitTraversalMethod_until_Traversal(ctx: any): void { + this.push('until', [this.visitNestedTraversalAsSubPipeline(ctx.nestedTraversal())]); + } + + visitTraversalMethod_until_Predicate(ctx: any): void { + this.push('until', [this.recoverPredicate(ctx.traversalPredicate())]); + } + + visitTraversalMethod_emit_Empty(_ctx: any): void { + this.push('emit', []); + } + + visitTraversalMethod_emit_Traversal(ctx: any): void { + this.push('emit', [this.visitNestedTraversalAsSubPipeline(ctx.nestedTraversal())]); + } + + visitTraversalMethod_emit_Predicate(ctx: any): void { + this.push('emit', [this.recoverPredicate(ctx.traversalPredicate())]); + } + // ── Mutation steps ──────────────────────────────────────────────────────── visitTraversalMethod_addV_Empty(ctx: any): void { diff --git a/gremlin-js/gremlin-javascript/lib/process/local/LocalExecutor.ts b/gremlin-js/gremlin-javascript/lib/process/local/LocalExecutor.ts index 059b1ddc17..a37f25ebff 100644 --- a/gremlin-js/gremlin-javascript/lib/process/local/LocalExecutor.ts +++ b/gremlin-js/gremlin-javascript/lib/process/local/LocalExecutor.ts @@ -17,7 +17,7 @@ * under the License. */ -import type { Arg, Pipeline, StepDescriptor } from './types.js'; +import type { Arg, Pipeline, StepDescriptor, ExecutionContext, RepeatSpec } from './types.js'; import { defaultPasses, type OptimizationPass } from './passes.js'; import { LocalExecutionError } from './LocalExecutionError.js'; import { Graph, Vertex, Edge } from '../../structure/graph.js'; @@ -26,13 +26,21 @@ import { stepOut, stepIn, stepBoth, stepOutE, stepInE, stepBothE, stepOutV, stepInV, stepOtherV, stepHas, stepHasId, stepHasLabel, stepHasNot, stepId, stepLabel, stepValue, stepKey, stepValues, stepValueMap, stepElementMap, - stepPath, + stepPath, stepRepeat, stepLimit, stepRange, stepSkip, stepTail, stepOrder, stepAddV, stepAddEWithModulators, stepProperty, } from './steps.js'; type StepFn = (source: Iterable<StreamItem>, args: Arg[], graph: Graph, trackPaths: boolean) => Generator<StreamItem>; +/** + * A "parent" step that runs nested child pipelines (addE from/to, path projections, + * repeat bodies). It receives the ExecutionContext rather than reaching back into the + * executor through a bespoke escape, which is what lets new branching steps be added + * as ordinary registry entries instead of special cases in applyStep. + */ +type ParentStepFn = (source: Iterable<StreamItem>, args: Arg[], ctx: ExecutionContext) => Iterable<StreamItem>; + /** Steps allowed inside a by(Traversal) projection on path() — single value extraction only. */ const VALUE_EXTRACTION_STEPS = new Set<string>([ 'id', 'label', 'key', 'value', 'values', 'valueMap', 'elementMap', @@ -52,21 +60,45 @@ export class LocalExecutor { execute(pipeline: Pipeline, graph: Graph): AsyncGenerator<any> { const optimized = this.passes.reduce((p, pass) => pass(p), pipeline); const folded = this.foldModulators(optimized); - const trackPaths = folded.some(s => s.name === 'path' || s.name === 'otherV'); - const stream = this.buildChain(folded, graph, trackPaths); + const trackPaths = this.needsPaths(folded); + const ctx = this.makeContext(graph, trackPaths); + const stream = this.buildChain(folded, ctx); return this.toAsync(stream, trackPaths); } /** - * Folds 'from' and 'to' modulator steps that follow an 'addE' step into a - * composite descriptor so the addE executor has all modulator args available. + * Folds modulator steps into the steps they modify, then recurses into nested child + * pipelines (repeat bodies, until/emit conditions, addE from/to, path projections) so + * an inner order().by() or repeat() is folded too. */ private foldModulators(pipeline: Pipeline): Pipeline { + const folded = this.foldLevel(pipeline); + for (const step of folded) { + if (step.name === 'repeat') { + const spec = step.args[0] as unknown as RepeatSpec; + spec.body = this.foldModulators(spec.body); + if (Array.isArray(spec.until)) spec.until = this.foldModulators(spec.until as Pipeline); + if (Array.isArray(spec.emit)) spec.emit = this.foldModulators(spec.emit as Pipeline); + } else { + for (let a = 0; a < step.args.length; a++) { + const arg = step.args[a]; + if (isPipelineArg(arg)) { + step.args[a] = this.foldModulators(arg as unknown as Pipeline) as unknown as Arg; + } + } + } + } + return folded; + } + + /** Folds the modulators at a single pipeline level (no recursion into child pipelines). */ + private foldLevel(pipeline: Pipeline): Pipeline { const result: StepDescriptor[] = []; let i = 0; while (i < pipeline.length) { const step = pipeline[i]; if (step.name === 'addE') { + // Fold the following from()/to() modulators into the addE descriptor. let fromArgs: Arg[] | null = null; let toArgs: Arg[] | null = null; let j = i + 1; @@ -78,7 +110,7 @@ export class LocalExecutor { result.push({ name: 'addE', args: [step.args[0], fromArgs as unknown as Arg, toArgs as unknown as Arg] }); i = j; } else if (step.name === 'order') { - // Fold a single following by() modulator into the order step args + // Fold a single following by() modulator into the order step args. const next = pipeline[i + 1]; if (next?.name === 'by') { if (Array.isArray(next.args[0])) { @@ -112,6 +144,26 @@ export class LocalExecutor { } result.push({ name: 'path', args: projections }); i = j; + } else if (step.name === 'repeat') { + // repeat() with only trailing times()/until()/emit() modulators. + let m = i + 1; + while (m < pipeline.length && isRepeatModulator(pipeline[m].name)) m++; + result.push({ name: 'repeat', args: [buildRepeatSpec(step, [], pipeline.slice(i + 1, m)) as unknown as Arg] }); + i = m; + } else if (isRepeatModulator(step.name)) { + // A leading run of times()/until()/emit() — these declare modulators on the + // repeat() that must immediately follow (e.g. until(c).repeat(b), emit().repeat(b)). + let k = i; + while (k < pipeline.length && isRepeatModulator(pipeline[k].name)) k++; + if (k >= pipeline.length || pipeline[k].name !== 'repeat') { + throw new LocalExecutionError(`'${step.name}' is only supported as a repeat() modulator in Tiny Gremlin`); + } + const leading = pipeline.slice(i, k); + let m = k + 1; + while (m < pipeline.length && isRepeatModulator(pipeline[m].name)) m++; + const trailing = pipeline.slice(k + 1, m); + result.push({ name: 'repeat', args: [buildRepeatSpec(pipeline[k], leading, trailing) as unknown as Arg] }); + i = m; } else { result.push(step); i++; @@ -120,79 +172,67 @@ export class LocalExecutor { return result; } - private buildChain(pipeline: Pipeline, graph: Graph, trackPaths: boolean): Iterable<StreamItem> { - if (pipeline.length === 0) return []; + /** True when paths must be tracked: a path()/otherV() anywhere top-level or inside a repeat body. */ + private needsPaths(pipeline: Pipeline): boolean { + for (const step of pipeline) { + if (step.name === 'path' || step.name === 'otherV') return true; + if (step.name === 'repeat' && this.needsPaths((step.args[0] as unknown as RepeatSpec).body)) return true; + } + return false; + } - const [source, ...rest] = pipeline; - let stream: Iterable<StreamItem> = this.applySourceStep(source, graph, trackPaths); + /** Builds the ExecutionContext handed to parent steps for running their child pipelines. */ + private makeContext(graph: Graph, trackPaths: boolean): ExecutionContext { + const ctx: ExecutionContext = { + graph, + trackPaths, + runBranch: (child, src) => { + let stream: Iterable<StreamItem> = src; + for (const step of child) stream = this.applyStep(step, stream, ctx); + return stream; + }, + runProject: (child, object) => { + // child is a single value-extraction step (validated during folding). + const probe: ExecutionContext = { ...ctx, trackPaths: false }; + for (const out of this.applyStep(child[0], [wrap(object, [], false)], probe)) { + return getValue(out, false); + } + return NON_PRODUCTIVE; // non-productive by(Traversal) filters the path traverser + }, + runRooted: (child) => { + const probe: ExecutionContext = { ...ctx, trackPaths: false }; + for (const item of this.buildChain(child, probe)) return item; + return null; + }, + }; + return ctx; + } - for (const step of rest) { - stream = this.applyStep(step, stream, graph, trackPaths); - } + private buildChain(pipeline: Pipeline, ctx: ExecutionContext): Iterable<StreamItem> { + if (pipeline.length === 0) return []; + const [source, ...rest] = pipeline; + let stream: Iterable<StreamItem> = this.applySourceStep(source, ctx); + for (const step of rest) stream = this.applyStep(step, stream, ctx); return stream; } - private applySourceStep(step: StepDescriptor, graph: Graph, trackPaths: boolean): Iterable<StreamItem> { + private applySourceStep(step: StepDescriptor, ctx: ExecutionContext): Iterable<StreamItem> { switch (step.name) { - case 'V': return sourceV(step.args, graph, trackPaths); - case 'E': return sourceE(step.args, graph, trackPaths); - case 'addV': return stepAddV([null], step.args, graph, trackPaths); - case 'addE': return this.applyAddE([null], step, graph, trackPaths); + case 'V': return sourceV(step.args, ctx.graph, ctx.trackPaths); + case 'E': return sourceE(step.args, ctx.graph, ctx.trackPaths); + case 'addV': return stepAddV([null], step.args, ctx.graph, ctx.trackPaths); + case 'addE': return PARENT_STEPS.get('addE')!([null], step.args, ctx); default: throw new LocalExecutionError(`'${step.name}' cannot be used as a source step`); } } - private applyStep(step: StepDescriptor, source: Iterable<StreamItem>, graph: Graph, trackPaths: boolean): Iterable<StreamItem> { - if (step.name === 'addE') return this.applyAddE(source, step, graph, trackPaths); - if (step.name === 'path') return this.applyPath(source, step, graph, trackPaths); + private applyStep(step: StepDescriptor, source: Iterable<StreamItem>, ctx: ExecutionContext): Iterable<StreamItem> { + const parent = PARENT_STEPS.get(step.name); + if (parent) return parent(source, step.args, ctx); const fn = STEP_REGISTRY.get(step.name); if (!fn) throw new LocalExecutionError(`No implementation for step '${step.name}'`); - return fn(source, step.args, graph, trackPaths); - } - - /** - * Executes path() with its folded by() projections. Provides stepPath a callback that - * runs a single value-extraction sub-step against one path element, taking its first - * result (e.g. values('name') yields the first value). - */ - private applyPath( - source: Iterable<StreamItem>, - step: StepDescriptor, - graph: Graph, - trackPaths: boolean, - ): Iterable<StreamItem> { - const runTraversal = (sub: Pipeline, object: any): any => { - const sub0 = sub[0]; - const fn = STEP_REGISTRY.get(sub0.name); - if (!fn) throw new LocalExecutionError(`No implementation for step '${sub0.name}'`); - for (const out of fn([wrap(object, [], false)], sub0.args, graph, false)) { - return getValue(out, false); - } - return NON_PRODUCTIVE; // non-productive by(Traversal) filters the path traverser - }; - return stepPath(source, step.args, trackPaths, runTraversal); - } - - private applyAddE( - source: Iterable<StreamItem>, - step: StepDescriptor, - graph: Graph, - trackPaths: boolean, - ): Iterable<StreamItem> { - const label = step.args[0] as string; - const fromPipeline = step.args[1] as Arg[] | null; - const toPipeline = step.args[2] as Arg[] | null; - return stepAddEWithModulators( - source, label, fromPipeline, toPipeline, graph, trackPaths, - (subPipeline, g) => this.executeSubPipeline(subPipeline as unknown as Pipeline, g), - ); - } - - private executeSubPipeline(pipeline: Pipeline, graph: Graph): any { - const stream = this.buildChain(pipeline, graph, false); - for (const item of stream) return item; - return null; + return fn(source, step.args, ctx.graph, ctx.trackPaths); } private async *toAsync(stream: Iterable<StreamItem>, trackPaths: boolean): AsyncGenerator<any> { @@ -205,6 +245,108 @@ export class LocalExecutor { } } +// ── Modulator folding helpers ─────────────────────────────────────────────────── + +function isRepeatModulator(name: string): boolean { + return name === 'times' || name === 'until' || name === 'emit'; +} + +/** True when an argument is itself a pipeline (a non-empty array of step descriptors). */ +function isPipelineArg(arg: unknown): boolean { + return Array.isArray(arg) && arg.length > 0 && arg[0] !== null + && typeof arg[0] === 'object' && 'name' in (arg[0] as object); +} + +/** Mutation steps. Graph mutation inside a loop is undefined, so these are blocked within repeat(). */ +const MUTATION_STEPS = new Set<string>(['addV', 'addE', 'property']); + +/** + * Rejects mutation steps anywhere within a repeat() body or its until()/emit() conditions, + * recursing into nested child pipelines (including nested repeats), since mutating the graph + * while looping over it has no well-defined semantics. + */ +function assertNoMutationSteps(pipeline: Pipeline, context: string): void { + for (const step of pipeline) { + if (MUTATION_STEPS.has(step.name)) { + throw new LocalExecutionError( + `Mutation step '${step.name}()' is not supported inside ${context} in Tiny Gremlin`, + ); + } + for (const arg of step.args) { + if (isPipelineArg(arg)) assertNoMutationSteps(arg as unknown as Pipeline, context); + } + } +} + +/** Folds a repeat() step plus its leading/trailing modulators into a RepeatSpec. */ +function buildRepeatSpec(repeatStep: StepDescriptor, leading: StepDescriptor[], trailing: StepDescriptor[]): RepeatSpec { + const body = repeatStep.args[0]; + if (!isPipelineArg(body)) { + throw new LocalExecutionError('repeat() with a loop label is not supported in Tiny Gremlin'); + } + assertNoMutationSteps(body as unknown as Pipeline, 'a repeat() body'); + const spec: RepeatSpec = { + body: body as unknown as Pipeline, + until: null, untilFirst: false, + emit: null, emitPresent: false, emitFirst: false, + times: null, timesFirst: false, + }; + for (const mod of leading) applyRepeatModulator(spec, mod, true); + for (const mod of trailing) applyRepeatModulator(spec, mod, false); + return spec; +} + +function applyRepeatModulator(spec: RepeatSpec, mod: StepDescriptor, declaredFirst: boolean): void { + if (mod.name === 'times') { + spec.times = mod.args[0] as number; + spec.timesFirst = declaredFirst; + } else if (mod.name === 'until') { + spec.until = requireTraversalCondition('until', mod.args[0]); + spec.untilFirst = declaredFirst; + } else if (mod.name === 'emit') { + spec.emitPresent = true; + spec.emit = mod.args.length ? requireTraversalCondition('emit', mod.args[0]) : null; + spec.emitFirst = declaredFirst; + } +} + +/** + * Validates an until()/emit() condition. Only the anonymous-traversal form is supported. + * The predicate form (until(P)/emit(P)) maps to Java's Predicate<Traverser> lambda overload, + * which Tiny Gremlin cannot express faithfully, so it is rejected with a clear message. + */ +function requireTraversalCondition(stepName: string, arg: Arg | undefined): Pipeline | null { + if (arg == null) return null; + if (!isPipelineArg(arg)) { + throw new LocalExecutionError( + `${stepName}(Predicate) is not supported in Tiny Gremlin; use the traversal form, ` + + `e.g. ${stepName}(__.has(...))`, + ); + } + const pipeline = arg as unknown as Pipeline; + assertNoMutationSteps(pipeline, `an ${stepName}() condition`); + return pipeline; +} + +// ── Parent step registry ──────────────────────────────────────────────────────── +// Steps that run nested child pipelines. They receive the ExecutionContext and use +// its runBranch/runProject/runRooted helpers, so adding a new branching step is a +// registry entry rather than a special case in applyStep. + +const PARENT_STEPS = new Map<string, ParentStepFn>([ + ['path', (source, args, ctx) => stepPath(source, args, ctx.trackPaths, (sub, obj) => ctx.runProject(sub, obj))], + ['addE', (source, args, ctx) => stepAddEWithModulators( + source, + args[0] as string, + args[1] as unknown as Arg[] | null, + args[2] as unknown as Arg[] | null, + ctx.graph, + ctx.trackPaths, + (sub) => ctx.runRooted(sub as unknown as Pipeline), + )], + ['repeat', (source, args, ctx) => stepRepeat(source, args[0] as unknown as RepeatSpec, ctx)], +]); + // ── Result copying ──────────────────────────────────────────────────────────── /** Shallow-copies Vertex/Edge/Path so callers can mutate results without corrupting the graph. */ @@ -291,7 +433,7 @@ const STEP_REGISTRY = new Map<string, StepFn>([ ['values', stepValues], ['valueMap', stepValueMap], ['elementMap', stepElementMap], - // 'path' is dispatched via applyPath (needs the registry for by(Traversal) projections) + // 'path', 'addE' and 'repeat' are parent steps dispatched via PARENT_STEPS (they run child pipelines) ['limit', stepLimit], ['range', stepRange], ['skip', stepSkip], diff --git a/gremlin-js/gremlin-javascript/lib/process/local/passes.ts b/gremlin-js/gremlin-javascript/lib/process/local/passes.ts index ce7635263a..be4088837b 100644 --- a/gremlin-js/gremlin-javascript/lib/process/local/passes.ts +++ b/gremlin-js/gremlin-javascript/lib/process/local/passes.ts @@ -31,6 +31,7 @@ const SUPPORTED_STEPS = new Set<string>([ 'path', 'limit', 'range', 'skip', 'tail', 'order', 'by', + 'repeat', 'times', 'until', 'emit', 'addV', 'addE', 'property', 'from', 'to', ]); diff --git a/gremlin-js/gremlin-javascript/lib/process/local/steps.ts b/gremlin-js/gremlin-javascript/lib/process/local/steps.ts index ba2b04a5a0..164accf753 100644 --- a/gremlin-js/gremlin-javascript/lib/process/local/steps.ts +++ b/gremlin-js/gremlin-javascript/lib/process/local/steps.ts @@ -19,7 +19,7 @@ import { P, TextP, t, direction } from '../traversal.js'; import { Graph, Vertex, Edge, VertexProperty, Property, Path } from '../../structure/graph.js'; -import type { Arg, Pipeline } from './types.js'; +import type { Arg, Pipeline, ExecutionContext, RepeatSpec } from './types.js'; // ── Traverser helpers ───────────────────────────────────────────────────────── @@ -766,3 +766,79 @@ export function* stepProperty( yield item; } } + +// ── Repeat step ─────────────────────────────────────────────────────────────── + +/** + * Executes repeat() with its folded times()/until()/emit() modulators against a + * stream. The loop is driven breadth-first over the whole frontier: at each loop + * depth, every surviving traverser advances in lockstep through one shared run of + * the body. until() is the exit condition (checked before the body when declared + * first — while — or after it otherwise — do-while), times() is a loop-count exit, + * and emit() is a side output that does not stop a traverser from continuing to loop. + * + * Driving the body over the entire frontier at once (rather than per input + * traverser) is what gives barrier steps inside the body — order().by(), and the + * same for a nested repeat() — their global semantics: order() sees every traverser + * at that loop position, not just the descendants of a single start. Because the + * frontier only ever holds traversers that have survived exactly `loops` iterations, + * the loop count is uniform across the frontier and the per-traverser times()/until() + * exits remain exact. + * + * Child pipelines (the body and any until()/emit() traversal conditions) are run + * through the ExecutionContext, so a repeat() body may itself contain order().by(), + * path() projections, or a nested repeat(). + */ +export function* stepRepeat( + source: Iterable<StreamItem>, + spec: RepeatSpec, + ctx: ExecutionContext, +): Generator<StreamItem> { + // until()/emit() conditions are always filter traversals here; the predicate form + // is rejected during folding (see requireTraversalCondition in LocalExecutor). + const conditionHolds = (cond: Pipeline, item: StreamItem): boolean => { + for (const _out of ctx.runBranch(cond, [item])) return true; // filter traversal: any output => true + return false; + }; + + const exitBeforeBody = (item: StreamItem, loops: number): boolean => { + if (spec.times != null && spec.timesFirst && loops >= spec.times) return true; + if (spec.untilFirst && spec.until != null && conditionHolds(spec.until, item)) return true; + return false; + }; + const exitAfterBody = (item: StreamItem, loops: number): boolean => { + if (spec.times != null && !spec.timesFirst && loops >= spec.times) return true; + if (!spec.untilFirst && spec.until != null && conditionHolds(spec.until, item)) return true; + return false; + }; + const shouldEmit = (item: StreamItem): boolean => { + if (!spec.emitPresent) return false; + if (spec.emit == null) return true; // bare emit() — emit every traverser + return conditionHolds(spec.emit, item); + }; + + let frontier: StreamItem[] = [...source]; + let loops = 0; + while (frontier.length > 0) { + // While-phase: traversers that hit a before-body exit leave the loop now; + // the rest (optionally emitted ahead of the body) form the body's input. + const entering: StreamItem[] = []; + for (const item of frontier) { + if (exitBeforeBody(item, loops)) { yield item; continue; } + if (spec.emitFirst && shouldEmit(item)) yield item; + entering.push(item); + } + if (entering.length === 0) break; + + // Run the body over the entire frontier in one pass so barriers see it all. + const nextLoops = loops + 1; + const nextFrontier: StreamItem[] = []; + for (const out of ctx.runBranch(spec.body, entering)) { + if (exitAfterBody(out, nextLoops)) { yield out; continue; } + if (!spec.emitFirst && shouldEmit(out)) yield out; + nextFrontier.push(out); + } + frontier = nextFrontier; + loops = nextLoops; + } +} diff --git a/gremlin-js/gremlin-javascript/lib/process/local/types.ts b/gremlin-js/gremlin-javascript/lib/process/local/types.ts index a081f017d7..d1aee1bb83 100644 --- a/gremlin-js/gremlin-javascript/lib/process/local/types.ts +++ b/gremlin-js/gremlin-javascript/lib/process/local/types.ts @@ -18,6 +18,7 @@ */ import { P, TextP, EnumValue } from '../traversal.js'; +import type { Graph } from '../../structure/graph.js'; /** * A single element flowing through a Tiny Gremlin local pipeline. @@ -55,3 +56,43 @@ export interface StepDescriptor { * traversal or anonymous sub-traversal. */ export type Pipeline = StepDescriptor[]; + +/** + * The capabilities a "parent" step (one that runs nested child pipelines) needs + * from the executor. This is Tiny Gremlin's lightweight analog of Java's + * TraversalParent: rather than each parent step reaching back into the executor + * through bespoke escapes, the executor hands every parent step a context with + * three child-execution modes. + */ +export interface ExecutionContext { + graph: Graph; + trackPaths: boolean; + /** Apply a child pipeline as mid-traversal steps over an existing stream (flatMap per traverser). */ + runBranch(child: Pipeline, source: Iterable<any>): Iterable<any>; + /** Run a single value-extraction child against one object, returning its first result or NON_PRODUCTIVE. */ + runProject(child: Pipeline, object: any): any; + /** Run a child as a complete source-rooted pipeline (e.g. addE from/to), returning the first result. */ + runRooted(child: Pipeline): any; +} + +/** + * The folded form of a repeat() step and its times()/until()/emit() modulators. + * `untilFirst`/`emitFirst` record whether the modulator was declared before the + * repeat() (while/emit-before) or after it (do-while/emit-after). A null `emit` + * with `emitPresent` true is a bare emit() that always emits. + */ +export interface RepeatSpec { + body: Pipeline; + until: Pipeline | null; + untilFirst: boolean; + emit: Pipeline | null; + emitPresent: boolean; + emitFirst: boolean; + times: number | null; + /** + * Whether times() was declared before the repeat(). Like until(), position decides + * while vs. do-while: times(0).repeat(b) skips the body entirely (identity), while + * repeat(b).times(0) still runs the body once. + */ + timesFirst: boolean; +} diff --git a/gremlin-js/gremlin-javascript/test/unit/tiny-gremlin-repeat-test.js b/gremlin-js/gremlin-javascript/test/unit/tiny-gremlin-repeat-test.js new file mode 100644 index 0000000000..1708887b34 --- /dev/null +++ b/gremlin-js/gremlin-javascript/test/unit/tiny-gremlin-repeat-test.js @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import { assert } from 'chai'; +import anon from '../../lib/process/anonymous-traversal.js'; +import { statics as __ } from '../../lib/process/graph-traversal.js'; +import { t, P } from '../../lib/process/traversal.js'; +import { buildModernGraph } from '../cucumber/local-fixtures.js'; + +/** + * Unit coverage for repeat() and its times()/until()/emit() modulators in the + * Tiny Gremlin local executor, including modulator position (before vs. after the + * repeat), emit side-output semantics, and folding of modulators inside the loop body. + * + * Modern graph out-edges: marko(1)->{vadas(2),josh(4),lop(3)}, josh(4)->{ripple(5),lop(3)}, + * peter(6)->{lop(3)}; vadas/lop/ripple have no out-edges. + */ +describe('Tiny Gremlin repeat()', () => { + let g; + const markoId = 1; + + beforeEach(() => { + g = anon.traversal().with_(buildModernGraph().graph); + }); + + it('times(n) after repeat loops the body n times', async () => { + const names = await g.V(markoId).repeat(__.out()).times(2).values('name').toList(); + // marko -> {vadas,josh,lop} -> only josh has out -> {ripple,lop} + assert.sameMembers(names, ['lop', 'ripple']); + }); + + it('times(n) before repeat has the same loop-count semantics', async () => { + const names = await g.V(markoId).times(2).repeat(__.out()).values('name').toList(); + assert.sameMembers(names, ['lop', 'ripple']); + }); + + it('times(0) after repeat still runs the body once (do-while)', async () => { + const names = await g.V(markoId).repeat(__.out('created')).times(0).values('name').toList(); + assert.deepEqual(names, ['lop']); + }); + + it('times(0) before repeat skips the body entirely (identity)', async () => { + const names = await g.V(markoId).times(0).repeat(__.out('created')).values('name').toList(); + assert.deepEqual(names, ['marko']); + }); + + it('emit() after repeat side-outputs each post-body traverser', async () => { + const names = await g.V(markoId).repeat(__.out()).times(2).emit().values('name').toList(); + // 1-hop set emitted after body 1 (vadas,josh,lop) plus the times-bounded 2-hop exit (ripple,lop) + assert.sameMembers(names, ['vadas', 'josh', 'lop', 'ripple', 'lop']); + }); + + it('until(traversal) after repeat is do-while (exit when the body output matches)', async () => { + const names = await g.V(markoId).repeat(__.out()).until(__.has('name', 'ripple')).values('name').toList(); + // only the marko->josh->ripple branch reaches a vertex named ripple + assert.deepEqual(names, ['ripple']); + }); + + it('until(traversal) before repeat is while (condition tested before the body)', async () => { + // lop is already software, so while-semantics yields it without ever running out() + const lopId = 3; + const fromLop = await g.V(lopId).until(__.hasLabel('software')).repeat(__.out()).values('name').toList(); + assert.deepEqual(fromLop, ['lop']); + // from marko (not software) the loop advances to the first software vertex on each path + const fromMarko = await g.V(markoId).until(__.hasLabel('software')).repeat(__.out()).values('name').toList(); + assert.sameMembers(fromMarko, ['lop', 'ripple', 'lop']); + }); + + it('emit(traversal) before repeat side-outputs matching traversers ahead of the body', async () => { + const names = await g.V(markoId).emit(__.has(t.label, 'person')).repeat(__.out()).values('name').toList(); + // person vertices reached while looping out(): marko, then vadas and josh + assert.sameMembers(names, ['marko', 'vadas', 'josh']); + }); + + it('supports a nested repeat() inside the loop body', async () => { + // body is a step followed by a bounded inner repeat (out().repeat(out()).times(1)), + // i.e. two hops expressed as embedded loops + const names = await g.V(markoId).repeat(__.out().repeat(__.out()).times(1)).times(1).values('name').toList(); + assert.sameMembers(names, ['lop', 'ripple']); + }); + + it('folds modulators inside the repeat body (order().by())', async () => { + // order().by('name') inside the body proves the body sub-pipeline is folded too; + // an unfolded body would fail on the bare by() step. + const names = await g.V(markoId).repeat(__.out().order().by('name')).times(2).values('name').toList(); + assert.sameMembers(names, ['lop', 'ripple']); + }); + + it('order() inside repeat() barriers globally over the whole loop frontier', async () => { + // The body runs over the entire frontier at once, so order() sorts all of g.V().both() + // together (global), not within each start vertex's neighbours (which would be unsorted + // once concatenated). 12 both()-endpoints, globally ordered by name. + const names = await g.V().repeat(__.both().order().by('name')).times(1).values('name').toList(); + assert.deepEqual(names, [ + 'josh', 'josh', 'josh', 'lop', 'lop', 'lop', 'marko', 'marko', 'marko', 'peter', 'ripple', 'vadas', + ]); + }); + + it('preserves global order() semantics through a nested repeat()', async () => { + // Same globalness, with the order() buried one repeat() deeper: + // repeat(both().repeat(order().by(name)).times(1)).times(1). + const names = await g.V() + .repeat(__.both().repeat(__.order().by('name')).times(1)).times(1) + .values('name').toList(); + assert.deepEqual(names, [ + 'josh', 'josh', 'josh', 'lop', 'lop', 'lop', 'marko', 'marko', 'marko', 'peter', 'ripple', 'vadas', + ]); + }); + + it('rejects repeat() with a loop label (no step labels in Tiny Gremlin)', async () => { + try { + await g.V(markoId).repeat('a', __.out()).times(2).values('name').toList(); + assert.fail('expected a LocalExecutionError for a labeled repeat()'); + } catch (err) { + assert.match(err.message, /loop label is not supported/); + } + }); + + it('rejects the predicate form of until() (only the traversal form is supported)', async () => { + try { + await g.V(markoId).repeat(__.out()).until(P.gt(100)).values('name').toList(); + assert.fail('expected a LocalExecutionError for until(P)'); + } catch (err) { + assert.match(err.message, /until\(Predicate\) is not supported/); + } + }); + + it('rejects a mutation step inside a repeat() body', async () => { + try { + await g.V(markoId).repeat(__.addV('x')).times(1).toList(); + assert.fail('expected a LocalExecutionError for addV() inside repeat()'); + } catch (err) { + assert.match(err.message, /Mutation step 'addV\(\)' is not supported inside a repeat\(\) body/); + } + }); + + it('rejects a mutation step inside an until() condition', async () => { + try { + await g.V(markoId).repeat(__.out()).until(__.property('k', 'v')).toList(); + assert.fail('expected a LocalExecutionError for property() inside until()'); + } catch (err) { + assert.match(err.message, /Mutation step 'property\(\)' is not supported inside an until\(\) condition/); + } + }); + + it('rejects a mutation step nested inside an inner repeat() body', async () => { + try { + await g.V(markoId).repeat(__.repeat(__.addE('x')).times(1)).times(1).toList(); + assert.fail('expected a LocalExecutionError for addE() nested in repeat()'); + } catch (err) { + assert.match(err.message, /Mutation step 'addE\(\)' is not supported inside a repeat\(\) body/); + } + }); +}); diff --git a/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Repeat.feature b/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Repeat.feature index 60c17a6008..a5652a312b 100644 --- a/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Repeat.feature +++ b/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Repeat.feature @@ -18,6 +18,7 @@ @StepClassBranch @StepRepeat Feature: Step - repeat() + @TinyGremlin Scenario: g_V_repeatXoutX_timesX2X_emit_path Given the modern graph And the traversal of @@ -36,6 +37,7 @@ Feature: Step - repeat() | p[v[josh],v[lop]] | | p[v[peter],v[lop]] | + @TinyGremlin Scenario: g_V_repeatXoutX_timesX2X_repeatXinX_timesX2X_name Given the modern graph And the traversal of @@ -48,7 +50,7 @@ Feature: Step - repeat() | marko | | marko | - @GraphComputerVerificationReferenceOnly + @GraphComputerVerificationReferenceOnly @TinyGremlin Scenario: g_V_repeatXoutE_inVX_timesX2X_path_by_name_by_label Given the modern graph And the traversal of @@ -61,6 +63,7 @@ Feature: Step - repeat() | p[marko,knows,josh,created,lop] | | p[marko,knows,josh,created,ripple] | + @TinyGremlin Scenario: g_V_repeatXoutX_timesX2X Given the modern graph And the traversal of @@ -73,6 +76,7 @@ Feature: Step - repeat() | v[ripple] | | v[lop] | + @TinyGremlin Scenario: g_V_repeatXoutX_timesX2X_emit Given the modern graph And the traversal of @@ -88,6 +92,7 @@ Feature: Step - repeat() | v[vadas] | And the result should have a count of 8 + @TinyGremlin Scenario: g_VX1X_timesX2X_repeatXoutX_name Given the modern graph And using the parameter vid1 defined as "v[marko].id" @@ -101,6 +106,7 @@ Feature: Step - repeat() | ripple | | lop | + @TinyGremlin Scenario: g_V_emit_timesX2X_repeatXoutX_path Given the modern graph And the traversal of @@ -125,6 +131,7 @@ Feature: Step - repeat() | p[v[peter]] | | p[v[peter],v[lop]] | + @TinyGremlin Scenario: g_V_emit_repeatXoutX_timesX2X_path Given the modern graph And the traversal of @@ -149,6 +156,7 @@ Feature: Step - repeat() | p[v[peter]] | | p[v[peter],v[lop]] | + @TinyGremlin Scenario: g_VX1X_emitXhasXlabel_personXX_repeatXoutX_name Given the modern graph And using the parameter vid1 defined as "v[marko].id" @@ -224,7 +232,7 @@ Feature: Step - repeat() | result | | p[marko,knows,josh,created,ripple] | - @GraphComputerVerificationReferenceOnly + @GraphComputerVerificationReferenceOnly @TinyGremlin Scenario: g_V_hasXloop_name_loopX_repeatXinX_timesX5X_path_by_name Given the sink graph And the traversal of @@ -236,7 +244,7 @@ Feature: Step - repeat() | result | | p[loop,loop,loop,loop,loop,loop] | - @GraphComputerVerificationReferenceOnly + @GraphComputerVerificationReferenceOnly @TinyGremlin Scenario: g_V_repeatXout_repeatXout_order_byXname_descXX_timesX1XX_timesX1X_limitX1X_path_byXnameX Given the modern graph And the traversal of @@ -248,7 +256,7 @@ Feature: Step - repeat() | result | | p[marko,josh,ripple] | - @GraphComputerVerificationReferenceOnly + @GraphComputerVerificationReferenceOnly @TinyGremlin Scenario: g_V_repeatXoutXknowsXX_untilXrepeatXoutXcreatedXX_emitXhasXname_lopXXX_path_byXnameX Given the modern graph And the traversal of @@ -260,6 +268,7 @@ Feature: Step - repeat() | result | | p[marko,josh] | + @TinyGremlin Scenario: g_V_repeatXrepeatXout_createdXX_untilXhasXname_rippleXXXemit_lang Given the modern graph And the traversal of @@ -375,6 +384,7 @@ Feature: Step - repeat() When iterated to list Then the traversal will raise an error with message containing text of "The repeat()-traversal was not defined" + @TinyGremlin Scenario: g_V_hasXperson_name_markoX_repeatXoutXcreatedXX_timesX1X_name Given the modern graph And the traversal of @@ -386,6 +396,7 @@ Feature: Step - repeat() | result | | lop | + @TinyGremlin Scenario: g_V_hasXperson_name_markoX_repeatXoutXcreatedXX_timesX0X_name Given the modern graph And the traversal of @@ -397,6 +408,7 @@ Feature: Step - repeat() | result | | lop | + @TinyGremlin Scenario: g_V_hasXperson_name_markoX_timesX1X_repeatXoutXcreatedXX_name Given the modern graph And the traversal of @@ -408,6 +420,7 @@ Feature: Step - repeat() | result | | lop | + @TinyGremlin Scenario: g_V_hasXperson_name_markoX_timesX0X_repeatXoutXcreatedXX_name Given the modern graph And the traversal of @@ -508,6 +521,7 @@ Feature: Step - repeat() | p[v[marko],v[lop],v[josh]] | # Nested repeat should maintain globalness + @TinyGremlin Scenario: g_V_repeatXboth_repeatXorder_byXnameXX_timesX1XX_timesX1X Given the modern graph And the traversal of
