Hi all,
I'd like to propose typing the step argument signatures across the
gremlin-go public traversal API, replacing the dominant
`...interface{}` parameters with proper Go types. Targets the 4.x
line. This grew out of an earlier informal conversation with Cole,
whose framing was that gremlin-go would be the first GLV typed,
with Python type hints and TS gremlin-javascript to follow once
Go validates the pattern. Posting here to widen the discussion
before any JIRA or PR.
Problem
=======
Today gremlin-go's user-facing traversal API uses
`(args ...interface{})` for nearly every step (137 of 138 methods
in graphTraversal.go, with mirrors on AnonymousTraversal and
graphTraversalSource -- ~500 interface{} sites total).
Consequences:
* No compile-time arity or type checking. Calls like
`g.V().HasLabel(P.Within("a", 1, true))` build and ship; the
server rejects them at parse time.
* gopls (the Go LSP) cannot offer per-arg autocomplete or type-
check at call sites, because every parameter accepts anything.
* Existing typed constructs (Direction, T, Order, Scope,
Cardinality named types; TraversalStrategy; *Vertex/*Edge) are
erased the instant they enter `(args ...interface{})`.
* `P` is currently a non-generic interface backed by
`[]interface{}`, so predicate construction itself loses type
safety.
Gremlin.Net solved the equivalent problem in TINKERPOP-1752 (2017,
Fixed) using C# method overloads. Go has no method overloads, no
sum types, and no method-level type parameters independent of the
receiver -- so the .NET strategy doesn't transfer directly.
Constraints from Go that shape any solution
===========================================
* No method overloads.
* Method type parameters must come from the receiver. A generic
`(g *GraphTraversal[V]) Has(key string, value V)` would require
GraphTraversal itself to be generic, i.e. the Traversal<S,E>
pattern. That is explicitly out of scope for this proposal.
* No sum / union types for parameter positions.
So generics in this proposal apply to the value types we pass
around (`P`, `TextP`), not to GraphTraversal's step methods.
Proposed Phase 1 (small vertical, ships first)
==============================================
A narrow slice that proves the approach:
* Make `P` generic over its value type: `P[V any]`, with
factories `P.Eq[V](v V)`, `P.Within[V](vs ...V)`, etc.,
mirroring Java's `P<V>`. Catches mixed-type predicates at
compile time today.
Before:
g.V().HasLabel(P.Within("a", 1, true)) // compiles, fails on server
After (illustrative):
g.V().HasLabel(P.Within[string]("a", "b", "c")) // ok
g.V().HasLabel(P.Within[string]("a", 1, true)) // does not compile
* `TextP[V]` follows the same shape (it is a P clone).
* Type the obvious-narrow steps:
string-variadic: As, HasLabel, HasKey, Cap, Aggregate, Select
int64: Limit, Range, Skip, Tail, Times
* Leave Has, By, Where, Property, AddV, AddE, From, To and the
other polymorphic-arg steps untouched in Phase 1.
Phase 2+ (subject to question 1 below)
======================================
For the polymorphic-arg steps, the realistic Go shapes are:
* Narrow variadics where overloads collapse cleanly
(e.g. `From(label string)` + `FromTraversal(*GraphTraversal)`).
* Named variants for irreducible Java overloads
(e.g. `Has(key, value any)` + `HasPredicate(key, p Predicate)`).
* Keep `any` for genuinely user-supplied values (Property values,
Has values).
I considered and rejected a "one Go method per Java overload with
a unique name" model -- it would balloon the API from ~50 step
methods to 200+ with no precedent in any other GLV.
Questions where I'd value the dev list's input
==============================================
1. Overload-collapse strategy for the polymorphic-arg steps.
Several Java methods have many semantically distinct overloads
(Has=9, By=10, AddV=4, Limit=4, Where=3). Because Go has no
overloads, no sum types, and no method-level type parameters
independent of the receiver, each Java overload group has to
collapse one of three ways:
a. Multiple named Go methods (HasPredicate, HasInLabel, ...).
Maximal compile-time safety, largest API surface.
b. Single signature with `any` + runtime validation. Smallest
API surface; preserves today's call shape with light
additional safety.
c. Sum-style accepted-type interfaces
(`type HasValue interface { hasValue() }` implemented by
the accepted concrete types). Compile-time safety without
API balloon, but users wrap primitives or rely on
conversion helpers.
gremlin-go would be setting the template here; Python (with
@overload + Union) and TS gremlin-javascript (with overload
signatures + unions) would express the equivalent pattern
later. The conceptual choice -- split vs collapse vs constrain
-- propagates across all three.
2. Timing: 4.x deprecate-alongside vs 5.0 hard-replace? Cole's
earlier guidance was "wanted for TP4 but not a release
blocker." Inside 4.x, project policy
(for-committers.asciidoc:237-256) requires deprecate-then-
remove. Because Go cannot have two methods with the same name
and different signatures, deprecate-alongside means coining
new method names (e.g. HasLabels(...string) alongside the
deprecated HasLabel(args ...interface{})) -- ergonomically
awkward, with the rename-back happening at 5.0. The alternative
is hard-replace targeted at 5.0 (cleanest but later). Which
path does the project prefer?
Prior art: TINKERPOP-1752 (Gremlin.Net type-safe methods, Fixed
2017). Will open a TINKERPOP JIRA once the design has rough
consensus here.
Thoughts? Especially curious to hear Cole and Ken given the
earlier informal chats, and anyone with strong opinions on Go API
ergonomics.
Thanks,
Rithin