This is an automated email from the ASF dual-hosted git repository. xiazcy pushed a commit to branch type-predicate-proposal in repository https://gitbox.apache.org/repos/asf/tinkerpop.git
commit 368a81aed6dd53c5dda2f2aade64d5a43c470c21 Author: xiazcy <xia...@gmail.com> AuthorDate: Thu Sep 11 11:18:21 2025 -0700 add draft proposal for type predicate and type enums --- docs/src/dev/future/index.asciidoc | 5 +- .../dev/future/proposal-type-predicate-8.asciidoc | 264 +++++++++++++++++++++ 2 files changed, 267 insertions(+), 2 deletions(-) diff --git a/docs/src/dev/future/index.asciidoc b/docs/src/dev/future/index.asciidoc index f66a4ce9a4..4ab7ff7c1e 100644 --- a/docs/src/dev/future/index.asciidoc +++ b/docs/src/dev/future/index.asciidoc @@ -165,8 +165,9 @@ story. |link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-3-remove-closures.asciidoc[Proposal 3] |Removing the Need for Closures/Lambda in Gremlin |3.7.0 |Y |link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-transaction-4.asciidoc[Proposal 4] |TinkerGraph Transaction Support |3.7.0 |Y |link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-scoping-5.asciidoc[Proposal 5] |Lazy vs. Eager Evaluation|3.8.0 |N -|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-asnumber-step-6.asciidoc[Proposal 6] |asNumber() Step|3.8.0 |N -|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-asbool-step-7.asciidoc[Proposal 7] |asBool() Step|3.8.0 |N +|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-asnumber-step-6.asciidoc[Proposal 6] |asNumber() Step|3.8.0 |Y +|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-asbool-step-7.asciidoc[Proposal 7] |asBool() Step|3.8.0 |Y +|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-type-predicate-8.asciidoc[Proposal 8] |asBool() Step|3.8.0 |N |========================================================= = Appendix diff --git a/docs/src/dev/future/proposal-type-predicate-8.asciidoc b/docs/src/dev/future/proposal-type-predicate-8.asciidoc new file mode 100644 index 0000000000..04b90aced4 --- /dev/null +++ b/docs/src/dev/future/proposal-type-predicate-8.asciidoc @@ -0,0 +1,264 @@ +//// +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to You under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +//// + +image::apache-tinkerpop-logo.png[width=500,link="https://tinkerpop.apache.org"] + +*x.y.z - Proposal 8* + +== Type Predicate + +=== Motivation + +Following up on link:https://issues.apache.org/jira/browse/TINKERPOP-2234[TINKERPOP-2234], filtering of traversers based on types would be a very convenient addition to data manipulation in Gremlin. For example, when we perform union of traversals and only want vertices or edges, currently there is no simple way to assert the type and pass it through. + +=== Type Token Definition & Implementation + +To facilitate type assertion across GLVs, using a unified set of enum tokens is the most convenient, especially given we do not want to leak Java types into other languages. The proposal here is to use a defined set of Gremlin Type tokens (see Appendix), and use an interface structure to allow providers to implement and register their own type tokens to be used with the Gremlin Language. + +==== Gremlin Type Enum + +We will define a `GType` enum consiste of core types in Gremlin be used with the type predicate (for full set of types, see Appendix), for type safety and maintainability across GLVs. This will also be the singel set of type enum used in Gremlin steps (e.g. replacing `N` in `asNumber()`). + +This enum itself will not be extensable, however it will contain a GlobalCache which providers can use to register their own types with the Grammar to be used as string arguments. + +[source,java] +---- +public enum GType { + BYTE(Byte.class), + DOUBLE(Double.class), + FLOAT(Float.class), + INT(Integer.class), + LIST(List.class), + LONG(Long.class), + MAP(Map.class), + NULL(OffsetDateTime.class),; + + private final Class<?> javaType; + + GType(Class<?> javaType) { + this.javaType = javaType; + } + + public Class<?> getType() { return javaType; } + + public GType fromName(String name) { return GType.valueOf(name.toUpperCase()); } + + static final class GlobalTypeCache { + + private GlobalTypeCache() { + throw new IllegalStateException("Utility class"); + } + + private static final Map<String, Class<?>> GLOBAL_TYPE_REGISTRY = new ConcurrentHashMap<>() { + }; + } +} +---- + +=== Addition of `P.typeOf(token)` + +Given `P` is widely used by filter steps and has existing evaluation structure, we would also introduce a new predicate `Type` for type assertions. + +[source,java] +---- +public enum Type implements PBiPredicate<Object, Object> { + typeOf { + @Override + public boolean test(final Object first, final Object second) { + // implementations + }} +} +---- + +There are several accepted overloads for the token. + +* `P.typeOf(GType)` - default case, accepts core Gremlin type enum. +* `P.typeOf(String)` - string for extensions: +** Can be the string key used in the Global Cache (e.g. `"GType.NUMBER"`), for providers +** Can be canonical Java class names for which we can derive the class from (e.g. `"java.lang.Number"`) +* `P.typeOf(Class)` - Java-specific for convenience, can be used as sugar syntax in Groovy + +Any other input, while accepted, will return `false` and lead to result being filtered out (alternative is to throw exception). + +For negation, use `not(P.typeOf(token))`. + +==== Examples + +---- +// Use of GType +gremlin> g.V().is(P.of(GType.VERTEX)).fold() +==> [v[1],v[2],[v3],[v4],[v5],[v6]] +gremlin> g.inject(1.0,2,3,'hello',false).is(P.typeOf(GType.NUMBER)).fold() +==> [1.0,2,3] +// for negation, use not() +gremlin> g.inject(1.0,2,3,'hello',false).is(not(P.typeOf(GType.NUMBER))).fold() +==> [hello,false] + +// Use of String key in GlobalTypeCache +gremlin> g.inject(1.0,2,3,'hello',false).is(P.typeOf("GType.NUMBER")).fold() // in this case +==> [1.0,2,3] +// Use of canonical Java name +gremlin> g.inject(1.0,2,3,'hello',false).is(P.typeOf("java.lang.Number")).fold() // in this case +==> [1.0,2,3] + +// Use of Java class for sugar syntx +gremlin> g.V().is(P.typeOf(Vertex.class)).fold() +==> [v[1],v[2],[v3],[v4],[v5],[v6]] +gremlin> g.inject(1.0,2,3,'hello',false).is(typeOf(Number)).fold() +==> [1.0,2,3] +gremlin> typeOf(Number).test(1) +==>true +gremlin> g.inject(1.0,2,3,'hello',false).or(is(typeOf(Integer)), is(typeOf(Boolean))).fold() +==>[2,3,false] + +// Potentially used to filter a certain type of value in all properties: +gremlin> g.V().hasLabel('person').values().is(typeOf(Integer)).sum() +==>123 + +// Potentially used to ensure a sideeffect is properly applied: +gremlin> g.V().hasLabel('person'). +......1> sideEffect(property('age', values('age').asNumber(GType.DOUBLE))). +......2> values('age').is(typeOf(Double)) +==>29.0 +==>27.0 +==>32.0 +==>35.0 + +// Potentially used to filter out nulls for subsequent data processing: +gremlin> g.V().values().is(not(typeOf(GType.NULL))) +---- + +=== Extending Addition Type Tokens via Global Cache + +To reduce complexity of serialization and registration, providers who implement their own custom types will use `String` tokens only. They will need to register their type into the cache to be recognized by the embeded traversal and the Grammar. + +For example, given a new custom `Point` class that's already integrated as a custom type in Gremlin: + +[source,java] +---- +public class Point { + private Integer x; + private Integer y; + + public Point(final Integer x, final Integer y) { + this.x = x; + this.y = y; + } + + // getters and setters +} +---- + +Register the desired string token representing the type in the Global Cache, and provide the appropriate user documentation: + +---- +GType.GlobalTypeCache.registerDataType("MyPoint", Point.class); + +g.inject(new Point(1, 2)).is(P.typeOf("MyPoint")).fold().next(); +---- + +=== Appendix + +A proof-of-concept implementation is located https://github.com/apache/tinkerpop/tree/type-predicate-poc. + +==== Proposed Range of Gremlin Type Tokens + +[cols="1,1"] +|=== +|Token |Gremlin Type Reference + +|GType.INT +|GraphBinary 4.0 + +|GType.LONG +|GraphBinary 4.0 + +|GType.DOUBLE +|GraphBinary 4.0 + +|GType.FLOAT +|GraphBinary 4.0 + +|GType.BIGDECIMAL +|GraphBinary 4.0 + +|GType.BIGINTEGER +|GraphBinary 4.0 + +|GType.BYTE +|GraphBinary 4.0 + +|GType.SHORT +|GraphBinary 4.0 + +|GType.STRING +|GraphBinary 4.0 + +|GType.DATETIME +|GraphBinary 4.0 + +|GType.LIST +|GraphBinary 4.0 + +|GType.SET +|GraphBinary 4.0 + +|GType.MAP +|GraphBinary 4.0 + +|GType.NUMBER +|Utility type + +|GType.UUID +|GraphBinary 4.0 + +|GType.EDGE +|GraphBinary 4.0 + +|GType.PATH +|GraphBinary 4.0 + +|GType.PROPERTY +|GraphBinary 4.0 + +|GType.GRAPH +|GraphBinary 4.0 + +|GType.VERTEX +|GraphBinary 4.0 + +|GType.VP +|GraphBinary 4.0(VertexProperty) + +|GType.BINARY +|GraphBinary 4.0 + +|GType.BOOL +|GraphBinary 4.0 + +|GType.TREE +|GraphBinary 4.0 + +|GType.CHAR +|GraphBinary 4.0 + +|GType.DURATION +|GraphBinary 4.0 + +|GType.NULL +|GraphBinary 4.0(Unspecified Null Object) +|===