Hi Stephen,

Good questions. Let's elevate this discussion (about the specifics of
graphs and traversal results over Thrift) to the dev list. See inline.


On Wed, Jul 7, 2021 at 5:08 AM Stephen Mallette <[email protected]>
wrote:

> So, what happens if a returned Vertex contained a ByteBuffer or
> InetAddress as a property value? I assume the thrift definition has to be
> adjusted to include those types if you expect them in the results?
>


What you see in the diff, currently, captures the types specifically
mentioned in Graph.Features (see graph_features.yaml). In order to support
other types natively, we should update Graph.Features in parallel. Byte
arrays can be captured using Thrift's binary type. Domain-specific types
like InetAddress probably should not be built in, just as specific element
labels and property keys are not built in at this level. However, that is
not the only possible answer. Certain very common types like IP addresses,
dates and intervals, units of measurement, etc. *could* be built into the
type system, but IMO probably shouldn't. Instead, we should give users a
way of encoding and decoding domain-specific objects using a handful of
atomic types. InetAddress in this case is encoded either as a string or a
struct.



> How would provider specific types (like a Point or special instances of P
> in JanusGraph) fit into something like this - how would providers (or
> users) extend on our thrift definitions?
>

Point is definitely a domain-specific type which you would not see at this
level of schema. Maybe I can illustrate encoding and decoding
domain-specific types in the branch; using the current simple type system,
you could turn the Point into a map with three keys, like "latitude",
"longitude" and "type". When receiving a map with "type" equal to "Point",
you turn it back into a native Point object. We could also use a strategy
similar to Protobuf's Any type, where we send a struct with two fields over
the wire: one field provides the data of the Point, and the other field
provides a URL which specifies the type, i.e. how the object should be
decoded. It is probably worthwhile to add a "record" type variant to
Graph.Features in any case.



I think that the idea of having a more strict definition on the types
> Gremlin supports is starting to materialize given the constraints on
> serializable types of GraphSON and then further restricted in GraphBinary.
> We actually have a list of types that haven't changed much in years at this
> point:
>
> https://tinkerpop.apache.org/docs/3.5.0/dev/io/
>


We might want to go through this list with a fine-toothed comb (i.e. we
probably don't want both a Date atomic type and a Timestamp type unless
they have different precision/granularity, in which case I would make that
explicit in the name of the type, e.g. UnixTimeSeconds vs. UnixTimeMillis).


I think we could actually even limit them further and then the dream would
> be to prevent them from being so JVM specific.
>


Yes, I would argue for limiting them to very domain-independent atomic
types, probably excluding the timestamp type(s) as well as UUID and Class.
However, as I say it's possible to include a few specialized types if the
user demand is really high. It's just more stuff which needs to be
implemented in each Gremlin language variant.



> It would be nice to elevate the discussion of supported types out of
> serialization and into the Gremlin language layer itself, which would then
> in turn drive serialization discussions.
>


That's where I see this going. The specification of Gremlin traversal
structure in YAML (already illustrated in the branch) translates neatly
into traversals over the wire using Thrift. To that and the basic graph
structure specification, we need a specification for other kinds of objects
which appear in traversal results, such as paths.


Josh


[original message clipped]

Reply via email to