Hi Stephen, Good questions. Let's elevate this discussion (about the specifics of graphs and traversal results over Thrift) to the dev list. See inline.
On Wed, Jul 7, 2021 at 5:08 AM Stephen Mallette <[email protected]> wrote: > So, what happens if a returned Vertex contained a ByteBuffer or > InetAddress as a property value? I assume the thrift definition has to be > adjusted to include those types if you expect them in the results? > What you see in the diff, currently, captures the types specifically mentioned in Graph.Features (see graph_features.yaml). In order to support other types natively, we should update Graph.Features in parallel. Byte arrays can be captured using Thrift's binary type. Domain-specific types like InetAddress probably should not be built in, just as specific element labels and property keys are not built in at this level. However, that is not the only possible answer. Certain very common types like IP addresses, dates and intervals, units of measurement, etc. *could* be built into the type system, but IMO probably shouldn't. Instead, we should give users a way of encoding and decoding domain-specific objects using a handful of atomic types. InetAddress in this case is encoded either as a string or a struct. > How would provider specific types (like a Point or special instances of P > in JanusGraph) fit into something like this - how would providers (or > users) extend on our thrift definitions? > Point is definitely a domain-specific type which you would not see at this level of schema. Maybe I can illustrate encoding and decoding domain-specific types in the branch; using the current simple type system, you could turn the Point into a map with three keys, like "latitude", "longitude" and "type". When receiving a map with "type" equal to "Point", you turn it back into a native Point object. We could also use a strategy similar to Protobuf's Any type, where we send a struct with two fields over the wire: one field provides the data of the Point, and the other field provides a URL which specifies the type, i.e. how the object should be decoded. It is probably worthwhile to add a "record" type variant to Graph.Features in any case. I think that the idea of having a more strict definition on the types > Gremlin supports is starting to materialize given the constraints on > serializable types of GraphSON and then further restricted in GraphBinary. > We actually have a list of types that haven't changed much in years at this > point: > > https://tinkerpop.apache.org/docs/3.5.0/dev/io/ > We might want to go through this list with a fine-toothed comb (i.e. we probably don't want both a Date atomic type and a Timestamp type unless they have different precision/granularity, in which case I would make that explicit in the name of the type, e.g. UnixTimeSeconds vs. UnixTimeMillis). I think we could actually even limit them further and then the dream would > be to prevent them from being so JVM specific. > Yes, I would argue for limiting them to very domain-independent atomic types, probably excluding the timestamp type(s) as well as UUID and Class. However, as I say it's possible to include a few specialized types if the user demand is really high. It's just more stuff which needs to be implemented in each Gremlin language variant. > It would be nice to elevate the discussion of supported types out of > serialization and into the Gremlin language layer itself, which would then > in turn drive serialization discussions. > That's where I see this going. The specification of Gremlin traversal structure in YAML (already illustrated in the branch) translates neatly into traversals over the wire using Thrift. To that and the basic graph structure specification, we need a specification for other kinds of objects which appear in traversal results, such as paths. Josh [original message clipped]
