> There are also some cases where it logically makes some sense to safely have one but not the other. A GLV likely doesn't need to support a Bytecode deserializer because it doesn't receive bytecode from the server
Agree, I don't think it's necessary to support types deserialization for types that are never going to be sent from the server, like some of the types under Graph Process <http://tinkerpop.apache.org/docs/current/dev/io/#_graph_process>. I also agree that for GLVs that have a more limited type system (like JavaScript or Python), we should do what is best for the user and solve it case by case. I wanted to stress the need for symmetry for GLVs where we have rich type systems (in this case .NET/Java) for Core and Extended types, for which supporting deserialization and not serialization can cause obscure errors, like: TinkerPop "Type A" is deserialized as "Type C1" in Gremlin-X GLV, but "Type C1" instances can't be serialized to "Type A". TypeC1 value = g.V().has("name", "jorge").value("propA").next(); // The following would fail g.V().has("name", "jorge").property("propA", value).next(); I think in this case, it's preferred to have a 1-to-1 mappings or no mapping at all (implementors/vendors could support it, if interested). 2018-05-30 12:45 GMT+02:00 Stephen Mallette <spmalle...@gmail.com>: > I think the original thread spread off in too many different directions. > I'm going to leave that original one to talk about future binary format > stuff, type deprecation, etc. and make this new one to focus on getting > this PR to close: > > https://github.com/apache/tinkerpop/pull/842 > > which is currently stuck on whether or not it is important for us to have > symmetry in serialization (i.e. everything a GLV can serialize must also be > deserialized). I'll paste up my last thoughts on that from my previous post > below: > > > Regarding serialization and deserialization asymmetry on GLVs (for Core > and Extended types), I think we should avoid it as it could lead to > obscure error > messages on the user side. > > In the past, I think TinkerPop (going back to 2.x) has been ok with it and > I'm not so sure that I recall any specific problems that were every voiced > by users on the subject. As it stands, I think we already have some > asymmetry in gremlin-python so there is some precedent for it. There are > also some cases where it logically makes some sense to safely have one but > not the other. A GLV likely doesn't need to support a Bytecode deserializer > because it doesn't receive bytecode from the server. It only needs to send > bytecode and thus only has a serializer - at least until we have GVMs > instead of GLVs :) Does that change your thinking at all Jorge? > > > > > > On Tue, May 29, 2018 at 12:45 PM Stephen Mallette <spmalle...@gmail.com> > wrote: > > > > Regarding serialization and deserialization asymmetry on GLVs (for > > Core and Extended types), I think we should avoid it as it could lead to > > obscure error messages on the user side. > > > > In the past, I think TinkerPop (going back to 2.x) has been ok with it > and > > I'm not so sure that I recall any specific problems that were every > voiced > > by users on the subject. As it stands, I think we already have some > > asymmetry in gremlin-python so there is some precedent for it. There are > > also some cases where it logically makes some sense to safely have one > but > > not the other. A GLV likely doesn't need to support a Bytecode > deserializer > > because it doesn't receive bytecode from the server. It only needs to > send > > bytecode and thus only has a serializer - at least until we have GVMs > > instead of GLVs :) Does that change your thinking at all Jorge? > > > > > First would be: Gremlin should not concern itself with storage > > schemas..... > > > > I like all of Robert's first paragraph because it makes Jorge's binary > > format proposal that much easier to get right. JanusGraph, DSE Graph and > > others won't have any trouble with this approach because the backend will > > simply know that the particular property that this number is going into > > will be a float and will coerce it as such on storage. I just wonder > > exactly how graphs that don't ' have schemas like neo4j/tinkergraph will > > deal with someone sending a "Number". What happens in that case? > > > > On Mon, May 28, 2018 at 4:20 AM, Florian Hockmann < > f...@florian-hockmann.de> > > wrote: > > > >> > these should be dropped: Class (unless this is used for something > >> important? Too many results on 'Class' > >> in the codebase. > >> > >> 'Class' is for example used for 'withoutStrategies' but I agree that > this > >> would probably better handled just as a string. 'Class' is Java-specific > >> which doesn't make much sense when graph providers want to implement > >> TinkerPop in another language than Java. > >> > >> Apart from that, I'm not sure I get your reasoning behind dropping types > >> like Date, Int32, and float. It's really trivial in most languages to > add > >> serializers for more numerical types so I don't really see why we should > >> drop them when they make the storage more efficient and reduce the need > for > >> type castings in user code. > >> For Date, you say that it's just a long. Sure, but how does the receiver > >> know that the long should be deserialized to a Date in this case? As a > user > >> I want to work with a Date object and not just with a long. Also, we > >> nevertheless need a convention of what this long represents: > Milliseconds > >> since January 1, 1970 (POSIX)? Since January 1, 1 (.NET)? Since December > >> 31, 1899 (C++ 7.0)? (There are a lot more epoch dates [1].) g:Date is > >> basically just this convention which is why I would keep it. > >> > >> > There should be a boolean (which seems to be completely missing??). > >> > >> Yeah, boolean and string are both just serialized without type > >> information right now. Maybe we want to change that if we ever introduce > >> GraphSON 4. > >> > >> > >> Jorge's suggestion to drop all extended types except for the five he > >> listed sounds like a good idea to me. I would only add dropping of > either > >> Timestamp or Date from Core and probably also Class, like Robert > suggested. > >> > >> [1] > >> https://en.wikipedia.org/wiki/Epoch_%28reference_date%29# > Notable_epoch_dates_in_computing > >> > >> -----Ursprüngliche Nachricht----- > >> Von: Robert Dale <robd...@gmail.com> > >> Gesendet: Freitag, 25. Mai 2018 15:43 > >> An: dev@tinkerpop.apache.org > >> Betreff: Re: [DISCUSS] Handling of problematic GraphSON types > >> > >> There should be a guiding principle on this to make these decisions > >> clearer. First would be: Gremlin should not concern itself with storage > >> schemas. As an extension of that, Gremlin should not concern itself with > >> storage size. Next would be: Gremlin should not be Java-specific. > Finally, > >> it should be hard to add a new type, i.e. it's demonstratively > difficult to > >> do a real world traversal without this type, how GLVs would map it, what > >> functions on that type should be a part of Gremlin, and n>1 people > >> positively affirm this direction. > >> > >> Thus, there should be a minimal Core on which most else can be built. > >> All extended types should be dropped. From Core, these should be > dropped: > >> Class (unless this is used for something important? Too many results on > >> 'Class' > >> in the codebase. Otherwise, it's just a string), Date (is a long), > >> Timestamp (is a long, what's the diff to Date anyway?). There should be > >> one floating point type which is 64-bit. There should be one integer > type > >> which is 64-bit. There should be a boolean (which seems to be completely > >> missing??). > >> > >> > >> Robert Dale > >> > >> On Fri, May 25, 2018 at 3:37 AM, Jorge Bay Gondra < > >> jorgebaygon...@gmail.com> > >> wrote: > >> > >> > Thanks Florian for starting the discussion on this topic! > >> > > >> > I think its a good exercise to evaluate which types are necessary for > >> > a GLV to support. > >> > > >> > I went through a similar exercise when designing the binary > >> > serialization format. I'll go ahead and propose: > >> > All types that are considered "Core", "Graph Structure" and "Graph > >> Process" > >> > in GraphSON3 > >> > <http://tinkerpop.apache.org/docs/current/dev/io/#_core_2> > >> > plus the following from the "Extended" list: > >> > - Short > >> > - Byte > >> > - ByteBuffer > >> > - BigInteger > >> > - BigDecimal > >> > > >> > The rationale is to select types that *can't be represented and > >> > stored* using other types. > >> > For example: > >> > - Short can be stored using an int backing field, but it would take > >> > twice the space. > >> > - BigDecimal can be stored using a ByteBuffer but ordering on a buffer > >> > doesn't align with decimal ordering. > >> > > >> > Regarding serialization and deserialization asymmetry on GLVs (for > >> > Core and Extended types), I think we should avoid it as it could lead > >> > to obscure error messages on the user side. > >> > > >> > I think we should provide a comprehensive type representation but it > >> > doesn't have to be contain any type imaginable. The Gremlin Server and > >> > the GLVs provide extension mechanisms that vendors and users can use > >> > to support other types. > >> > > >> > 2018-05-24 14:31 GMT+02:00 Florian Hockmann <f...@florian-hockmann.de>: > >> > > >> > > As part of the discussion for the pull request by Daniel C. Weber > >> > > that > >> > adds > >> > > support for more extended GraphSON types to Gremlin.Net [1] we > >> > > identified several of those types to be problematic for non-Java > >> > > languages (or at least for .NET in this case) as they don't really > >> > > have counterparts in other languages and for some it was even > >> > > difficult to say where they differ > >> > from > >> > > each other. > >> > > > >> > > > >> > > > >> > > Now the question is basically what we want to do with those > >> > > problematic types. > >> > > > >> > > > >> > > > >> > > My suggestion would be an approach like this: > >> > > > >> > > 1. Identify types that are problematic and that we therefore > >> don't > >> > > want > >> > > to support across all GLVs. > >> > > 2. Communicate to users somehow which types are problematic > >> > (something > >> > > like a deprecation) as we won't support them in all GLVs and maybe > >> > > even stop supporting them at all at some point in the future. > >> > > 3. Support the remaining types in all GLVs. > >> > > > >> > > > >> > > > >> > > Does that sound like a good plan? Are there any good ideas for the > >> > > deprecation of those problematic types? My first idea would be to > >> > > put > >> > them > >> > > in a different section in the I/O docs [2] that explains at the > >> > > beginning that and why they are deprecated, but maybe someone here > >> > > has a better > >> > idea. > >> > > > >> > > > >> > > > >> > > Another question that was brought up during the review of the > >> > > mentioned > >> > PR > >> > > by Jorge was whether types should only be supported symmetrically or > >> > > whether GLVs should try to support types as good as they can. If > >> > > someone has good arguments or a strong opinion for either side then > >> > > it would of course > >> > also > >> > > be good to hear them. > >> > > > >> > > To give a concrete example of what is meant by symmetric support: > >> > > > >> > > In its current form the PR deserializes both GraphSON types > >> > > gx:Duration > >> > and > >> > > gx:Period to the .NET type TimeSpan and it serializes TimeSpan back > >> > > to gx:Duration. This means that gx:Duration is supported > >> > > symmetrically, but gx:Period is not as there exists no .NET > >> > > serializer that create a gx:Period. > >> > > > >> > > > >> > > > >> > > [1] https://github.com/apache/tinkerpop/pull/842 > >> > > > >> > > [2] http://tinkerpop.apache.org/docs/current/dev/io/#_extended_2 > >> > > > >> > > > >> > > > >> > > > >> > > >> > >> > > >