Jorge, you sound like you have a pretty strong feeling about this issue so I'm fine to stick to your direction. I really don't feel strongly about it either way and since .NET isn't my strong suit I'll defer to you on this one.
On Wed, May 30, 2018 at 11:18 AM Jorge Bay Gondra <jorgebaygon...@gmail.com> wrote: > > There are also some cases where it logically makes some sense to safely > have one but not the other. A GLV likely doesn't need to support a Bytecode > deserializer because it doesn't receive bytecode from the server > > Agree, I don't think it's necessary to support types deserialization for > types that are never going to be sent from the server, like some of the > types under Graph Process > <http://tinkerpop.apache.org/docs/current/dev/io/#_graph_process>. > > I also agree that for GLVs that have a more limited type system (like > JavaScript or Python), we should do what is best for the user and solve it > case by case. > > I wanted to stress the need for symmetry for GLVs where we have rich type > systems (in this case .NET/Java) for Core and Extended types, for which > supporting deserialization and not serialization can cause obscure errors, > like: > TinkerPop "Type A" is deserialized as "Type C1" in Gremlin-X GLV, but "Type > C1" instances can't be serialized to "Type A". > > TypeC1 value = g.V().has("name", "jorge").value("propA").next(); > // The following would fail > g.V().has("name", "jorge").property("propA", value).next(); > > I think in this case, it's preferred to have a 1-to-1 mappings or no > mapping at all (implementors/vendors could support it, if interested). > > > 2018-05-30 12:45 GMT+02:00 Stephen Mallette <spmalle...@gmail.com>: > > > I think the original thread spread off in too many different directions. > > I'm going to leave that original one to talk about future binary format > > stuff, type deprecation, etc. and make this new one to focus on getting > > this PR to close: > > > > https://github.com/apache/tinkerpop/pull/842 > > > > which is currently stuck on whether or not it is important for us to have > > symmetry in serialization (i.e. everything a GLV can serialize must also > be > > deserialized). I'll paste up my last thoughts on that from my previous > post > > below: > > > > > Regarding serialization and deserialization asymmetry on GLVs (for > Core > > and Extended types), I think we should avoid it as it could lead to > > obscure error > > messages on the user side. > > > > In the past, I think TinkerPop (going back to 2.x) has been ok with it > and > > I'm not so sure that I recall any specific problems that were every > voiced > > by users on the subject. As it stands, I think we already have some > > asymmetry in gremlin-python so there is some precedent for it. There are > > also some cases where it logically makes some sense to safely have one > but > > not the other. A GLV likely doesn't need to support a Bytecode > deserializer > > because it doesn't receive bytecode from the server. It only needs to > send > > bytecode and thus only has a serializer - at least until we have GVMs > > instead of GLVs :) Does that change your thinking at all Jorge? > > > > > > > > > > > > On Tue, May 29, 2018 at 12:45 PM Stephen Mallette <spmalle...@gmail.com> > > wrote: > > > > > > Regarding serialization and deserialization asymmetry on GLVs (for > > > Core and Extended types), I think we should avoid it as it could lead > to > > > obscure error messages on the user side. > > > > > > In the past, I think TinkerPop (going back to 2.x) has been ok with it > > and > > > I'm not so sure that I recall any specific problems that were every > > voiced > > > by users on the subject. As it stands, I think we already have some > > > asymmetry in gremlin-python so there is some precedent for it. There > are > > > also some cases where it logically makes some sense to safely have one > > but > > > not the other. A GLV likely doesn't need to support a Bytecode > > deserializer > > > because it doesn't receive bytecode from the server. It only needs to > > send > > > bytecode and thus only has a serializer - at least until we have GVMs > > > instead of GLVs :) Does that change your thinking at all Jorge? > > > > > > > First would be: Gremlin should not concern itself with storage > > > schemas..... > > > > > > I like all of Robert's first paragraph because it makes Jorge's binary > > > format proposal that much easier to get right. JanusGraph, DSE Graph > and > > > others won't have any trouble with this approach because the backend > will > > > simply know that the particular property that this number is going into > > > will be a float and will coerce it as such on storage. I just wonder > > > exactly how graphs that don't ' have schemas like neo4j/tinkergraph > will > > > deal with someone sending a "Number". What happens in that case? > > > > > > On Mon, May 28, 2018 at 4:20 AM, Florian Hockmann < > > f...@florian-hockmann.de> > > > wrote: > > > > > >> > these should be dropped: Class (unless this is used for something > > >> important? Too many results on 'Class' > > >> in the codebase. > > >> > > >> 'Class' is for example used for 'withoutStrategies' but I agree that > > this > > >> would probably better handled just as a string. 'Class' is > Java-specific > > >> which doesn't make much sense when graph providers want to implement > > >> TinkerPop in another language than Java. > > >> > > >> Apart from that, I'm not sure I get your reasoning behind dropping > types > > >> like Date, Int32, and float. It's really trivial in most languages to > > add > > >> serializers for more numerical types so I don't really see why we > should > > >> drop them when they make the storage more efficient and reduce the > need > > for > > >> type castings in user code. > > >> For Date, you say that it's just a long. Sure, but how does the > receiver > > >> know that the long should be deserialized to a Date in this case? As a > > user > > >> I want to work with a Date object and not just with a long. Also, we > > >> nevertheless need a convention of what this long represents: > > Milliseconds > > >> since January 1, 1970 (POSIX)? Since January 1, 1 (.NET)? Since > December > > >> 31, 1899 (C++ 7.0)? (There are a lot more epoch dates [1].) g:Date is > > >> basically just this convention which is why I would keep it. > > >> > > >> > There should be a boolean (which seems to be completely missing??). > > >> > > >> Yeah, boolean and string are both just serialized without type > > >> information right now. Maybe we want to change that if we ever > introduce > > >> GraphSON 4. > > >> > > >> > > >> Jorge's suggestion to drop all extended types except for the five he > > >> listed sounds like a good idea to me. I would only add dropping of > > either > > >> Timestamp or Date from Core and probably also Class, like Robert > > suggested. > > >> > > >> [1] > > >> https://en.wikipedia.org/wiki/Epoch_%28reference_date%29# > > Notable_epoch_dates_in_computing > > >> > > >> -----Ursprüngliche Nachricht----- > > >> Von: Robert Dale <robd...@gmail.com> > > >> Gesendet: Freitag, 25. Mai 2018 15:43 > > >> An: dev@tinkerpop.apache.org > > >> Betreff: Re: [DISCUSS] Handling of problematic GraphSON types > > >> > > >> There should be a guiding principle on this to make these decisions > > >> clearer. First would be: Gremlin should not concern itself with > storage > > >> schemas. As an extension of that, Gremlin should not concern itself > with > > >> storage size. Next would be: Gremlin should not be Java-specific. > > Finally, > > >> it should be hard to add a new type, i.e. it's demonstratively > > difficult to > > >> do a real world traversal without this type, how GLVs would map it, > what > > >> functions on that type should be a part of Gremlin, and n>1 people > > >> positively affirm this direction. > > >> > > >> Thus, there should be a minimal Core on which most else can be built. > > >> All extended types should be dropped. From Core, these should be > > dropped: > > >> Class (unless this is used for something important? Too many results > on > > >> 'Class' > > >> in the codebase. Otherwise, it's just a string), Date (is a long), > > >> Timestamp (is a long, what's the diff to Date anyway?). There should > be > > >> one floating point type which is 64-bit. There should be one integer > > type > > >> which is 64-bit. There should be a boolean (which seems to be > completely > > >> missing??). > > >> > > >> > > >> Robert Dale > > >> > > >> On Fri, May 25, 2018 at 3:37 AM, Jorge Bay Gondra < > > >> jorgebaygon...@gmail.com> > > >> wrote: > > >> > > >> > Thanks Florian for starting the discussion on this topic! > > >> > > > >> > I think its a good exercise to evaluate which types are necessary > for > > >> > a GLV to support. > > >> > > > >> > I went through a similar exercise when designing the binary > > >> > serialization format. I'll go ahead and propose: > > >> > All types that are considered "Core", "Graph Structure" and "Graph > > >> Process" > > >> > in GraphSON3 > > >> > <http://tinkerpop.apache.org/docs/current/dev/io/#_core_2> > > >> > plus the following from the "Extended" list: > > >> > - Short > > >> > - Byte > > >> > - ByteBuffer > > >> > - BigInteger > > >> > - BigDecimal > > >> > > > >> > The rationale is to select types that *can't be represented and > > >> > stored* using other types. > > >> > For example: > > >> > - Short can be stored using an int backing field, but it would take > > >> > twice the space. > > >> > - BigDecimal can be stored using a ByteBuffer but ordering on a > buffer > > >> > doesn't align with decimal ordering. > > >> > > > >> > Regarding serialization and deserialization asymmetry on GLVs (for > > >> > Core and Extended types), I think we should avoid it as it could > lead > > >> > to obscure error messages on the user side. > > >> > > > >> > I think we should provide a comprehensive type representation but it > > >> > doesn't have to be contain any type imaginable. The Gremlin Server > and > > >> > the GLVs provide extension mechanisms that vendors and users can use > > >> > to support other types. > > >> > > > >> > 2018-05-24 14:31 GMT+02:00 Florian Hockmann <f...@florian-hockmann.de > >: > > >> > > > >> > > As part of the discussion for the pull request by Daniel C. Weber > > >> > > that > > >> > adds > > >> > > support for more extended GraphSON types to Gremlin.Net [1] we > > >> > > identified several of those types to be problematic for non-Java > > >> > > languages (or at least for .NET in this case) as they don't really > > >> > > have counterparts in other languages and for some it was even > > >> > > difficult to say where they differ > > >> > from > > >> > > each other. > > >> > > > > >> > > > > >> > > > > >> > > Now the question is basically what we want to do with those > > >> > > problematic types. > > >> > > > > >> > > > > >> > > > > >> > > My suggestion would be an approach like this: > > >> > > > > >> > > 1. Identify types that are problematic and that we therefore > > >> don't > > >> > > want > > >> > > to support across all GLVs. > > >> > > 2. Communicate to users somehow which types are problematic > > >> > (something > > >> > > like a deprecation) as we won't support them in all GLVs and maybe > > >> > > even stop supporting them at all at some point in the future. > > >> > > 3. Support the remaining types in all GLVs. > > >> > > > > >> > > > > >> > > > > >> > > Does that sound like a good plan? Are there any good ideas for the > > >> > > deprecation of those problematic types? My first idea would be to > > >> > > put > > >> > them > > >> > > in a different section in the I/O docs [2] that explains at the > > >> > > beginning that and why they are deprecated, but maybe someone here > > >> > > has a better > > >> > idea. > > >> > > > > >> > > > > >> > > > > >> > > Another question that was brought up during the review of the > > >> > > mentioned > > >> > PR > > >> > > by Jorge was whether types should only be supported symmetrically > or > > >> > > whether GLVs should try to support types as good as they can. If > > >> > > someone has good arguments or a strong opinion for either side > then > > >> > > it would of course > > >> > also > > >> > > be good to hear them. > > >> > > > > >> > > To give a concrete example of what is meant by symmetric support: > > >> > > > > >> > > In its current form the PR deserializes both GraphSON types > > >> > > gx:Duration > > >> > and > > >> > > gx:Period to the .NET type TimeSpan and it serializes TimeSpan > back > > >> > > to gx:Duration. This means that gx:Duration is supported > > >> > > symmetrically, but gx:Period is not as there exists no .NET > > >> > > serializer that create a gx:Period. > > >> > > > > >> > > > > >> > > > > >> > > [1] https://github.com/apache/tinkerpop/pull/842 > > >> > > > > >> > > [2] http://tinkerpop.apache.org/docs/current/dev/io/#_extended_2 > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > > > > > >