> There are also some cases where it logically makes some sense to safely
have one but not the other. A GLV likely doesn't need to support a Bytecode
deserializer because it doesn't receive bytecode from the server

Agree, I don't think it's necessary to support types deserialization for
types that are never going to be sent from the server, like some of the
types under Graph Process
<http://tinkerpop.apache.org/docs/current/dev/io/#_graph_process>.

I also agree that for GLVs that have a more limited type system (like
JavaScript or Python), we should do what is best for the user and solve it
case by case.

I wanted to stress the need for symmetry for GLVs where we have rich type
systems (in this case .NET/Java) for Core and Extended types, for which
supporting deserialization and not serialization can cause obscure errors,
like:
TinkerPop "Type A" is deserialized as "Type C1" in Gremlin-X GLV, but "Type
C1" instances can't be serialized to "Type A".

TypeC1 value = g.V().has("name", "jorge").value("propA").next();
// The following would fail
g.V().has("name", "jorge").property("propA", value).next();

I think in this case, it's preferred to have a 1-to-1 mappings or no
mapping at all (implementors/vendors could support it, if interested).


2018-05-30 12:45 GMT+02:00 Stephen Mallette <spmalle...@gmail.com>:

> I think the original thread spread off in too many different directions.
> I'm going to leave that original one to talk about future binary format
> stuff, type deprecation, etc. and make this new one to focus on getting
> this PR to close:
>
> https://github.com/apache/tinkerpop/pull/842
>
> which is currently stuck on whether or not it is important for us to have
> symmetry in serialization (i.e. everything a GLV can serialize must also be
> deserialized). I'll paste up my last thoughts on that from my previous post
> below:
>
> >  Regarding serialization and deserialization asymmetry on GLVs (for Core
> and Extended types), I think we should avoid it as it could lead to
> obscure error
> messages on the user side.
>
> In the past, I think TinkerPop (going back to 2.x) has been ok with it and
> I'm not so sure that I recall any specific problems that were every voiced
> by users on the subject. As it stands, I think we already have some
> asymmetry in gremlin-python so there is some precedent for it. There are
> also some cases where it logically makes some sense to safely have one but
> not the other. A GLV likely doesn't need to support a Bytecode deserializer
> because it doesn't receive bytecode from the server. It only needs to send
> bytecode and thus only has a serializer - at least until we have GVMs
> instead of GLVs :)  Does that change your thinking at all Jorge?
>
>
>
>
>
> On Tue, May 29, 2018 at 12:45 PM Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
> > >  Regarding serialization and deserialization asymmetry on GLVs (for
> > Core and Extended types), I think we should avoid it as it could lead to
> > obscure error messages on the user side.
> >
> > In the past, I think TinkerPop (going back to 2.x) has been ok with it
> and
> > I'm not so sure that I recall any specific problems that were every
> voiced
> > by users on the subject. As it stands, I think we already have some
> > asymmetry in gremlin-python so there is some precedent for it. There are
> > also some cases where it logically makes some sense to safely have one
> but
> > not the other. A GLV likely doesn't need to support a Bytecode
> deserializer
> > because it doesn't receive bytecode from the server. It only needs to
> send
> > bytecode and thus only has a serializer - at least until we have GVMs
> > instead of GLVs :)  Does that change your thinking at all Jorge?
> >
> > >   First would be: Gremlin should not concern itself with storage
> > schemas.....
> >
> > I like all of Robert's first paragraph because it makes Jorge's binary
> > format proposal that much easier to get right. JanusGraph, DSE Graph and
> > others won't have any trouble with this approach because the backend will
> > simply know that the particular property that this number is going into
> > will be a float and will coerce it as such on storage. I just wonder
> > exactly how graphs that don't ' have schemas like neo4j/tinkergraph will
> > deal with someone sending a "Number". What happens in that case?
> >
> > On Mon, May 28, 2018 at 4:20 AM, Florian Hockmann <
> f...@florian-hockmann.de>
> > wrote:
> >
> >> > these should be dropped: Class (unless this is used for something
> >> important? Too many results on 'Class'
> >> in the codebase.
> >>
> >> 'Class' is for example used for 'withoutStrategies' but I agree that
> this
> >> would probably better handled just as a string. 'Class' is Java-specific
> >> which doesn't make much sense when graph providers want to implement
> >> TinkerPop in another language than Java.
> >>
> >> Apart from that, I'm not sure I get your reasoning behind dropping types
> >> like Date, Int32, and float. It's really trivial in most languages to
> add
> >> serializers for more numerical types so I don't really see why we should
> >> drop them when they make the storage more efficient and reduce the need
> for
> >> type castings in user code.
> >> For Date, you say that it's just a long. Sure, but how does the receiver
> >> know that the long should be deserialized to a Date in this case? As a
> user
> >> I want to work with a Date object and not just with a long. Also, we
> >> nevertheless need a convention of what this long represents:
> Milliseconds
> >> since January 1, 1970 (POSIX)? Since January 1, 1 (.NET)? Since December
> >> 31, 1899 (C++ 7.0)? (There are a lot more epoch dates [1].) g:Date is
> >> basically just this convention which is why I would keep it.
> >>
> >> > There should be a boolean (which seems to be completely missing??).
> >>
> >> Yeah, boolean and string are both just serialized without type
> >> information right now. Maybe we want to change that if we ever introduce
> >> GraphSON 4.
> >>
> >>
> >> Jorge's suggestion to drop all extended types except for the five he
> >> listed sounds like a good idea to me. I would only add dropping of
> either
> >> Timestamp or Date from Core and probably also Class, like Robert
> suggested.
> >>
> >> [1]
> >> https://en.wikipedia.org/wiki/Epoch_%28reference_date%29#
> Notable_epoch_dates_in_computing
> >>
> >> -----Ursprüngliche Nachricht-----
> >> Von: Robert Dale <robd...@gmail.com>
> >> Gesendet: Freitag, 25. Mai 2018 15:43
> >> An: dev@tinkerpop.apache.org
> >> Betreff: Re: [DISCUSS] Handling of problematic GraphSON types
> >>
> >> There should be a guiding principle on this to make these decisions
> >> clearer.  First would be: Gremlin should not concern itself with storage
> >> schemas. As an extension of that, Gremlin should not concern itself with
> >> storage size. Next would be: Gremlin should not be Java-specific.
> Finally,
> >> it should be hard to add a new type, i.e. it's demonstratively
> difficult to
> >> do a real world traversal without this type, how GLVs would map it, what
> >> functions on that type should be a part of Gremlin, and n>1 people
> >> positively affirm this direction.
> >>
> >> Thus, there should be a minimal Core on which most else can be built.
> >> All extended types should be dropped. From Core, these should be
> dropped:
> >> Class (unless this is used for something important? Too many results on
> >> 'Class'
> >> in the codebase. Otherwise, it's just a string), Date (is a long),
> >> Timestamp (is a long, what's the diff to Date anyway?).  There should be
> >> one floating point type which is 64-bit. There should be one integer
> type
> >> which is 64-bit. There should be a boolean (which seems to be completely
> >> missing??).
> >>
> >>
> >> Robert Dale
> >>
> >> On Fri, May 25, 2018 at 3:37 AM, Jorge Bay Gondra <
> >> jorgebaygon...@gmail.com>
> >> wrote:
> >>
> >> > Thanks Florian for starting the discussion on this topic!
> >> >
> >> > I think its a good exercise to evaluate which types are necessary for
> >> > a GLV to support.
> >> >
> >> > I went through a similar exercise when designing the binary
> >> > serialization format. I'll go ahead and propose:
> >> > All types that are considered "Core", "Graph Structure" and "Graph
> >> Process"
> >> > in GraphSON3
> >> > <http://tinkerpop.apache.org/docs/current/dev/io/#_core_2>
> >> > plus the following from the "Extended" list:
> >> > - Short
> >> > - Byte
> >> > - ByteBuffer
> >> > - BigInteger
> >> > - BigDecimal
> >> >
> >> > The rationale is to select types that *can't be represented and
> >> > stored* using other types.
> >> > For example:
> >> > - Short can be stored using an int backing field, but it would take
> >> > twice the space.
> >> > - BigDecimal can be stored using a ByteBuffer but ordering on a buffer
> >> > doesn't align with decimal ordering.
> >> >
> >> > Regarding serialization and deserialization asymmetry on GLVs (for
> >> > Core and Extended types), I think we should avoid it as it could lead
> >> > to obscure error messages on the user side.
> >> >
> >> > I think we should provide a comprehensive type representation but it
> >> > doesn't have to be contain any type imaginable. The Gremlin Server and
> >> > the GLVs provide extension mechanisms that vendors and users can use
> >> > to support other types.
> >> >
> >> > 2018-05-24 14:31 GMT+02:00 Florian Hockmann <f...@florian-hockmann.de>:
> >> >
> >> > > As part of the discussion for the pull request by Daniel C. Weber
> >> > > that
> >> > adds
> >> > > support for more extended GraphSON types to Gremlin.Net [1] we
> >> > > identified several of those types to be problematic for non-Java
> >> > > languages (or at least for .NET in this case) as they don't really
> >> > > have counterparts in other languages and for some it was even
> >> > > difficult to say where they differ
> >> > from
> >> > > each other.
> >> > >
> >> > >
> >> > >
> >> > > Now the question is basically what we want to do with those
> >> > > problematic types.
> >> > >
> >> > >
> >> > >
> >> > > My suggestion would be an approach like this:
> >> > >
> >> > > 1.      Identify types that are problematic and that we therefore
> >> don't
> >> > > want
> >> > > to support across all GLVs.
> >> > > 2.      Communicate to users somehow which types are problematic
> >> > (something
> >> > > like a deprecation) as we won't support them in all GLVs and maybe
> >> > > even stop supporting them at all at some point in the future.
> >> > > 3.      Support the remaining types in all GLVs.
> >> > >
> >> > >
> >> > >
> >> > > Does that sound like a good plan? Are there any good ideas for the
> >> > > deprecation of those problematic types? My first idea would be to
> >> > > put
> >> > them
> >> > > in a different section in the I/O docs [2] that explains at the
> >> > > beginning that and why they are deprecated, but maybe someone here
> >> > > has a better
> >> > idea.
> >> > >
> >> > >
> >> > >
> >> > > Another question that was brought up during the review of the
> >> > > mentioned
> >> > PR
> >> > > by Jorge was whether types should only be supported symmetrically or
> >> > > whether GLVs should try to support types as good as they can. If
> >> > > someone has good arguments or a strong opinion for either side then
> >> > > it would of course
> >> > also
> >> > > be good to hear them.
> >> > >
> >> > > To give a concrete example of what is meant by symmetric support:
> >> > >
> >> > > In its current form the PR deserializes both GraphSON types
> >> > > gx:Duration
> >> > and
> >> > > gx:Period to the .NET type TimeSpan and it serializes TimeSpan back
> >> > > to gx:Duration. This means that gx:Duration is supported
> >> > > symmetrically, but gx:Period is not as there exists no .NET
> >> > > serializer that create a gx:Period.
> >> > >
> >> > >
> >> > >
> >> > > [1] https://github.com/apache/tinkerpop/pull/842
> >> > >
> >> > > [2] http://tinkerpop.apache.org/docs/current/dev/io/#_extended_2
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >>
> >
>

Reply via email to