Hi everyone,

I just remembered we do have a well-documented set of rules regarding 
comparability and equality of nulls and mismatched types which is relevant to 
this discussion. It is part of the provider docs here: 
https://tinkerpop.apache.org/docs/current/dev/provider/#gremlin-semantics-equality-comparability.
 We may still have some work to do to consistently follow our own rules, 
although the consensus here seems to agree with the docs that null==null is 
true. I agree that this would be worth revisiting if a type system is being 
introduced to TinkerPop.

Josh, I look forward to hearing more on your proposed type system once it is 
ready.

Regards,

Cole

From: Joshua Shinavier <j...@fortytwo.net>
Date: Tuesday, August 8, 2023 at 7:23 AM
To: dev@tinkerpop.apache.org <dev@tinkerpop.apache.org>
Subject: Re: [DISCUSS] Is null equal to null
Hi Dave,

I declined to add my name to that paper. I worked closely with most of the
authors for about 7 months in 2021 when I led a PGSWG subgroup on property
types. We met weekly (minutes
<https://docs.google.com/document/d/1-YcfzgCJ5zXzDq_lL0EfMzx-9M_DM5pGJ66hzfYLR_A/edit>)
to define a type system for properties and, by extension, vertices and
edges. This was a pretty interesting time, with vigorous debate about
nominative vs. structural types, schema on write vs. schema on read, but
there were still a lot of unresolved questions when we paused the working
group. The paper was written a year later in the space of a few weeks -- I
was invited to be involved, but didn't have time, and didn't approve of a
couple of aspects of the paper draft -- above all, that the proposal
ignored the still-open, still-relevant questions about the formalism in
favor of just "getting something out there". Some of my influences made it
in -- e.g. the graph data model feature matrix, which I introduced at the
2019 Dagstuhl seminar. So, it's a paper by my friends and colleagues which
captures a lot of the major concerns of the working group and of schemas
for property graphs -- I am just not satisfied with the actual formalism,
and don't see it as definitive. My preference these days is to map graph
schemas into some other, well-studied formalism like typed lambda calculi
(in the case of Lambda Graph) rather than creating a type system
specifically for property graphs as in that paper, or even as in APG. That
gives you a lot more freedom to introduce variants of the data model (e.g.
if an application needs constraints on property values like min/max, regex,
etc., it is a short step from a property graph model based on System F to
one with dependent types). I am also cautious of making any concessions to
SQL which would weaken the type system, e.g. by including nulls in
primitive types.

Josh



On Mon, Aug 7, 2023 at 3:09 PM David Bechberger <d...@bechberger.com> wrote:

> Hello Ken,
>
> I don't know that I have a strong opinion on what NULL==NULL should
> evaluate to, but I agree we should come up with a set of rules here for
> consistency, both within Gremlin but also with other database language
> standards (e.g. GQL and SQL) so that Gremlin best matches customer
> expectations.  Gremlin's divergence from user expectations when it comes to
> null handling has been a constant headache for new users.  While I agree
> with Josh that a type system would make this easier, we still need to be
> consistent until we cross that bridge.
>
> For example, if you have a list, A, which is
> [1,2,null] and a list, B, which is [1,null]. Should the result of an
> INTERSECT be [1,null] or [1]
>
> In Postgres, this would be [1,null] so that is probably what I would
> recommend unless someone has a stronger opinion to do something different?
>
> Josh, I am familiar with the work you did on Dragon, and I am curious how
> you see your work aligning with the recent SIGMOD paper [1] from the LDBC
> working group on PG Schema?
>
> Dave
>
> [1] https://arxiv.org/abs/2211.10962
>
>
> On Sat, Aug 5, 2023 at 7:02 AM Joshua Shinavier <j...@fortytwo.net> wrote:
>
> > Hi Ken,
> >
> > Yes indeed, there is that push. I am not saying that Gremlin shouldn't
> have
> > a type system -- just that certain questions will have better answers
> once
> > it does. While I am not drawing a lot of attention to it yet in
> connection
> > with TinkerPop, there is a type system I am going to propose for
> TinkerPop.
> > The formalism is called Lambda Graph, and it is closely related to the
> > Algebraic Property Graphs [1] model which was implemented by Dragon [2].
> I
> > made a big deal about Dragon three years ago and then was unable to
> release
> > it, so I'm waiting until Hydra [3] is completely ready before promoting
> it
> > here. That said, it's not far from being ready. We are building property
> > graph (not yet TinkerPop) applications with it at LinkedIn. I recently
> gave
> > a presentation [4] on the data model which has excerpts from the Lambda
> > Graph paper draft. In terms of property types, probably the first thing I
> > will explore is integrating Hydra's "TinkerPop" model [5] with TinkerPop
> > proper. In that model, property types are parameterized and unspecified,
> as
> > are vertex and edge id types; different type systems for properties and
> ids
> > can be plugged in here. For Hydra's core type system, see hydra/core.Type
> > [6]. This type system behaves as I described above: there are no "nulls",
> > but there are optionals, which are comparable to the extent that the base
> > type is comparable.
> >
> > Josh
> >
> > [1] https://arxiv.org/abs/1909.04881
> > [2] https://www.uber.com/blog/dragon-schema-integration-at-uber-scale/
> > [3] https://github.com/CategoricalData/hydra
> > [4]
> >
> >
> https://docs.google.com/presentation/d/1PF0K3KtopV0tMVa0sGBW2hDA7nw-cSwQm6h1AED1VSA
> > [5]
> >
> >
> https://categoricaldata.github.io/hydra/hydra-java/javadoc/hydra/langs/tinkerpop/propertyGraph/package-summary.html
> > [6]
> >
> >
> https://categoricaldata.github.io/hydra/hydra-java/javadoc/hydra/core/Type.html
> >
> >
> > On Fri, Aug 4, 2023 at 6:23 PM Ken Hu <k...@bitquilltech.com.invalid>
> > wrote:
> >
> > > Hi Josh,
> > >
> > > Thanks for your input. There seems to be a push in the graph database
> > world
> > > towards having a schema. It's likely something like this would be
> > > introduced in TinkerPop in the future. Let's assume that TinkerPop does
> > > support schemas, and therefore would have a type system, would this
> > change
> > > your opinion on the matter?
> > >
> > > Thanks again,
> > > Ken
> > >
> > > On Wed, Aug 2, 2023 at 3:54 PM Joshua Shinavier <j...@fortytwo.net>
> > wrote:
> > >
> > > > For what it is worth, I think the question of whether null == null is
> > > only
> > > > meaningful in the context of a specific type system, which Gremlin so
> > far
> > > > does not provide. My personal preference is to avoid SQL-style nulls
> > and
> > > > achieve optionality through union types (e.g. Java's Optional or
> > > Haskell's
> > > > Maybe). In the case of two lists, if you can assume that the type of
> > the
> > > > list is list<optional<int>>, then you can safely treat null like
> > > > Optional.empty(), and compare it with another null of the same
> logical
> > > type
> > > > (int). If that is the interpretation of your two lists, then the
> > > > intersection is [1, null].
> > > >
> > > > Josh
> > > >
> > > >
> > > >
> > > > On Tue, Aug 1, 2023 at 5:47 PM Ken Hu <k...@bitquilltech.com.invalid
> >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > As Gremlin evolves and gains more functionality, it is important
> that
> > > we
> > > > > establish some fundamental rules to provide consistency in results.
> > One
> > > > > such question that we should come to agreement on is how null
> values
> > > are
> > > > > compared. Currently, Gremlin seems to mostly follow the comparison
> > that
> > > > is
> > > > > used in Java where NULL == NULL returns TRUE. However, in many
> other
> > > > > database systems, NULL == NULL would return FALSE (or NULL).
> > > > >
> > > > > This question comes about as I'm starting to look a little deeper
> > into
> > > > the
> > > > > proposed list functions. An example of where this is applicable is
> > the
> > > > > INTERSECT list function. For example, if you have a list, A, which
> is
> > > > > [1,2,null] and a list, B, which is [1,null]. Should the result of
> an
> > > > > INTERSECT be [1,null] or [1]?
> > > > >
> > > > > I think it makes sense in Gremlin for us to follow the rule that
> most
> > > > > programming languages follow which is the former (NULL == NULL
> > returns
> > > > > TRUE) because it feels more in line with how Gremlin was meant to
> be
> > > used
> > > > > (together with your code rather than as a string query). In this
> case
> > > the
> > > > > return value would be [1,null].
> > > > >
> > > > > What are your thoughts on this subject?
> > > > >
> > > > > Thanks,
> > > > > Ken
> > > > >
> > > >
> > >
> >
>

Reply via email to