Hello Ken, I don't know that I have a strong opinion on what NULL==NULL should evaluate to, but I agree we should come up with a set of rules here for consistency, both within Gremlin but also with other database language standards (e.g. GQL and SQL) so that Gremlin best matches customer expectations. Gremlin's divergence from user expectations when it comes to null handling has been a constant headache for new users. While I agree with Josh that a type system would make this easier, we still need to be consistent until we cross that bridge.
For example, if you have a list, A, which is [1,2,null] and a list, B, which is [1,null]. Should the result of an INTERSECT be [1,null] or [1] In Postgres, this would be [1,null] so that is probably what I would recommend unless someone has a stronger opinion to do something different? Josh, I am familiar with the work you did on Dragon, and I am curious how you see your work aligning with the recent SIGMOD paper [1] from the LDBC working group on PG Schema? Dave [1] https://arxiv.org/abs/2211.10962 On Sat, Aug 5, 2023 at 7:02 AM Joshua Shinavier <j...@fortytwo.net> wrote: > Hi Ken, > > Yes indeed, there is that push. I am not saying that Gremlin shouldn't have > a type system -- just that certain questions will have better answers once > it does. While I am not drawing a lot of attention to it yet in connection > with TinkerPop, there is a type system I am going to propose for TinkerPop. > The formalism is called Lambda Graph, and it is closely related to the > Algebraic Property Graphs [1] model which was implemented by Dragon [2]. I > made a big deal about Dragon three years ago and then was unable to release > it, so I'm waiting until Hydra [3] is completely ready before promoting it > here. That said, it's not far from being ready. We are building property > graph (not yet TinkerPop) applications with it at LinkedIn. I recently gave > a presentation [4] on the data model which has excerpts from the Lambda > Graph paper draft. In terms of property types, probably the first thing I > will explore is integrating Hydra's "TinkerPop" model [5] with TinkerPop > proper. In that model, property types are parameterized and unspecified, as > are vertex and edge id types; different type systems for properties and ids > can be plugged in here. For Hydra's core type system, see hydra/core.Type > [6]. This type system behaves as I described above: there are no "nulls", > but there are optionals, which are comparable to the extent that the base > type is comparable. > > Josh > > [1] https://arxiv.org/abs/1909.04881 > [2] https://www.uber.com/blog/dragon-schema-integration-at-uber-scale/ > [3] https://github.com/CategoricalData/hydra > [4] > > https://docs.google.com/presentation/d/1PF0K3KtopV0tMVa0sGBW2hDA7nw-cSwQm6h1AED1VSA > [5] > > https://categoricaldata.github.io/hydra/hydra-java/javadoc/hydra/langs/tinkerpop/propertyGraph/package-summary.html > [6] > > https://categoricaldata.github.io/hydra/hydra-java/javadoc/hydra/core/Type.html > > > On Fri, Aug 4, 2023 at 6:23 PM Ken Hu <k...@bitquilltech.com.invalid> > wrote: > > > Hi Josh, > > > > Thanks for your input. There seems to be a push in the graph database > world > > towards having a schema. It's likely something like this would be > > introduced in TinkerPop in the future. Let's assume that TinkerPop does > > support schemas, and therefore would have a type system, would this > change > > your opinion on the matter? > > > > Thanks again, > > Ken > > > > On Wed, Aug 2, 2023 at 3:54 PM Joshua Shinavier <j...@fortytwo.net> > wrote: > > > > > For what it is worth, I think the question of whether null == null is > > only > > > meaningful in the context of a specific type system, which Gremlin so > far > > > does not provide. My personal preference is to avoid SQL-style nulls > and > > > achieve optionality through union types (e.g. Java's Optional or > > Haskell's > > > Maybe). In the case of two lists, if you can assume that the type of > the > > > list is list<optional<int>>, then you can safely treat null like > > > Optional.empty(), and compare it with another null of the same logical > > type > > > (int). If that is the interpretation of your two lists, then the > > > intersection is [1, null]. > > > > > > Josh > > > > > > > > > > > > On Tue, Aug 1, 2023 at 5:47 PM Ken Hu <k...@bitquilltech.com.invalid> > > > wrote: > > > > > > > Hi All, > > > > > > > > As Gremlin evolves and gains more functionality, it is important that > > we > > > > establish some fundamental rules to provide consistency in results. > One > > > > such question that we should come to agreement on is how null values > > are > > > > compared. Currently, Gremlin seems to mostly follow the comparison > that > > > is > > > > used in Java where NULL == NULL returns TRUE. However, in many other > > > > database systems, NULL == NULL would return FALSE (or NULL). > > > > > > > > This question comes about as I'm starting to look a little deeper > into > > > the > > > > proposed list functions. An example of where this is applicable is > the > > > > INTERSECT list function. For example, if you have a list, A, which is > > > > [1,2,null] and a list, B, which is [1,null]. Should the result of an > > > > INTERSECT be [1,null] or [1]? > > > > > > > > I think it makes sense in Gremlin for us to follow the rule that most > > > > programming languages follow which is the former (NULL == NULL > returns > > > > TRUE) because it feels more in line with how Gremlin was meant to be > > used > > > > (together with your code rather than as a string query). In this case > > the > > > > return value would be [1,null]. > > > > > > > > What are your thoughts on this subject? > > > > > > > > Thanks, > > > > Ken > > > > > > > > > >