Based on the current discussion, I'll make all() and any() behave like
and() and or() where they are ternary-aware steps.

On Wed, Sep 20, 2023 at 4:53 PM Cole Greer <cole.gr...@improving.com.invalid>
wrote:

> Hi Mike,
>
> Thanks for weighing in here. I think this discussion needs to be split in
> two parts. On one side there is the implementation discussion which Pieter
> and Ken have led which focuses on how to make this work for providers,
> should exceptions be used in control flow, etc. I think the second
> discussion needs to be about ensuring the semantics make sense for gremlin.
> My focus here is on the semantic discussion.
>
> I agree with your points that ternary-boolean/three-valued logic well
> defined and pervasive in many database systems. After all the semantics
> outlined in our provider docs (
> https://tinkerpop.apache.org/docs/3.6.0/dev/provider/#_ternary_boolean_logics)
> are essentially the same as comparison and logic semantics in the SQL
> standard. Where I have some concerns with the semantics in gremlin has to
> do with the boundaries of the logic expression. In SQL, it is very clear
> what is, and is not a part of a single logic expression. If I have
> something such as:
>
> WHERE x > 5
> OR
> (NOT y < 10);
>
> It is very well understood that everything following the WHERE should be
> considered as a single logic expression which is recursively composed of
> other logic expressions or comparisons. All of the ternary logic is
> isolated and contained within that composite expression, and it is well
> understood that WHERE will remove any results in which the composite
> expression does not return true.
>
> Gremlin on the other hand does not have a notion of a logic expression. We
> have “atomic” predicates like P.gt, and we have composite predicates such
> as P.and, but there is also logical FilterSteps such as AndStep and
> NotStep. These steps are what creates the fuzzy boundaries around a logic
> expression. As of right now, the equivalent of the above in gremlin would
> be something such as:
>
> Ex1: g…().as(“x”)…().as(“y”).where( or( select(“x”).is(P.gt(5)),
> select(“y”).is(P.not(P.lt(10))) ) )
>
> The question I want to ask about this is what part of that traversal is
> the “logic expression”. It is tempting to define it as everything within
> the where() step. That definition makes sense in this example but is
> difficult to formally define for general cases. If define a logic
> expression as the contents of a where() step, then in the following example
> (Ex2), there is a hard break between the inner where() step and the larger
> expression.
>
> Ex2: g…().as(“x”)…().as(“y”).where( or( select(“x”).is(P.gt(5)),
> where(select(“y”).is(P.not(P.lt(10)))) ) )
>
> In this example, the whole expression in the outer where()-step feels like
> a single composite logic expression and having the internal where()-step
> create a hard break and early reduction leads to an unexpected difference
> in semantics between Ex1 and Ex2. I have seen a few cases in the past in
> which the different error reduction points between Ex1 and Ex2 has created
> a lot of confusion for users.
>
> An additional bit of weird semantics we currently have due to
> implementation limitations is that Error states cannot get passed through
> any parent step which is not a FilterStep. This is because the logic for
> handling errors is implemented within FilterStep and it is not clear what
> it would mean to reduce an error state to false from outside of the context
> of a FilterStep. However, the issue that this creates is that a traversal
> such as Ex3, will be treated as a single composite logic expression,
> whereas the union()-step in Ex4 forces early reduction and breaks this into
> 2 independent logic expressions.
>
> Ex3: …not( is(eq(0)) )
> Ex4: …not( union( is(eq(0)) ) )
>
> To add one more layer on top of this, we currently have certain strategies
> which may reorder, modify, replace, or remove certain steps as
> optimizations. Many of these optimizations have not properly considered the
> ternary boolean semantics and produce incorrect results in error cases.
>
> I have spent a fair bit of time thinking about this and I have not yet
> come up with a clear set of rules to determine the right boundary for a
> logic expression which feels intuitive, or at the very least is teachable.
> Any system I’ve come up with which encompasses steps within logic
> expression always leads to certain edge cases where the semantics no longer
> make sense. For this reason, I’m wondering if the best solution might be to
> set the boundary at the predicate level. In this case ternary boolean
> semantics would work within a composite predicate such as
> P.lt(5).and(P.not(P.eq(3))), however when returning a value out to the step
> level, it would have to be either true or false. Under this model we would
> no longer view a tree of And/Or/Not steps as a single logic expression but
> instead as a series of discrete steps. I think this may be the right path
> forward although it likely requires leveling up the capabilities of
> predicates in order to be more useful.
>
> I would be curious to know if others have any thoughts on what the right
> rules are for determining when to reduce from ternary boolean logic to true
> and false.
>
> Regards,
>
> Cole
>
>
>
> From: Mike Personick <m...@dayzero.io>
> Date: Wednesday, September 20, 2023 at 9:41 AM
> To: dev@tinkerpop.apache.org <dev@tinkerpop.apache.org>
> Subject: Re: [DISCUSS] Ternary Boolean Handling in New Steps
> Early reduction is not possible because of negation. Please see the
> discussion of ternary boolean semantics here:
>
>
> https://tinkerpop.apache.org/docs/3.6.0/dev/provider/#_ternary_boolean_logics
>
> On Tue, Sep 19, 2023 at 12:25 PM Ken Hu <k...@bitquilltech.com.invalid>
> wrote:
>
> > Thanks for your input Pieter.
> >
> > I agree with a lot of what you said and I think your suggestion is
> > reasonable. From what I can tell, the logic in the FilterStep is there
> > because reduction points are needed for the ternary boolean system. One
> of
> > the ways to move logic out of this area would be to get rid of ternary
> > boolean by immediately reducing any ERROR state to FALSE.
> >
> > On Sat, Sep 16, 2023 at 1:06 AM pieter <pieter.mar...@gmail.com> wrote:
> >
> > > Hi,
> > > I have not really applied my mind to issue of the semantics, but what I
> > > do recall is being irritated with `GremlinValueComparator` throwing
> > > `GremlinTypeErrorException`.
> > > Sometimes its propogated and sometimes swallowed. Code smell!!!
> > > Using exceptions as process logic right there in the heart of
> > > TinkerPop's iterator logic seemed to me as a bad idea and breaks
> > > providers ability to override classes.
> > > A good example is the logic in FilterStep.processNextStart() where the
> > > exception is being swalloed. This logic should not be here and
> > > exceptions should not be used for control flow.
> > > Providers expect the base filter step to filter, not conditionally
> > > swallow exceptions based on a long if statement.
> > > My suggestion is let the comparator do what comparators do and return a
> > > int. The type issue should be handled higher up the stack.
> > >
> > > Regards
> > > Pieter
> > > On Fri, 2023-09-15 at 14:14 -0700, Ken Hu wrote:
> > > >
> > > >
> > > > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Hi Cole, "However, it makes sense for this short-term decision
> > > > > > > to align with our long-term direction regarding comparability
> > > > > > > semantics. I
> > > > > > > wouldn’t be opposed to your proposed implementation if the
> > > > > > > long-term plan is to move all steps towards this immediate
> > > > > > > reduction behaviour." This is sort of my thinking as well. As
> > > > > > > you demonstrated in your post, there is already an
> > > > > > > inconsistency with the way ternary
> > > > > > > boolean is reduced which leads to different results for
> > > > > > > equivalent queries.
> > > > > > > This is why I would prefer to just move ahead with an
> > > > > > > implementation that I
> > > > > > > believe is the most consistent with the expectations of users.
> > > > > > > However, you
> > > > > > > have valid concerns about adding even more inconsistencies to
> > > > > > > the
> > > > > > > language so if others voice their concern as well then I'll
> > > > > > > make it
> > > > > > > behave more like AND and OR. Regards, Ken On Mon, Sep 11, 2023
> > > > > > > at 6:11 PM Cole Greer cole.gr...@improving.com.invalidwrote:
> > > > > >
> > > > >
> > > >
> > > >
> > > > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi Ken, Thanks for bringing this up, I believe topic
> > > > > > > > > > > warrants some
> > > > > > > > > > > further discussion. My understanding of the intent of
> > > > > > > > > > > the
> > > > > > > > > > > current system is that it aims to provide a consistent
> > > > > > > > > > > and predictable set of
> > > > > > > > > > > rules for comparisons between any datatypes. Prior to
> > > > > > > > > > > 3.6, in general
> > > > > > > > > > > comparisons between different types in gremlin produced
> > > > > > > > > > > undefined behaviour (in practice this usually meant an
> > > > > > > > > > > exception). The current system
> > > > > > > > > > > successfully resolved much of this issue although it
> > > > > > > > > > > has introduced certain
> > > > > > > > > > > semantic consistency issues (see
> > > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-2940).
> > > > > > > > > > > Further,
> > > > > > > > > > > while the docs (
> > > > > > > > > > >
> https://tinkerpop.apache.org/docs/3.7.0/dev/provider/#_
> > > > > > > > > > > ternary_boolean_logics)
> > > > > > > > > > > are quite clear regarding the propagation/reduction
> > > > > > > > > > > behaviour
> > > > > > > > > > > in many cases, as you probe the edges it becomes
> > > > > > > > > > > muddier. Considering the following example, the docs
> > > > > > > > > > > quite clearly
> > > > > > > > > > > define the expected behaviour of the first traversal,
> > > > > > > > > > > but the expected behaviour is not clear outside of
> > > > > > > > > > > basic combinations of AND,
> > > > > > > > > > > OR, and NOT: gremlin> g.inject(1).not(is(gt("one"))) //
> > > > > > > > > > > Produces no output
> > > > > > > > > > > gremlin> g.inject(1).not(union(is(gt("one")),
> > > > > > > > > > > is(eq("zero"))))
> > > > > > > > > > > ==>1 // Error is reduced to false prior to Union Step,
> > > > > > > > > > > and thus
> > > > > > > > > > > not propagated into the Not Step. This is a good
> > > > > > > > > > > example that we are currently in a bit of a
> > > > > > > > > > > weird place where some of the language semantics are
> > > > > > > > > > > formally defined
> > > > > > > > > > > in documentation, while the rest of the language
> > > > > > > > > > > semantics are
> > > > > > > > > > > defined by implementation. It currently cannot be
> > > > > > > > > > > determined if the above
> > > > > > > > > > > example is expected or a bug. I believe it is important
> > > > > > > > > > > that we find a resolution to
> > > > > > > > > > > this by expanding our formally defined semantics or
> > > > > > > > > > > changing the
> > > > > > > > > > > implementation (when a breaking change is permittable).
> > > > > > > > > > > As for the short-term question posed by ANY and ALL, my
> > > > > > > > > > > only concern with your suggestion is it would be
> > > > > > > > > > > subject to the
> > > > > > > > > > > following inconsistency although as shown above there
> > > > > > > > > > > is current
> > > > > > > > > > > precedent for this sort of thing. gremlin>
> > > > > > > > > > > g.inject(1).not(is(lt("one"))) // Produces no output
> > > > > > > > > > > gremlin> g.inject([1]).not(any(is(lt("one")))) ==>[1]
> > > > > > > > > > > In my opinion the most neutral direction would be for
> > > > > > > > > > > ANY to
> > > > > > > > > > > behave the same as a chain of OR’s and for ALL to act
> > > > > > > > > > > as a chain of
> > > > > > > > > > > ANDs. However, it makes sense for this short-term
> > > > > > > > > > > decision to align
> > > > > > > > > > > with our long-term direction regarding comparability
> > > > > > > > > > > semantics. I
> > > > > > > > > > > wouldn’t be opposed to your proposed implementation if
> > > > > > > > > > > the long-term plan is to
> > > > > > > > > > > move all steps towards this immediate reduction
> > > > > > > > > > > behaviour. Thanks, Cole Greer From: Ken Hu
> > > > > > > > > > > k...@bitquilltech.com.INVALIDDate: Monday,
> > > > > > > > > > > September 11, 2023 at 4:16 PM To:
> > > > > > > > > > > dev@tinkerpop.apache.org
> dev@tinkerpop.apache.orgSubjec
> > > > > > > > > > > t:
> > > > > > > > > > > [DISCUSS] Ternary Boolean Handling in New Steps Hi All,
> > > > > > > > > > > Starting in version 3.6, the ternary boolean system was
> > > > > > > > > > > introduced to handle comparison/equality tests within
> > > > > > > > > > > Gremlin. Recently,
> > > > > > > > > > > I've been implementing some list functions from
> > > > > > > > > > > Proposal 3 which
> > > > > > > > > > > make heavy use of the GremlinValueComparator to
> > > > > > > > > > > determine if values
> > > > > > > > > > > satisfy a specific condition. However, I'm finding it a
> > > > > > > > > > > bit tricky to
> > > > > > > > > > > understand how I should handle the
> > > > > > > > > > > GremlinTypeErrorException. For any() and
> > > > > > > > > > > all(), it seems like it would make sense to immediately
> > > > > > > > > > > reduce any ERROR state
> > > > > > > > > > > to false as it's a filter step. In the case of all(),
> > > > > > > > > > > if a
> > > > > > > > > > > GremlinTypeErrorException is caught, it would mean
> > > > > > > > > > > there was a comparison error so the
> > > > > > > > > > > traverser should be removed from the stream. However,
> > > > > > > > > > > doing this
> > > > > > > > > > > seemingly clashes with the original intention of
> > > > > > > > > > > ternary boolean which is to
> > > > > > > > > > > allow a provider-specific response on how to handle an
> > > > > > > > > > > ERROR state. My current thoughts are that we should
> > > > > > > > > > > rework the ternary
> > > > > > > > > > > boolean system in the future to make it easier to
> > > > > > > > > > > incorporate it into
> > > > > > > > > > > new steps. One of the trickiest parts is that it uses
> > > > > > > > > > > unchecked exceptions as
> > > > > > > > > > > a means to implement the ERROR state which can get
> > > > > > > > > > > easily
> > > > > > > > > > > missed or accidentally leaked to the user (which has
> > > > > > > > > > > happened before).
> > > > > > > > > > > For now, I'm planning to go ahead and immediately
> > > > > > > > > > > reduce ERROR states as I
> > > > > > > > > > > think that is what makes the most sense for list
> > > > > > > > > > > functions. Does anyone have any thoughts about this?
> > > > > > > > > > > Thanks, Ken
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > >
> > > >
> > >
> > >
> >
>

Reply via email to