Re: structure API for TP4

Stephen Mallette Fri, 03 Jan 2020 05:21:34 -0800

Sorry it took me a bit to get to this...

> Graph.Features will carry over into TP4

Having Graph.Features implies having Graph which is part of the Structure
API. Marko and I have questioned the necessity for the Graph and Structure
API in recent years. Major graph providers who use TinkerPop don't even
implement it I don't think - they just process Gremlin. This "secondary"
API (formerly a first class citizen) also creates confusion for users who
try to use it directly and have mixed results depending on the graph they
choose. Worse still, they end up writing Structure API code in scripts
embedded as strings in their code (despite advice to not do so) and end up
creating  non-portable code. Furthermore, GLV users end up wondering why
they can't do graph.addVertex() and other similar Structure API calls.
Mixed advice in third-party blog posts compounds these issues.

So, when you talk about the Structure API, I wonder if you mean to keep all
of it or just the notion of Graph.Features (in some new revised form). The
latter is agreeable in my mind because we likely still need some way to
know how a graph behaves for purposes of our technology test suite. Without
the Structure API, I wasn't sure yet what that would look like.

> I feel we should use Scala for the API. This opinion is informed by my
experiences writing tools of this kind in both Java and Haskell at Uber.
While I am a huge fan of Haskell, practical considerations rule it out as
an option. We need the API to be JVM-compatible

Having followed along with your talks, writings, etc and with my own
reading of Category Theory and such, I realized that a use of Java would
probably not work. While I have interest in Haskell (more so than Scala),
Scala does seem like the best fit for this work on the JVM. That said,
there are two points I'd like us to consider that have been on my mind for
TP4:

1. The realization that TinkerPop, specifically Gremlin, would be available
natively in other language ecosystems besides the JVM came way too late in
TP3. As a result, we have an extraordinarily mixed set of messages with
Gremlin usage. Things work one way in Java, but another way in Python. And
while 3.4.x unified connection options across languages, there's still too
many ways to connect to a graph and too much discrepancy in behavior. We
need to think about how every single feature that we create for TP4 behaves
in each language and what parity of capability we can achieve there. And if
some reasonable level of parity can't be achieved for whatever reason, we
should seriously consider either not implementing the feature or the story
for the language ecosystems that don't have the functionality better be
crystal clear and consistent with TinkerPop as whole. We should very much
consider how Graph.Features (in whatever form it takes) is accessible via
Java, Python, Javascript, etc. before going too far in any particular
development direction.
2. What is the general structure for this project with respect to the
different language environments that we have? Personally, I still like the
idea of a single repo, but without a single build system ruling it all. In
this way each language ecosystem can take advantage of the best parts of
its particular build tool chain without having to shoehorn into a different
system's approach. That said, I think each ecosystem should stick to a
single build tool chain e.g.. maven for the JVM.

As a big picture point, I think the JVM ecosystem will be the model for all
other language ecosystems. I would think that we would want to take care
that we not turn TinkerPop into a Scala-only system - I assume this work
isn't laying the foundation for that, but figured I'd voice the concern. I
think we'd largely still rely on Java for development outside of this
feature that has some specific demands not addressed well by it. I'd
further assume that we would have some nice clean interop back to Java for
this stuff so as to keep our core users well engaged.

> to keep TinkerPop aligned with upcoming standards like RDF* and GQL.
> Interoperability with mm-ADT should be straightforward

Thank you for keeping up with the developing standards. That's a nice
service to TinkerPop.

Ultimately my vision for TP4 seems to have less to do with specific major
new features (thus glad to see that you're thinking in that manner) and
more to do with creating consistent, coherent and easy graph usage patterns
across language ecosystems for users while making it even simpler for
providers to build their TinkerPop-enabled systems. Having seen so much
success with GLVs for TP3, despite their drawbacks, I can't help but sense
that focusing on this notion as a foundational element of design for TP4
will further expand TinkerPop's appeal and reach.

On Thu, Dec 26, 2019 at 11:00 AM Joshua Shinavier <[email protected]> wrote:

> Hi everyone,
>
> I would like to reboot the conversation around TinkerPop 4, specifically as
> it concerns the structure API. You will have seen my posts, ever since my
> presentation [1] last January, about an algebraic approach to property
> graph schemas and transformations, which Ryan and I formalized in the APG
> paper [2]. I am now very close to releasing the Haskell implementation of
> this framework as open source software (to be accompanied by an Uber
> Engineering Blog post, in the next few weeks if all goes well).
>
> At various times and places, I have suggested that we develop a Scala-based
> structure API for TP4 which implements APG in an extensible way. I think it
> is time to proceed and start committing code, or discuss alternative plans
> for the structure API. There seems to be plenty of community interest, and
> I now have an official OK to put some engineering hours towards it at work.
> I would like to align with you -- the TP PMC and other TinkerPop committers
> and developers -- on how to proceed, who will contribute, and what the
> development timeline will look like.
>
> Some specifics from my side:
>
>    - Graph.Features will carry over into TP4; it will just be a bit more
>    sophisticated than the current TP3 Graph.Features. Btw. I also proposed
>    this idea of a graph feature vector at the recent Dagstuhl Seminar [3],
>    where it caught on and will be the basis of a "dragon data model" that
>    might help to keep TinkerPop aligned with upcoming standards like RDF*
> and
>    GQL.
>    - I feel we should use Scala for the API. This opinion is informed by my
>    experiences writing tools of this kind in both Java and Haskell at Uber.
>    While I am a huge fan of Haskell, practical considerations rule it out
> as
>    an option. We need the API to be JVM-compatible. The best Haskell-JVM
>    bridge in is Eta [4], but IMO it is not ready to be put in the critical
>    path on a project such as TinkerPop; we used it at Uber for a while and
>    found it to be a time sink, despite the generated bytecode working
> great.
>    Likewise, I would strongly advise against continuing with a pure
> Java-based
>    API if we want to do intelligent things with graph schemas. The
> language is
>    just not appropriate as a basis for the type system in question. Scala,
> on
>    the other hand, has all of the advantages of Haskell in terms of type
>    safety and functional pattern matching, although it requires some extra
>    discipline to keep your code pure.
>    - Interoperability with Ryan's CQL (categorical query language [5]) is
>    of interest.
>    - Interoperability with mm-ADT should be straightforward now that mm-ADT
>    has support for union types. Hopefully, mm-ADT's type system will end
> up as
>    a proper superset of TP4's.
>
> Thoughts?
>
> Josh
>
>
> [1]
>
> https://www.slideshare.net/joshsh/a-graph-is-a-graph-is-a-graph-equivalence-transformation-and-composition-of-graph-data-models-129403012
> [2] https://arxiv.org/abs/1909.04881
> [3] https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=19491
> [4] https://eta-lang.org
> [5] https://www.categoricaldata.net
>

Re: structure API for TP4

Reply via email to