I have worked with two different graph database vendors—Azure Cosmos DB and
Neo4j. During our migration to Neo4j, we discovered that using the Gremlin
language wasn’t possible; we were forced to rewrite all our queries into
Cypher, which is the native language for Neo4j and, in my experience, much
simpler for querying.

This situation highlights a key challenge for a common abstraction: the
underlying query languages and connection/authentication mechanisms vary
significantly. Gremlin is not only different from Cypher in syntax but is
also deprecated for Neo4j (see
https://tinkerpop.apache.org/docs/3.7.3/reference/#neo4j-gremlin).

The question would be how can the common approach accommodate these
different query languages?

On Fri, Feb 21, 2025 at 7:36 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Without deep looking at the code I love the idea - it's very similar to
> what we have for common.sql and common.io - and soon common.messaging - I
> also - long time ago - suggested common.dataframe that someone could submit
> using Apache Ibis:
> https://lists.apache.org/thread/qx3yh6h0l6jb0kh3fz9q95b3x5b4001l  -
> similarly I believe there was an idea about common.llm ...
>
> I think the "common" pattern is a great one for Airflow, to build on top of
> "other giants" who build those common abstractions that you can easily
> switch between different implementations of various data access layers.
>
> My suggestion and question - would be however (not very strong on it, I
> would love to hear what others think, I know it's been somewhat contentious
> when I started the ibis discussion) - would be to make it "common.graph",
> "common.dataframe" - instead of "apache.gremlin" or "apache.ibis" - just to
> stress that those are not implementations of particular service but
> opinionated choice of particular technology to do "common" operations. This
> is what essentially "common.io" is . - it should be named "fsspec"
> provider
> if we were to name it by the "library" that implemented it.
>
> J.
>
>
> On Fri, Feb 21, 2025 at 8:22 PM Ahmad Farhan <ahmad.farhan9...@gmail.com>
> wrote:
>
> > Hi Everyone,
> >
> > I’ve created a draft PR (https://github.com/apache/airflow/pull/46977)
> to
> > introduce and discuss a new provider for using Gremlin—the graph
> traversal
> > language of Apache TinkerPop (more details here:
> > https://tinkerpop.apache.org/gremlin.html). Gremlin is supported by
> > various
> > graph database vendors such as Azure Cosmos DB and Amazon Neptune.
> > Previously, I had to develop a custom hook to query data from Azure
> Cosmos
> > DB using Apache Gremlin.
> >
> > I managed to create a provider and run it locally on the main branch.
> > However, I ran into the BaseHook issue (
> > https://github.com/apache/airflow/issues/45233) on that branch, so I
> ended
> > up testing it fully on the v2-10-test branch. The PR should be complete,
> > but I’ve kept it as a draft for now while we discuss the provider.
> >
> > I’m a new contributor, so I’m especially eager to hear your feedback.
> > Comments on the PR is very welcome, and please feel free to reach out
> with
> > any questions via email or Slack.
> >
> > Thanks,
> > Ahmad Farhan
> >
>

Reply via email to