This may be interesting... they implement cypher (unfortunately they had to fork in order to have cypher be a first class query language with SQL).
https://github.com/bitnine-oss/agensgraph On Mon, Aug 21, 2017 at 12:44 AM Chris Travers <chris.trav...@adjust.com> wrote: > On Sun, Aug 20, 2017 at 4:10 AM, MauMau <maumau...@gmail.com> wrote: > >> From: Chris Travers >> > Why cannot you do all this in a language handler and treat as a user >> defined function? >> > ... >> > If you have a language handler for cypher, why do you need in_region >> or cast_region? Why not just have a graph_search() function which >> takes in a cypher query and returns a set of records? >> >> The language handler is for *stored* functions. The user-defined >> function (UDF) doesn't participate in the planning of the outer >> (top-level) query. And they both assume that they are executed in SQL >> commands. >> > > Sure but stored functions can take arguments, such as a query string which > gets handled by the language handler. There's absolutely no reason you > cannot declare a function in C that takes in a Cypher query and returns a > set of tuples. And you can do a whole lot with preloaded shared libraries > if you need to. > > The planning bit is more difficult, but see below as to where I see major > limits here. > >> >> I want the data models to meet these: >> >> 1) The query language can be used as a top-level session language. >> For example, if an app specifies "region=cypher_graph" at database >> connection, it can use the database as a graph database and submit >> Cypher queries without embedding them in SQL. >> > > That sounds like a foot gun. I would probably think of those cases as > being ideal for a custom background worker, similar to Mongress. > Expecting to be able to switch query languages on the fly strikes me as > adding totally needless complexity everywhere to be honest. Having > different listeners on different ports simplifies this a lot and having, > additionally, query languages for ad-hoc mixing via language handlers might > be able to get most of what you want already. > >> >> 2) When a query contains multiple query fragments of different data >> models, all those fragments are parsed and planned before execution. >> The planner comes up with the best plan, crossing the data model >> boundary. To take the query example in my first mail, which joins a >> relational table and the result of a graph query. The relational >> planner considers how to scan the table, the graph planner considers >> how to search the graph, and the relational planner considers how to >> join the two fragments. >> > > It seems like all you really need is a planner hook for user defined > languages (I.e. "how many rows does this function return with these > parameters" right?). Right now we allow hints but they are static. I > wonder how hard this would be using preloaded, shared libraries. > > >> >> So in_region() and cast_region() are not functions to be executed >> during execution phase, but are syntax constructs that are converted, >> during analysis phase, into calls to another region's parser/analyzer >> and an inter-model cast routine. >> > > So basically they work like immutable functions except that you cannot > index the output? > >> >> 1. The relational parser finds in_region('cypher_graph', 'graph >> query') and produces a parse node InRegion(region_name, query) in the >> parse tree. >> >> 2. The relational analyzer looks up the system catalog to checks if >> the specified region exists, then calls its parser/analyzer to produce >> > the query tree for the graph query fragment. The relational analyser > > >> attaches the graph query tree to the InRegion node. >> >> 3. When the relational planner finds the graph query tree, it passes >> the graph query tree to the graph planner to produce the graph >> execution plan. >> >> 4. The relational planner produces a join plan node, based on the >> costs/statistics of the relational table scan and graph query. The >> graph execution plan is attached to the join plan node. >> >> The parse/query/plan nodes have a label to denote a region, so that >> appropriate region's routines can be called. >> > > It would be interesting to see how much of what you want you can get with > what we currently have and what pieces are really missing. > > Am I right that if you wrote a function in C to take a Cypher query plan, > and analyse it, and execute it, the only thing really missing would be > feedback to the PostgreSQL planner regarding number of rows expected? > >> >> Regards >> MauMau >> >> > > > -- > Best Regards, > Chris Travers > Database Administrator > > Tel: +49 162 9037 210 <+49%20162%209037210> | Skype: einhverfr | > www.adjust.com > Saarbrücker Straße 37a, 10405 Berlin > >