Re: A collection of examples that map a query language query to provider bytecode.

Stephen Mallette Fri, 10 May 2019 05:11:50 -0700

>  I don’t like that you would have to create a CypherCompiler class (even
if its just a wrapper) for all popular programming languages. :(


Yeah, this is the trouble I saw with sparql-gremlin and how to make it so
that GLVs can support the g.sparql() step properly. It seems like no matter
what you do, you end up with a situation where the language designer has to
do something in each programming language they want to support. The bulk of
the work seems to be in the "compiler" so if that were moved to the server
(what we did in TP3) then the language designer would only have to write
that once per VM they wanted to support and then provide a more lightweight
library for each programming language they supported on the client-side. A
programming language that had the full compiler implementation would have
the advantage that they could client-side compile or rely on the server. I
suppose that a lightweight library would then become the basis for a future
full blown compiler in that language........hard one.



On Thu, May 9, 2019 at 6:09 PM Marko Rodriguez <[email protected]> wrote:

> Hello Dmitry,
>
> > In TP3 compilation to Bytecode can happen on Gremlin Client side or
> Gremlin Server side:
> >
> > 1. If compilation is simple, it is possible to implement it for all
> Gremlin Clients: Java, Python, JavaScript, .NET...
> > 2. If compilation is complex, it is possible to create a plugin for
> Gremlin Server. Clients send query string, and server does the compilation.
>
> Yes, but not for the reasons you state. Every TP3-compliant language must
> be able to compile to TP3 bytecode. That bytecode is then submitted,
> evaluated by the TP3 VM, and a traverser iterator is returned.
>
> However, TP3’s GremlinServer also supports JSR223 ScriptEngine which can
> compile query language Strings server side and then return a traverser
> iterator. This exists so people can submit complex Groovy/Python/JS scripts
> to GremlinServer. The problem with this access point is that arbitrary code
> can be submitted and thus while(true) { } can hang the system! dar.
>
> > For example, in Cypher for Gremlin it is possible to use compilation to
> Bytecode in JVM client, or on the server when using [other language
> clients][1].
>
> I’m not to familiar with GremlinServer plugin stuff, so I don’t know. I
> would say that all TP3-compliant query languages must be able to compile to
> TP3 bytecode.
>
> > My current understanding is that TP4 Server would serve only for I/O
> purposes.
>
> This is still up in the air, but I believe that we should:
>
>         1. Only support one data access point.
>                 TP4 bytecode in and traversers out.
>         2. The TP4 server should have two components.
>                 (1) One (or many) bytecode input locations (IP/port) that
> pass the bytecode to the TP4 VM.
>                 (2) Multiple traverser output locations where distributed
> processors can directly send halted traversers back to the client.
>
> For me, thats it. However, I’m not a network server-guy so I don’t have a
> clear understanding of what is absolutely necessary.
>
> > Where do you see "Query language -> Universal Bytecode" part in TP4
> architecture? Will it be in the VM? Or in middleware? How will clients look
> like in TP4?
>
> TP4 will publish a binary serialization specification.
> It will be dead simple compared to TP3’s binary specification.
> The only types of objects are: Bytecode, Instruction, Traverser, Tuple,
> and Primitive.
>
> Every query language designer that wants to have their query language
> execute on the TP4 VM (and thus, against all supporting processing engines
> and data storage systems) will need to have a compiler from their language
> to TP4 bytecode.
>
> We will provide 2 tools in all the popular programming languages (Java,
> Python, JS, …).
>         1. A TP4 serializer and deserializer.
>         2. A lightweight network client to submit serialized bytecode and
> deserialize Iterator<Traverser> into objects in that language.
>
> Thus, if the Cypher-TP4 compiler is written in Scala, you would:
>         1. build up a org.apache.tinkerpop.machine.bytecode.Bytecode
> object during your compilation process.
>         2. use our org.apache.tinkerpop.machine.io <
> http://org.apache.tinkerpop.machine.io/>.RemoteMachine object to send the
> Bytecode and get back Iterator<Traverser> objects.
>                 - RemoteMachine does the serialization and deserialization
> for you.
>
> I originally wrote out how it currently looks in the tp4/ branch, but
> realized that it asks you to write one too many classes. Thus, I think we
> will probably go with something like this:
>
> Machine machine = RemoteMachine.
>                     withStructure(NeptuneStructure.class, config1).
>                     withProcessor(AkkaProcessor.class, config2).
>                       open(config0);
>
> Iterator<Traverser> results = machine.submit(CypherCompiler.compile("MATCH
> (x)-[knows]->(y)”));
>
> Thus, you would only have to provide a single CypherCompiler class.
>
> If you have any better ideas, please say so. I don’t like that you would
> have to create a CypherCompiler class (even if its just a wrapper) for all
> popular programming languages. :(
>
> Perhaps TP4 has a Compiler interface and compilation happens server
> side….? But then that requires language designers to write their compiler
> in Java … hmm…..
>
> Hope I’m clear,
> Marko.
>
> http://rredux.com <http://rredux.com/>
>
>
>
>
>
>

Re: A collection of examples that map a query language query to provider bytecode.

Reply via email to