Re: A collection of examples that map a query language query to provider bytecode.

Stephen Mallette Fri, 10 May 2019 06:46:18 -0700

>  If VM, server or compiler is implemented in another language, there is
always a possibility to use something like gRPC or even REST to call
microservice that will do query→Universal Bytecode conversion.


That's an interesting way to handle it especially if it could be done in a
completely transparent way - a Remote Compiler of some sort. If we had such
a thing then the compilation could conceivably happen anywhere, client or
server of the host programming language.

On Fri, May 10, 2019 at 9:08 AM Dmitry Novikov <dmitry.novi...@neueda.com>
wrote:

> Hello,
>
> Marko, thank you for the clear explanation.
>
> > I don’t like that you would have to create a CypherCompiler class (even
> if its just a wrapper) for all popular programming languages. :(
>
> Fully agree about this. For declarative languages like SQL, Cypher and
> SPARQL complex compilation will be needed, most probably requiring AST
> walk. Writing compilers for all popular languages could be possible in
> theory, but increases the amount of work n times (where n>language count)
> and complicates testing. Also, libraries necessary for the task might not
> be available for all languages.
>
> In my opinion, to avoid the situation when the number of supported query
> languages differs depending on client programming language, it is
> preferable to introduce a plugin system. The server might have multiple
> endpoints, one for Bytecode, one for SQL, Cypher, etc.
>
> If VM, server or compiler is implemented in another language, there is
> always a possibility to use something like gRPC or even REST to call
> microservice that will do query→Universal Bytecode conversion.
>
> Regards,
> Dmitry
>
> On 2019/05/10 12:03:30, Stephen Mallette <spmalle...@gmail.com> wrote:
> > >  I don’t like that you would have to create a CypherCompiler class
> (even
> > if its just a wrapper) for all popular programming languages. :(
> >
> > Yeah, this is the trouble I saw with sparql-gremlin and how to make it so
> > that GLVs can support the g.sparql() step properly. It seems like no
> matter
> > what you do, you end up with a situation where the language designer has
> to
> > do something in each programming language they want to support. The bulk
> of
> > the work seems to be in the "compiler" so if that were moved to the
> server
> > (what we did in TP3) then the language designer would only have to write
> > that once per VM they wanted to support and then provide a more
> lightweight
> > library for each programming language they supported on the client-side.
> A
> > programming language that had the full compiler implementation would have
> > the advantage that they could client-side compile or rely on the server.
> I
> > suppose that a lightweight library would then become the basis for a
> future
> > full blown compiler in that language........hard one.
> >
> >
> >
> > On Thu, May 9, 2019 at 6:09 PM Marko Rodriguez <okramma...@gmail.com>
> wrote:
> >
> > > Hello Dmitry,
> > >
> > > > In TP3 compilation to Bytecode can happen on Gremlin Client side or
> > > Gremlin Server side:
> > > >
> > > > 1. If compilation is simple, it is possible to implement it for all
> > > Gremlin Clients: Java, Python, JavaScript, .NET...
> > > > 2. If compilation is complex, it is possible to create a plugin for
> > > Gremlin Server. Clients send query string, and server does the
> compilation.
> > >
> > > Yes, but not for the reasons you state. Every TP3-compliant language
> must
> > > be able to compile to TP3 bytecode. That bytecode is then submitted,
> > > evaluated by the TP3 VM, and a traverser iterator is returned.
> > >
> > > However, TP3’s GremlinServer also supports JSR223 ScriptEngine which
> can
> > > compile query language Strings server side and then return a traverser
> > > iterator. This exists so people can submit complex Groovy/Python/JS
> scripts
> > > to GremlinServer. The problem with this access point is that arbitrary
> code
> > > can be submitted and thus while(true) { } can hang the system! dar.
> > >
> > > > For example, in Cypher for Gremlin it is possible to use compilation
> to
> > > Bytecode in JVM client, or on the server when using [other language
> > > clients][1].
> > >
> > > I’m not to familiar with GremlinServer plugin stuff, so I don’t know. I
> > > would say that all TP3-compliant query languages must be able to
> compile to
> > > TP3 bytecode.
> > >
> > > > My current understanding is that TP4 Server would serve only for I/O
> > > purposes.
> > >
> > > This is still up in the air, but I believe that we should:
> > >
> > >         1. Only support one data access point.
> > >                 TP4 bytecode in and traversers out.
> > >         2. The TP4 server should have two components.
> > >                 (1) One (or many) bytecode input locations (IP/port)
> that
> > > pass the bytecode to the TP4 VM.
> > >                 (2) Multiple traverser output locations where
> distributed
> > > processors can directly send halted traversers back to the client.
> > >
> > > For me, thats it. However, I’m not a network server-guy so I don’t
> have a
> > > clear understanding of what is absolutely necessary.
> > >
> > > > Where do you see "Query language -> Universal Bytecode" part in TP4
> > > architecture? Will it be in the VM? Or in middleware? How will clients
> look
> > > like in TP4?
> > >
> > > TP4 will publish a binary serialization specification.
> > > It will be dead simple compared to TP3’s binary specification.
> > > The only types of objects are: Bytecode, Instruction, Traverser, Tuple,
> > > and Primitive.
> > >
> > > Every query language designer that wants to have their query language
> > > execute on the TP4 VM (and thus, against all supporting processing
> engines
> > > and data storage systems) will need to have a compiler from their
> language
> > > to TP4 bytecode.
> > >
> > > We will provide 2 tools in all the popular programming languages (Java,
> > > Python, JS, …).
> > >         1. A TP4 serializer and deserializer.
> > >         2. A lightweight network client to submit serialized bytecode
> and
> > > deserialize Iterator<Traverser> into objects in that language.
> > >
> > > Thus, if the Cypher-TP4 compiler is written in Scala, you would:
> > >         1. build up a org.apache.tinkerpop.machine.bytecode.Bytecode
> > > object during your compilation process.
> > >         2. use our org.apache.tinkerpop.machine.io <
> > > http://org.apache.tinkerpop.machine.io/>.RemoteMachine object to send
> the
> > > Bytecode and get back Iterator<Traverser> objects.
> > >                 - RemoteMachine does the serialization and
> deserialization
> > > for you.
> > >
> > > I originally wrote out how it currently looks in the tp4/ branch, but
> > > realized that it asks you to write one too many classes. Thus, I think
> we
> > > will probably go with something like this:
> > >
> > > Machine machine = RemoteMachine.
> > >                     withStructure(NeptuneStructure.class, config1).
> > >                     withProcessor(AkkaProcessor.class, config2).
> > >                       open(config0);
> > >
> > > Iterator<Traverser> results =
> machine.submit(CypherCompiler.compile("MATCH
> > > (x)-[knows]->(y)”));
> > >
> > > Thus, you would only have to provide a single CypherCompiler class.
> > >
> > > If you have any better ideas, please say so. I don’t like that you
> would
> > > have to create a CypherCompiler class (even if its just a wrapper) for
> all
> > > popular programming languages. :(
> > >
> > > Perhaps TP4 has a Compiler interface and compilation happens server
> > > side….? But then that requires language designers to write their
> compiler
> > > in Java … hmm…..
> > >
> > > Hope I’m clear,
> > > Marko.
> > >
> > > http://rredux.com <http://rredux.com/>
> > >
> > >
> > >
> > >
> > >
> > >
> >
>

Re: A collection of examples that map a query language query to provider bytecode.

Reply via email to