Hi, > Machine machine = RemoteMachine > .withStructure(NeptuneStructure.class, config1) > .withProcessor(AkkaProcessor.class, config2) > .withCompiler(CypherCompiler.class, config3) > .open(config0);
Yea, I think something like this would work well. I like it because it exposes the three main components that TinkerPop is gluing together: Language Structure Process Thus, I would have it: withStructure() withProcessor() withLanguage() Marko. http://rredux.com <http://rredux.com/> > On May 10, 2019, at 8:27 AM, Dmitry Novikov <dmitry.novi...@neueda.com> wrote: > > Stephen, Remote Compiler - very interesting idea to explore. Just for > brainstorming, let me imagine how this may look like: > > > 1. If the client supports compilation - compiles on the client side > 2. If remote supports compilation - compiles on the server side > 3. If neither client and remote support compilation, `config3` could contain > the path to microservice. Microservice does compilation and either return > bytecode, either send bytecode to remote and proxy response to the client. > Microservice could be deployed on remote as well. > > `config3` may look like respectively: > > 1. `{compilation: 'embedded'}` > 2. `{compilation: 'remote'}` > 2. `{compilation: 'external', uri: 'localhost:3000/cypher'}` > > On 2019/05/10 13:45:50, Stephen Mallette <spmalle...@gmail.com> wrote: >>> If VM, server or compiler is implemented in another language, there is >> always a possibility to use something like gRPC or even REST to call >> microservice that will do query→Universal Bytecode conversion. >> >> That's an interesting way to handle it especially if it could be done in a >> completely transparent way - a Remote Compiler of some sort. If we had such >> a thing then the compilation could conceivably happen anywhere, client or >> server of the host programming language. >> >> On Fri, May 10, 2019 at 9:08 AM Dmitry Novikov <dmitry.novi...@neueda.com> >> wrote: >> >>> Hello, >>> >>> Marko, thank you for the clear explanation. >>> >>>> I don’t like that you would have to create a CypherCompiler class (even >>> if its just a wrapper) for all popular programming languages. :( >>> >>> Fully agree about this. For declarative languages like SQL, Cypher and >>> SPARQL complex compilation will be needed, most probably requiring AST >>> walk. Writing compilers for all popular languages could be possible in >>> theory, but increases the amount of work n times (where n>language count) >>> and complicates testing. Also, libraries necessary for the task might not >>> be available for all languages. >>> >>> In my opinion, to avoid the situation when the number of supported query >>> languages differs depending on client programming language, it is >>> preferable to introduce a plugin system. The server might have multiple >>> endpoints, one for Bytecode, one for SQL, Cypher, etc. >>> >>> If VM, server or compiler is implemented in another language, there is >>> always a possibility to use something like gRPC or even REST to call >>> microservice that will do query→Universal Bytecode conversion. >>> >>> Regards, >>> Dmitry >>> >>> On 2019/05/10 12:03:30, Stephen Mallette <spmalle...@gmail.com> wrote: >>>>> I don’t like that you would have to create a CypherCompiler class >>> (even >>>> if its just a wrapper) for all popular programming languages. :( >>>> >>>> Yeah, this is the trouble I saw with sparql-gremlin and how to make it so >>>> that GLVs can support the g.sparql() step properly. It seems like no >>> matter >>>> what you do, you end up with a situation where the language designer has >>> to >>>> do something in each programming language they want to support. The bulk >>> of >>>> the work seems to be in the "compiler" so if that were moved to the >>> server >>>> (what we did in TP3) then the language designer would only have to write >>>> that once per VM they wanted to support and then provide a more >>> lightweight >>>> library for each programming language they supported on the client-side. >>> A >>>> programming language that had the full compiler implementation would have >>>> the advantage that they could client-side compile or rely on the server. >>> I >>>> suppose that a lightweight library would then become the basis for a >>> future >>>> full blown compiler in that language........hard one. >>>> >>>> >>>> >>>> On Thu, May 9, 2019 at 6:09 PM Marko Rodriguez <okramma...@gmail.com> >>> wrote: >>>> >>>>> Hello Dmitry, >>>>> >>>>>> In TP3 compilation to Bytecode can happen on Gremlin Client side or >>>>> Gremlin Server side: >>>>>> >>>>>> 1. If compilation is simple, it is possible to implement it for all >>>>> Gremlin Clients: Java, Python, JavaScript, .NET... >>>>>> 2. If compilation is complex, it is possible to create a plugin for >>>>> Gremlin Server. Clients send query string, and server does the >>> compilation. >>>>> >>>>> Yes, but not for the reasons you state. Every TP3-compliant language >>> must >>>>> be able to compile to TP3 bytecode. That bytecode is then submitted, >>>>> evaluated by the TP3 VM, and a traverser iterator is returned. >>>>> >>>>> However, TP3’s GremlinServer also supports JSR223 ScriptEngine which >>> can >>>>> compile query language Strings server side and then return a traverser >>>>> iterator. This exists so people can submit complex Groovy/Python/JS >>> scripts >>>>> to GremlinServer. The problem with this access point is that arbitrary >>> code >>>>> can be submitted and thus while(true) { } can hang the system! dar. >>>>> >>>>>> For example, in Cypher for Gremlin it is possible to use compilation >>> to >>>>> Bytecode in JVM client, or on the server when using [other language >>>>> clients][1]. >>>>> >>>>> I’m not to familiar with GremlinServer plugin stuff, so I don’t know. I >>>>> would say that all TP3-compliant query languages must be able to >>> compile to >>>>> TP3 bytecode. >>>>> >>>>>> My current understanding is that TP4 Server would serve only for I/O >>>>> purposes. >>>>> >>>>> This is still up in the air, but I believe that we should: >>>>> >>>>> 1. Only support one data access point. >>>>> TP4 bytecode in and traversers out. >>>>> 2. The TP4 server should have two components. >>>>> (1) One (or many) bytecode input locations (IP/port) >>> that >>>>> pass the bytecode to the TP4 VM. >>>>> (2) Multiple traverser output locations where >>> distributed >>>>> processors can directly send halted traversers back to the client. >>>>> >>>>> For me, thats it. However, I’m not a network server-guy so I don’t >>> have a >>>>> clear understanding of what is absolutely necessary. >>>>> >>>>>> Where do you see "Query language -> Universal Bytecode" part in TP4 >>>>> architecture? Will it be in the VM? Or in middleware? How will clients >>> look >>>>> like in TP4? >>>>> >>>>> TP4 will publish a binary serialization specification. >>>>> It will be dead simple compared to TP3’s binary specification. >>>>> The only types of objects are: Bytecode, Instruction, Traverser, Tuple, >>>>> and Primitive. >>>>> >>>>> Every query language designer that wants to have their query language >>>>> execute on the TP4 VM (and thus, against all supporting processing >>> engines >>>>> and data storage systems) will need to have a compiler from their >>> language >>>>> to TP4 bytecode. >>>>> >>>>> We will provide 2 tools in all the popular programming languages (Java, >>>>> Python, JS, …). >>>>> 1. A TP4 serializer and deserializer. >>>>> 2. A lightweight network client to submit serialized bytecode >>> and >>>>> deserialize Iterator<Traverser> into objects in that language. >>>>> >>>>> Thus, if the Cypher-TP4 compiler is written in Scala, you would: >>>>> 1. build up a org.apache.tinkerpop.machine.bytecode.Bytecode >>>>> object during your compilation process. >>>>> 2. use our org.apache.tinkerpop.machine.io < >>>>> http://org.apache.tinkerpop.machine.io/>.RemoteMachine object to send >>> the >>>>> Bytecode and get back Iterator<Traverser> objects. >>>>> - RemoteMachine does the serialization and >>> deserialization >>>>> for you. >>>>> >>>>> I originally wrote out how it currently looks in the tp4/ branch, but >>>>> realized that it asks you to write one too many classes. Thus, I think >>> we >>>>> will probably go with something like this: >>>>> >>>>> Machine machine = RemoteMachine. >>>>> withStructure(NeptuneStructure.class, config1). >>>>> withProcessor(AkkaProcessor.class, config2). >>>>> open(config0); >>>>> >>>>> Iterator<Traverser> results = >>> machine.submit(CypherCompiler.compile("MATCH >>>>> (x)-[knows]->(y)”)); >>>>> >>>>> Thus, you would only have to provide a single CypherCompiler class. >>>>> >>>>> If you have any better ideas, please say so. I don’t like that you >>> would >>>>> have to create a CypherCompiler class (even if its just a wrapper) for >>> all >>>>> popular programming languages. :( >>>>> >>>>> Perhaps TP4 has a Compiler interface and compilation happens server >>>>> side….? But then that requires language designers to write their >>> compiler >>>>> in Java … hmm….. >>>>> >>>>> Hope I’m clear, >>>>> Marko. >>>>> >>>>> http://rredux.com <http://rredux.com/> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >>