Hello,

This is an update of what I’ve been up to on the tp4/ branch since the last 
report 2 weeks ago.

        1. Arguments
                TP4 brings the concept of an Argument to the front and center. 
An argument can either be a constant (e.g. 2) or a dynamically determined value 
(e.g. out().count()). This means that users will be able to do things such as:
                        * has(‘name’,out(‘father’).value(‘name’)) // is he a jr?
                        * is(eq(out(‘mananger’))) // is he is own boss?
                This flexibility is starting to make the steps bleed into each 
other.
                        is(eq(select(‘a’))) == where(eq(‘a’))
                One Gremlin-C# guy on Twitter was saying that Gremlin has too 
many ways to do things. It will be nice if we can reduce the number of steps we 
have with Arguments.

        2. Console
                Java9+ brings with it JShell. I posed the question on dev@ — do 
we need GremlinConsole?
                        
https://lists.apache.org/thread.html/b9083cf992b01bcfe4b82d14b9aa2d30c90707c4c134c6cfefade4ae@%3Cdev.tinkerpop.apache.org%3E
 
<https://lists.apache.org/thread.html/b9083cf992b01bcfe4b82d14b9aa2d30c90707c4c134c6cfefade4ae@%3Cdev.tinkerpop.apache.org%3E>
                It is possible to configure JShell to look (and feel?) like the 
GremlinConsole with a short startup script.
                I would like to shoot for TP4 being as small and compact as 
possible — less to build, less to document, less to maintain, …
                Gremlin-Java -> JShell, Gremlin-Groovy -> GroovySh, 
Gremlin-Python -> Python CLI, … why not reuse?
                The most beautiful code is the code that was never written. The 
greatest programmers are those that coded themselves out of a job. Let us be 
great and beautiful.

        3. Data Structures
                I’m still trying to figure out how to generalize Gremlin out of 
graph. Limited luck.
                Worked with Kuppitz a bit on how to represent all steps using 
just map, flatmap, reduce, filter, branch only! (its a little too nutz for my 
tastes, but maybe…)
                        https://twitter.com/twarko/status/1109491874333515778 
<https://twitter.com/twarko/status/1109491874333515778>
                Ryan Wisnesky was kind enough to provide a demo of his Category 
Query Language (CQL) on Monday. Cool stuff indeed.
                Ryan pointed me to this paper which I found worthwhile: 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.3252&rep=rep1&type=pdf
 
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.3252&rep=rep1&type=pdf>
                This is the big unknown for me and I want to solve it. If we 
can do this right, TinkerPop will permeate all things Apache…all things data.
                        https://twitter.com/twarko/status/1109540859442163712 
<https://twitter.com/twarko/status/1109540859442163712>

        4. The Machine
                I introduced the Machine interface.
                        
https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
 
<https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java>
                This interface encompasses both TraversalSource and 
RemoteConnection functionality.
                The general use is g = 
Gremlin.traversal(machine).withProcess(...).withStrategy(...)
                This move turned Gremlin into basically “nothing” — Gremlin is 
a just the “builder-pattern” applied to Bytecode. Check out how small Gremlin 
is!
                        
https://github.com/apache/tinkerpop/tree/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin
 
<https://github.com/apache/tinkerpop/tree/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin>
                        Thats it. ?! … Gremlin is trivial. Much less to 
consider for Gremlin-JS, Gremlin-C#, Gremlin-?? …

        5. RemoteMachine, TraverserServer, and MachineServer
                https://twitter.com/twarko/status/1110612168968265729 
<https://twitter.com/twarko/status/1110612168968265729>
                “GremlinServer” is too serial in concept. Receive bytecode, 
execute bytecode, aggregate traversers, return traversers.
                        - This is bad. We need to start thinking distributed 
execution and aggregation from the start. We need to blur the concept of a 
“server.”
                
https://github.com/apache/tinkerpop/tree/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/remote
 
<https://github.com/apache/tinkerpop/tree/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/remote>
                        MachineServer — sits somewhere an accepts Bytecode. 
(multi-threaded server)
                        RemoteMachine —  can talk to a MachineServer to submit 
Bytecode.        (single socket client)
                        Processor — exists throughout the cluster and executes 
bytecode. (parallel/distributed execution engine)
                        TraverserServer — can sit somewhere and parallellily 
(?is that a word?) accept traverser results. (multi-threaded server)
                The thing which accepts bytecode, the thing which executes 
bytecode, and the thing which aggregates results are all different things and 
the entailments are worthy.
                Much like how the Machine interface killed the complexity of 
Gremlin, I believe this server architecture will kill the complexity of 
GremlinServer.
                        - The biggest part of our I/O will be the binary 
protocol (for now I’m just using Object[Input/Output]Stream).
        
        6. Implementing Instructions
                I’m trying not to rip out the full language as I just want to 
focus on implementing only one instruction from each “class” of instruction.
                This way, if an insight comes, large amounts of code don’t need 
to be rewritten.
                My latest achievement was the implementation of 
order().by().by(). [from the barrier class of instructions]
                        - Along with match() and repeat(), this is arguably one 
of the more difficult steps to implement.
                        - The TP4 implementation is 1/3 the size of the TP3 
implementation and it just worked right out of the box on Apache Beam.
                        - The abstract VM model we have in TP4 is simple and 
consistent. Complex operations are just working.

There you have it. That is a review of the tp4/ branch over the last two weeks. 
Moving forward, I hope to make headway on the following:

        * AkkaProcessor
                - unlike Pipes and Beam where Function is the thread of 
execution, for Akka, Traverser is the thread of execution.
                - Will the TP4 architecture be able to naturally support this 
conceptual tweak? TP3 couldn’t.
        * A data structure breakthrough.
                - Contrary to popular belief, everything is not a graph. 
                - The only time I think “graph” is when I talk to a graphdb. 
                - For the most part I think in lists, maps, sets, primitives — 
don’t you?
        * A better understanding of the TP4 instruction set.
                - What is truly needed? What is our core instruction set?
        * A documentation infrastructure stub.
                - Gremlin-Groovy away… how do we do documentation?
        * Traverser species
                - I’m currently copying the TP3 model. I didn’t like it before 
and I still don’t like it.
        * Strategies
                - I haven’t worked on this much, but I believe we might have 
“strategies” all wrong (these are our compiler optimizations).
                - The TP3 model worked well enough for TP3, but for TP4, I 
think we might need a major conceptual overhaul.
                - Just a feeling at this point…

Thanks for reading. As always, I’m more than happy to receive any questions or 
comments.

Take care,
Marko.

http://rredux.com <http://rredux.com/>




Reply via email to