Hello,

I think it would be important to have have an integration test that benchmarks 
Gremlin (over TinkerGraph). 

What does this look like?

        1. We have a collection of traversals that span the various uses of 
Gremlin. (write, read, path, aggregates, etc.)
        2. We have a scale free graph (250k edges?) in TinkerGraph that we run 
this traversal set against.
        3. We save the results of this benchmark a stats/ like directory that 
get pushed to the repository.
                - ???/stats/marko-09-23-2015-macosx-3.0.1.txt
                - ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
                - etc.
        4. We can then look at how queries become better or worse with each 
release (and in SNAPSHOTS).
                - A cross file visualization these benchmarks would be great so 
we can see (easily) which aspects of Gremlin are getting better/worse.

Why only TinkerGraph?

        1. We don't want to benchmark a database/disk. This is to benchmark 
Gremlin with itself through time.
        2. TinkerGraph doesn't evolve. Its probably the most stable piece of 
code in TP3 -- thus, its a good baseline system.

Is anyone interested in working on something like this? Note that I'm not 
versed in the best practices for doing this, so if there is a better way to 
accomplish the goal of benchmarking Gremlin over releases, lets do that. 
However, lets start simple and grow so we can get something working and 
providing us insights ASAP.

Finally, if this is generally a good idea, I can make a ticket in JIRA for this.

Thoughts?,
Marko.

http://markorodriguez.com

Reply via email to