Hello,
I think it would be important to have have an integration test that benchmarks
Gremlin (over TinkerGraph).
What does this look like?
1. We have a collection of traversals that span the various uses of
Gremlin. (write, read, path, aggregates, etc.)
2. We have a scale free graph (250k edges?) in TinkerGraph that we run
this traversal set against.
3. We save the results of this benchmark a stats/ like directory that
get pushed to the repository.
- ???/stats/marko-09-23-2015-macosx-3.0.1.txt
- ???/stats/marko-09-23-2015-macosx-3.1.0-SNAPSHOT.txt
- etc.
4. We can then look at how queries become better or worse with each
release (and in SNAPSHOTS).
- A cross file visualization these benchmarks would be great so
we can see (easily) which aspects of Gremlin are getting better/worse.
Why only TinkerGraph?
1. We don't want to benchmark a database/disk. This is to benchmark
Gremlin with itself through time.
2. TinkerGraph doesn't evolve. Its probably the most stable piece of
code in TP3 -- thus, its a good baseline system.
Is anyone interested in working on something like this? Note that I'm not
versed in the best practices for doing this, so if there is a better way to
accomplish the goal of benchmarking Gremlin over releases, lets do that.
However, lets start simple and grow so we can get something working and
providing us insights ASAP.
Finally, if this is generally a good idea, I can make a ticket in JIRA for this.
Thoughts?,
Marko.
http://markorodriguez.com