Nice discussion thread, Stephen. I've tinkered around minimally with
writing a graph implementation, so hopefully we'll get more feedback from
others. From what I have done, +1 on killing @OptOut test annotations. They
seem out of place on the Graph impl class.
You mentioned "there is at least one method that could be called on
Features that is
typically dynamically driven based on schema" -- which method is that?
On Mon, Sep 19, 2016 at 4:33 PM, Stephen Mallette <spmalle...@gmail.com>
> I've spent the middle portion of the day reviewing our test infrastructure
> and related open tickets and have some ideas to make some things better. I
> titled this post for 3.3.0, but, in truth, I'm not sure what must be 3.3.0
> and what might yet be useful and good for 3.2.x. I'm also using this email
> as a way to organize my notes/ideas from the day, so apologies if I'm
> dumping a lot of stuff here to follow.
> (1) Of all the things I came up with, I think the biggest breaker is this
> one: have one uber test suite in gremlin-test. In other words, merge
> gremlin-groovy-test to gremlin-test and get rid of that all together. Then.
> stop publishing test artifacts out of hadoop-gremlin (and wherever else we
> might be doing that). We can make groovy and hadoop dependencies optional
> so that if providers aren't using them, they don't have to have them sucked
> in and can just depend on them as needed.
> (2) Next biggest breaker - how does everyone feel about killing OptOut and
> OptIn and getting those concepts out of gremlin-core and into features of
> gremlin-test. I've heard at least two Graph providers mention a problem
> where they want to "OptOut" more at the GraphProvider level as opposed to
> the Graph level as their configurations in the GraphProvider do more to
> drive that setting than the Graph does. I don't think we lose anything by
> moving "OptOut" except for the describeGraph() functionality:
> which I'm not sure is that big a deal to worry about. That was a bit of a
> nice idea that didn't really develop any further than where it is right
> (3) We currently tied the GraphProvider to a specific configuration of a
> Graph instance. So every time you want a slight permutation on that
> configuration, you need to create a new GraphProvider instance. I think
> that we can simplify that and cut down on the proliferation of those
> instances and in the same moment offer some added flexibility. I was
> digging through JUnit docs/code and I think there is a way for us to create
> a "GraphProviderSource" which would annotate a test (rather than
> @GraphProviderClass). The GraphProviderSource would produce a list of
> GraphProvider instances to run each test in a suite with. So, if the
> GraphProviderSource produced 4 different GraphProvider instances, it would
> execute each test in the suite 4 times (one for each GraphProvider).
> (4) I think this approach is nice because it spreads into something else
> that I think is important to us: getting maximum value for time out of our
> tests. As we add GLVs and more tests (I think that without integration
> tests right now, we're over 12000 tests), the time it takes to do a basic
> mvn clean install is getting longer and longer. We want that that as short
> as possible while maximizing code coverage. To that end, I'll make several
> + jacoco is now good with java 8 (i think it has been for a while, but i
> hadn't noticed). i worked with it a bit today and we should be able to get
> a good aggregate test coverage report with it (assuming we are ok with
> adding a new "gremlin-coverage" module to maven - stinks, but perhaps isn't
> so different than having added gremlin-benchmarks in some respects). If we
> have that we can find out what combinations of GraphProviders give us the
> best coverage for time and make that our standard testing profile.
> + We can build some fairly exotic GraphProviderSource implementations that
> can help us test all possible configuration options for TinkerGraph or
> cover ranges of settings in Neo4jGraph or randomize the returned
> GraphProviders - these could all be options we execute in docker during
> code freeze week (and perhaps periodically during our dev cycles) to ensure
> we're not breaking anything as a result of running the "maximized"
> configuration of just mvn clean install.
> + If that works, we can eliminate the use or Random in our test suite for a
> standard mvn clean install thus eliminating the chance of some
> non-deterministic behavior. Rather than be "random" we just test all the
> + Perhaps we could have different maven profiles that ran different
> GraphProviderSource implementations. I'm thinking that those might be
> triggered from different docker runs to help parallelize the tests and
> allow us to test more permutations more quickly???
> (5) Finally, I think we could speed up our test suite if we could figure
> out a way to cache Graph.Features in the test suite. A lot of tests get
> "ignored" because of test requirements, but the test suite requires a Graph
> instance to check those requirements against the Features. For some
> providers, creating the Graph instances introduces disk I/O even when the
> test will be ignored because of the feature. That setup/teardown is
> expensive and ends up slowing the tests. If we could cache those somehow
> and thus avoid the Graph instance creation, we'd save some processing - I
> suspect it would be helpful to us internally with Neo4j. The trick of
> course is that the Features implementation can't be dynamically driven and
> there is at least one method that could be called on Features that is
> typically dynamically driven based on schema. Very few tests use that
> however, so perhaps there is some way to workaround that problem.
> Well, my brain has been dumped. Thoughts welcome.