yeah - i actually thought about "gremlin-tools" and almost wondered if we shouldn't do:
gremlin-tools +-gremlin-benchmarks +-gremlin-coverage for 3.3.0. The added flexibility to have independent poms for these things might turn out useful. I sorta like that idea. On Sat, Oct 8, 2016 at 10:47 AM, Ted Wilmes <twil...@gmail.com> wrote: > Adding coverage to gremlin-benchmark sounds good to me. If we come up with > any other dev specific tooling that we'd like to add, maybe it would make > sense > to just rename gremlin-benchmark to something like gremlin-tools or > gremlin-dev. > > --Ted > > On Fri, Oct 7, 2016 at 5:20 PM, Stephen Mallette <spmalle...@gmail.com> > wrote: > > > Thought I'd keep this thread warm a bit. If you've built the TinkerPop > repo > > recently, you would realize it's taking a really long time these days to > > get a simple `mvn clean install` completed. We've produced tons of tests > > that are adding to build times and I think that while we have lots of > > tests, it does NOT mean: > > > > 1. we need to execute all of them all of the time or that > > 2. we have good coverage > > > > I think that we will need to start optimizing our unit test runs (and the > > build in general) to get us back to a point where we can get a build in > > less than 15 minutes (10 would be better, but I'm not sure we'll get > there > > as an initial step). Just from the test perspective, I think this will > > mean: > > > > 1. More unit tests that mock interfaces/classes over full stack > integration > > tests > > 2. Move more unit tests to integration tests > > > > So with all that in mind, I've got a local branch that adds a -Dcoverage > > option that does two things: > > > > 1. builds an aggregated report of unit test data to show us what's > running > > long > > 2. builds an aggregated test coverage report that shows us what code is > > tested and what is not > > > > So far this only is setup for unit tests (not integration tests) and it's > > really just a model as I don't have hadoop/spark/giraph in the mix yet > for > > coverage. As I alluded to earlier in this thread, I was going to use > jacoco > > now that it supports java 8 and does a better job with multi-module > builds. > > It works fine but I had to create a gremlin-coverage module to make it > all > > work. That kinda stinks, so I opted instead to repurpose > gremlin-benchmark > > to hold the coverage configuration. It just means that gremlin-benchmark > > would now have some additional dependencies and the new config for the > > "coverage" maven profile. If everyone thinks that's ok, I will make this > > part of 3.2.2. I don't think that infringes too much on > gremlin-benchmark, > > but that's just my opinion - to me that's better than yet another module. > > > > Given that there's been no objections in this thread, I will likely > create > > some JIRA tickets based on what i've suggested here so that I can get to > > work on that first thing for 3.3.0 (and try to speed up the 3.2.2 build > > where possible). > > > > > > On Tue, Sep 20, 2016 at 11:16 AM, Stephen Mallette <spmalle...@gmail.com > > > > wrote: > > > > > I think these are the ones that contain logic and are dynamically sorta > > > driven: > > > > > > ElementFeatures.willAllowId(Object) > > > VertexPropertyFeatures.willAllowId(Object) > > > VertexFeatures.getCardinality(String) > > > > > > I was thinking that some graphs might return static values for these in > > > which case caching would work. Obviously, schema driven graphs would > have > > > trouble with getCardinality(), though I don't remember the contexts in > > > which any of these are used - my analysis didn't go that far. > > > > > > > > > On Tue, Sep 20, 2016 at 10:54 AM, Jason Plurad <plur...@gmail.com> > > wrote: > > > > > >> Nice discussion thread, Stephen. I've tinkered around minimally with > > >> writing a graph implementation, so hopefully we'll get more feedback > > from > > >> others. From what I have done, +1 on killing @OptOut test annotations. > > >> They > > >> seem out of place on the Graph impl class. > > >> > > >> You mentioned "there is at least one method that could be called on > > >> Features that is > > >> typically dynamically driven based on schema" -- which method is that? > > >> > > >> -- Jason > > >> > > >> On Mon, Sep 19, 2016 at 4:33 PM, Stephen Mallette < > spmalle...@gmail.com > > > > > >> wrote: > > >> > > >> > I've spent the middle portion of the day reviewing our test > > >> infrastructure > > >> > and related open tickets and have some ideas to make some things > > >> better. I > > >> > titled this post for 3.3.0, but, in truth, I'm not sure what must be > > >> 3.3.0 > > >> > and what might yet be useful and good for 3.2.x. I'm also using this > > >> email > > >> > as a way to organize my notes/ideas from the day, so apologies if > I'm > > >> > dumping a lot of stuff here to follow. > > >> > > > >> > (1) Of all the things I came up with, I think the biggest breaker is > > >> this > > >> > one: have one uber test suite in gremlin-test. In other words, merge > > >> > gremlin-groovy-test to gremlin-test and get rid of that all > together. > > >> Then. > > >> > stop publishing test artifacts out of hadoop-gremlin (and wherever > > else > > >> we > > >> > might be doing that). We can make groovy and hadoop dependencies > > >> optional > > >> > so that if providers aren't using them, they don't have to have them > > >> sucked > > >> > in and can just depend on them as needed. > > >> > > > >> > (2) Next biggest breaker - how does everyone feel about killing > OptOut > > >> and > > >> > OptIn and getting those concepts out of gremlin-core and into > features > > >> of > > >> > gremlin-test. I've heard at least two Graph providers mention a > > problem > > >> > where they want to "OptOut" more at the GraphProvider level as > opposed > > >> to > > >> > the Graph level as their configurations in the GraphProvider do more > > to > > >> > drive that setting than the Graph does. I don't think we lose > anything > > >> by > > >> > moving "OptOut" except for the describeGraph() functionality: > > >> > > > >> > http://tinkerpop.apache.org/docs/current/reference/#describe-graph > > >> > > > >> > which I'm not sure is that big a deal to worry about. That was a bit > > of > > >> a > > >> > nice idea that didn't really develop any further than where it is > > right > > >> > now. > > >> > > > >> > (3) We currently tied the GraphProvider to a specific configuration > > of a > > >> > Graph instance. So every time you want a slight permutation on that > > >> > configuration, you need to create a new GraphProvider instance. I > > think > > >> > that we can simplify that and cut down on the proliferation of those > > >> > instances and in the same moment offer some added flexibility. I was > > >> > digging through JUnit docs/code and I think there is a way for us to > > >> create > > >> > a "GraphProviderSource" which would annotate a test (rather than > > >> > @GraphProviderClass). The GraphProviderSource would produce a list > of > > >> > GraphProvider instances to run each test in a suite with. So, if the > > >> > GraphProviderSource produced 4 different GraphProvider instances, it > > >> would > > >> > execute each test in the suite 4 times (one for each GraphProvider). > > >> > > > >> > (4) I think this approach is nice because it spreads into something > > else > > >> > that I think is important to us: getting maximum value for time out > of > > >> our > > >> > tests. As we add GLVs and more tests (I think that without > integration > > >> > tests right now, we're over 12000 tests), the time it takes to do a > > >> basic > > >> > mvn clean install is getting longer and longer. We want that that as > > >> short > > >> > as possible while maximizing code coverage. To that end, I'll make > > >> several > > >> > points: > > >> > > > >> > + jacoco is now good with java 8 (i think it has been for a while, > > but i > > >> > hadn't noticed). i worked with it a bit today and we should be able > to > > >> get > > >> > a good aggregate test coverage report with it (assuming we are ok > with > > >> > adding a new "gremlin-coverage" module to maven - stinks, but > perhaps > > >> isn't > > >> > so different than having added gremlin-benchmarks in some respects). > > If > > >> we > > >> > have that we can find out what combinations of GraphProviders give > us > > >> the > > >> > best coverage for time and make that our standard testing profile. > > >> > + We can build some fairly exotic GraphProviderSource > implementations > > >> that > > >> > can help us test all possible configuration options for TinkerGraph > or > > >> > cover ranges of settings in Neo4jGraph or randomize the returned > > >> > GraphProviders - these could all be options we execute in docker > > during > > >> > code freeze week (and perhaps periodically during our dev cycles) to > > >> ensure > > >> > we're not breaking anything as a result of running the "maximized" > > >> > configuration of just mvn clean install. > > >> > + If that works, we can eliminate the use or Random in our test > suite > > >> for a > > >> > standard mvn clean install thus eliminating the chance of some > > >> > non-deterministic behavior. Rather than be "random" we just test all > > the > > >> > cases. > > >> > + Perhaps we could have different maven profiles that ran different > > >> > GraphProviderSource implementations. I'm thinking that those might > be > > >> > triggered from different docker runs to help parallelize the tests > and > > >> > allow us to test more permutations more quickly??? > > >> > > > >> > (5) Finally, I think we could speed up our test suite if we could > > figure > > >> > out a way to cache Graph.Features in the test suite. A lot of tests > > get > > >> > "ignored" because of test requirements, but the test suite requires > a > > >> Graph > > >> > instance to check those requirements against the Features. For some > > >> > providers, creating the Graph instances introduces disk I/O even > when > > >> the > > >> > test will be ignored because of the feature. That setup/teardown is > > >> > expensive and ends up slowing the tests. If we could cache those > > somehow > > >> > and thus avoid the Graph instance creation, we'd save some > processing > > - > > >> I > > >> > suspect it would be helpful to us internally with Neo4j. The trick > of > > >> > course is that the Features implementation can't be dynamically > driven > > >> and > > >> > there is at least one method that could be called on Features that > is > > >> > typically dynamically driven based on schema. Very few tests use > that > > >> > however, so perhaps there is some way to workaround that problem. > > >> > > > >> > Well, my brain has been dumped. Thoughts welcome. > > >> > > > >> > > > > > > > > >