I've been working on a revised testing framework for IO - specifically
GraphSON and Gryo. The idea to do this came from this issue Marko created a
while back:

https://issues.apache.org/jira/browse/TINKERPOP-1130

but was pushed to the forefront by

https://issues.apache.org/jira/browse/TINKERPOP-1565

where I started running into strange situations where changes to IO related
classes weren't producing test failures. To me that seemed to indicate that
we really didn't have IO testing well covered despite having hundreds of
tests in place validating IO. :/

So, I've started a TINKERPOP-1130 branch (based on master) to try to
improve testing. It involved adding a new sub-module to gremlin-tools
called gremlin-io-test.  I had to do this as a separate parent module
because while TINKERPOP-1130 talks about testing a "graph", IO is more than
that - it includes Gremlin Server/Driver stuff too. As a result, I needed a
test module that could sit at the top of the dependency chain.

So, a few things that are neat about this new module:

1. it generates the content for the dev/io docs automatically through
maven. while you still have to cut/paste the content to the asciidoc, at
least we don't need to maintain a script that we copy/paste out of the
asciidoc itself.
2. everything that we document for IO we validate and the two are pretty
well tied together via a Model (
https://github.com/apache/tinkerpop/blob/TINKERPOP-1130/gremlin-tools/gremlin-io-test/src/main/java/org/apache/tinkerpop/gremlin/structure/io/Model.java)
which defines the class we support for serialization and the IO
configuration/version that supports it.
3. it generates a CSV file that gives us an idea as to what is being tested
and thus supported at a particular version:
https://gist.github.com/spmallette/a25c26f4f371846f11c55a7665a76999

As you can tell from looking at the CSV/gist, you can see that there's a
fair bit of inconsistency across IO (i.e. integer/double for graphson v2
has a weird bug when you call deserialization methods a certain way,
TinkerGraph isn't supported anywhere because i realized multi-properties
aren't deserializing properly, etc). Ideally, we should have more things
flipped to "true" in that CSV than we currently have.

A nice side effect of this work is that I think we can relieve graph
providers a bit and remove some of the IO tests from the test suite (i also
suspect we can retire some internal unit tests as gremlin-io-test is more
complete in what it does).

I'd like submit a PR pretty soon, so that work on TINKERPOP-1565 can
resume, but would like to clean up a few of the inconsistencies first
before doing so. It' maybe tricky too because this work is for master, but
the fixes probably belong in tp32, so I may have to recreate failure
conditions over in that branch to get a fix for master.

If you have any thoughts or feedback please let me know.

Reply via email to