Analysing ODF Schema / Using the ODF Toolkit for Test document generation

Svante Schubert Sat, 21 Jan 2017 08:34:50 -0800

Hi guys,

I need some feed-back of yours!
I would like to analyse the ODF XML schema a little more to create test
documents, which are valid variations of a certain ODF feature.


Think of a paragraph. It is represented by the <text:p> element.
If ODF applications would like to have 'sufficient' test documents of the
paragraph feature. How would they proceed?
Straight forward someone would generate a test document with none, one or
some random number of generated paragraphs on root level.

What else can we think of? Paragraphs being part of header & footer come to
my mind.
Perhaps a paragraph within a cell? To generalize these cases - within
different ODF feature.

But how do we know that we have tests for all type of variations?
The only possible way seems to me to take a look into the ODF schema.

If you open the ODF 1.2 XML RelaxNG schema
<http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng> in
an editor or viewer you may find one reference to the <text:p> element:
<define name="text-p">
<element name="text:p">
<ref name="paragraph-attrs"/>
<zeroOrMore>
<ref name="paragraph-content-or-hyperlink"/>
</zeroOrMore>
</element>
</define>
Which is used 19 times throughout the schema.
The question is: Are there 19 distinct feature scenarios to be tested?

When you try to take a look at some of those occurences, you quickly
realize that it is very cumbersome to jump back in forth within the text
file.
For this reason the Toolkit is generating an HTML file in one of his tests.

After 'mvn install' on all sources you may find the follwing file
odftoolkit20170121/generator/schema2template/target/OdfReference.html
I have shared for ease of use at OdfReference.html via public dropbox
(1.8MB) <https://dl.dropboxusercontent.com/u/49473263/OdfReference.html>.

As you see that in the HTML the RelaxNG sequence and choice is yet missing.
The generator is not evaluating those, yet.

If you are curious how this works:
The RelaxNG is being parse by using Sun Microsystems MultiSchemaValidator
(MSV). A tool that is able to load DTD, W3C schema and RelaxNG into a
common internal model and use any of them for XML validation.
Our generator project is using MSV to create an internal model of the ODF
1.2 RelaxNG schema and create the typed source classes for the ODFDOM
project by filling data into templates from the Apache Velocity project.
Basically text files with Java source code, which are accessing data from
environment variabels like Java maps in our case the MSV model.

Interesting in this context is the dump of the internal model that can be
found at
odftoolkit20170121/generator/schema2template/target/odf12-msvtree.dump
Again I have shared it odf12-msvtree.dump via public dropbox (0.6MB)
<https://dl.dropboxusercontent.com/u/49473263/odf12-msvtree.dump>.

So I would like to have some feed-back of you guys?
What are the possible ways to proceed on this task?
Is it feasable to load the model of the RelaxNG into a GraphDB to do some
analysing. For instance, can a certain element be nested?
Also I would add metadata to elements which belong to a user feature. Like
all elements that belong to a table, or elements that are basically just
boilerplate, such as <office:body>.

What you think? Any experiences in this area by anyone? Any thoughts are
welcome!

All the best,
Svante
ᐧ

Analysing ODF Schema / Using the ODF Toolkit for Test document generation

Reply via email to