Hi guys, I need some feed-back of yours! I would like to analyse the ODF XML schema a little more to create test documents, which are valid variations of a certain ODF feature.
Think of a paragraph. It is represented by the <text:p> element. If ODF applications would like to have 'sufficient' test documents of the paragraph feature. How would they proceed? Straight forward someone would generate a test document with none, one or some random number of generated paragraphs on root level. What else can we think of? Paragraphs being part of header & footer come to my mind. Perhaps a paragraph within a cell? To generalize these cases - within different ODF feature. But how do we know that we have tests for all type of variations? The only possible way seems to me to take a look into the ODF schema. If you open the ODF 1.2 XML RelaxNG schema <http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng> in an editor or viewer you may find one reference to the <text:p> element: <define name="text-p"> <element name="text:p"> <ref name="paragraph-attrs"/> <zeroOrMore> <ref name="paragraph-content-or-hyperlink"/> </zeroOrMore> </element> </define> Which is used 19 times throughout the schema. The question is: Are there 19 distinct feature scenarios to be tested? When you try to take a look at some of those occurences, you quickly realize that it is very cumbersome to jump back in forth within the text file. For this reason the Toolkit is generating an HTML file in one of his tests. After 'mvn install' on all sources you may find the follwing file odftoolkit20170121/generator/schema2template/target/OdfReference.html I have shared for ease of use at OdfReference.html via public dropbox (1.8MB) <https://dl.dropboxusercontent.com/u/49473263/OdfReference.html>. As you see that in the HTML the RelaxNG sequence and choice is yet missing. The generator is not evaluating those, yet. If you are curious how this works: The RelaxNG is being parse by using Sun Microsystems MultiSchemaValidator (MSV). A tool that is able to load DTD, W3C schema and RelaxNG into a common internal model and use any of them for XML validation. Our generator project is using MSV to create an internal model of the ODF 1.2 RelaxNG schema and create the typed source classes for the ODFDOM project by filling data into templates from the Apache Velocity project. Basically text files with Java source code, which are accessing data from environment variabels like Java maps in our case the MSV model. Interesting in this context is the dump of the internal model that can be found at odftoolkit20170121/generator/schema2template/target/odf12-msvtree.dump Again I have shared it odf12-msvtree.dump via public dropbox (0.6MB) <https://dl.dropboxusercontent.com/u/49473263/odf12-msvtree.dump>. So I would like to have some feed-back of you guys? What are the possible ways to proceed on this task? Is it feasable to load the model of the RelaxNG into a GraphDB to do some analysing. For instance, can a certain element be nested? Also I would add metadata to elements which belong to a user feature. Like all elements that belong to a table, or elements that are basically just boilerplate, such as <office:body>. What you think? Any experiences in this area by anyone? Any thoughts are welcome! All the best, Svante ᐧ
