Hi Svante, re your interest in GraphDB. My ODF Explorer generates dot files which are used by Graphviz to produce tree images.
Further one of the features you may not be aware of is that is can compare documents and show graphically the differences found. At least from a structural point of view. It does not look at the text of a document. But it could. An example output can be see here http://hammyau.github.io/ODFExplorer/XPathGraphCompare.html I am more than happy for this code to be cleaned up/extended and added to the ODF Toolkit in any way shape or form. If it helps developers to more easily understand the ODF document structures that would be great. I do not understand the process or legalities involved though. On Thu, Apr 13, 2017 at 8:22 PM, Svante Schubert <[email protected]> wrote: > Rob Weir once told me, that participation on OpenSource is like, one is > scratching the part that is itching most. Rob was one of the Jedis of the > ISO Document Clone Wars when Microsoft cloned the concept of ODF for OOXML > making as well an ISO standard on office documents reusing the idea of > zipped XML files and some more [1]. > > One is for sure, the ODF Toolkit is a wonderful place to play around with > new concepts for file formats, as we already did: > > 1. We generate from the ODF grammar (RelaxNG) > <http://incubator.apache.org/odftoolkit/0.6.2-incubating/ > schema2template/> a > typed ODF DOM tree > <http://incubator.apache.org/odftoolkit/0.6.2-incubating/ > odfdom/overview-summary.html#The_ODFDOM_Layers>, > so developers do not have to know the details of the verbose ODF schema > <http://docs.oasis-open.org/office/v1.2/os/OpenDocument- > v1.2-os-schema.rng> > and XSL tooling works out-of-the-box as XSLT directly on ODF document > <http://incubator.apache.org/odftoolkit/xsltrunner/ODFXSLTRunner.html>. > 2. We have the most advanced ODF validator > <http://odf-validator.rhcloud.com> doing its job mostly unseen. For > example, for regression testing of LibreOffice and the ODF test server > <https://gitlab.com/odfplugfest/odfserver/>. Dispatching not only XML > invalidness, but constraints of the package/ZIP format via the XML > ErrorHandler interface > <http://www.saxproject.org/apidoc/org/xml/sax/ErrorHandler.html>. > 3. Many companies are using the ODF toolkit to have an automated access > to information from the document and adopt the document by any kind of > pattern. > > > After the release, I am curious how you developers on the ODF dev list > might extend some functionality that is missing most for you? > Remember, when nothing is giving back, the project will never flourish to a > top level project of Apache and might in worst case even vanish. > > Therefore, I would like to give you my personal roadmap of improved > functionality for the ODF toolkit and would love to know about yours or ask > for assistance! > > From what I learned, the complexity of file formats can be tamed best when > applying as much automation as possible. Therefore, our focus should be on > the generation part of ODFDOM. > Currently, the sequence and choice of XML elements provided by the Grammar > are not generated into XML. In addition, if there is are multiple children > with an xml:id (or equivalent attribute named in our configuration) a map > should be generated on demand (see JIRA feature > <https://issues.apache.org/jira/browse/ODFTOOLKIT-182>). If the above is > done, one of the oldest written part of ODFDOM the style functionality > could be generated and work could be so much easier. > > User components (image, paragraph, table, each character, etc.) have to be > identified directly in the RelaxNG (e.g as a comment above start element), > so they can be generated as well for high level and an explicit > hand-written Simple API would become more or less redundant. Best thing, > the more we generate, the easier we fix on a large scale and even more > might generate not only an ODF model with Java DOM source code but as well > C++ binary source code for some Android/IOS model. > > Finally, a document shall be mapped to a sequence of these changes of these > user components mentioned above (similar to an edit sequence of a user > creating the document) and new changes from ODF editors shall be merged > into the document. The upcoming ODF change-tracking will be based on > defined changes oppose to some prior state XML parts saved aside, without > knowing what they are meaning and therefore no chance to have such parts > overlap each other. > Changes are the next big step of document evolution. While documents will > always have their meaning as a snapshot of overall state (like the document > being signed), changes are a mandatory invention to get rid off our > ping-pong document exchange via email/disc and allow async simultaneous > editing of multiple users across multiple applications, which ask: What did > you change? > > But foremost, I would love to play around with some analysis of the ODF > grammar in some GraphDB [2] > <https://lists.apache.org/thread.html/810459b3eadbebe81ed2a9720a5467 > d1a5a3daa0da999baf02ce2305@%3Codf-dev.incubator.apache.org%3E> > [3] <http://markmail.org/message/yjq7arijymspndiu>. To created some > tooling to identify user components from the ODF RelaxNG. > > If you have some personal needs/roadmap you are capable of working on, I > would love to hear about it. > Otherwise, if you are without a personal need, but like the basic idea > described, I would love to have some assistance. For instance, on the JIRA > feature <https://issues.apache.org/jira/browse/ODFTOOLKIT-182> mentioned > above, but it is not an easy foe, likely more the BOSS at the end of the > game level.. ;) > > Looking forward to receive some feedback, > Svante > > > [1] Did you ever heard that the first draft version of OpenDocumen Format > was called OpenOffice XML, but was changed as its naming was too close to > one existing application? > Based on this Microsoft was adopting "OpenOffice XML" to "Office Open XML". > Some bright idea! :) > [2] https://lists.apache.org/thread.html/810459b3eadbebe81ed2a9720a5467 > d1a5a3daa0da999baf02ce2305@%3Codf-dev.incubator.apache.org%3E > [3] http://markmail.org/message/yjq7arijymspndiu > ᐧ > -- Cheers, Ian C
