Re: Is it ok to use on production?

Svante Schubert Thu, 05 Jul 2018 04:32:53 -0700

 Hello Emerson,

2018-07-05 5:39 GMT+02:00 [email protected] <[email protected]>:


> Hello all,
>
> I am analyzing ODTToolkit in order to replace Aspose in a project to
> generate .odt files, and I have some doubts.
>

It is always valid to have doubts. Let me try to address those by guiding
you through the basic architecture and design decisions.

The core piece of the ODF Toolkit is the ODFDOM library.
All ODF documents are able to be loaded and saved without any loss of
information, as all XML is being stored in a DOM tree. A DOM tree allows
the generic expansion of the ODF document model even if there might be
custom XML nodes within the document.
In addition, it enables any XML technology as XPath queries on the tree or
running XSLT scripts directly on the ODF document (using the XSLT Runner
<http://incubator.apache.org/odftoolkit/xsltrunner/ODFXSLTRunner.html>).

The source code of the DOM is directly being generated from the grammar of
ODF, the ODF RelaxNG schema
<http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng>.
But it became early apparent there was an need from customers for an easy
user view. Some higher level API, which does the obfuscating of the
underlying verbose XML syntax. We at Sun Microsystems called it the DOC
API, as we already had the underlying PKG API (package layer) dealing with
the ZIP storing the XML and images, the signatures and encryption and
obviously the generated classed of DOM API.
There had been some disconsense on the project between some IBM and Sun
developers and IBM forked and created the Simple API. When Oracle (after
buying Sun) donated the ODF Toolkit sources to Apache, IBM rejoined and
Simple API become a redundant piece. Therefore we deprecated the DOC API in
favor of the Simple API.

Nowadays, I believe it was not the perfect choice as are several source
code redundancies in the Simple API to ODFDOM and from a strategic view I
would like to generate more than currently possible from the ODF RelaxNG
grammar instead writing it by hand.
For instance, the grammar states that a paragraph (<text:p> XML element)
might have a @text:style-name attribute, but it does not mentioned that it
is always related to some @style:name of a style element of the styles. In
addition, when there is an XML parent element having multiple XML child
elements with an ID, the source code generator should provide the creation
of a lazy Map (e.g. for styles - in our case the ID would be the style:name
& style:familiy, which are identified as String and not as IDs, but this
could be declared in some configuration file for generation, as we already
have one config file
<https://github.com/apache/odftoolkit/blob/trunk/generator/schema2template/src/main/resources/examples/odf/config.xml>
 for all the "default values" and "style:family<->stylable element
relation". The default values are being extracted diretly from the ODF
specification by XSLT
<https://github.com/apache/odftoolkit/blob/trunk/xslt-runner/sample_xslt/getAttributeDefaults.xsl>
.
Therefore, I am working to generate as much as possible from the ODF XML
RelaxNG schema and additional information (e.g. styles linking, maps, etc.)
but also what are the root XML elements of the customer known entities as
paragraph, image, table, etc.
For this reason, I have started to experiment with the Apache Tinkerpop
Graph database using the Gremlin graph scripting language. I got funding
for the upcoming winter from the German government to work further in this
area.


> 1- Is there a final distribution that is ready to use on production
> projects?
> I downloaded all available... there are ones with incubating code, some
> incomplete, and some with all dependencies needed.
>

The latest sources on the repository is the latest production version.
Similar as you do not download a new version from Chrome but it is being
updated according to their sprints, I do no longer believe in version
numbers and yearly software cycles.
There are regression tests for the major feature and code reviews for every
patch.
There is a JAR with all dependency needed to ease the use from command line.
The "prefix" incubating is from Apache as there are not enough devolpers in
the past years to let it become a top level project. I personally would add
some features and change some designs before I would give out a 1.0, still
the product had been used for years by Sun/Oracle, IBM and Open-XChange on
the back-end to modify ODF documents. But too often they had forked and did
not contributed back.

I am a Freelancer from Berlin and if you need, I might provide you support
and guidance on a professional level addressing the problems that are
important to you.

>
> 2- On OpenOffice, we are able to write a text, open Character properties
> and apply a border around it and apply zero padding.
> And we can see the border and padding applied to the text-properties in
> the content.xml:
> <style:text-properties loext:shadow="none" loext:border="0.06pt solid
> #000000" loext:padding="0in" officeooo:paragraph-rsid="00129e44"
> officeooo:rsid="00129e44"/>
>
> But I could not find a way to apply border and padding using the Text
> related classes, but only the Paragraph ones, and even using margin and
> other stuffs it is hard to get the same result.
>
> Is there a way to have the border and padding applied to a text with
> ODFdom or SimpleODF APIs?
>
> If it is working with any ODF application and the information is stored in
the ODF XML - which is likely for OpenOffice - it is possible to address
with the ODFDOM as it allows the full access of the XML.
I am not aware of it in the Simple API, I suggest to take first a look at
the regression tests.



> 3- In case I have a left-aligned text, is it possible to get the position
> of first character of it, so that I can set the Indent (First Line or
> Before Text at OpenOffice).
>
> Yes ;-)
What I usually do if I need to investigate a feature in ODF XML (same for
OOXML) is to save the XML feature before and after the feature is enabled to
understand the XML being involved.
When located the XML nodes, you will find futher information in the ODF
specification (I prefer to use the HTML version of the spec)
<http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html>.

Thanks for any help in advance.
> Emerson Roman
>

To summarize the answer to your question:
Yes, production usage is not even valid, it more the library has matured
over years and is being used since ever in the production of several
companies.

Certainly, there are still things to improve, there always will be, like
dropping the DOM with its XML support in favour of a lower memory
footprint. As this is generated code it should be even possible to offer
both options by providing two different generated libraries.
Finally, I am certain Aspose as a company is far beyond in ODF know-how.
Several OASIS ODF technical committee (TC) members are working on this
project (including me) and there the upcoming collaboration approach by
dispatching user changes instead of documents is unique. Remember, sending
documents for collaboration is as clever as zipping your source code and
sending this. ;-)

If you are still not convinced, we might even have a short call, where you
can ask everything I might have missed, although the list is preferred so
others might get the answers as well.

Cheers,
Svante
ᐧ

Re: Is it ok to use on production?

Reply via email to