Hi Xiuzhi,
I am very happy that you and your team from 'RedOffice' have decided to
join the OpenOffice.org community. I am even more excited, that you wish
to contribute work in the XML based filter area.
As we discussed earlier, I am going to give all information necessary
to work in this area publicly on our xml-dev list, to make it
possible for anyone of the community to jump in.
In case you have not decided yet on which exact field or problem you and
your team are going to work, I am going to point out some improvable
areas, where your help would be most welcome, as I am running out of time:
1. Filter detection:
Every time a document is being imported in StarOffice /
OpenOffice.org the filter detection chooses between the existing
filters for the most appropriate filter.
Currently the correct XML based filter is being chosen by the
DocType string, which is being provided in the XML Filter Settings
dialog. Trying to find the string in the first 1000 char.
Much better would be a filter detection based on the XML root node
and XML Namespace. Other possible document loading scenarios have
to be evaluated.
2. Storage of embedded Document content:
Saving embedded content of an Office document:
E.g. Graphics might be unpacked as a folder (similar to browser
behavior, e.g. as in FireFox).
3. Logging:
Currently Logging is comparable weak for the XML based / XSLT
filters. The only way to enable logging is to set a Java
environment variable (e.g.
-DXSLTransformer.statsfile=/usr/local/offices/xslt_debug.txt) in
the Office options for Java.
A few new features are imaginable as:
* Customizing the filter logging via GUI
* Usage of defined log level (analog to Java Logging)
* GUI flag for a cumulative log file (instead replacing
log for every transformation) Required for logging test
scenarios transforming multiple files.
4. Validation:
Validation should be used during import/export.
Currently validation is only possible from a test dialog. In case
of an export filter, it will be validated against an user provided
DTD, or in case of an import filter against the already bundled
OpenOffice.org XML DTD [1].
This existing test scenario might be dropped in favor of external
development tools. Instead an (optional) validation during runtime
should be possible to allow a variation of user scenarios:
* Turn on schema validation for larger customer/field test
like for a StarOffice Beta release
* Schema validation against a subset of an existing schema
(e.g. more restricted Open Document format). To be used to
control the validity of the input document of the filter by
proof of existence of certain data or structure. For
example, a restriction on the styles being used or to check
if the document being processed satisfy a demanded structure
as for a certain legal document.
To establish this, it is not sufficient to reuse the existing DTD
functionality, but expand it at least against the schema of the
OD default format (Relax NG). To be more flexible to the market
arbitrary schemas as DTD, XML Schema, Relax NG should be usable by
using a conversion tool, making them compatible to one another.
The MSV (Multi Schema Validator) would be an option,
https://msv.dev.java.net/
As even the most powerful schema (Relax NG) has it's limitation,
it might be desirable to use the ISO standard XML Schematron, too
It basically depends on the usage of assertion on certain
document content (pointed out by XPath).
By this the user has a validation against arbitrary even most
complex business logic no other schema would be able to manage.
Are you interested in one of these areas or have possibly found another
one you would like to work on?
Maybe you found constraints during your work on the XSLT filter
transforming the UOF format to OpenDocument and now want to solve them?
Or possibly you are seeking a real challenge? For example, most advanced
would be the redesign of the XML FILTER SETTINGS dialog as GUI
implementations are involved.
I am looking forward to your answer.
Kind Regards,
Svante
[1] DTD was the schema earlier used for the StarOffice 7 format
(OpenOffice.org XML) the new XML format for StarOffice 8 (Open Document
format) is based instead on the more powerful RELAX NG schema.
--
Svante Schubert <[EMAIL PROTECTED]> Sun Microsystems
Software Engineer - StarOffice Nagelsweg 55
Phone: +49 40 23646 965 D-20097 Hamburg
Fax: +49 40 23646 550 http://www.sun.com/staroffice
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]