Hi Xiuzhi,

I am very happy that you and your team from 'RedOffice' have decided to
join the OpenOffice.org community. I am even more excited, that you wish
to contribute work in the XML based filter area.
As we discussed earlier, I am going to give all information necessary
to work in this area publicly on our xml-dev list, to make it
possible for anyone of the community to jump in.

In case you have not decided yet on which exact field or problem you and
your team are going to work, I am going to point out some improvable
areas, where your help would be most welcome, as I am running out of time:

 1. Filter detection:
    Every time a document is being imported in StarOffice /
    OpenOffice.org the filter detection chooses between the existing
    filters for the most appropriate filter.
    Currently the correct XML based filter is being chosen by the
    DocType string, which is being provided in the XML Filter Settings
    dialog. Trying to find the string in the first 1000 char.
    Much better would be a filter detection based on the XML root node
    and XML Namespace. Other possible document loading scenarios have
    to be evaluated.

 2. Storage of embedded Document content:
    Saving embedded content of an Office document:
    E.g. Graphics might be unpacked as a folder (similar to browser
    behavior, e.g. as in FireFox).

 3. Logging:
    Currently Logging is comparable weak for the XML based / XSLT
    filters. The only way to enable logging is to set a Java
    environment variable (e.g.
    -DXSLTransformer.statsfile=/usr/local/offices/xslt_debug.txt) in
    the Office options for Java.
    A few new features are imaginable as:
        * Customizing the filter logging via GUI
        * Usage of defined log level (analog to Java Logging)
        * GUI flag for a cumulative log file (instead replacing
          log for every transformation) Required for logging test
          scenarios transforming multiple files.

 4. Validation:
    Validation should be used during import/export.
    Currently validation is only possible from a test dialog. In case
    of an export filter, it will be validated against an user provided
    DTD, or in case of an import filter against the already bundled
    OpenOffice.org XML DTD [1].
    This existing test scenario might be dropped in favor of external
    development tools. Instead an (optional) validation during runtime
    should be possible to allow a variation of user scenarios:
        * Turn on schema validation for larger customer/field test
          like for a StarOffice Beta release
        * Schema validation against a subset of an existing schema
          (e.g. more restricted Open Document format). To be used to
          control the validity of the input document of the filter by
          proof of existence of certain data or structure. For
          example, a restriction on the styles being used or to check
          if the document being processed satisfy a demanded structure
          as for a certain legal document.

    To establish this, it is not sufficient to reuse the existing DTD
    functionality, but expand it at least against the schema of the
    OD default format (Relax NG). To be more flexible to the market
    arbitrary schemas as DTD, XML Schema, Relax NG should be usable by
    using a conversion tool, making them compatible to one another.
    The MSV (Multi Schema Validator) would be an option,
    https://msv.dev.java.net/
    As even the most powerful schema (Relax NG) has it's limitation,
    it might be desirable to use the ISO standard XML Schematron, too
    It basically depends on the usage of assertion on certain
    document content (pointed out by XPath).
    By this the user has a validation against arbitrary even most
    complex business logic no other schema would be able to manage.

Are you interested in one of these areas or have possibly found another
one you would like to work on?

Maybe you found constraints during your work on the XSLT filter
transforming the UOF format to OpenDocument and now want to solve them?

Or possibly you are seeking a real challenge? For example, most advanced
would be the redesign of the XML FILTER SETTINGS dialog as GUI
implementations are involved.

I am looking forward to your answer.

Kind Regards,
Svante


[1] DTD was the schema earlier used for the StarOffice 7 format (OpenOffice.org XML) the new XML format for StarOffice 8 (Open Document format) is based instead on the more powerful RELAX NG schema.


--
Svante Schubert <[EMAIL PROTECTED]>             Sun Microsystems
Software Engineer - StarOffice                            Nagelsweg 55
Phone:  +49 40 23646 965                               D-20097 Hamburg
Fax:    +49 40 23646 550                 http://www.sun.com/staroffice

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to