Re: [orbeon-user] Best practise for writing a converter

Erik Bruchez Mon, 07 Feb 2005 15:33:31 -0800

Eric van der Vlist wrote:

For instance, one of the basic features of an OpenOffice converter would
be to accept an OpenOffice document as a model and the new XML content
to replace this content in the model.

This can be done passing the location of the model in a config input
(like I think it's the case for the Excel converter) but this could also
be done passing the model itself as an input.

The second solution would be more flexible (it gives the possibility to
chain transformations of OpenOffice documents without having to
explicitly use temporary files).

Now, I would question the efficiency of base64 encoding and decoding OpenOffice documents that are zip files containing XML documents and pictures.

In theory, Base64 encoding and decoding shouldn't be too slow. In practice, we haven't measured the performance of the implementation we use, which comes from Apache. The big question is whether the time of encoding / decoding is significant compared to the other tasks.

Between these two options, which one would you recommend?

I think both ways can work. But if you have a URL available, and your processor supports a binary stream, you can use the URL generator to produce that binary stream. So it is more flexible this way. I would prefer that solution.

Or, you could go the whole extra mile (or kilometer) and use the strategy used by the Email processor, which comes down to specifying URIs for attachments, but a URI can be something like "oxf:/foo.jpg", but also "input:foo".

There is also a variant (possible with both options) which would be to
totally expose the content of OpenOffice documents.

A converter from OpenOffice to XML would have one input (the OpenOffice
document) and one output per XML document composing the package. Vice
versa, a converter from XML to OpenOffice would have as many input as
documents and an output for the OpenOffice document.

The downside is more pipeline work to do to connect all the inputs and
outputs, but I find that the additional flexibility could be worth the
pain and that this would give the possibility to work on all the
components of OpenOffice documents (this is needed, for instance if you
want to add pictures or change master styles or metadata in a document).

Another downside is that you have to know the names of the documents before creating the pipeline, unless you plan to generate the pipeline dynamically (which should be done IMO in last resort). Right?

-Erik


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
orbeon-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/orbeon-user

Re: [orbeon-user] Best practise for writing a converter

Reply via email to