Hi Arved,

> What are your recommendations for someone to come up to speed with RTF?

I'd recommend to stay away from it unless you really have to ;-)
Seriously, to someone accustomed to clear and well-defined specs, RTF is 
somewhat messy, what it is really is a documented internal format, not a spec 
that has been agreed upon by a carefully-selected comittee.

The RTF spec that we use in jfor is (mostly) V1.5 from Microsoft, who since 
moved on to 1.6 (at least), but apparently 1.5 is the most widely supported 
spec. A google search shows it at http://www.dubois.ws/software/RTF, it might 
be harder to find at Microsoft as it's not the latest.

The rtflib package of jfor (available at www.jfor.org) encapsulates our 
knowledge of RTF and is fairly simple and understandable, but it is still too 
much element-oriented.
One important thing to realize (happened too late here) is that RTF is 
more flow-based or stack-based than element-based: not everything that is 
opened has to be closed, it's more like a flow with embedded attribute 
changes.

> As I understand it, RTF is presented
> to a user-agent which does a fair amount of layout; higher-level structures
> are still present in the RTF. 

Right - but there are both structure and presentations codes, so an RTF 
document could be both. 
Jfor has a strong bend towards structure, as usually the user goal is to get 
an editable RTF document, where as much of the original document structure 
must be preserved for convenience. 
Precise appearance usually comes second, as applying a new wordprocessor 
style sheet can change a lot of it.

RTF is both a presentation and a structure format, along with a moving target 
due to the "spec" being expanded and rewritten with nearly every new version 
of winword. 
There are a many grey areas in the spec, meaning the only possible test is 
opening the generated RTF in the desired wordprocessors (and often watching 
it crash...).

> <snip>
> This is not so different from MIF
Agreed. We are working with MIF for another project, and didn't choose FOP 
for that because of lack of precise control over the MIF output.

I tend to see these formats as:
-PDF for finished high-quality output ("presentation language"), layout 100% 
done by FOP

-MIF for semi-finished high-quality output ("typography language"), layout 
done by Framemaker according to MIF instructions.

-RTF for editable structure + presentation output ("wordprocessing 
language"), layout done by wordprocessor.

So I fully agree that MIF and RTF "renderers" share a lot in common - 
they must be able to get as much information as possible about the original 
document structure, and in my view do not need any layout computations.

> In a sense with RTF and MIF (and HTML for anyone who really desperately
> wants to see FO->HTML) we are talking about translators as opposed to
> formatters and renderers...

yes - that's why I called jfor a "converter" instead of "formatter"

Without knowing too much about FOP internals, I think a processing chain 
along these lines might help:

parsing if needed
-> SAX events
-> FO attributes processing (validation, inheritance) 
-> StructureRenderer

StructureRenderer is
EITHER Layout + PrintRenderer
OR StructureProcessor (RTF, MIF, etc.)

What we need to find out is how much the existing FOP and these "structure 
renderers" have in common.

- Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to