It would be good if we could migrate this thread to fop-dev. Please
subscribe there if you're not already. Thanks.

On 30.05.2006 08:13:50 b.ohnsorg wrote:
> 
> ----- original Nachricht --------
> 
> Betreff: Re: RTF, nested tables, context enhancement - status
> Gesendet: Fr 26 Mai 2006 16:42:13 CEST
> Von: "Jeremias Maerki"<[EMAIL PROTECTED]>
> 
> > 
> > On 23.05.2006 08:02:50 b.ohnsorg wrote:
> > > 
> > > ----- original Nachricht --------
> > > 
> > > Betreff: Re: RTF, nested tables, context enhancement - status
> > > Gesendet: Mo 22 Mai 2006 22:30:07 CEST
> > > Von: "Jeremias Maerki"<[EMAIL PROTECTED]>
> > > 
> > > > page-number-citation and page numbering: Both work to at least a
> > certain
> > > > degree. Can you elaborate on the problems you're seeing?
> > > RTF does not know any «citation», so you need to write a character '2',
> > > if you want a link to display a page information refering to page '2'.
> > > For example an index at the end of the document does not know, which
> > > page explains «Barcode» and all links will show a question mark.
> > 
> > That's not correct. RTF has support for page number citations. You
> > should take a lot at the RTF specification and create an RTF file in MS
> > Word with references so you see how this should be implemented.
> If someone has a reliable resource, let me know. I don't own any
> Micros~1 component and took OpenOffice-documents. They contain fields
> like MS-Office, but page numbers are hard coded. (tried cross
> referencing and TOC-mechanism) 

You can get it from M$:
RTF 1.7 (MS Word 2002):
http://www.microsoft.com/downloads/details.aspx?FamilyID=e5b8ebc2-6ad6-49f0-8c90-e4f763e3f04f&DisplayLang=en
(that's what I currently use)

RTF 1.8 (MS Word 2003):
http://www.microsoft.com/downloads/details.aspx?familyid=AC57DE32-17F0-4B46-9E4E-467EF9BC5540&displaylang=en

Don't forget to have an MS Word (or Word Viewer) installed somewhere.
The specification alone doesn't really help. OpenOffice is useless for
this.

> > 
> > > I thought about that last night (while cruising through the
> > > PDF-renderer to compare my table width «guessing» with the already
> > implemented one)
> > > and draw the following conclusion:
> > > 
> > > Fact 1: RTF knows page breaks
> > 
> > Wrong. RTF can contain hard page breaks but doesn't have to.
> That's what I say: «knows page breaks» (and it's obvious, that I mean 
> Insert->Page break->«Manual page break»)

So, we're talking a completely different language. :-( Someone want to
write a FOP glossary? ;-)

> > 
> > > Fact 2: RTF is page-oriented
> > 
> > Wrong. RTF is a flow-oriented format. Microsoft Word is responsible for
> > the page break decisions unless you add hard page breaks.
> Same intention, other words. Mkay, it's a «stream» of format attributes,
> but it's intended to render to pages, therefore you may insert a page
> break. If it would be a drawing program, intended to render to large
> sheets of paper to stick to walls, it would not include page breaking.
> Depends on point of view. (You may set the page size to any value one
> may think of, mkay, but that's not the point. Another example would be
> an index or cross reference with the page number appended)
> 
> > 
> > > Fact 3: RTF aligns elements absolute
> > 
> > Yes.
> > 
> > > Fact 4: RTF does not know anything about the document's structuring
> > 
> > Depends on what you mean by structure. RTF supports styles which allow
> > some kind of structuring.
> Telling a block of text to be Arial, 22pt, underlined and bold is not
> structure, but formatting. You may assume that this is heading level 1
> but it could also be a capitalized letter or dadaistic poem. What I
> mean with structure is something like: <h1>May there be light</h2>, so it's a
> heading, nothing else (I exclude mistakenly used tags by HTML-beginners).

Again the language problem maybe. My "styles" refers not only to the
group of formatting attributes but also to the attributes indicating
whether the heading level. However, it seems this is not written down in
the RTF specification. If you look at a Word-generated RTF file you'll
see the private attribute "soutlvl", for example. While this may not be
"structure" by itself but at least an indication of structure. But the
argument probably makes no sense as XSL-FO doesn't support "structure",
either. You can't distinguish a header from a normal paragraph.

> > 
> > > Conclusion: RTF differs not that much from PDF-rendering (only written
> > tokens are a bit different)
> > > -> Use the PDF-layout management system and add a RTF-writer. So RTF
> > > would look exactly like all PDFs. And there's only a slight difference
> > > from Office-generated RTFs: page breaks after every page.
> > 
> > That would defeat the purpose of the current RTF output support. The
> > clue is that you can edit the generated file and then print it. If you
> > predefine the page breaks you make the editing a lot harder for the
> > end-user. If you don't need RTF editing, then don't generate RTF because
> > PDF offers much better quality.
> 
> And that's what I wanted to read: There's no need for all that TOC-ing,
> page number citation and whatsoever. Therefore also indenting makes no
> sense, esp. when nesting blocks, tables and lists. Mkay, there should
> be a small amount of space between a list item and it's bullet, but for
> editing you don't need any formatting attributes - what makes it a lot
> easier.

I fear that we misunderstand each other again, but I disagree. If a user
manually post-processing a generated RTF file adds
content that changes the page breaks, you'd condemn the user to manually
update all references if you don't implement the dynamic elements.

> > 
> > > > 
> > > > What's exactly the problem with graphics? We've got pretty good support
> > > > for many cases, even SVG now.
> > > I was only talking about the current RTF export. AFAIK GIFs are not
> > rendered (there was an annoying exception string inside the document).
> > > 
> > > > GIF: The LZW patents are expired but still it's a good idea to forget
> > > > GIF. The legal side is still less than clear.
> > > But PDF «understands» GIFs, so I can use them. Should be the same with
> > > RTF, and the only thing I'll do is reading them and transform into a
> > > RTF-compatible format...if it's not legal I won't do this...
> > 
> > PDF doesn't understand GIFs. Images are converted from GIF to a generic
> > bitmap format supported by PDF. In the case of RTF output this code
> > hasn't been written, yet. Patches welcome.
> That's what I wanted to hear, too.
> 
> > 
> > > > RTF does support referenced images.
> > > So there ought to be a switch (render to the document, use references)
> > 
> > ...if someone implements that switch.
> 
> me got list - ugh!
> 
> > 
> > > > Over all, your post seems to address topics which, as far as I know,
> > are
> > > > mostly solved, so I'm not really sure what you're after. Maybe if you'd
> > > > show the problems you're trying to address with examples....
> > > I only wanted to check any progress, maybe I forgot some important fact
> > > or any discussion. But the referencing of page-number-citation is one
> > > problem (we already discussed this in February), nesting tables,
> > > indenting...everything which needs structures inside structures (a
> > > block inside a block, a table inside a block, a block inside a table...)
> > 
> > You can see that there is still some work ahead. Some features will
> > probably never be implemented because RTF is such a crappy format. I
> > doubt we will ever get 100% functionality on nested tables.
> 
> Only depends on the sort of algorithm. If I reuse the PDF-rendering and
> strip all the layout managing, it'd be enough to adjust the parent
> classes. I'm just checking the possibilities, to reuse as much code as
> possible. It's needed for nesting of positioning attributes (nested
> blocks, tables, row widths...) and I don't want to re-invent layout
> managing nor implement a totally different way of guessing a letter's
> placement within a block. That makes it necessary to insert page breaks
> which goes against FO/RTF-philosophy.

Hmm, not sure how to interpret this.

> > Before you take the RTF handler (not renderer!) apart you should
> > understand RTF first. It seems you have a few things to catch up. As I
> > said above, converting the RTFHandler into a Renderer defeats the
> > purpose of the RTF format in the first place.
> 
> That's top of my list, to read through important parts of the
> RTF-standard (tables, indenting, references). I don't want to turn the
> handler into a renderer but improve the rendering features, so that
> some «unexpected results» disappear. So I spent more time thinking about an
> optimal implementation, than coding some «dirty hacks».
> 
> To conclude all this: Maybe it's a lack of the «Big Picture», how
> everything works together in FOP and it's still in progress, therefore
> I'll have to think much more about it...Thanks so far for your
> constructive responses...

You're welcome. I think the biggest problem we have is the differing
language. I currently can't see where you really want to go in the end.
Anyway, I can currently not invest too much time into this discussion as
I'm very busy and RTF is only a personal interest. At any rate, further
development discussion should move to fop-dev. Maybe our RTF specialist
Peter Herweg can also chime in if he has time.


Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to