Re: White space handling Wiki page

Manuel Mall Fri, 28 Oct 2005 09:02:57 -0700
On Fri, 28 Oct 2005 11:08 pm, Luca Furini wrote:
> Manuel Mall wrote:
> > Side note: FOP doesn't quite do the same internally, i.e. a
> > character explicitly specified using <fo:character.../> is handled
> > separately from 'plain text'. If someone would write a style sheet
> > which does a transform of every character into a <fo:character />
> > object and would feed the output to FOP the formatting results
> > would be lets say VERY DISAPPOINTING. Actually something like:
> > <fo:block
> > background-color="yellow">word1<fo:character character="
> > "/><fo:character character= " "/>word2<fo:character character="
> > "/>word3<fo:character character=" "/></fo:block> currently causes
> > an exception!
>
> This is a problem of the whitespace-related code, but anyway the
> CharacterLM always creates a sequence of element corresponding to a
> non-space character, so the only feasible breaks recognized by the
> algorithm would be the hyphenation points inside the words ...
>
> I think that just as TextArea and Character both extend an
> AbstractTextArea, TextLM and CharLM should have a common super class
> holding the createElementsFor*() methods. It would not be necessary
> to add a SpaceArea or a WordArea child to a Character area, anyway
> (but we could decide to do it anyway just for analogy).
>
Yes I agree but it is IMO a bit more complicated. The Unicode line 
breaking algorithm does require more than one character to make 
decisions. Simple example: No break after/before an opening/closing 
punctuation, e.g. (, [, ], ) etc.. So in a sequence like "( HELP )" 
neither the the space following "(" nor the space preceding ")" would 
be a legal break opportunity. If someone would write:
...(<fo:inline font-weight="bold"> HELP </fo:inline>)...
then even if the current getNextKnuth functions would implement the 
Unicode algorithm we still would create a break opportunity for the 
spaces because the fo snippet above would generate 3 calls to 
getNextKnuth because 3 different LMs are created: one for '...(', one 
for ' HELP ', and the last for ')...' and each do the analysis just 
limited to their piece of text. Therefore having the 
createElementsFor*() methods centralised solves only part of the 
problem.
> Regards
>      Luca
Cheers
        Manuel
Re: White space handling Wiki page

Reply via email to