Re: Is getNextKnuthElements the right interface for inline LMs?

Luca Furini Mon, 07 Nov 2005 06:39:36 -0800

Manuel Mall wrote:

What I observed is that most of these issue cannot be solved by lookingat a single character at a time. They need context, very often only onecharacter, sometimes more (e.g. sequence of white space). Moreimportantly the context needed is not limited to the fo they occur in.They all span across fos. This is were the current LM structures andespecially the getNextKnuthElement interface really gets in the way ofthings. Basically one cannot create the correct Knuth sequences withoutthe context but the context can come from everywhere (superior fo,subordinate fo, or neighboring fo). So one needs look ahead andbacktrack features across all these boundaries and it feels extremelymessy.
It appears conceptually so much simpler to have only a single loopinterating over all the characters in a paragraph doing all thecharacter/glyph manipulation, word breaking (hyphenation), and linebreaking analysis and generation of the Knuth sequences in one place. Anexample where this is currently done is the white space handling duringrefinement. One loop at block level based on a recursive char iteratorthat supports deletion and character replacement does the job. Verysimple and easy to understand. I have something similar in mind forinline Knuth sequence generation. Of course the iterator would not onlyreturn the character but relevant formatting information for it as well,e.g. the font so the width etc. can be calculated. The iterator may alsohave to indicate start/end border/padding and conditional border/paddingelements.

I think that there are two different "layers" that affect the generationof the elements: one is the "text layer" (or maybe semantic level), wherewe have the text and we can easily handle whitespace, recognize wordboundaries, find hyphenation points, regardless of the actual fo (and itsdepth) where the text lives, and the "formatting layer" where we have theresolved values for the properties like font, size, borders, etc. Theselayers speak different languages, as one knows words and spaces and theother elements and attributes.

At the moment, the getNextKnuthElements() method works at the formattinglevel: each LM knows the relevant properties but has a limited view of thetext, whence the current difficulties.

Your proposal is to work at the text level (correct me if I'm wrong), withthe LineLM centralizing the handling of the text for a whole block. Iwonder if, doing so, we would not find difficult to know the resolvedproperty values applying to each piece of text.

I'm not saying that whe don't need changes in the LM interactions; I'mjust asking myself (and asking to you all, of course :-)) if it is reallypossible to have both breaking and element generation *in one place*.

What if we had first a centralized control at the text level (the LineLMputting together all the text, finding words, normalizing spaces,performing hyphenation ...) and then a localized element generation (eachLM, basing on what the LineLM did and using the local properties)?

Something somewhat similar (but limited to single words) happens at themoment with the getChangedKnuthElements() method, which is called onlyafter the LineLM has reconstructed a word, found its breaking points andtold the inline LMs where the breaks are.

Don't know if what I just wrote makes any sense; so, as I never tried todo what you suggest or what I just attempted to describe, I really lookforward to see your code in action!


Regards
    Luca

Re: Is getNextKnuthElements the right interface for inline LMs?

Reply via email to