Re: Leading/trailing space removal in LineLM

2005-11-04 Thread J.Pietschmann
Luca Furini wrote: note that a word with a soft hyphen in its middle would not be hyphenated, unless we ignore this character when collecting word fragments Well, in order to prepare for hyphenation, other characters like joiners has to be removed too. We should probably also use Unicode

Re: zero width space

2005-11-04 Thread J.Pietschmann
Manuel Mall wrote: What about character composition/decomposition? Good question? Where is the answer? Lets clarify the problem first. Let's say the input contains the sequence U+0061 U+0308 (latin small a, combining diaresis), the font has a glyph for U+00E4 but not U+0308. Obviously,