Re: Unicode issues

2007-01-15 Thread J.Pietschmann
Simon Pepping wrote: Aren't ligatures a feature of the font, Yes and no. While the font may provide a glyph, it is the responsibility of the content rendering code to decide whether a ligature should be used. Deciding whether a ligature is applicable is not necessarily trivial, for example ther

Re: Unicode issues

2007-01-15 Thread Simon Pepping
On Mon, Jan 15, 2007 at 04:42:12PM +0100, J.Pietschmann wrote: > As for Ligatures and character shaping: an algorithm for automatically > detecting ligature points may use a pattern lookup similar to the > pattern based hyphenation. The pattern dictionary should store only > either NFD or NFC for

Re: Unicode issues

2007-01-15 Thread Simon Pepping
On Sun, Jan 14, 2007 at 11:05:33PM +0100, J.Pietschmann wrote: > There are libraries which already implement UAX#15 properly, e.g. icu4j, > but especially icu4j is a rather large blob of a jar. I think Unicode > normalization should be handled like PDF encryption: do it if the > library is availabl

Re: Unicode issues

2007-01-15 Thread J.Pietschmann
Manuel Mall wrote: Font selection in combination with character substitution. Ligatures and character shaping. Joerg, can you elaborate on this for me please. Fonts may contain glyphs for precomposed Unicode characters, or they may not. If a list of fonts is searched for a glyph of a charact

Re: Unicode issues

2007-01-14 Thread Manuel Mall
On Monday 15 January 2007 07:05, J.Pietschmann wrote: > Manuel Mall wrote: > > 2. Unicode text boundaries (UAX#29) especially word boundaries. Do > > we need this? It does not determine the word breaks to which the > > word spacing property is applied to as this is determined by the > > treat-as-wo

Re: Unicode issues

2007-01-14 Thread J.Pietschmann
Manuel Mall wrote: 2. Unicode text boundaries (UAX#29) especially word boundaries. Do we need this? It does not determine the word breaks to which the word spacing property is applied to as this is determined by the treat-as-word-space property. It could be used to determine the words for hyph