[copied to [email protected] from a discussion which started
at [EMAIL PROTECTED]

The "combining accents" problem seems really complicated.
Depending on the font and the rendering engine, combining accents
are sometimes displayed/printed correctly, or sometimes not.

Here are the results of some experiments, trying to display the
following combining (not pre-combined) accents:

à made by a + U+300
á made by a + U+301
ã made by a + U+303
ả made by a + U+309
ạ made by a + U+323
a with comma below made by a + U+326

Now sometimes this works (the accents are displayed above or below
the base letters) and sometimes it does not (the accents are
displayed after the base letters).

Legend:

OO : rendering of combining accents in Openoffice
GE : rendering of combining accents in Pango (gedit/paps)
TW : rendering in paps by Thomas Wolff's experiments
CA : "combining accents" present in font file
CL : "OT Glyph Class" (whatever that is) of combining accents
     (as shown by fontforge)
GS : font file has a GSUB table

-  : not at all
-- : font does not have code position for combining accents
     (Type1 font file)
+  : some accents (or "yes")
++ : all accents (especially including U+323, U+326)
A  : OT Glyph Class is "Automatic"
M  : OT Glyph Class is "Mark"


Font                       OO    GE   TW   CA   CL   GS
====                       ==    ==   ==   ==   ==   ==

Andale Mono                 -     -         -    A    -

Arial                      ++    ++         +    A    +

AR PL KaitiM GB            ++    ++         -    A

Baekmuk Dotum              ++    ++         -    A

Bitstream Vera Serif       ++    ++         -    A

Bitstream Vera Sans Mono    -     -    -    -    A

Code2000                   ++    ++        ++    M

Comic Sans MS              ++    ++         -    A    -

Courier (10 pitch)          -    ++    -   --

Courier New                 -     -    +    +    A    +

FreeMono                    +     +        ++    M

FreeSans                   ++    ++        ++    M

Free Serif                 ++    ++        ++    M

Lucida Bright               +     +         -    A

Luxi Mono                   -     -         -    A

Luxi Sans                   +     +         -    A

Luxi Serif                  +     +         -    A

Times New Roman            ++    ++         +    A    +

Trebuchet MS               ++    ++         -    A    -

URW Bookman L              ++    ++        --

Verdana                     -     -         +    A    -


Well, I cannot make heads or tails of this. In general, the
behaviour of pango seems to be the same as that of Openoffice,
apart from the case Courier 10 pitch (which is a type 1 font). But
why the combining accents work in some fonts and not in others, I
have no clue. Are there bugs in the rendering engines? Or (more
likely) in the fonts? But what are these bugs exactly?

Thomas Wolff also showed paps results with "vera sans mono" which
shows the accents *before* the base letters. Which font is this
exactly? It it obviously not the same as "Bitstream vera Sans Mono".

I include (for members of the linux-utf-8 list) a test file made
by Thomas Wolff, containing "combining accents".

Regards, Jan


Combined characters:
                  ┌───────────────┐
  à b̂ ç d̗ e̊ f̪ ģ   │ STARGΛ̊TE SG-1 │
                  └───────────────┘
  wrongly placed accents:
  à U+300  á U+301  ã U+303
  ả U+309  ạ U+323  a̦ U+326

  ๏ 
แผ่นดินฮั่นเสื่อมโทรมแสนสังเวช
  พระปกเกศกองบู๊กู้ขึ้นใหม่

Scripts:
’Άνδρα μοι ’έννεπε μούσα, πολύτροπον ’ός 
μαλλα πόλλα πλάνχθη ’έπι Τροιής. 
Мы очень рады, что вы посетите наш 
международный сервер.
ابةتثجحخدذرزسشصضطظعغـفقكلمنهوىي
ტექსტების დამუშავებასა და 
მრავალენოვან კომპიუტერულ 
სისტემებში.
替洼渎溏潺瀚灯烫虫调达逯遘醋长闫阚顺驼髓
ㅱㆈㆌㅭㄺㄶ공곽껫끓뇽늙등뗍뛴룸많맹뫘볶

Punctuation and symbols:
  ╔═══════════════════════╗
  ║   • “smart quotes”    ║
  ╚═══════════════════════╝

 ∮ E⋅da = Q,  n → ∞, ∑ f(i) = ∏ g(i), ∀x∈ℝ: ⌈x⌉ = 
−⌊−x⌋, α ∧ ¬β = ¬(¬α ∨ β)

This line has a DOS line end.
This paragraph has Unicode line-ends,
this is the second line, the paragraph 
ends here.
This line has no line-end.

Reply via email to