Re: [HACKERS] psql display of Unicode combining characters in 8.2

2006-12-27 Thread Tom Lane
I wrote:
 Actually, looking at the comments for ucs_wcwidth() in wchar.c, it seems
 that this is already accounted for in the dsplen output: characters
 for which -1 is returned are control characters, characters for which
 0 is returned should be printed as-is and counted as zero width.  So the
 bug is just that pg_wcsformat conflates the two cases.

I've applied the attached patch to fix this, but not being much of a
user of languages that have combining characters, I can't test it very
well.  Please check out the behavior and see if you like it.

regards, tom lane



binTUgsohehbu.bin
Description: zerowidth.patch

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] psql display of Unicode combining characters in 8.2

2006-12-27 Thread Michael Fuhr
On Wed, Dec 27, 2006 at 02:49:41PM -0500, Tom Lane wrote:
 I've applied the attached patch to fix this, but not being much of a
 user of languages that have combining characters, I can't test it very
 well.  Please check out the behavior and see if you like it.

Looks good so far.  I've tested languages like Vietnamese (Latin
script with lots of diacritics), polytonic Greek, and pointed Hebrew,
with text normalized to both NFC and NFD.  Before the patch the NFD
text had lots of \u escapes; after the patch it looks identical to
the NFC text aside from a few minor differences in the rendered
glyphs, which tells me that I am indeed receiving the decomposed
sequences.

Thanks!

-- 
Michael Fuhr

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly