> It depends what language you want to sort. Lots of languages do not  
> have a sort alphabet. For example, Japanese. It can be quite  
> difficult to sort unusual languages like this. I am not aware of any  
> standard technique for sorting Japanese text other than keeping an  
> arbitrarily sorted dictionary (courtesy of whatever the most popular  
> Japanese dictionary at the time happens to be perhaps) and then doing  
> hash lookups in the for indexing values. As you can imagine, this is  
> not particularly fast. I have not actually tried this, but I expect  
> PosgreSQL will simply sort in a fairly binary fashion. As in, it gets  
> sorted in according to the binary value of the characters, or the  
> UTF-8 offsets, or something like that.

Above is almost correct but usually sorting by the JIS code order is
enough for most Japanese applications (I believe same thing can be
said to Chinese). I do not recommend using locale for sorting
Japanese. It quite frequently happens that the locale support for
multibyte encodings is totally broken. See recent posting titled
"[GENERAL] Japanese words not distinguished" for more details.

If you have to live with UTF-8 database, I recommend turning off the
locale support and use CONVERT to sort Japanese. For example,

SELECT * FROM t1 ORDER BY CONVERT(col1 USING utf_8_to_euc_jp);

> On 12 Jul 2005, at 15:48, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
> 
> > Our product will be storing its character data in utf-8 format  
> > (unicode encoding).
> >
> > What is the best way to achive cultural sensitive sorting using the  
> > utf-8 data?
> >
> > Is it possible have the locale apply to a connection?
> >
> > If so, is the cultural sorting support mature in PostgreSQL?
> >
> > What type of performance can be expected as compared with the  
> > normal c locale sorting?
> >
> > Thanks very much,
> >
> > Steve.
> >
> > ---------------------------(end of  
> > broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >       subscribe-nomail command to [EMAIL PROTECTED] so that  
> > your
> >       message can get through to the mailing list cleanly
> >
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
> 

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to