Re: [HACKERS] improve Chinese locale performance

2013-09-09 Thread Quan Zongliang
On 09/06/2013 01:02 AM, Robert Haas wrote: On Wed, Sep 4, 2013 at 11:02 PM, Quan Zongliang quanzongli...@gmail.com wrote: I think of a new idea. Add a compare method column to pg_collation. Every collation has its own compare function or null. When function varstr_cmp is called, if specified

Re: [HACKERS] improve Chinese locale performance

2013-09-09 Thread Robert Haas
On Mon, Sep 9, 2013 at 5:22 AM, Quan Zongliang quanzongli...@gmail.com wrote: Understood. I just try to speed up text compare, not redesign locale. Do you have a plan to do this? Not any time soon, anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL

Re: [HACKERS] improve Chinese locale performance

2013-09-05 Thread Robert Haas
On Wed, Sep 4, 2013 at 11:02 PM, Quan Zongliang quanzongli...@gmail.com wrote: I think of a new idea. Add a compare method column to pg_collation. Every collation has its own compare function or null. When function varstr_cmp is called, if specified collation has compare function, call it

Re: [HACKERS] improve Chinese locale performance

2013-09-04 Thread Quan Zongliang
On 07/23/2013 09:42 PM, Craig Ringer wrote: (Replying on phone, please forgive bad quoting) Isn't this pretty much what adopting ICU is supposed to give us? OS-independent collations? I'd be interested in seeing the rest data for this performance report, partly as I'd like to see how ICU

Re: [HACKERS] improve Chinese locale performance

2013-08-01 Thread Robert Haas
On Sun, Jul 28, 2013 at 5:39 AM, Martijn van Oosterhout klep...@svana.org wrote: On Tue, Jul 23, 2013 at 10:34:21AM -0400, Robert Haas wrote: I pretty much lost interest in ICU upon reading that they use UTF-16 as their internal format.

Re: [HACKERS] improve Chinese locale performance

2013-07-28 Thread Martijn van Oosterhout
On Tue, Jul 23, 2013 at 10:34:21AM -0400, Robert Haas wrote: I pretty much lost interest in ICU upon reading that they use UTF-16 as their internal format. http://userguide.icu-project.org/strings#TOC-Strings-in-ICU The UTF-8 support has been steadily improving: For example,

Re: [HACKERS] improve Chinese locale performance

2013-07-23 Thread Robert Haas
On Mon, Jul 22, 2013 at 12:49 PM, Greg Stark st...@mit.edu wrote: On Mon, Jul 22, 2013 at 12:50 PM, Peter Eisentraut pete...@gmx.net wrote: I think part of the problem is that we call strcoll for each comparison, instead of doing strxfrm once for each datum and then just strcmp for each

Re: [HACKERS] improve Chinese locale performance

2013-07-23 Thread Craig Ringer
(Replying on phone, please forgive bad quoting) Isn't this pretty much what adopting ICU is supposed to give us? OS-independent collations? I'd be interested in seeing the rest data for this performance report, partly as I'd like to see how ICU collations would compare when ICU is crudely

Re: [HACKERS] improve Chinese locale performance

2013-07-23 Thread Robert Haas
On Tue, Jul 23, 2013 at 9:42 AM, Craig Ringer cr...@2ndquadrant.com wrote: (Replying on phone, please forgive bad quoting) Isn't this pretty much what adopting ICU is supposed to give us? OS-independent collations? Yes. I'd be interested in seeing the rest data for this performance report,

Re: [HACKERS] improve Chinese locale performance

2013-07-23 Thread Quan Zongliang
On 07/23/2013 09:42 PM, Craig Ringer wrote: (Replying on phone, please forgive bad quoting) Isn't this pretty much what adopting ICU is supposed to give us? OS-independent collations? Yes, we need OS-independent collations. I'd be interested in seeing the rest data for this performance

Re: [HACKERS] improve Chinese locale performance

2013-07-22 Thread Craig Ringer
On 07/22/2013 12:17 PM, Quan Zongliang wrote: Hi hackers, I tried to improve performance when database is Chinese. Under openSUSE, create index on table with 54996 rows locale=C, 140ms locale=zh_CN, 985ms I think the function strcoll() of Linux is too slow. So, I made a new utf8 to

Re: [HACKERS] improve Chinese locale performance

2013-07-22 Thread Quan Zongliang
On 07/22/2013 03:54 PM, Craig Ringer wrote: On 07/22/2013 12:17 PM, Quan Zongliang wrote: Hi hackers, I tried to improve performance when database is Chinese. Under openSUSE, create index on table with 54996 rows locale=C, 140ms locale=zh_CN, 985ms I think the function strcoll() of Linux is

Re: [HACKERS] improve Chinese locale performance

2013-07-22 Thread Peter Eisentraut
On 7/22/13 3:54 AM, Craig Ringer wrote: It might be worth looking at gcc's strcoll() implementation. See if it performs better when you use the latest gcc, and if not try to improve gcc's strcoll() . I think part of the problem is that we call strcoll for each comparison, instead of doing

Re: [HACKERS] improve Chinese locale performance

2013-07-22 Thread Greg Stark
On Mon, Jul 22, 2013 at 12:50 PM, Peter Eisentraut pete...@gmx.net wrote: I think part of the problem is that we call strcoll for each comparison, instead of doing strxfrm once for each datum and then just strcmp for each comparison. That is effectively equivalent to what the proposal

Re: [HACKERS] improve Chinese locale performance

2013-07-22 Thread Andrew Dunstan
On 07/22/2013 12:49 PM, Greg Stark wrote: On Mon, Jul 22, 2013 at 12:50 PM, Peter Eisentraut pete...@gmx.net wrote: I think part of the problem is that we call strcoll for each comparison, instead of doing strxfrm once for each datum and then just strcmp for each comparison. That is