[PATCHES] UTF8MatchText

ITAGAKI Takahiro Sun, 01 Apr 2007 21:58:03 -0700

"Andrew - Supernews" <[EMAIL PROTECTED]> wrote:

>  ITAGAKI> I think all "safe ASCII-supersets" encodings are comparable
>  ITAGAKI> by bytes, not only UTF-8.
> 
> This is false, particularly for EUC.


Umm, I see. I updated the optimization to be used only for UTF8 case.
I also added some inlining hints that are useful on my machine (Pentium 4).


x1000 of LIKE '%foo% on 10000 rows tables [ms]
 encoding  | HEAD  |  P1   |  P2   |  P3  
-----------+-------+-------+-------+-------
 SQL_ASCII |  7094 |  7120 |  7063 |  7031
 LATIN1    |  7083 |  7130 |  7057 |  7031
 UTF8      | 17974 | 10859 | 10839 |  9682
 EUC_JP    | 17032 | 17557 | 17599 | 15240

- P1: UTF8MatchText()
- P2: P1 + __inline__ GenericMatchText()
- P3: P2 + __inline__ wchareq()
      (The attached patch is P3.)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

utf8matchtext.patch
Description: Binary data

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

[PATCHES] UTF8MatchText

Reply via email to