Dennis Gearon wrote:

John Sidney-Woollett wrote:

For what it's worth, we have a unicode 7.4.1 database which gives us the sorting and searching behaviour that we expect (with the exception of the upper and lower functions). We access the data via jdbc so we don't have to deal with encoding issues per se as the driver does any translation for us.

Currently we don't use any LIKE statements, but if we did, and wanted them optimized then we'd use the appropriate OP Class when defining the index. We also don't use any REGEX expressions. And we'll shortly be experimenting with tsearch2...

        List of databases
    Name      |  Owner   | Encoding
---------------+----------+----------
test          | postgres | UNICODE

Setting the psql client encoding to Latin1 and inserting the following data...

# select * from johntest;
id | value
----+-------
 1 | test
 2 | tést
 3 | tèst
 4 | taste
 5 | TEST
 6 | TÉST
 7 | TÈST
 8 | TASTE
(8 rows)

[snip]

using a LIKE operation also works as expected (again no index on value field)

# select * from johntest where value like 't%';
id | value
----+-------
 1 | test
 2 | tést
 3 | tèst
 4 | taste
(4 rows)

Like works, but it can't use an index, and so would have horibble performance vs. the situation where it CAN use an index. I believe this is how Postgres is working now.


If you use one of the OPCLASSes then LIKE operations using indexes should work, I believe.

See http://www.postgresql.org/docs/7.4/static/indexes-opclass.html

John Sidney-Woollett

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Reply via email to