Dennis Gearon wrote:
John Sidney-Woollett wrote:
For what it's worth, we have a unicode 7.4.1 database which gives us the sorting and searching behaviour that we expect (with the exception of the upper and lower functions). We access the data via jdbc so we don't have to deal with encoding issues per se as the driver does any translation for us.Like works, but it can't use an index, and so would have horibble performance vs. the situation where it CAN use an index. I believe this is how Postgres is working now.
Currently we don't use any LIKE statements, but if we did, and wanted them optimized then we'd use the appropriate OP Class when defining the index. We also don't use any REGEX expressions. And we'll shortly be experimenting with tsearch2...
List of databases Name | Owner | Encoding ---------------+----------+---------- test | postgres | UNICODE
Setting the psql client encoding to Latin1 and inserting the following data...
# select * from johntest; id | value ----+------- 1 | test 2 | tést 3 | tèst 4 | taste 5 | TEST 6 | TÉST 7 | TÈST 8 | TASTE (8 rows)
[snip]
using a LIKE operation also works as expected (again no index on value field)
# select * from johntest where value like 't%'; id | value ----+------- 1 | test 2 | tést 3 | tèst 4 | taste (4 rows)
If you use one of the OPCLASSes then LIKE operations using indexes should work, I believe.
See http://www.postgresql.org/docs/7.4/static/indexes-opclass.html
John Sidney-Woollett
---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly