Dag Lem <d...@nimrod.no> writes: > Tom Lane <t...@sss.pgh.pa.us> writes: >> (We do have methods for dealing with non-ASCII test cases, but >> I can't see that this patch is using any of them.)
> I naively assumed that tests would be run in an UTF8 environment. Nope, not necessarily. Our current best practice for this is to separate out encoding-dependent test cases into their own test script, and guard the script with an initial test on database encoding. You can see an example in src/test/modules/test_regex/sql/test_regex_utf8.sql and the two associated expected-files. It's a good idea to also cover as much as you can with pure-ASCII test cases that will run regardless of the prevailing encoding. > Running "ack -l '[\x80-\xff]'" in the contrib/ directory reveals that > two other modules are using UTF8 characters in tests - citext and > unaccent. Yeah, neither of those have been upgraded to said best practice. (If you feel like doing the legwork to improve that situation, that'd be great.) > Looking into the unaccent module, I don't quite understand how it will > work with various encodings, since it doesn't seem to decode its input - > will it fail if run under anything but ASCII or UTF8? Its Makefile seems to be forcing the test database to use UTF8. I think this is a less-than-best-practice choice, because then we have zero test coverage for other encodings; but it does prevent test failures. regards, tom lane