I found a crash case (assertion failure) when runing psql -f utf8_encoded_script.sql against client_encoding = shift_jis in postgresql.conf. Though encoding mismatch is obviously user's fault, a crash doesn't explain anything to him.
The thing is, prepare_buffer() in psqlscan.l assumes PQmblen() always returns the appropriate length, but it actually isn't. newtxt can be overflowed on those cases into the following 2 byte NULLs, which is filled in the beginning of prepare_buffer(). It results in that yy_scan_buffer() returns NULL by design since the input buffer's following 2 bytes are not NULL. Then, psql_assert(state->scanbufhandle) in psql_scan() detects bug later. This bug can be occurred not only in shift_jis but also big5, bgk, etc. "unsafe" encodings. The attached is to fix it. Just double check not to pad overflowed 0xff for the input buffer. If you need unit case I'll send it, but the problem seems clear. Regards, -- Hitoshi Harada
psql_encoding_bugfix.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers