This patch fixes a rare parsing bug with unicode characters on Mac OS X. The problem is that isspace() on Mac OS X changes its behaviour with the locale. Use scanner_isspace instead, which only returns true for ASCII whitespace. It appears other places in the Postgres code have already run into this, since a number of places use scanner_isspace instead. However, there are still a lot of other calls to isspace(). I'll try to take a quick look to see if there might be other instances of this bug.
The bug is that in the following hstore value, the unicode character "disappears", and is replaced with "key\xc4", because it is parsed incorrectly: select E'keyą=>value'::hstore; hstore ----------------- "keyą"=>"value" (1 row) select 'keyą=>value'::hstore::text::bytea; bytea ---------------------------------- \x226b6579c4223d3e2276616c756522 (1 row) The correct result should be: hstore ----------------- "keyą"=>"value" (1 row) That query is added to the regression test. The query works on Linux, but failed on Mac OS X. For a more detailed explanation of how isspace() works, on Mac OS X, see: https://github.com/evanj/isspace_locale Thanks! Evan Jones
hstore-isspace.patch
Description: Binary data