Re: [PyGreSQL] PyGreSQL Commit r615 - trunk/module

Christoph Zwerschke Tue, 24 Nov 2015 12:03:06 -0800

Am 24.11.2015 um 17:38 schrieb D'Arcy J.M. Cain:
> It no longer segfaults.  I am not sure that you needed to make such a
> drastic fix though.  Did you consider casting to unsigned int?  I
> suspect that the problem was chars > 127 being converted to negative
> numbers.  The only negative number allowed to those macros is -1.


Yes, that's the root problem.

In German locale with UTF8, Postgresq outputs "34,25 \xe2\x82\xcac",where the last three bytes together are the Euro character in UTF8encoding (yes, it needs three bytes since it came late to the party).

Now Pygres goes through this string without awareness of the encoding,it checks all three bytes with isdigit(). As you said, '\xac' casted toint becomes negative (-84) and for whatever strange reasons, isdigit()considers it a digit (strange because the other two negative bytes arenot considered digits, and because \xac and \xffac are not considereddigits in latin1 or unicode).

One solution is, as you say, to not cast to int, but to unsigned char,which is what isdigit expects. Or to use -funsigned-char, but we shouldnot rely on that and also cast properly since the compiler flag may notbe supported on all platforms (it's probably a gcc thing only).

However, I think my solution is better because calling isdigit() isunnecessary overhead. Remember it's a function call, not a macro, thatalso takes the locale into account. So checking >= '0' && <= '9' isfaster, but moreover we want to be as restrictive as possible and nothave other characters considered digits because of whatever strangeinterpretation of the locale. For instance, '\xb2' would be considered adigit on Windows because it is a superscript 2 in cp1252.

You can still add the -funsigned-char, it cannot harm and should makethings a bit more deterministic.


-- Christoph
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

Re: [PyGreSQL] PyGreSQL Commit r615 - trunk/module

Reply via email to