New submission from Mark Dickinson: [Broken out of the discussion in issue 15144]
Some of the newly-optimized code in Objects/unicodeobject.c contains strict aliasing violations; under the C standards, this is undefined behaviour (C99 6.5p7). An example occurs in ascii_decode: unsigned long value = *(const unsigned long *) _p; Here the pointer dereference violates the strict aliasing rule. I think these portions of Objects/unicodeobject.c should be rewritten to avoid the undefined behaviour. This is not a purely theoretical problem: compilers are known to make optimizations based on the assumption that strict aliasing is not violated. Early versions of David Gay's dtoa.c gave incorrect results as a result of strict aliasing violations, for example; see [1]. [2] gives a stackoverflow reference explaining strict aliasing. [1] http://patrakov.blogspot.co.uk/2009/03/dont-use-old-dtoac.html [2] http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule ---------- components: Interpreter Core messages: 170841 nosy: mark.dickinson, storchaka priority: normal severity: normal status: open title: Strict aliasing violations in Objects/unicodeobject.c type: behavior versions: Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15992> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com