New submission from Mark Dickinson:

[Broken out of the discussion in issue 15144]

Some of the newly-optimized code in Objects/unicodeobject.c contains strict 
aliasing violations;  under the C standards, this is undefined behaviour (C99 
6.5p7).

An example occurs in ascii_decode:

    unsigned long value = *(const unsigned long *) _p;

Here the pointer dereference violates the strict aliasing rule.

I think these portions of Objects/unicodeobject.c should be rewritten to avoid 
the undefined behaviour.

This is not a purely theoretical problem: compilers are known to make 
optimizations based on the assumption that strict aliasing is not violated.  
Early versions of David Gay's dtoa.c gave incorrect results as a result of 
strict aliasing violations, for example; see [1].

[2] gives a stackoverflow reference explaining strict aliasing.

[1] http://patrakov.blogspot.co.uk/2009/03/dont-use-old-dtoac.html
[2] http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule

----------
components: Interpreter Core
messages: 170841
nosy: mark.dickinson, storchaka
priority: normal
severity: normal
status: open
title: Strict aliasing violations in Objects/unicodeobject.c
type: behavior
versions: Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15992>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to