On Sun, Jan 19, 2014 at 9:21 AM, spir <denis.s...@gmail.com> wrote:
> I guess (not sure) python optimises access of dicts used as scopes (also of
> object attributes) by interning id-strings and thus beeing able to replace
> them by hash values already computed once for interning, or other numeric

A string object caches its hash, but initially it isn't computed (-1).
On the `dict` lookup fast path, you have a cached hash, where the
string wasn't probed to a different table index by a hash collision,
and the string is interned with the same address (CPython ID) as the
key in the table.

CPython setattr is implemented by calling PyObject_SetAttr, which
interns the name. The string will be used by subsequently interned
strings. Code objects intern strings, but those will already have been
instantiated for the module, and we're dealing with dynamic strings
here like `"year" + str(1900)`. That leaves dynamic code that you
compile/exec.

    years = type('Years', (), {})
    setattr(years, 'year' + str(1900), [])
    s1 = next(s for s in vars(years) if s == 'year1900')

Compile a code object containing the string 'year1900' as a constant:

    code = compile('"year1900"', '<string>', 'exec')
    s2 = next(s for s in code.co_consts if s == 'year1900')

The code object interned the string constant because it's all name
characters (ASCII letters, numbers, and underscore):

    >>> s2 is s1
    True

You could also call sys.intern to get the interned string, but it
isn't worth the cost of the call each time:

    >>> s3 = 'year' + str(1900)
    >>> s3 is s1
    False
    >>> s4 = sys.intern(s3)
    >>> s4 is s1
    True

Thankfully a dict with only string keys uses an efficient equality
comparison, so all of this is really a moot point:

http://hg.python.org/cpython/file/3.3/Objects/stringlib/eq.h
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to