Le samedi 8 octobre 2011 16:54:06, Martin v. Löwis a écrit : > In benchmarking PEP 393, I noticed that many UTF-8 decode > calls originate from C code with static strings, in particular > PyObject_CallMethod. Many of such calls already have been optimized > to cache a string object, however, PyObject_CallMethod remains > unoptimized since it requires a char*.
Because all identifiers are ASCII (in the C code base), another idea is to use a structure similar to PyASCIIObject but with an additional pointer to the constant char* string: typedef struct { PyASCIIObject _base; const char *ascii; } PyConstASCIIObject; Characters don't have to be copied, just the pointer, but you still have to allocate a structure. Because the size of the structure is also constant, we can have an efficient free list. Pseudo-code to create such object: PyObject* create_const_ascii(const char *str) { PyConstASCIIObject *obj; /* ensure maybe that str is ASCII only? */ obj = get_from_freelist(); # reset the object (e.g. hash) if (!obj) { obj = allocate_new_const_ascii(); if (!obj) return NULL; } obj->ascii = str; return obj; } Except PyUnicode_DATA, such structure should be fully compatible with other PEP 383 structures. We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. Later we can add new functions like _PyDict_GetASCII(). Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com