Re: [Python-Dev] Identifier API

Victor Stinner Wed, 12 Oct 2011 15:52:17 -0700

Le samedi 8 octobre 2011 16:54:06, Martin v. Löwis a écrit :
> In benchmarking PEP 393, I noticed that many UTF-8 decode
> calls originate from C code with static strings, in particular
> PyObject_CallMethod. Many of such calls already have been optimized
> to cache a string object, however, PyObject_CallMethod remains
> unoptimized since it requires a char*.


Because all identifiers are ASCII (in the C code base), another idea is to use 
a structure similar to PyASCIIObject but with an additional pointer to the 
constant char* string:

typedef struct {
  PyASCIIObject _base;
  const char *ascii;
} PyConstASCIIObject;

Characters don't have to be copied, just the pointer, but you still have to 
allocate a structure. Because the size of the structure is also constant, we 
can have an efficient free list. Pseudo-code to create such object:

PyObject*
create_const_ascii(const char *str)
{
  PyConstASCIIObject *obj;
  /* ensure maybe that str is ASCII only? */
  obj = get_from_freelist(); # reset the object (e.g. hash)
  if (!obj) {
     obj = allocate_new_const_ascii();
     if (!obj) return NULL;
  }
  obj->ascii = str;
  return obj;
}

Except PyUnicode_DATA, such structure should be fully compatible with other 
PEP 383 structures.

We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. Later 
we can add new functions like _PyDict_GetASCII().

Victor
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Identifier API

Reply via email to