Hi,

Lisandro Dalcin wrote:
> Stefan, I've just  saw your comments on Python-Dev list.

Actually the py3k list.


> You always
> stress the point that if is not possible (or at least difficult) to
> inherit (at the C-level) from the Py2.X builtin 'str' type. Just to be
> completely sure I understand the whole issues... This problem also
> applies to ANY object using 'PyObject_VAR_HEAD' macro (like
> PyTupleObject or PyFrameObject), right?

Yes. The problem is the single-malloc initialisation that includes a variable
sized buffer tail right after the fixed size object struct. Strings use it for
their content and tuples for their items.

If you inherit from such an object in Cython (or Pyrex) in a cdef class, the
normal machinery will assume that additional fields and the vtab lie right
behind the original struct, so they end up inside the buffer - a sure crasher.

The current result of that discussion is that any PyVarObject, i.e. str/tuples
(and maybe unicode in Py3, that's not decided yet) needs special casing when
cdef subclassing it in Cython. The idea is to calculate the address of their
additional struct fields instead of assuming a fixed size combined struct. I
added the latest comments below.

We might still run into unexpected problems here (type casting and whatnot),
so this still needs some more thought. Also, I will definitely not have the
time to implement it.

Stefan


Martin von Löwis wrote:
---------------------------
As people have pointed out: add new fields *after* the variable-sized
members. To access it, you need to compute the length of the base
object, and then cast the pointer to an extension struct.

That extends to further subtypes, too.

Access is slightly slower, i.e. it's not a compile-time constant, but

  base_address + base_address[ob_len]*elem_size - more_fields_size

This still compiles efficiently, e.g. on x86, gcc compiles a struct
field access to

  movl    20(%eax), %eax

and an access with a var-sized offset into

  movl    8(%eax), %edx; fetch length into edx
  movl    -20(%eax,%edx,2), %eax; access 20-byte sized struct, assuming
elements of size 2
---------------------------

He then corrected the term into

  base_address + base_address[ob_len]*elem_size + more_fields_size

in a later post.

Similarly, Antoine Pitrou answered on an example I gave:
---------------------
Stefan Behnel writes:
> >
> >     cdef class MyListSubType(PyListObject):
> >         cdef int some_additional_int_field
> >         cdef my_struct* some_struct
> >
> >         def __init__(self):
> >             self.some_struct = get_the_struct_pointer(...)
> >             self.some_additional_int_field = 1

In your example, you could wrap the additional fields (additional_int_field and
some_struct) in a dedicated struct, and define a macro which gives a pointer to
this struct when given the address of the object. Once you have the pointer to
the struct, accessing additional fields is as simple as in the non-PyVarObject
case.

Something like (pseudocode):

#define MyStrSubType_FIELDS_ADDR(op) \
  ((struct MyStrSubType_subfields*) &((void*)op + \
      PyString_Type->tp_basicsize + \
      op->size * PyString_Type->tp_itemsize))

It's not as trivially cheap as a straight field access, but much less
expensive than a dictionary lookup.

(perhaps this needs to be a bit more complicated if you want a specific
alignment for your fields)
---------------------


_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to