Le Mardi 20 Juin 2006 11:24, Travis Oliphant a écrit : > Matthieu Perrot wrote: > > hi, > > > > I need to handle strings shaped by a numpy array whose data own to a C > > structure. There is several possible answers to this problem : > > 1) use a numpy array of strings (PyArray_STRING) and so a (char *) > > object in C. It works as is, but you need to define a maximum size to > > your strings because your set of strings is contiguous in memory. > > 2) use a numpy array of objects (PyArray_OBJECT), and wrap each «C > > string» with a python object, using PyStringObject for example. Then our > > problem is that there is as wrapper as data element and I believe data > > can't be shared when your created PyStringObject using (char *) thanks to > > PyString_AsStringAndSize by example. > > > > > > Now, I will expose a third way, which allow you to use no size-limited > > strings (as in solution 1.) and don't create wrappers before you really > > need it (on demand/access). > > > > First, for convenience, we will use in C, (char **) type to build an > > array of string pointers (as it was suggested in solution 2). Now, the > > game is to make it works with numpy API, and use it in python through a > > python array. Basically, I want a very similar behabiour than arrays of > > PyObject, where data are not contiguous, only their address are. So, the > > idea is to create a new array descr based on PyArray_OBJECT and change > > its getitem/setitem functions to deals with my own data. > > > > I exepected numpy to work with this convenient array descr, but it fails > > because PyArray_Scalar (arrayobject.c) don't call descriptor getitem > > function (in PyArray_OBJECT case) but call 2 lines which have been > > copy/paste from the OBJECT_getitem function). Here my small patch is : > > replace (arrayobject.c:983-984): > > Py_INCREF(*((PyObject **)data)); > > return *((PyObject **)data); > > by : > > return descr->f->getitem(data, base); > > > > I play a lot with my new numpy array after this change and noticed that a > > lot of uses works : > > This is an interesting solution. I was not considering it, though, and > so I'm not surprised you have problems. You can register new types but > basing them off of PyArray_OBJECT can be problematic because of the > special-casing that is done in several places to manage reference counting. > > You are supposed to register your own data-types and get your own > typenumber. Then you can define all the functions for the entries as > you wish. > > Riding on the back of PyArray_OBJECT may work if you are clever, but it > may fail mysteriously as well because of a reference count snafu. > > Thanks for the tests and bug-reports. I have no problem changing the > code as you suggest. > > -Travis
Thanks for applying my suggestions. I think, you suggest this kind of declaration : PyArray_Descr *descr = PyArray_DescrNewFromType(PyArray_VOID); descr->f->getitem = (PyArray_GetItemFunc *) my_getitem; descr->f->setitem = (PyArray_SetItemFunc *) my_setitem; descr->elsize = sizeof(char *); PyArray_RegisterDataType(descr); Without the last line, you are right it works and it follows the C-API way. But if I register this array descr, the typenumber is bigger than what PyTypeNum_ISFLEXIBLE function considers to be a flexible type. So the returned scalar object is badly-formed. Then, I get a segmentation fault later, because the created voidscalar has a null descr pointer. -- Matthieu Perrot _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion