On Sat, 7 Feb 2009 12:50:22 -0800 (PST) Rhamphoryncus <rha...@gmail.com> wrote:
> On Feb 7, 1:39 pm, <mma...@gmx.net> wrote: > > On Sat, 7 Feb 2009 01:06:06 -0800 (PST) > > Rhamphoryncus <rha...@gmail.com> wrote: > > > > > What usecase do you have for such inconsistently structured data? > > > > I have a similar use case in pyspread, which is a Python spreadsheet > > that employs numpy object arrays. Since the Python objects in the > > numpy arrays are derived from user input, they can be anything, > > including nested lists as well as strings, etc. > > > > Since I consider my work-around that treats strings as a special > > case a rather ugly hack, I would welcome a robust, generic approach > > to the OP's problem. > > Can you explain this in a little more detail? In the application, there is one main numpy array of type "O". Each element of the array corresponds to one cell in a grid. The user may enter a Python expression into the grid cell. The input is evaled and the result is stored in the numpy array (the actual process is a bit more complicated). Therefore, the object inside a numpy array element may be an inconsistent, nested, iterable type. The user now may access the result grid via __getitem__. When doing this, a numpy array that is as flat as possible while comprising the maximum possible data depth is returned, i.e.: 1. Non-string and non-unicode iterables of similar length for each of the cells form extra dimensions. 2. In order to remove different container types, the result is flattened, cast into a numpy.array and re-shaped. 3. Dimensions of length 1 are eliminated. Therefore, the user can conveniently use numpy ufuncs on the results. I am referring to the flatten operation in step 2 -- http://mail.python.org/mailman/listinfo/python-list