Hatem Nassrat (el 2008-02-06 a les 08:23:33 -0400) va dir::

> I however have a slightly different problem that maybe a little more
> complicated, maybe not. I have structures of the following pythonic
> form.
> 
>     [[1, 2, 3], [4, 5], [6], [7, 8, 9, 10]]
> 
> Which if you like is a python representation of a combinatorial design.
> I would like to place them in my database. From what I read in the above
> threads and others, I understood that:
> 
>     - VLArray has a large overhead

Well, the overhead may or may not be a problem depending on your
particular data and access patterns, so my advice is that you try
several approaches and use the one which gives better results.    ::

> So in my case it might be better to pickle these objects in a fixed
> width field. However, Is it possible to have a single VLArray to host a
> number of the above mentioned structure. I.e. Can the VLArray have 3
> dimensions, to finally look like:
> 
>     VLArray ( [
>                 [[1, 2, 3], [4, 5], [6], [7, 8, 9, 10]],
>                 [[1, 2], [3, 4, 5], [7, 8, 9, 6]],
>                 [[4, 5], [6], [7, 3, 2, 1]],
>                 [[1, 2, 3], [4, 5], [6]],
>                 [[1], [4, 5], [7, 8, 2, 3]],
>                
>                ...
>               ] )
> 
> Or place vlarray instances in an enlargeable array?

A ``VLArray`` consists of a variable number of rows, each of them
containing an array of atoms.  Arrays in different rows may have
different lengths, but atoms have the same type and shape, so your
structure can not be directly mapped to a ``VLArray``.

However, you may create a ``VLArray`` of ``ObjectAtom``, which will save
every row as a pickled Python object.  Pickling into a fixed width field
in a table (as you mention) or into a row in an enlargeable array are
also possible solutions, but involve manual (un)pickling.

You may also use two ``VLArray`` nodes, one for the flat list of numbers
and another one for the indexes where the list is splitted::

    vlarray1 = [                        vlarray2 = [
      [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],    [0, 3, 5, 6],
      [1, 2, 3, 4, 5, 7, 8, 9, 6],        [0, 2, 5],
      [4, 5, 6, 7, 3, 2, 1],              [0, 2, 3],
      [1, 2, 3, 4, 5, 6],                 [0, 3, 5],
      [1, 4, 5, 7, 8, 2, 3],              [0, 1, 3],
      ...                                 ...
    ]                                   ]

Which one of them is the best depends on your data, your problem and
your access patterns.  It would be great however if you could provide
feedback to the list with your measurements or conclusions, so everybody
can learn from real examples! :)

::

        Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
               Cárabos Coop. V.  V  V   Enjoy Data
                                  ""

Attachment: signature.asc
Description: Digital signature

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to