On the other hand Tom,
If you know that you will be doing < N insertions in the future,
you can always pre-allocate a Table / Array that is of size N
and pre-loaded with null values. You can then 'insert' by
over-writing the nth row. Furthermore you can always append
size N chunks whenever.
For most of my problems, this has worked just fine, especially if
you are dealing with sparse data.
Be Well
Anthony
On Wed, Mar 21, 2012 at 9:18 AM, Francesc Alted <fal...@gmail.com> wrote:
> On Mar 21, 2012, at 7:08 AM, Tom Diethe wrote:
>
> >>> I'm writing a wrapper for sparse matrices (CSR format) and therefore
> >>> need to store three vectors and 3 scalars:
> >>>
> >>> - data (float64 vector)
> >>> - indices (int32 vector)
> >>> - indptr (int32 vector)
> >>>
> >>> - nrows (int32 scalar)
> >>> - ncols (int32 scalar)
> >>> - nnz (int32 scalar)
> >>>
> >>> data and indices will always be the same length as each other (=nnz)
> >>> but the indptr vector is much shorter.
> >>>
> >>> I've written routines that allow you to insert/update/delete rows or
> >>> columns of the matrix by acting on these vectors only. However I'm
> >>> struggling to work out the best pytables structure to store these,
> >>> that allows me to append/insert/delete elements easily, and is
> >>> efficient.
> >>>
> >>> I was using a Group with an EArray for each vector. This works ok but
> >>> it seems like you are unable to delete items - is that correct?
> >>>
> >>> I also tried using a Group with a separate Table for each of the
> >>> vectors (I could possibly just have two - one for data and indices and
> >>> the other for indptr), but this seems to add a lot of overhead in
> >>> manipulating the arrays.
> >>>
> >>> Is there something simple I'm missing?
>
> Inserting on PyTables objects is not supported. The reason is that they
> are implemented on top of HDF5 datasets, that does not support this either.
> HDF5 is meant for dealing large datasets, and implementing insertions (or
> deletions) is not an efficient operation (requires a complete rewrite of
> the dataset). So, if you are going to need a lot of insertions or
> deletions, then PyTables / HDF5 is probably not what you want.
>
> HTH,
>
> -- Francesc Alted
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users