A Thursday 11 March 2010 12:51:21 Jorge Scandaliaris escrigué:
> Francesc Alted <faltet <at> pytables.org> writes:
> > Yes, having everything in a single table and using compression to reduce
> > unused space is the simplest option.  If your maximum length for variable
> > length field is high (>1000 bytes), you can still use a VLArray for
> > keeping them, and add another level of indirection in case you want to
> > add/remove rows.
> 
> The variable length field is typically larger than 1000 bytes. They are the
> (x,y) coordinates of a user selected region from a large image. They vary,
>  but it is not uncommon to have a 100 by 100 pixels region, thus ~20000
>  values.
> 
> > The level of indirection can be just an external EArray that keeps track
> > of the affected operations in the VLArray.
> 
> Nice, I haven't thought about this approach. However, The EArray itself can
> only be appended, right? Wouldn't this mean that I still have the original
> limitation? I guess I could replace the EArray by a table and your proposed
> solution would work ok for me.

Ah, yes, you are right.  Sorry for the confusion.

> Just curious, Am I looking into a weird use case? Is this the reason Arrays
> are append only, or is it to complex/computationally expensive to implement
> it?

No, not very strange case.  The reason behind deleting rows in *Array objects 
is not supported is two-folded.  First, HDF5 does not support this right out 
of the box (except when using the HL API and only for tables; this is where 
PyTables is inheriting this feature).  And second, it is quite inefficient, 
specially when your array/tables are large.

In general you should use the removal capability judiciously if you want to 
keep performance sane.  But, if you *must* use removal, it is best done in 
relatively small tables (less than say, 10 million elements) and with very few 
fields (1 is optimal, of course).

> Well, I suppose most people deal with fixed length data, and then you
> have tables...

Cheers,

-- 
Francesc Alted

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to