Francesc Alted <faltet <at> pytables.org> writes:

> > Question 2: Further on I only need to work with the last 20% or so of each
> > row. Is there an efficient way to slice from a row without having to load
> > it all from disk?
> >
> > for i in range(len(y)):
> >     yj = y[i][-2000:] # not having to read y[i][:6500]
> >     ...
> 
> I'm afraid you can't.  The thing is that the VL types cannot be divided and 
> the entire data element must be transferred.  See:
> 
> http://www.hdfgroup.org/HDF5/doc/UG/11_Datatypes.html
> 
> section 4.3.2.3 for more info on this.

Ouch. This means I'll have to store my own "abstracts" of the data, and in many 
cases it will be faster to re-compute the details I need than store them all in 
HDF5 files. I already currently have something like this: a table with a field 
for the VLArray row index. I'll need to somehow expand it with more fields, but 
I vaguely recall that it's not possible to alter table descriptions (add/drop 
fields). I guess a new table aligned with the old one is the easiest way out, 
or a manual loop:

# get description of old_table
# add new fields
# create new_table
from itertools import izip
for oldrow, newrow in izip(old_table, new_table):
    for field in old_fields:
        newrow[field] = oldrow[field]

I'm a little surprised that the design of HDF5 does not permit striding and 
slicing of VLArray rows; I thought a VLArray mostly behaved like any other 
array.

Thank you for a very clarifying answer!

Best regards,
Jon Olav



------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to