2011/3/28 Adriano Vilela Barbosa <adriano.vil...@yahoo.com>

> > Okay.  The  problem was two-folded.  First of all, a bug in the way
> > PyTables deals  with the defaults, made the MemoryError (this has been
> > fixed in  trunk).  Secondly, and due to HDF5 limitations, you cannot use
> > atoms  that are larger than 64 KB.  The canonical way to handle this is
> > to add  more dimensions to the datasets in HDF5 and then use the slice
> > selection  capabilities to retrieve the images.  Look at this:
>
> Actually, what you did below was the first thing I tried when moving away
> from
> strings. However, it resulted in my code running dozens of times slower and
> my
> HDF files being quite bigger. That's why I tried using bigger atoms (one
> atom
> per optical flow frame), to see if this would run faster and/or produce
> smaller
> files, and then I ran into the error I reported.
>
> However, I later noticed that the shape of your array is
>
> array_shape = (n_frames, n_rows, n_cols)
>
> whereas I had tried
>
> array_shape = (n_rows, n_cols, n_frames)
>
> This makes a huge difference. Using a shape (n_frames, n_rows, n_cols) for
> the
> CArray results in the code running only about 15% slower and producing a
> file
> only about 10% bigger when compared to using strings. This is much better
> than
> the results I was getting when using a shape (n_rows, n_cols, n_frames). I
> guess
> this has to do with the way the data is laid out on disk?
>

Yes.  Data on-disk is written in C-order, so you must be sure than the
leading dimensions varies the slowest (i.e. as I have set them).


> As for the atom size limit (64 kB), I guess that doesn't apply to string
> atoms?
> When using strings, I construct the atom in the following way
>
> array_atom = tables.StringAtom(len(matrix.tostring()))
>
> where len(matrix.tostring()) = 691200 bytes = 675 kB.
>
> I mean, the size of the string atom is quite above the 64 kB limit and yet
> it
> doesn't produce any erros.
>

To be exact, the problem is not the atom size, but rather the maximum
attribute size.  In this case, one attribute is used to keep the defaults
for the atom, and it cannot be larger than 64 KB.  Perhaps I should avoid to
write the attribute when the defaults heve, well, the default value (i.e.
zero).  I'm not certain why this problem does not affect the string types
though.

-- 
Francesc Alted
------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and publish 
your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to