2011/3/28 Adriano Vilela Barbosa <adriano.vil...@yahoo.com> > > Okay. The problem was two-folded. First of all, a bug in the way > > PyTables deals with the defaults, made the MemoryError (this has been > > fixed in trunk). Secondly, and due to HDF5 limitations, you cannot use > > atoms that are larger than 64 KB. The canonical way to handle this is > > to add more dimensions to the datasets in HDF5 and then use the slice > > selection capabilities to retrieve the images. Look at this: > > Actually, what you did below was the first thing I tried when moving away > from > strings. However, it resulted in my code running dozens of times slower and > my > HDF files being quite bigger. That's why I tried using bigger atoms (one > atom > per optical flow frame), to see if this would run faster and/or produce > smaller > files, and then I ran into the error I reported. > > However, I later noticed that the shape of your array is > > array_shape = (n_frames, n_rows, n_cols) > > whereas I had tried > > array_shape = (n_rows, n_cols, n_frames) > > This makes a huge difference. Using a shape (n_frames, n_rows, n_cols) for > the > CArray results in the code running only about 15% slower and producing a > file > only about 10% bigger when compared to using strings. This is much better > than > the results I was getting when using a shape (n_rows, n_cols, n_frames). I > guess > this has to do with the way the data is laid out on disk? >
Yes. Data on-disk is written in C-order, so you must be sure than the leading dimensions varies the slowest (i.e. as I have set them). > As for the atom size limit (64 kB), I guess that doesn't apply to string > atoms? > When using strings, I construct the atom in the following way > > array_atom = tables.StringAtom(len(matrix.tostring())) > > where len(matrix.tostring()) = 691200 bytes = 675 kB. > > I mean, the size of the string atom is quite above the 64 kB limit and yet > it > doesn't produce any erros. > To be exact, the problem is not the atom size, but rather the maximum attribute size. In this case, one attribute is used to keep the defaults for the atom, and it cannot be larger than 64 KB. Perhaps I should avoid to write the attribute when the defaults heve, well, the default value (i.e. zero). I'm not certain why this problem does not affect the string types though. -- Francesc Alted
------------------------------------------------------------------------------ Create and publish websites with WebMatrix Use the most popular FREE web apps or write code yourself; WebMatrix provides all the features you need to develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users