Re: [Pytables-users] PerformanceWarning: The Leaf ``/tetrahedrons`` is exceeding the maximum recommended rowsize

Dominik Szczerba Mon, 13 Dec 2010 05:57:02 -0800

On Mon, Dec 13, 2010 at 2:20 PM, Francesc Alted <fal...@pytables.org> wrote:


> A Monday 13 December 2010 14:05:28 Dominik Szczerba escrigué:
> > On Mon, Dec 13, 2010 at 1:48 PM, Francesc Alted <fal...@pytables.org>
> wrote:
> > > As we know, HDF5 is ignorant on how the data in file is ordered.
> > > So, if you have created the dataset using a Fortran program, then
> > > clearly the data is ordered column-wise on disk.  But, as you are
> > > reading the file by using a C-based app, then columns and rows
> > > will appear to be *transposed*.
> > >
> > > So, if what you want is to read column i *of your original Fortran
> > > array*, then the correct way to do this in PyTables should be:
> > >
> > > for i in range(NCELL):
> > >   col = tetrahedrons[i,:]
> >
> > This does not work. It only works as I wrote previously. Please see
> > below:
> >
> > In [3]: tets = array(fid.getNode("/tetrahedrons").read())
> > In [4]: tets.shape
> > Out[4]: (4, 4624802)
> > In [5]: tets[:,0]
> > Out[5]: array([715692, 707733, 707734, 159966], dtype=int32)
> > In [6]: tets[0,:]
> > Out[6]: array([715692, 365237, 555693, ..., 706208, 706208, 511217],
> > dtype=int32)
> >
> > so tetrahedrons[i,:] runs 0..3 and not 0...NC-1
> >
> > Did you make a typo above, or we do not arrive at a conclusion?
>
> That was not a typo, but a mistake on my part (I forgot that HDF5
> reverses the shape of the matrices when using the Fortran binding).  So,
> yes, your version is okay for accessing columns.
>
> But, for knowing if accessing columns this is efficient for your case,
> I'd need more info on your datasets.  Are they contiguous or chunked?
> If chunked, which is the chunkshape you have chosen?
>
>
Both. Files saved from matlab are uncompressed/contiguous, the ones saved
from my program are usually compressed/chunked and the size is around
1024^2/sizeof(type).

Many thanks and regards,
Dominik

------------------------------------------------------------------------------
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev

_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] PerformanceWarning: The Leaf ``/tetrahedrons`` is exceeding the maximum recommended rowsize

Reply via email to