Re: [Pytables-users] PerformanceWarning: The Leaf ``/tetrahedrons`` is exceeding the maximum recommended rowsize

Francesc Alted Mon, 13 Dec 2010 05:21:28 -0800

A Monday 13 December 2010 14:05:28 Dominik Szczerba escrigué:
> On Mon, Dec 13, 2010 at 1:48 PM, Francesc Alted <[email protected]> 
wrote:
> > As we know, HDF5 is ignorant on how the data in file is ordered. 
> > So, if you have created the dataset using a Fortran program, then
> > clearly the data is ordered column-wise on disk.  But, as you are
> > reading the file by using a C-based app, then columns and rows
> > will appear to be *transposed*.
> > 
> > So, if what you want is to read column i *of your original Fortran
> > array*, then the correct way to do this in PyTables should be:
> > 
> > for i in range(NCELL):
> >   col = tetrahedrons[i,:]
> 
> This does not work. It only works as I wrote previously. Please see
> below:
> 
> In [3]: tets = array(fid.getNode("/tetrahedrons").read())
> In [4]: tets.shape
> Out[4]: (4, 4624802)
> In [5]: tets[:,0]
> Out[5]: array([715692, 707733, 707734, 159966], dtype=int32)
> In [6]: tets[0,:]
> Out[6]: array([715692, 365237, 555693, ..., 706208, 706208, 511217],
> dtype=int32)
> 
> so tetrahedrons[i,:] runs 0..3 and not 0...NC-1
> 
> Did you make a typo above, or we do not arrive at a conclusion?


That was not a typo, but a mistake on my part (I forgot that HDF5 
reverses the shape of the matrices when using the Fortran binding).  So, 
yes, your version is okay for accessing columns.

But, for knowing if accessing columns this is efficient for your case, 
I'd need more info on your datasets.  Are they contiguous or chunked?  
If chunked, which is the chunkshape you have chosen?

-- 
Francesc Alted

------------------------------------------------------------------------------
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] PerformanceWarning: The Leaf ``/tetrahedrons`` is exceeding the maximum recommended rowsize

Reply via email to