LittleBigBrain : > Hi Everyone, > > I am testing the I/O speed on my machine. > I got a array in HDF5 file: > File(xxx.h5, title='', mode='r', rootUEP='/', > filters=Filters(complevel=0, shuffle=False, fletcher32=False)) > / (RootGroup) '' > /rawSignal (Array(3000000, 24)) '' > atom := Float64Atom(shape=(), dflt=0.0) > maindim := 0 > flavor := 'numpy' > byteorder := 'big' > chunkshape := None > h5file=openFile("xxx.h5", mode = "r") > a1=h5file.root.rawSignal > then I try to load some chunks into a numpy array: > import numpy as npy > def test(a_len): > a=npy.zeros((a_len,1)) > t1=time.time() > a[:,0]=a1[:a_len,1] > t2=time.time()-t1 > print 'a size = ',a.shape,'dt = ',t2 > Then I got following results: > data lenght = 100.0 > a size = (100L, 1L) dt = 0.0 > data lenght = 1000.0 > a size = (1000L, 1L) dt = 0.0310001373291 > data lenght = 10000.0 > a size = (10000L, 1L) dt = 0.281000137329 > data lenght = 100000.0 > a size = (100000L, 1L) dt = 2.70299983025 > data lenght = 1000000.0 > a size = (1000000L, 1L) dt = 27.0629999638 > > another file: > File(filename=yyy.h5, title='', mode='r', rootUEP='/', > filters=Filters(complevel=0, shuffle=False, fletcher32=False)) > / (RootGroup) '' > /rawSignal (CArray(3000000, 24)) '' > atom := Float64Atom(shape=(), dflt=0.0) > maindim := 0 > flavor := 'numpy' > byteorder := 'little' > chunkshape := (1365, 24) > > a size = (100L, 1L) dt = 0.0 > a size = (1000L, 1L) dt = 0.0 > a size = (10000L, 1L) dt = 0.0160000324249 > a size = (100000L, 1L) dt = 0.0149998664856 > a size = (1000000L, 1L) dt = 0.31200003624 > Is this speed normal? So the major difference is due to the chunkshape I > guess. But how could I specify the chunckshape when I open a file > without a chunkshape infomation? >
I checked the HDF5 documentation again. |H5D_CONTIGUOUS format should not be that slow (15MB/min). Is there any parameter I should specify in pytable to make I/O for this format faster? | ------------------------------------------------------------------------------ Virtualization is moving to the mainstream and overtaking non-virtualized environment for deploying applications. Does it make network security easier or more difficult to achieve? Read this whitepaper to separate the two and get a better understanding. http://p.sf.net/sfu/hp-phase2-d2d _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users