LittleBigBrain :
> Hi Everyone,
>
> I am testing the I/O speed on my machine.
> I got a array in HDF5 file:
> File(xxx.h5, title='', mode='r', rootUEP='/',
> filters=Filters(complevel=0, shuffle=False, fletcher32=False))
> / (RootGroup) ''
> /rawSignal (Array(3000000, 24)) ''
> atom := Float64Atom(shape=(), dflt=0.0)
> maindim := 0
> flavor := 'numpy'
> byteorder := 'big'
> chunkshape := None
> h5file=openFile("xxx.h5", mode = "r")
> a1=h5file.root.rawSignal
> then I try to load some chunks into a numpy array:
> import numpy as npy
> def test(a_len):
> a=npy.zeros((a_len,1))
> t1=time.time()
> a[:,0]=a1[:a_len,1]
> t2=time.time()-t1
> print 'a size = ',a.shape,'dt = ',t2
> Then I got following results:
> data lenght = 100.0
> a size = (100L, 1L) dt = 0.0
> data lenght = 1000.0
> a size = (1000L, 1L) dt = 0.0310001373291
> data lenght = 10000.0
> a size = (10000L, 1L) dt = 0.281000137329
> data lenght = 100000.0
> a size = (100000L, 1L) dt = 2.70299983025
> data lenght = 1000000.0
> a size = (1000000L, 1L) dt = 27.0629999638
>
> another file:
> File(filename=yyy.h5, title='', mode='r', rootUEP='/',
> filters=Filters(complevel=0, shuffle=False, fletcher32=False))
> / (RootGroup) ''
> /rawSignal (CArray(3000000, 24)) ''
> atom := Float64Atom(shape=(), dflt=0.0)
> maindim := 0
> flavor := 'numpy'
> byteorder := 'little'
> chunkshape := (1365, 24)
>
> a size = (100L, 1L) dt = 0.0
> a size = (1000L, 1L) dt = 0.0
> a size = (10000L, 1L) dt = 0.0160000324249
> a size = (100000L, 1L) dt = 0.0149998664856
> a size = (1000000L, 1L) dt = 0.31200003624
> Is this speed normal? So the major difference is due to the chunkshape I
> guess. But how could I specify the chunckshape when I open a file
> without a chunkshape infomation?
>   

I checked the HDF5 documentation again. |H5D_CONTIGUOUS format should
not be that slow (15MB/min). Is there any parameter I should specify in
pytable to make I/O for this format faster?
|

------------------------------------------------------------------------------
Virtualization is moving to the mainstream and overtaking non-virtualized
environment for deploying applications. Does it make network security 
easier or more difficult to achieve? Read this whitepaper to separate the 
two and get a better understanding.
http://p.sf.net/sfu/hp-phase2-d2d
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to