HDF5 file support compression. This is enabled via a flag when writing the file; when reading, it is automatically decompressed. I assume that compression would greatly reduce the file size.
-erik On Tue, Jul 21, 2015 at 1:21 PM, Stefan Karpinski <[email protected]> wrote: > In your example data, each value is represented with two bytes: one for > the value, one for a comma or newline. Each Int64 value is 8 bytes. If all > your values are between 0 and 255, you could use UInt8 to represent them > and cut the size in half. > > On Tue, Jul 21, 2015 at 1:16 PM, paul analyst <[email protected]> > wrote: > >> I have data in txt file, some milons like this: >> 0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0 >> 0,0,0,0,0,0,0,2,0,0,0,2,0,0,0,0,1 >> 0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1 >> >> Coding win1250. >> >> size of dane.txt is 1.3 GB >> >> D=readcsv("dane.txt") >> k,l=size(D) >> >> using HDF5, JLD >> hfi=h5open("D.h5","w") >> close(hfi) >> >> fid = h5open("D.h5","r+") >> g = fid["/"] >> dset1 = d_create(g, "/D", datatype(Int64), dataspace(k,l)) >> dset1[:,:]=D >> close(fid) >> >> After save to h5 file the file has 6.3 GB ? Why new file is 4 times biger? >> Paul >> > > -- Erik Schnetter <[email protected]> http://www.perimeterinstitute.ca/personal/eschnetter/
