In my thesis, I'm working on a project that produces huge amounts of output 
in text files - about 25-30GB spread across a million or more files per 
simulation run. If I compress the files using e.g. `tar --xz --create -f 
archive.tar.gz tracefiles/` I can reduce the size on disk by a factor 5-6 
or even more. I postprocess all this data in Julia, and reading the data 
files seems to be a major bottleneck.

Has any effort been made toward reading files in these formats in Julia? 
I've seen [ZipFile](https://github.com/fhs/ZipFile.jl) for handling the 
.zip format, but unfortunately `zip` isn't available on our cluster, while 
`tar` is.

If there hasn't been any work on this, I might take a stab at it sometime - 
but first I must finish my thesis...

// T

Reply via email to