On some occasions, I have the need to sort extremely large files, but which compress well using programs such as gzip or bzip.

I can emulate the sorting of a gzipped files while keeping input compressed using shell pipes, eg
   zcat in.gz | sort | gzip > out.gz
However, if there is not enough temporary space available for sort to store to (ie, usually /tmp), then sort will fail.

In a similar vein, I can sort multiple large gzipped files, where each file is small enough to sort in available temporary space, but
the complete file is not, by making use of fifos.
for i in *.gz;
do
   zcat $i | sort | gzip > $i.out
   mkfifo $i.out.fifo
   zcat $i.out > $i.out.fifo &
done
sort -m *.out.fifo | gzip > out.gz

However, things would be made easier if sort, like tar, could support the use of compressors and decompressors, (a) for input and (b) for the temporary files. Are there any difficulties in adding such features to sort that I haven't envisaged?

Many thanks

Craig


_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Reply via email to