On Saturday, September 3, 2011, Edward Kirton <eskir...@lbl.gov> wrote:
> of course there is a computational cost to compressing/uncompressing
> files but that's probably better than storing unnecessarily huge
> files.  it's a trade-off.

It may still be faster due to less IO, probably depends on your hardware.

> since i'm rapidly running out of storage, i think the best immediate
> solution for me is to deprecate all the fastq datatypes in favor of a
> new fastqsangergz and to bundle the read qc tools to eliminate
> intermediate files.  sure, users won't be able to play around with
> their data as much, but my disk is 88% full and my cluster has been
> 100% occupied for 2-months straight, so less choice is probably
> better.

In your position I agree that is a pragmatic choice. You might be able to
modify the file upload code to gzip any FASTQ files... that would prevent
uncompressed FASTQ getting into new histories.

I wonder if Galaxy would benefit from a new fastqsanger-gzip (etc) datatype?
However this seems generally useful (not just for FASTQ) so perhaps a more
general mechanism would be better where tool XML files can say which file
types they accept and which of those can/must be compressed (possily not
just gzip format?).

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to