On Saturday, September 3, 2011, Edward Kirton <eskir...@lbl.gov> wrote: > of course there is a computational cost to compressing/uncompressing > files but that's probably better than storing unnecessarily huge > files. it's a trade-off.
It may still be faster due to less IO, probably depends on your hardware. > since i'm rapidly running out of storage, i think the best immediate > solution for me is to deprecate all the fastq datatypes in favor of a > new fastqsangergz and to bundle the read qc tools to eliminate > intermediate files. sure, users won't be able to play around with > their data as much, but my disk is 88% full and my cluster has been > 100% occupied for 2-months straight, so less choice is probably > better. In your position I agree that is a pragmatic choice. You might be able to modify the file upload code to gzip any FASTQ files... that would prevent uncompressed FASTQ getting into new histories. I wonder if Galaxy would benefit from a new fastqsanger-gzip (etc) datatype? However this seems generally useful (not just for FASTQ) so perhaps a more general mechanism would be better where tool XML files can say which file types they accept and which of those can/must be compressed (possily not just gzip format?). Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/