Hello Zack,

There are a two options:
 - use the public server Main, but run the data through in batches
 - use a local or cloud instance, as a whole or in batches
   http://getgalaxy.org

To decide how to batch the data, perhaps use just one of your datasets to make some estimates about how much disk each will use as it moves through your planned analysis pipeline. Then you can decide how many to run at a time. You may also want to consider how much disk needs to be reserved for summary data to be pooled at the end for final steps. It would be good to know if the public instance is going to be the right resource for your project overall before you start.

Often a workflow will have many intermediate steps where the same dataset is changed in small ways as it is transformed. These intermediate files are likely not needed and can be purged (permanently deleted) once the processing is past a certain point and confirmed to be OK. Then there are steps where very large raw data is reduced to much smaller summary data - the large data can often be archived then permanently deleted (purged).

The optimal workflow would include some phases where processing occurs in a streaming mode and some phases where it pauses to allow for data review and cleanup, to make space for downstream analysis steps or more analysis cycles.

Data purged from the public Galaxy instance can be first saved locally to your own computer/servers/cloud resource as individual files or as entire histories for use in local instances or cloud instances as reviewable archives or actionable work environments (to be used at a later time, locally or loaded back up into the public Galaxy instance).

For more about data management, please see the guidelines and tips on this wiki. Of most relevance for your case (after Quotas), will be the last section, about "delete" (recoverable, counts towards disk quota) and "permanently delete" (a.k.a. purged, non-recoverable, does not count towards disk quota):

http://wiki.g2.bx.psu.edu/Learn/Managing%20Datasets#Data_size_and_disk_Quotas

We realize that this is not a simple solution. Quotas were a very difficult, but necessary, project decision. What we can do is try to provide as much support as possible to help our community effectively manage their data. So, if you need more help or have questions, please let us know,

Take care,

Jen
Galaxy team

On 1/30/12 7:42 AM, Zack Liu wrote:
Dear galaxy admins,

I am working on a hi-seq project, where each file is ~ 50G.  I have 8 of
them.  After I uploaded my files, I realized that I have reach my quote
limit. Literally, I can't do any mapping or actions on these files since
I have no disk space for it.

I was wondering if there's anyway I can get more disk storage on galaxy.
Thanks!

Zack Liu




___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/wiki/Support
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to