Re: [galaxy-dev] Galaxy database size and location

2014-03-06 Thread Hans-Rudolf Hotz

Hi Ravi

I don't quite understand question. It looks like you are mixing up two 
different things? A few comments, which might clarify and help you:


 - the postgresql db does not store the data. It tracks the users,
   their jobs and their histories. Hence, it stays pretty small.

 - the actual data is stored in ~/galaxy_dist/database/files/
   And this directory (or rather its numbered subdirectories) can grow
   pretty quickly - depending on the kind of jobs you run.

 - there are clean-up scripts which you can use to remove 'deleted'
   history items (ie the data), see: 
https://wiki.galaxyproject.org/Admin/Config/Performance/Purge%20Histories%20and%20Datasets



Hope this helps, Hans-Rudolf


On 03/06/2014 02:39 AM, Ravi Alla wrote:

Hi fellow galaxy devs,

I am trying to understand how to implement the galaxy database and get an idea 
of how big it could get. Currently we are running galaxy on a webserver, and 
want to have the postgresql db on locally mounted partition and not on an NFS 
partition. This limits us to around 100GB of storage for the db. We will create 
data libraries for users to load their data without copying to galaxy, so input 
files won't be duplicated. Is there anything we can do about the output files? 
Do these files need to end up in the database or can we put them on the NFS 
partition somewhere with the db holding information about their location?
I noticed that on a routine small analysis I could easily have 20GB or more of 
output files and history and all this is in the database.
If output files and history files are written to the database, are they cleaned 
up daily to avoid storage issues?

Please advise.
Thanks
Ravi Alla
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Galaxy database size and location

2014-03-06 Thread Ravi Alla
I figured it out. There is an option in the universe.wsgi.ini file called 
file_path which points to database/file now and can be changed to a diff 
location.
Thanks
On Mar 6, 2014, at 1:04 AM, Hans-Rudolf Hotz h...@fmi.ch wrote:

 Hi Ravi
 
 I don't quite understand question. It looks like you are mixing up two 
 different things? A few comments, which might clarify and help you:
 
 - the postgresql db does not store the data. It tracks the users,
   their jobs and their histories. Hence, it stays pretty small.
 
 - the actual data is stored in ~/galaxy_dist/database/files/
   And this directory (or rather its numbered subdirectories) can grow
   pretty quickly - depending on the kind of jobs you run.
 
 - there are clean-up scripts which you can use to remove 'deleted'
   history items (ie the data), see: 
 https://wiki.galaxyproject.org/Admin/Config/Performance/Purge%20Histories%20and%20Datasets
 
 
 Hope this helps, Hans-Rudolf
 
 
 On 03/06/2014 02:39 AM, Ravi Alla wrote:
 Hi fellow galaxy devs,
 
 I am trying to understand how to implement the galaxy database and get an 
 idea of how big it could get. Currently we are running galaxy on a 
 webserver, and want to have the postgresql db on locally mounted partition 
 and not on an NFS partition. This limits us to around 100GB of storage for 
 the db. We will create data libraries for users to load their data without 
 copying to galaxy, so input files won't be duplicated. Is there anything we 
 can do about the output files? Do these files need to end up in the database 
 or can we put them on the NFS partition somewhere with the db holding 
 information about their location?
 I noticed that on a routine small analysis I could easily have 20GB or more 
 of output files and history and all this is in the database.
 If output files and history files are written to the database, are they 
 cleaned up daily to avoid storage issues?
 
 Please advise.
 Thanks
 Ravi Alla
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Galaxy database size and location

2014-03-06 Thread Ravi Alla
I figured it out. There is an option in the universe.wsgi.ini file called 
file_path which points to database/file now and can be changed to a diff 
location.
Thanks
On Mar 6, 2014, at 1:04 AM, Hans-Rudolf Hotz h...@fmi.ch wrote:

 Hi Ravi
 
 I don't quite understand question. It looks like you are mixing up two 
 different things? A few comments, which might clarify and help you:
 
 - the postgresql db does not store the data. It tracks the users,
  their jobs and their histories. Hence, it stays pretty small.
 
 - the actual data is stored in ~/galaxy_dist/database/files/
  And this directory (or rather its numbered subdirectories) can grow
  pretty quickly - depending on the kind of jobs you run.
 
 - there are clean-up scripts which you can use to remove 'deleted'
  history items (ie the data), see: 
 https://wiki.galaxyproject.org/Admin/Config/Performance/Purge%20Histories%20and%20Datasets
 
 
 Hope this helps, Hans-Rudolf
 
 
 On 03/06/2014 02:39 AM, Ravi Alla wrote:
 Hi fellow galaxy devs,
 
 I am trying to understand how to implement the galaxy database and get an 
 idea of how big it could get. Currently we are running galaxy on a 
 webserver, and want to have the postgresql db on locally mounted partition 
 and not on an NFS partition. This limits us to around 100GB of storage for 
 the db. We will create data libraries for users to load their data without 
 copying to galaxy, so input files won't be duplicated. Is there anything we 
 can do about the output files? Do these files need to end up in the database 
 or can we put them on the NFS partition somewhere with the db holding 
 information about their location?
 I noticed that on a routine small analysis I could easily have 20GB or more 
 of output files and history and all this is in the database.
 If output files and history files are written to the database, are they 
 cleaned up daily to avoid storage issues?
 
 Please advise.
 Thanks
 Ravi Alla
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/