Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

2015-07-29 Thread Shane Kelly
Hi Iyad,
thanks for those links, and the sound advice. I'll read and
absorb, and see what kind of a plan I can come up with.

So good of you to take the time to answer, many thanks.

Regards,
Shane


>What you should do (in my opinion) is install a grid scheduler (SGE,
>torque, etc) on the big server.  If you run Galaxy on a separate
>server, it can be configured to submit jobs to the scheduler.  Galaxy
>also has the concept of a Web App and Handler components.
>Essentially, handlers take care of talking with the scheduler while
>the Web App will serve pages to users.  By default, the Web App and
>Handlers are combined in the same process.  you can configure galaxy
>to start up multiple handler processes and multiple web app
>processes.  Then, you can use Apache or Nginx to load balance user
>requests between the various Galaxy Web Apps.
>
>My recommendation is that you start small to accommodate about 5
>concurrent users without noticeable performance issues: 1 Galaxy web
>app 1 Galaxy handler
>
>Recommended reading:
>https://wiki.galaxyproject.org/Admin/Config/Performance/Scaling 
>https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster
>
>
>-Original Message-
>From: Shane Kelly [mailto:s...@shanek54.co.uk] 
>Sent: July-29-15 7:32 AM
>To: Kandalaft, Iyad
>Cc: galaxy-dev@lists.galaxyproject.org
>Subject: Re: [galaxy-dev] I need some advice on what type of Galaxy
>server to implement
>
>Hi Iyad,
>   Thanks for taking the time to get back to me.
>I am an IT guy, and would not know how to answer these questions
>except to say that the server is to be used to facilitate biological
>and medical research in the areas of genomics, transcriptomics,
>epigenomics and metagenomics. (straight from the mission statement :-)
>)
>
>   I think that they would say  that overall performance is less
>   important than being able to run large datasets (1/2-3TB).
>
>   I like the idea of a separate box for the web-server, but I am
>   not sure how the web server would communicate with the
>   pipeline box - is that ability built into galaxy, or is it a
>   well worn path with plenty of examples that I could plagiarize?
>
>   Sorry to be such a newb, but I don't know much about galaxy at
>   all. Luckily I have 2-3 months to put this in place...
>
>Again, thank you for your time.
>
>Regards,
>Shane
>
>
>
>
>>When you say NGS, is it genome assembly?  If so, what type of genomes 
>>and do you have experience with its memory and cpu requirements.  We 
>>noted that servers with large amount of memory and cores have a
>>memory bus bottleneck. The other aspect is high processing on the
>>server will impact the performance of Galaxy unless it is given
>>higher priority. Note that if you overcommit the server, it can
>>destabilize and bring down the Galaxy web app and database.
>>
>>My general approach is Galaxy web app + proxy on a separate machine 
>>from the handlers.  The analysis server is either running a grid or
>>the handlers.
>> 
>>I recommend multiple smaller servers if you can get away with it as 
>>long as you have one that can accommodate your LARGE workloads.  If
>>you don't care about overall performance, large servers are the way
>>to go as they are more "versatile".
>>
>>Regards,
>>
>>Iyad Kandalaft
>>
>>Acting Chief Bioinformatician in Biodiversity, STB Agriculture and 
>>Agri-Food Canada / Government of Canada iyad.kandal...@agr.gc.ca /
>>Tel: 613-759-1228 / TTY: 613-773-2600
>>
>>Bioinformaticien chef de la  biodiversite interim, Direction générale 
>>des Science et de la technologie Agriculture et Agroalimentaire
>>Canada / Gouvernement du Canada iyad.kandal...@agr.gc.ca / Tel:
>>613-759-1228 / TTY: 613-773-2600
>>
>>
>>
>>
>>-Original Message-
>>From: galaxy-dev [mailto:galaxy-dev-boun...@lists.galaxyproject.org]
>>On Behalf Of Shane Kelly Sent: July-28-15 8:23 AM
>>To: galaxy-dev@lists.galaxyproject.org
>>Subject: [galaxy-dev] I need some advice on what type of Galaxy
>>server to implement
>>
>>Hi
>>  I have been tasked with getting a Galaxy server up and running
>>  for a group at work.
>>
>>  1. No-one can tell me how many users (concurrent or otherwise)
>>  there will be 2. Most of the analyses will be NGS.
>>  3. Tools will be developed in-house but we will use public
>>  domain tools also. 4. There will be a guy running the
>>  server/developing tools pretty much full time.
>>
>>  I have two favoured solutions at the m

Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

2015-07-29 Thread Shane Kelly
Hi Iyad,
Thanks for taking the time to get back to me.
I am an IT guy, and would not know how to answer these questions except
to say that the server is to be used to facilitate biological and
medical research in the areas of genomics, transcriptomics, epigenomics
and metagenomics. (straight from the mission statement :-) )

I think that they would say  that overall performance is less
important than being able to run large datasets (1/2-3TB).

I like the idea of a separate box for the web-server, but I am
not sure how the web server would communicate with the pipeline box -
is that ability built into galaxy, or is it a well worn path with plenty
of examples that I could plagiarize?

Sorry to be such a newb, but I don't know much about galaxy at
all. Luckily I have 2-3 months to put this in place...

Again, thank you for your time.

Regards,
Shane




>When you say NGS, is it genome assembly?  If so, what type of genomes
>and do you have experience with its memory and cpu requirements.  We
>noted that servers with large amount of memory and cores have a memory
>bus bottleneck. The other aspect is high processing on the server will
>impact the performance of Galaxy unless it is given higher priority.
>Note that if you overcommit the server, it can destabilize and bring
>down the Galaxy web app and database.
>
>My general approach is Galaxy web app + proxy on a separate machine
>from the handlers.  The analysis server is either running a grid or
>the handlers.
> 
>I recommend multiple smaller servers if you can get away with it as
>long as you have one that can accommodate your LARGE workloads.  If
>you don't care about overall performance, large servers are the way to
>go as they are more "versatile".
>
>Regards,
>
>Iyad Kandalaft
>
>Acting Chief Bioinformatician in Biodiversity, STB
>Agriculture and Agri-Food Canada / Government of Canada
>iyad.kandal...@agr.gc.ca / Tel: 613-759-1228 / TTY: 613-773-2600
>
>Bioinformaticien chef de la  biodiversite interim, Direction générale
>des Science et de la technologie Agriculture et Agroalimentaire
>Canada / Gouvernement du Canada iyad.kandal...@agr.gc.ca / Tel:
>613-759-1228 / TTY: 613-773-2600 
>
>
>
>
>-Original Message-
>From: galaxy-dev [mailto:galaxy-dev-boun...@lists.galaxyproject.org]
>On Behalf Of Shane Kelly Sent: July-28-15 8:23 AM
>To: galaxy-dev@lists.galaxyproject.org
>Subject: [galaxy-dev] I need some advice on what type of Galaxy server
>to implement
>
>Hi
>   I have been tasked with getting a Galaxy server up and running
>   for a group at work.
>
>   1. No-one can tell me how many users (concurrent or otherwise)
>   there will be 2. Most of the analyses will be NGS.
>   3. Tools will be developed in-house but we will use public
>   domain tools also. 4. There will be a guy running the
>   server/developing tools pretty much full time.
>
>   I have two favoured solutions at the moment:
>
>1. A pipeline processor ( 64 Core, 512G Ram, with DAS of about 150TB
>), and a Web server to act as frontend and database server, and
>another, smaller box for a total install of galaxy, but doing only the
>development work.
>
>2. An all-in-one server with 128 Cores, 1TB ram, DAS storage of 150TB,
>and development work done on a VM.
>
>Any input would be hepfull.
>
>Thanks
>Shane
>___
>Please keep all replies on the list by using "reply all"
>in your mail client.  To manage your subscriptions to this and other
>Galaxy lists, please use the interface at:
>  https://lists.galaxyproject.org/
>
>To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/
>___
>Please keep all replies on the list by using "reply all"
>in your mail client.  To manage your subscriptions to this
>and other Galaxy lists, please use the interface at:
>  https://lists.galaxyproject.org/
>
>To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

2015-07-29 Thread Kandalaft, Iyad
What you should do (in my opinion) is install a grid scheduler (SGE, torque, 
etc) on the big server.  If you run Galaxy on a separate server, it can be 
configured to submit jobs to the scheduler.  Galaxy also has the concept of a 
Web App and Handler components.  Essentially, handlers take care of talking 
with the scheduler while the Web App will serve pages to users.  By default, 
the Web App and Handlers are combined in the same process.  you can configure 
galaxy to start up multiple handler processes and multiple web app processes.  
Then, you can use Apache or Nginx to load balance user requests between the 
various Galaxy Web Apps.

My recommendation is that you start small to accommodate about 5 concurrent 
users without noticeable performance issues:
1 Galaxy web app
1 Galaxy handler

Recommended reading:
https://wiki.galaxyproject.org/Admin/Config/Performance/Scaling 
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster


-Original Message-
From: Shane Kelly [mailto:s...@shanek54.co.uk] 
Sent: July-29-15 7:32 AM
To: Kandalaft, Iyad
Cc: galaxy-dev@lists.galaxyproject.org
Subject: Re: [galaxy-dev] I need some advice on what type of Galaxy server to 
implement

Hi Iyad,
Thanks for taking the time to get back to me.
I am an IT guy, and would not know how to answer these questions except to say 
that the server is to be used to facilitate biological and medical research in 
the areas of genomics, transcriptomics, epigenomics and metagenomics. (straight 
from the mission statement :-) )

I think that they would say  that overall performance is less important 
than being able to run large datasets (1/2-3TB).

I like the idea of a separate box for the web-server, but I am not sure 
how the web server would communicate with the pipeline box - is that ability 
built into galaxy, or is it a well worn path with plenty of examples that I 
could plagiarize?

Sorry to be such a newb, but I don't know much about galaxy at all. 
Luckily I have 2-3 months to put this in place...

Again, thank you for your time.

Regards,
Shane




>When you say NGS, is it genome assembly?  If so, what type of genomes 
>and do you have experience with its memory and cpu requirements.  We 
>noted that servers with large amount of memory and cores have a memory 
>bus bottleneck. The other aspect is high processing on the server will 
>impact the performance of Galaxy unless it is given higher priority.
>Note that if you overcommit the server, it can destabilize and bring 
>down the Galaxy web app and database.
>
>My general approach is Galaxy web app + proxy on a separate machine 
>from the handlers.  The analysis server is either running a grid or the 
>handlers.
> 
>I recommend multiple smaller servers if you can get away with it as 
>long as you have one that can accommodate your LARGE workloads.  If you 
>don't care about overall performance, large servers are the way to go 
>as they are more "versatile".
>
>Regards,
>
>Iyad Kandalaft
>
>Acting Chief Bioinformatician in Biodiversity, STB Agriculture and 
>Agri-Food Canada / Government of Canada iyad.kandal...@agr.gc.ca / Tel: 
>613-759-1228 / TTY: 613-773-2600
>
>Bioinformaticien chef de la  biodiversite interim, Direction générale 
>des Science et de la technologie Agriculture et Agroalimentaire Canada 
>/ Gouvernement du Canada iyad.kandal...@agr.gc.ca / Tel:
>613-759-1228 / TTY: 613-773-2600
>
>
>
>
>-Original Message-
>From: galaxy-dev [mailto:galaxy-dev-boun...@lists.galaxyproject.org]
>On Behalf Of Shane Kelly Sent: July-28-15 8:23 AM
>To: galaxy-dev@lists.galaxyproject.org
>Subject: [galaxy-dev] I need some advice on what type of Galaxy server 
>to implement
>
>Hi
>   I have been tasked with getting a Galaxy server up and running
>   for a group at work.
>
>   1. No-one can tell me how many users (concurrent or otherwise)
>   there will be 2. Most of the analyses will be NGS.
>   3. Tools will be developed in-house but we will use public
>   domain tools also. 4. There will be a guy running the
>   server/developing tools pretty much full time.
>
>   I have two favoured solutions at the moment:
>
>1. A pipeline processor ( 64 Core, 512G Ram, with DAS of about 150TB ), 
>and a Web server to act as frontend and database server, and another, 
>smaller box for a total install of galaxy, but doing only the 
>development work.
>
>2. An all-in-one server with 128 Cores, 1TB ram, DAS storage of 150TB, 
>and development work done on a VM.
>
>Any input would be hepfull.
>
>Thanks
>Shane
>___
>Please keep all replies on the list by using "reply all"
>in your mail client.  To ma

Re: [galaxy-dev] I need some advice on what type of Galaxy server to implement

2015-07-28 Thread Kandalaft, Iyad
When you say NGS, is it genome assembly?  If so, what type of genomes and do 
you have experience with its memory and cpu requirements.  We noted that 
servers with large amount of memory and cores have a memory bus bottleneck.
The other aspect is high processing on the server will impact the performance 
of Galaxy unless it is given higher priority.  Note that if you overcommit the 
server, it can destabilize and bring down the Galaxy web app and database.

My general approach is Galaxy web app + proxy on a separate machine from the 
handlers.  The analysis server is either running a grid or the handlers.
 
I recommend multiple smaller servers if you can get away with it as long as you 
have one that can accommodate your LARGE workloads.  If you don't care about 
overall performance, large servers are the way to go as they are more 
"versatile".

Regards,

Iyad Kandalaft

Acting Chief Bioinformatician in Biodiversity, STB
Agriculture and Agri-Food Canada / Government of Canada
iyad.kandal...@agr.gc.ca / Tel: 613-759-1228 / TTY: 613-773-2600

Bioinformaticien chef de la  biodiversite interim, Direction générale des 
Science et de la technologie
Agriculture et Agroalimentaire Canada / Gouvernement du Canada
iyad.kandal...@agr.gc.ca / Tel: 613-759-1228 / TTY: 613-773-2600
 




-Original Message-
From: galaxy-dev [mailto:galaxy-dev-boun...@lists.galaxyproject.org] On Behalf 
Of Shane Kelly
Sent: July-28-15 8:23 AM
To: galaxy-dev@lists.galaxyproject.org
Subject: [galaxy-dev] I need some advice on what type of Galaxy server to 
implement

Hi
I have been tasked with getting a Galaxy server up and running for a 
group at work.

1. No-one can tell me how many users (concurrent or otherwise) there 
will be
2. Most of the analyses will be NGS.
3. Tools will be developed in-house but we will use public domain tools 
also.
4. There will be a guy running the server/developing tools pretty much 
full time.

I have two favoured solutions at the moment:

1. A pipeline processor ( 64 Core, 512G Ram, with DAS of about 150TB ), and a 
Web server to act as frontend and database server, and another, smaller box for 
a total install of galaxy, but doing only the development work.

2. An all-in-one server with 128 Cores, 1TB ram, DAS storage of 150TB, and 
development work done on a VM.

Any input would be hepfull.

Thanks
Shane
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy 
lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/