First of all thanks for the long reply - to keep it short I'll follow
you with answering inline:
Scott McManus wrote:
It may help to consider other pieces aside from compute nodes
that you will need, such as nodes for proxies and databases,
networking gear (such as switches and cables), and so on.
http://usegalaxy.org/production has some details, and there are
high-level pieces explained at
Thanks, I read through it, that is some evidence.
One non-technical note regarding the organization (techies: skip that):
This is right the point we're on currently - we had a first
non-technical conversation ~1 month ago, and in the last days suddenly
funding was released and led to "zugzwang" (as far as I read it also
describes in English the force to (re)act).
The structure is roughly as follows: there is the IT provider for the
complete hospital campus (consisting of several clinics and some medical
school institutions; we belong to the latter) and our own institute's
IT, which serves internally science and research. We had hours of chats
inside our institute and agreed that we are neither able nor willed to
manage everything on our own (the system is intended for everyone doing
NGS research at the campus). This main IT was not integrated in the
announced non-technical conversation.
You should also talk to your institution's IT folks about power
requirements, how those costs passed on, off-site backup storage
(though it sounds like you're counting on RAID 5/6), etc.
Regarding the technical environment everything is on the way, today
we'll have another meeting (the "main" IT folks are bothered by our
targeted "custom" hardware). Backup is also part of conversations in
September, we don't want to count on RAID6 alone. This topic is
additionally very politics-driven (who pays for what?)... Technically
the need is no question.
Fortunately I joined the Czars group from the first meeting and also
took part at the GCC2012 breakout session. You're absolutely right. Too
bad that is of so short time until we have to act - that's the reason
why I included the whole list, hoping that anyone did some benchmarking.
We planed to, but our first server behaved quite "moody"...
Sharing some experiences or "hard fact values" including system specs
would be great for other people who are at the point to order hardware
and are forced to state some reasons.
It also may help if folks could share their experiences with benchmarking
their own systems along with the tools that they've been using.
The Galaxy Czars conference call could help - you could bring this
up at the next meeting.
Planned for the HDD connection is a RAID controller offering 1 GB/s -
the array on our first server btw delivered 450 MB/s (measured). Network
should not be the problem for the concept, it is intended to be
relatively autarkic. Network load will only appear while loading data
from an archive or the sequencer itself. A 10 Gbit/s connection is
available. InfiniBand was considered for a short time but would exceed
the current funding. A cluster is available, but the connection speed is
quite low (due to usage more for statistical analyses).
I've answered inline, but in general I think that the bottleneck
for your planned architecture will be I/O with respect to disk.
The next bottleneck may be with respect to the network - if you
have a disk farm with a 1 Gbps (125 MBps) connection, then it
doesn't matter if your disks can write 400+ MBps. (Nate also
included this in his presentation.) You may want to consider
Infiniband over Ethernet - I think the Galaxy Czars call would
be really helpful in this respect.
This is what we wanted to do (see above), but we did not get so far due
to the announced technical issues (RAID controller, HDD crashes etc.)
1. Using the described bioinformatics software: where are the
system bottlenecks? (connections between CPUs, RAM, HDDs)
One way to get a better idea is to start with existing resources,
create a sample workflow or two, and measure performance. Again,
the Galaxy czars call could be a good bet.
From plain theory I would expect the Needleman-Wunsch algorithm to be
of high relevance, which should be integer calculation, basically. In
the case of pairwise sequence alignment. MSAs may be different (may
require floating point calcs). Unfortunately, GPU and/or FPGA usage are
currently far out of range of this first concept, but in the back of my
mind for a longer time :). In the standard CPU setting/environment I
would suppose also the I/O between CPU and RAM to be a bottleneck for
tasks which are not that CPU intensive (more details down below).
2. What is the expected relation of integer-based and floating point
based calculations, which will be loading the CPU cores?
This also depends on the tools being used. This might be more
relevant if your architecture were to use more specialized hardware
(such as GPUs or FPGAs), but this should be a secondary concern.
Inspired by a tool calling BWA several times ('Stampy') on one
memory-mapped file our idea was that, due to the HDD bottleneck, RAM
should be available as much as possible. Therefore our medium-term idea
would be to optimize I/O-intensive workflows by wrappers enabling
memory-mapping as far as possible.
As far as I understood until now, the CPU<->RAM I/O would be a
bottleneck in case of trimming, simple filter steps etc. - everything
were masses of data/strings don't really challenge the CPU. As long as
the source data is already in the main memory and not to be loaded from
3. Regarding the architectural differences (strengths, weaknesses):
Would an AMD- or an Intel-System be more suitable?
I really can't answer which processor line is more suitable, but
I think that having enough RAM per core is more important. Nate shows
that main.g2.bx.psu.edu has 4 GB RAM per core.
4. How much I/O (read and write) can be expected at the memory
controllers? Which tasks are most I/O intensive (regarding RAM and/or
Workflows currently write all output to disk and read all input from
disk. This gets back to previous questions on benchmarking.
Than I should be quite save with 128+ GB of RAM. Sure, if one core has
to request RAM outside of it's own adress space, via a neighbored core,
speed goes significantly down. But this should appear rarely in our setting.
5. Roughly separated in mapping and clustering jobs: which amounts of
main memory can be expected to be required by a single job (given
Illumina exome data, 50x coverage)? As far as I know mapping should
around 4 GB, clustering much more (may reach high double digits).
Nate's presentation shows that main.g2.bx.psu.edu has 24 to 48 GB per
8 core reservation, and as before it shows that there is 4 GB per core.
It was less about considering the filesystem (we recently abolished ZFS
due to issues, which afterwards turned out to (maybe) refer back to some
strange hardware behaviour), we go for EXT4 currently. Gluster or Ceph
may get interesting when we go for an extended system, after having
built up a working stand-alone concept. I'll come back to you for that.
And also to the Czars group where this topic came up as one of the first
6. HDD access (R/W) is mainly in bigger blocks instead of masses of
short operations - correct?
Again, this all depends on the tool being used and could help with some
benchmarks. This question sounds like it's mostly related to choosing the
filesystem - is that right? If so, then you may want to consider a
compressing file system such as ZFS or BtrFS. You may also want to consider
filesystems like Ceph or Gluster (now Red Hat). I know that Ceph can
run on top of XFS and BtrFS, but you should look into BtrFS's churn rate -
it might still be evolving quickly. Again, a ping to the Galaxy Czars call
may help on any and possibly all of these questions.
Thanks, I'll need some of that.
UPDATE: in the meantime I had this meeting, we gained some more days. At
least until mid of next week. Anyway, if someone could offer some
measured values or experiences I would be very glad.
Sebastian Schaaf, M.Sc. Bioinformatics
Chair of Biometry and Bioinformatics
Department of Medical Information Sciences, Biometry and Epidemiology
University of Munich
Marchioninistr. 15, K U1 (postal)
Marchioninistr. 17, U 006 (office)
D-81377 Munich (Germany)
Tel: +49 89 2180-78178
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at: