Re: building a department GPU cluster

Colin McCabe Mon, 28 Jan 2013 14:34:38 -0800

On Thu, Jan 17, 2013 at 12:24 AM, Roberto Nunnari
<[email protected]>wrote:


> Hi all.
>
> I'm writing to you to ask for advice or a hint to the right direction.
>
> In our department, more and more researchers ask us (IT administrators) to
> assemble (or to buy) GPGPU powered workstations to do parallel computing.
>
> As I already manage a small CPU cluster (resources managed using SGE),
> with my boss we talked about building a new GPU cluster. The problem is
> that I have no experience at all with GPU clusters.
>
> Apart from the already running GPU workstations, we already have some new
> HW that looks promising to me as a starting point for a GPU cluster.
>
> - 1x Dell PowerEdge R720
> - 1x Dell PowerEdge C410x
> - 1x NVIDIA M2090 PCIe x16
> - 1x NVIDIA iPASS Cable Kit
> (Dell forgot to include the iPASS adapter for the R720!! :-D)
>
> I'd be grateful if you could kindly give me some advice and/or hint to the
> right direction.
>
> In particular I'm interested on your opinion on:
> 1) is the above HW suitable for a small (2 to 4/6 GPUs) GPU cluster?
> 2) is apache adhoop suitable (or what could we use?) as a queuing and
> resource management system? We would like the cluster to be usable by many
> users at once in a way that no user has to worry about resources, just like
> we do on the CPU cluster with SGE.
>

My understanding (although I could be wrong) is that only one task is going
to be able to use the GPU at a time, so you're going to have to take that
into account when configuring MR.

3) What distribution of linux would be more appropriate?
>

Whatever NVIDIA's kernel module supports best-- probably RHEL.

4) necessary stack of sw? (cuda, hadoop, other?)
>
> You probably want to write the code in C or C++ and use Hadoop streaming
plus whatever libraries you need in order to use CUDA.  nvidia.com should
have more information about that.  CUDA is an NVIDIA-proprietary technology.

Colin

Re: building a department GPU cluster

Reply via email to