Hi Patrick,

This is indeed pretty application specific. While you could modify Spark to 
list GPUs and assign tasks to them, I think a simpler solution would be to 
manage use of GPUs at the application level. Create a static object GPUManager 
that lists the GPUs on each machine (somehow) and receives requests from tasks 
to run on them. Then have it launch these on the available GPUs, giving each 
one a GPU ID. This isn't perfect because it means your tasks will be blocking 
to wait on this object, but it might still be good enough.

Matei

On Oct 9, 2013, at 12:01 PM, Patrick Grinaway <[email protected]> wrote:

> Hello all,
> 
> I've seen some rumblings of questions about heterogeneous computing 
> environments, but I think I've got a bit of a unique case (at least one not 
> documented on here):
> 
> Some of the functions that I will be using as "maps" in our pipeline actually 
> require the use of a GPU. The issue here becomes twofold:
> 
> 1) Most nodes have 4 or less GPUs, but at least 10 available CPU cores. Is 
> there some attribute with which I can tag workers so that they know this 
> resource limitation, without limiting me to 4 or less CPUs?
> 
> 2) The code that uses the GPU needs a gpuid to know which GPU to bind (it 
> pretty much takes the whole thing). Is there some way that my function can 
> know *which* worker it is in (just a numerical ID, maybe 0-(n-1)) so that it 
> knows which GPU to bind?
> 
> I suspect this may not be currently feasible, and if not, I'll try to write 
> it myself, but I figured I'd ask first.
> 
> Thanks!
> 
> Patrick Grinaway
> 

Reply via email to