I have tried this with 

  #!/bin/bash

  #SBATCH --job-name=easybuild_gpu
  #SBATCH --ntasks=4
  #SBATCH --time=12:00:00
  #SBATCH --mem-per-cpu=1G
  #SBATCH --partition=gpu
  #SBATCH --qos=medium

  srun eb Keras-2.2.4-fosscuda-2019a-Python-3.7.2.eb --robot

but get the error

  == FAILED: Installation ended unsuccessfully (build directory: 
/trinity/shared/easybuild/build/TensorFlow/1.13.1/fosscuda-2019a-Python-3.7.2): 
build failed (first 300 chars): Failed to chmod/chown several paths: 
['/trinity/shared/easybuild/build/TensorFlow/1.13.1/fosscuda-2019a-Python-3.7.2',
 
'/trinity/shared/easybuild/build/TensorFlow/1.13.1/fosscuda-2019a-Python-3.7.2/protobufpython',
 
'/trinity/shared/easybuild/build/TensorFlow/1.13.1/fosscuda-2019a-Python-3.7.2/abslpy
 (took 4 sec)

I'm running the Slurm job as the same user I use always to run
Easybuild, so all the above directories are already owned by that user.

Any ideas about what I might be doing wrong?

Cheers,

Loris

Åke Sandgren <[email protected]> writes:

> If you're building a single easyconfig you could also just make a small
> submit file that runs eb the same way you would do manually, and send it
> to the right node.
>
> On 12/4/19 10:23 AM, Loris Bennett wrote:
>> Hi,
>> 
>> With Kenneth's help I have realised/remembered that I need to compile
>> Keras / TensorFlow on a machine with the appropriate CUDA drivers.  This
>> seemed like a good scenario in which to use the
>> 
>>   --job
>> 
>> option.  However, if I write 
>> 
>>   ... --job --job-backend=Slurm --job-cores=4
>> 
>> I would still need to specify the partition with the GPU nodes, so that
>> the job is indeed scheduled to a machine on which the drivers are
>> installed. 
>> 
>> How would I do that?
>> 
>> Cheers,
>> 
>> Loris
>> 
-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to