Hey all, We are trying to setup a Slurm cluster for both cpu and gpu partitions for research and education(courses) in a computer science faculty at my university everything seems to work fine and we have managed to accomplish almost everythingן¿½ needed except a few things:
A. Is it possible to setup a global defaults values for srun/sbatch (i.e. number of cores ,email, etc)? if so, how can it be done? A2. Is it possible to make some srun/sbatch parameters required(i.e a user cannot run a job via srun unless specifying email)? if so, how? B. We have active directory(AD) in our faculty, and We prefer manage users/groups from there , is it possible? any guide available somewhere? C. What is the recommanded way to handle data files? meaning , a user wants his data/code files (for example a data set of pictures for gpu deep learning) to be accessible to the nodes allocated to him and get the result back easily without sshing to those nodes(I want to close the nodes to ssh if possible), so far we investigatedן¿½ nfs(low preformance vs files locally on server), nextcloud(file syncing back and forth), is there a better way we overlooked? D. We need to give a specific known user the ability to run his jobs on specific nodes on specific hours while no other jobs allowed to run concurrently(exclusion) We saw there is reservation, but it takes the resources even if that user didn't eventually use his reservation, another solution was to create a partition with priority higher than all the others , put this partition in down state and only give that user a right to submit jobs to it, then put a script in crontab to change the state of the partition in the time window needed, What do you think? is there a more elegant way? Our most common os is ubuntu, and we are using slurm 17.02.7 Thanks in advance for you time and effort, Nadav