The cluster has nodes containing 3-8 nvidia GPUs each with slurm as the scheduler. The gpus are used mainly for AI and image processing. Display to a remote system is a secondary use. requirements will include
- Only the user submitting the batch job has access to the gpu and the user has access only to the gpu(s) allocated through the batch system. - Ideal situation is for Xorg or an equivalent daemon to be started when the batch job starts and is killed when the job exits. Daemon should run as the user, possibly with /dev/nvidia? owned by the user. A chown can be included in the slurm prolog script. - If Xorg has to be running continuously, it should not take resources (gpu system time or memory) away from the non-display jobs when they have the gpu allocated. Do we need one daemon per gpu and how to we restrict access based on slurm resource requests? - More minor but still a problem. Running Xorg headless still blocks access to the virtual consoles using HPE servers and iLO to connect -- You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/29781df7-b977-49fa-85d6-996de6d6d377n%40googlegroups.com.
