Hi, 1. for user home pvc, make sure you have correct fsGid configured. if you use docker-stack (jupyter/*) based notebook, it should also try properly to chown the user home directory before su into the jovyan user.
2. is your single user image with the tensorflow-gpu or tensorflow package? beware that conda can pull non-gpu version from mixed channels even if you specifically install tensorflow-gpu. 3. limit: 0 does not take away GPUs. you need to configure NVIDIA_VISIBLE_DEVICES=none as extra env in this case. Best, clkao Benedikt Bäumle <[email protected]> 於 2018年9月13日 週四 下午6:53寫道: > Hey guys, > > I am currently setting up a Kubernetes bare-metal single node cluster + > Jupyterhub for having control over resources for our users. I use Helm to > set up jupyterhub with a custom singleuser-notebook image for deep learning. > > The idea is to set up the hub to have better control over NVIDIA GPUs on > the server. > > I am struggling with some things I can't figure out how to do or if that > is even possible: > > 1. I mount the home directory of the user to the notebook user ( in our > case /home/dbvis/ ) in the helm chart values.yaml: > > extraVolumes: > - name: home > hostPath: > path: /home/{username} > extraVolumeMounts: > - name: home > mountPath: /home/dbvis/data > > It is indeed mounted like this, but with root:root ownership and I can't > add/remove/change anything inside the container at /home/dbvis/data. What > is tried out: > > - I tried to change the ownership in the Dockerfile by running 'chown -R > dbvis:dbvis /home/dbvis/' in the end as root user > - I tried through the following postStart hook in the values.yaml > > lifecycleHooks: > postStart: > exec: > command: ["chown", "-R", "dbvis:dbvis", "/home/dbvis/data"] > > Both didn't work...as storageclass I set up rook with rook-ceph-block > storage. > Any ideas? > > > 2. We have several NVIDIA GPUs and I would like to control them and set > limits for the jupyter singelser-notebooks. I set up 'nvidia device plugin' > ( https://github.com/NVIDIA/k8s-device-plugin ). > When I use 'kubectl describe node' I find the GPU as resource: > > Allocatable: > cpu: 16 > ephemeral-storage: 189274027310 > hugepages-1Gi: 0 > hugepages-2Mi: 0 > memory: 98770548Ki > nvidia.com/gpu: 1 > pods: 110 > ... > ... > Allocated resources: > (Total limits may be over 100 percent, i.e., overcommitted.) > Resource Requests Limits > -------- -------- ------ > cpu 2250m (14%) 4100m (25%) > memory 2238Mi (2%) 11146362880 (11%) > nvidia.com/gpu 0 0 > Events: <none> > > Inside the jupyter singleuser-notebooks I can see the GPU, when executing > 'nvidia-smi'. > But if I run e.g. tensorflow to see the GPU with the following code: > > from tensorflow.python.client import device_lib > > device_lib.list_local_devices() > > I just get the CPU device: > > [name: "/device:CPU:0" > device_type: "CPU" > memory_limit: 268435456 > locality { > } > incarnation: 232115754901553261] > > > Any idea what I am doing wrong? > > Further, I would like to limit the amount of GPUs ( It is just a test > environment with one GPU we have more ). I tried the following which > doesn't seem to have an effect: > > - Add the following config in values.yaml in any combination possible: > > extraConfig: | > c.Spawner.notebook_dir = '/home/dbvis' > c.Spawner.extra_resource_limits: {'nvidia.com/gpu': '0'} > c.Spawner.extra_resource_guarantees: {'nvidia.com/gpu': '0'} > c.Spawner.args = ['--device=/dev/nvidiactl', > '--device=/dev/nvidia-uvm', '--device=/dev/nvidia-uvm-tools', > '/dev/nvidia0' ] > > - Add the GPU to the resources in the singleuser configuration in > values.yaml: > > singleuser: > image: > name: benne4444/dbvis-singleuser > tag: test3 > nvidia.com/gpu: > limit: 1 > guarantee: 1 > > Is what I am trying even possible right now? > > Further information: > > I set up a server running > > - Ubuntu 18.04.1 LTS > - docker-nvidia > - helm jupyterhub version 0.8-ea0cf9a > > I added the complete values.yaml. > > If you need additional information please let me know. Any help is > appreciated a lot. > > Thank you, > Benedikt > > > > > -- > You received this message because you are subscribed to the Google Groups > "Project Jupyter" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jupyter/585d4d0b-5d8d-4cf2-b109-2c16f93d1f62%40googlegroups.com > <https://groups.google.com/d/msgid/jupyter/585d4d0b-5d8d-4cf2-b109-2c16f93d1f62%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Project Jupyter" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/CACY80b3NyGcYwbN%3DSmFBtEGXoKGWM3Q3zUsXkDeW0929mX0h0Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
