I promise a report back, and here it is. The DRI permission issue was solved by running the vglserver_setup and running the VNC session outside of the job controller. Running VNC inside the job controller, I am seeing the same issue, which makes me think this is an issue with cgroup. Is there a pointer for working with virtualgl and cgroup?
On Thu, Oct 12, 2017 at 12:43 PM Jafaruddin Lie <[email protected]> wrote: > Thanks :) > I'll try this and report back. > > > On Tuesday, October 10, 2017 at 4:31:44 PM UTC+11, DRC wrote: > >> Ah, OK. That sounds like a DRI permissions issue. Normally the DRI >> device permissions will be set such that only the user who logged into >> the X server will be able to use direct rendering. If you configure the >> 3D X server as recommended in the VirtualGL User's Guide, including: >> >> -- Installing a display manager such as GDM and starting it when the >> system starts >> -- Running vglserver_config to grant limited access to the X server when >> the display manager is active >> >> this will automatically configure the DRI permissions appropriately for >> nVidia GPUs. You will not be able to share the 3D X server if you start >> it using startx. You need to configure the 3D X server as specified in >> the User's Guide in order for it to be sharable. >> >> DRC >> >> On 10/9/17 11:33 PM, Jafaruddin Lie wrote: >> > Thank you for your clarification. >> > See if I get this correct: >> > So what I have is two accounts connected to the same machine, each >> > connected with two VNC sessions (the 2D X server) and this is working >> > fine as I can see the desktops. >> > I have one 3D X server instance configured using xorg.conf, with 4 >> > Nvidia GPUs and as far as I can tell from the document, this xorg.conf >> > has been configured as headless. >> > >> > My issue now, it seems, only one user is able to access the 3D X server >> > properly. The other account, when running vglrun, will throw errors >> like >> > [VGL] WARNING: The OpenGL rendering context obtained on X display >> > [VGL] :0.3 is indirect, which may cause performance to suffer. >> > [VGL] If :0.3 is a local X display, then the framebuffer device >> > [VGL] permissions may be set incorrectly. >> > >> > I am just trying to get the two accounts to use the 3D X server, so I >> > was wondering if the issue lies with my xorg.conf or somewhere else? >> > Again, thank you for your replies. I am learning (albeit slowly). >> > >> > >> > >> > >> > On Tue, Oct 10, 2017 at 3:14 PM DRC <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > You are either still fundamentally misunderstanding how to >> configure a >> > VirtualGL server, or you are not describing your configuration >> properly. >> > "I have two VNC servers (from two different users) running on an >> > instance of X on :0" makes no sense. As I explained before, VNC is >> the >> > 2D X server, not the 3D X server. xorg.conf is used to configure >> the 3D >> > X server. In a normal VirtualGL environment, :0.0, :0.1, :0.2, >> etc. are >> > screens on the 3D X server, each assigned to a different GPU. In a >> > normal VirtualGL environment, :1.0, :2.0, :3.0, etc. are X proxy/2D >> X >> > server instances. >> > >> > On 10/9/17 9:09 PM, Jafaruddin Lie wrote: >> > > Thanks, DRC. >> > > I commented out the InputDevice(s) lines, now I have two VNC >> servers >> > > (from two different users) running on an instance of X on :0 (ps >> aux >> > > shows "X: 0") >> > > >> > > The first user to login via the job controller is allocated the >> first >> > > two graphic cards, and this user has no issue >> > > running vglrun glxgears (just to test). >> > > The second user is allocated the remaining graphic cards, and got >> this >> > > message: >> > > >> > > [VGL] WARNING: The OpenGL rendering context obtained on X display >> > > [VGL] :0.3 is indirect, which may cause performance to suffer. >> > > [VGL] If :0.3 is a local X display, then the framebuffer >> device >> > > [VGL] permissions may be set incorrectly. >> > > >> > > And this error from glxinfo: >> > > name of display: :2 >> > > X Error of failed request: GLXBadContext >> > > Major opcode of failed request: 152 (GLX) >> > > Minor opcode of failed request: 6 (X_GLXIsDirect) >> > > Serial number of failed request: 43 >> > > Current serial number in output stream: 43 >> > > >> > > That error persists with all the display number that I've tried >> (-d 0.0 >> > > to -d 0.3). >> > > Is there anything I am missing from my X configuration here?? >> > > >> > > > On Tue, Oct 10, 2017 at 12:18 PM DRC <[email protected] <mailto: >> [email protected]> >> > > > <mailto:[email protected] <mailto:[email protected]>>> wrote: >> > > >> > > You're missing the point. In a VirtualGL environment, there >> are two X >> > > servers: >> > > >> > > -- The "3D X server" is a shared resource. It is used by >> multiple users >> > > simultaneously in order to access the GPU, so it needs to >> remain >> > > running. It is only necessary to run one instance of the 3D >> X server, >> > > although multiple GPUs can be assigned to different screens >> on that X >> > > server. This allows VirtualGL to address the GPUs >> separately, which can >> > > be useful for load balancing (assigning different users to >> different >> > > GPUs) in large multi-user environments. >> > > >> > > -- The "2D X server" (X proxy) is per-user and can be >> started/stopped as >> > > needed. TurboVNC is our recommended 2D X server/X proxy >> solution, but >> > > VirtualGL can also be used with other X proxy solutions, >> including >> > > TigerVNC, Xpra, FreeNX, etc. The 2D X server receives and >> displays the >> > > rendered 3D images from VirtualGL, as well as rendering the >> output of >> > > other (non-3D) X applications. It is this X server that you >> would want >> > > to start/stop using your job controller. >> > > >> > > The basic purpose of VirtualGL is two-fold: >> > > >> > > (1) To allow multiple users to share a single GPU (via the 3D >> X server >> > > instance) >> > > (2) To split 3D and 2D rendering to different X servers, so >> as to enable >> > > hardware-accelerated OpenGL in an X proxy environment >> > > >> > > The relationship between the 3D X server and the 2D X server >> is thus >> > > one-to-many. Your test results made me remember a discussion >> about this >> > > on the old virtualgl-users mailing list, and IIRC someone >> told me that >> > > you have to disable the keyboard and mouse drivers in >> xorg.conf in order >> > > for the 3D X server instance to be truly headless, i.e. for >> you to be >> > > able to disconnect it from the console without encountering >> errors. Try >> > > commenting out >> > > >> > > InputDevice "Mouse0" "CorePointer" >> > > InputDevice "Keyboard0" "CoreKeyboard" >> > > >> > > and see if that changes the situation. If so, I'll add that >> to the >> > > how-to on VirtualGL.org. >> > > >> > > On 10/9/17 7:03 PM, Jafaruddin Lie wrote: >> > > > Thanks, our xorg.conf is already set to headless according >> to that page. >> > > > Here's part of the section: >> > > > >> > > > Section "Screen" >> > > > Identifier "Screen0" >> > > > Device "Device0" >> > > > Monitor "Monitor0" >> > > > DefaultDepth 24 >> > > > Option "UseDisplayDevice" "none" >> > > > SubSection "Display" >> > > > Depth 24 >> > > > EndSubSection >> > > > EndSection >> > > > >> > > > Section "Device" >> > > > Identifier "Device0" >> > > > Driver "nvidia" >> > > > VendorName "NVIDIA Corporation" >> > > > BoardName "Tesla K80" >> > > > BusID "PCI:0:6:0" >> > > > EndSection >> > > > >> > > > Section "Monitor" >> > > > Identifier "Monitor0" >> > > > VendorName "Unknown" >> > > > ModelName "Unknown" >> > > > HorizSync 28.0 - 33.0 >> > > > VertRefresh 43.0 - 72.0 >> > > > Option "DPMS" >> > > > EndSection >> > > > >> > > > I'll test it with a single X instance and see how we go. >> > > > The reason we want to have multiple X is that these desktop >> sessions are >> > > > started by a job controller (user will submit their request >> for a >> > > > desktop, which includes starting X). >> > > > We would like the X server to be killed whenever a user >> finishes their >> > > > job, but one step at a time I suppose :) >> > > > >> > > > >> > > > > On Tue, Oct 10, 2017 at 5:12 AM DRC <[email protected] >> <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > > > <mailto:[email protected] <mailto:[email protected]> >> > > <mailto:[email protected] <mailto:[email protected]>>>> wrote: >> > > > >> > > > Yes, it is possible. You don't actually even need >> multiple X servers >> > > > for multiple users to use VirtualGL at the same time. >> Multiple users >> > > > can share the same GPU using VirtualGL. That is one of >> its purposes. >> > > > In order to use VirtualGL with multiple GPUs, generally >> the easiest way >> > > > to do it is to configure a single X server with >> multiple screens, so GPU >> > > > 0 would be accessible by setting VGL_DISPLAY=:0.0 and >> GPU 1 would be >> > > > accessible by setting VGL_DISPLAY=:0.1, etc. The >> reason why you are >> > > > getting the pixel readback error is because, unless the >> X server is >> > > > headless, it has to be attached to the physical display >> in order for >> > > > pixel readback to work. The only way to use VirtualGL >> with multiple X >> > > > servers (:0.0, :1.0, etc.) is if one or more of them is >> configured to be >> > > > headless. ( >> https://virtualgl.org/Documentation/HeadlessNV explains how >> > > > to configure a headless X server with an nVidia GPU.) >> > > > >> > > > On 10/9/17 5:35 AM, Jafaruddin Lie wrote: >> > > > > Hi all >> > > > > Simple question is if it is possible, on a single >> machine with multiple >> > > > > GPUs, running different X servers on those GPUs, for >> multiple users to >> > > > > use vglrun at the same time? >> > > > > This is our current setup: >> > > > > >> > > > > 1 machine with 4 Nvidia K80 cards (latest drivers), >> running CentOS 7, >> > > > > VirtualGL 2.5.2, TightVNC, and Mate Desktop. >> > > > > >> > > > > We have 2 xorg.conf (2 cards configured on each >> xorg.conf), and >> > > > > currently we are testing whether we can bring up 2 >> desktop sessions via VNC. >> > > > > The desktop loads, and we can see two different X >> servers running on >> > > > > those 4 cards and the VNC servers running on >> different displays. >> > > > > >> > > > > The issue is with VirtualGL. The first user can do >> startx and run vglrun >> > > > > glxgears, this will work fine. >> > > > > When the second user startx, the first user's vglrun >> session will be >> > > > > terminated with this error: >> > > > > >> > > > > [VGL] ERROR: OpenGL error 0x0502 >> > > > > [VGL] ERROR: in readpixels--- >> > > > > [VGL] 439: Could not read pixels >> > > > > >> > > > > The first user can run vglrun again once the second >> user stops their X >> > > > > session. >> > > > > My understanding is that with our setup, this should >> be do-able, right? >> > > > > >> > > > > Thanks. >> > > > >> > > > -- >> > > > You received this message because you are subscribed to >> a topic in >> > > > the Google Groups "VirtualGL User Discussion/Support" >> group. >> > > > To unsubscribe from this topic, visit >> > > > >> https://groups.google.com/d/topic/virtualgl-users/5B331QalCaI/unsubscribe. >> >> > > > To unsubscribe from this group and all its topics, send >> an email to >> > > > > [email protected] >> > > <mailto:virtualgl-users%[email protected]> >> > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]>> >> > > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]> >> > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]>>>. >> > > > To view this discussion on the web visit >> > > > >> https://groups.google.com/d/msgid/virtualgl-users/e38fd010-f8e2-7faf-7e0b-2e3f68317985%40virtualgl.org. >> >> > > > For more options, visit >> https://groups.google.com/d/optout. >> > > > >> > > > -- >> > > > You received this message because you are subscribed to the >> Google >> > > > Groups "VirtualGL User Discussion/Support" group. >> > > > To unsubscribe from this group and stop receiving emails >> from it, send >> > > > > an email to [email protected] >> > > <mailto:virtualgl-users%[email protected]> >> > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]>> >> > > > <mailto:[email protected] >> > <mailto:virtualgl-users%[email protected]> >> > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]>>>. >> > > > To view this discussion on the web visit >> > > > >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuMFtgSfjQf-TnmMMs1n5xOfUk520%2B2S0WgV6ipi-BEk-w%40mail.gmail.com >> > > > < >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuMFtgSfjQf-TnmMMs1n5xOfUk520%2B2S0WgV6ipi-BEk-w%40mail.gmail.com?utm_medium=email&utm_source=footer>. >> >> > > > For more options, visit https://groups.google.com/d/optout. >> >> > > >> > > -- >> > > You received this message because you are subscribed to a >> topic in >> > > the Google Groups "VirtualGL User Discussion/Support" group. >> > > To unsubscribe from this topic, visit >> > > >> https://groups.google.com/d/topic/virtualgl-users/5B331QalCaI/unsubscribe. >> >> > > To unsubscribe from this group and all its topics, send an >> email to >> > > > [email protected] >> > > <mailto:virtualgl-users%[email protected]> >> > > <mailto:virtualgl-users%[email protected] >> > <mailto:virtualgl-users%[email protected]>>. >> > > To view this discussion on the web visit >> > > >> https://groups.google.com/d/msgid/virtualgl-users/f34cbec1-b777-1970-86a2-f140907904eb%40virtualgl.org. >> >> > > For more options, visit https://groups.google.com/d/optout. >> > > >> > > -- >> > > You received this message because you are subscribed to the >> Google >> > > Groups "VirtualGL User Discussion/Support" group. >> > > To unsubscribe from this group and stop receiving emails from it, >> send >> > > > an email to [email protected] >> > > <mailto:virtualgl-users%[email protected]> >> > > <mailto:[email protected] >> > <mailto:virtualgl-users%[email protected]>>. >> > > To view this discussion on the web visit >> > > >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuPjSqe9k8SJWE_yF3tP8yXM%3DvC2eidvmJbXMhVruuzFuA%40mail.gmail.com >> > > < >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuPjSqe9k8SJWE_yF3tP8yXM%3DvC2eidvmJbXMhVruuzFuA%40mail.gmail.com?utm_medium=email&utm_source=footer>. >> >> > > For more options, visit https://groups.google.com/d/optout. >> > >> > -- >> > You received this message because you are subscribed to a topic in >> > the Google Groups "VirtualGL User Discussion/Support" group. >> > To unsubscribe from this topic, visit >> > >> https://groups.google.com/d/topic/virtualgl-users/5B331QalCaI/unsubscribe. >> >> > To unsubscribe from this group and all its topics, send an email to >> > > [email protected] >> > > <mailto:virtualgl-users%[email protected]>. >> > To view this discussion on the web visit >> > >> https://groups.google.com/d/msgid/virtualgl-users/41e4806e-dc81-afda-95d3-4c3054090a35%40virtualgl.org. >> >> > For more options, visit https://groups.google.com/d/optout. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "VirtualGL User Discussion/Support" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> > > an email to [email protected] >> > > <mailto:[email protected]>. >> > To view this discussion on the web visit >> > >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuN2NZSCj8GvyeGofVSg1Jaw8ZxB4VxhwQdsXP3FU9e0-A%40mail.gmail.com >> > < >> https://groups.google.com/d/msgid/virtualgl-users/CAOunvuN2NZSCj8GvyeGofVSg1Jaw8ZxB4VxhwQdsXP3FU9e0-A%40mail.gmail.com?utm_medium=email&utm_source=footer>. >> >> > For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "VirtualGL User Discussion/Support" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/virtualgl-users/5B331QalCaI/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/virtualgl-users/1877f842-9ae5-4f1c-81cd-6b3c2c24944e%40googlegroups.com > <https://groups.google.com/d/msgid/virtualgl-users/1877f842-9ae5-4f1c-81cd-6b3c2c24944e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/CAOunvuMKQAzLakccKgB2CrBuXj8-_4HTmoVccE6m0Y%3DahjK2ZQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
