Re: [Spice-devel] how can i trace monitor change (etc) events
On 05/07/2014 05:41 PM, David Mansfield wrote: On 05/02/2014 09:05 AM, Marc-André Lureau wrote: Hi FYI: I have been running with the attached patch (not the inline above) to spice-gtk for one week now, and so has my colleague. Dual monitor works perfectly. There is one other crash scenario (xorg in guest crashes and won't restart) which sometimes happens consistently upon logging in. This can be fixed by removing the file ~/.config/monitors.xml (in the guest). I recommend strongly that the attached fix be included. I understand it is a band-aid but so is fixme: goto whole (see source code) which is broken by definition. You are basically ignoring the surface_id from the monitor config with this patch. This isn't helping much, most probably this point out to invalid config, so I'd rather not do that and keep displaying the whole surface instead. Did you manage to trace back to where that surface id was generated? That's what I would do. It could be corrupted memory... I have done a bit of poking into this. I added this into qxl.ko's qxl_object.c: --- qxl_object.c~2014-03-30 23:40:15.0 -0400 +++ qxl_object.c2014-05-07 17:05:24.989185760 -0400 @@ -311,6 +311,7 @@ ret = qxl_hw_surface_alloc(qdev, bo, NULL); if (ret) return ret; +DRM_ERROR(qxl_bo_check_id: %p %d %d, bo, (int)bo-is_primary, (int)bo-surface_id); } return 0; } And lo-and-behold, the system keeps creating new primary surfaces, and not one of the resulting surface_id is '0'. [ 25.393840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 88020967 1 1 [ 86.104660] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 8801fb5dcc00 1 2- this is the one that remote-viewer sees with surface_id #2 [ 118.560840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 8801fb748800 1 3 I think that the entire concept of is_primary vs surface_id=0 is very broken in this code, and that sometimes the real surface_id (which is not 0) is leaking out into the protocol. These id's are allocated by idr. Or perhaps the #2 surface is allocated as a primary before the #1 has been freed, and somewhere a check on only one primary allowed is hit, and so the primary gets toggled off. (surface #3 is when I resized the second monitor and can be ignored). I have an explanation for this, but not a fix. The fix needs to be made by the owner of this code (Alon or Dave according to the header!) The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there is a conditional termed recreate_primary. The logic for this is based on qcrtc-index. In other words, the recreate_primary branch is taken only on the first head. In this case, the surface_id is FORCED to 0, and for all other heads, the actual surface_id is used. However, no _actual_ surface_id's are 0. See above. All surfaces (valid for this context) have surface_id 0. So it is IMPOSSIBLE for the monitor_config for the second monitor to have surface_id = 0. So you may be asking yourself (or me) - so how is it working (in gnome3)? Well, that is funny. It's just working by coincidence for gnome3, but broken by design. Here's how that works. When the bo is created, it _initially_ gets surface_id = 0. The actual surface id (assigned by qxl_bo_check_id) isn't assigned until the qxl_update_area_ioctl is called (by userspace). In gnome3, this ioctl is called AFTER setting the mode on the 2nd monitor, so the second branch above which returns the actual surface_id still returns 0 because the surface hasn't been checked (whatever that means). In MATE and other xrandr environments, the ioctl is called right BEFORE setting the mode, so the actual surface_id is assigned and we get the trap in spice-widget.c (which I have worked around). Thoughts? -- Thanks, David Mansfield Cobite, INC. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
Hi Thanks a lot for your analysis so far! Adding David Airlie in CC - Original Message - I have an explanation for this, but not a fix. The fix needs to be made by the owner of this code (Alon or Dave according to the header!) The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there is a conditional termed recreate_primary. The logic for this is based on qcrtc-index. In other words, the recreate_primary branch is taken only on the first head. In this case, the surface_id is FORCED to 0, and for all other heads, the actual surface_id is used. However, no _actual_ surface_id's are 0. See above. All surfaces (valid for this context) have surface_id 0. So it is IMPOSSIBLE for the monitor_config for the second monitor to have surface_id = 0. Ok So you may be asking yourself (or me) - so how is it working (in gnome3)? Well, that is funny. It's just working by coincidence for gnome3, but broken by design. Here's how that works. When the bo is created, it _initially_ gets surface_id = 0. The actual surface id (assigned by qxl_bo_check_id) isn't assigned until the qxl_update_area_ioctl is called (by userspace). (there seems to be other paths to get surface checked/allocated, with the qxl_release stuff. But I have no idea how this works) In gnome3, this ioctl is called AFTER setting the mode on the 2nd monitor, so the second branch above which returns the actual surface_id still returns 0 because the surface hasn't been checked (whatever that means). In MATE and other xrandr environments, the ioctl is called right BEFORE setting the mode, so the actual surface_id is assigned and we get the trap in spice-widget.c (which I have worked around). Thoughts? How come gnome3 (and mate) draw correctly all the monitors on the surface 0 (that's the only things spice-gtk shows, even with your patch), since the crtc seems to be associated with a different surface?.. Hopefully Alon or Dave can shed some light here. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On 05/08/2014 10:59 AM, Marc-André Lureau wrote: Hi Thanks a lot for your analysis so far! Adding David Airlie in CC - Original Message - I have an explanation for this, but not a fix. The fix needs to be made by the owner of this code (Alon or Dave according to the header!) The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there is a conditional termed recreate_primary. The logic for this is based on qcrtc-index. In other words, the recreate_primary branch is taken only on the first head. In this case, the surface_id is FORCED to 0, and for all other heads, the actual surface_id is used. However, no _actual_ surface_id's are 0. See above. All surfaces (valid for this context) have surface_id 0. So it is IMPOSSIBLE for the monitor_config for the second monitor to have surface_id = 0. Ok So you may be asking yourself (or me) - so how is it working (in gnome3)? Well, that is funny. It's just working by coincidence for gnome3, but broken by design. Here's how that works. When the bo is created, it _initially_ gets surface_id = 0. The actual surface id (assigned by qxl_bo_check_id) isn't assigned until the qxl_update_area_ioctl is called (by userspace). (there seems to be other paths to get surface checked/allocated, with the qxl_release stuff. But I have no idea how this works) In gnome3, this ioctl is called AFTER setting the mode on the 2nd monitor, so the second branch above which returns the actual surface_id still returns 0 because the surface hasn't been checked (whatever that means). In MATE and other xrandr environments, the ioctl is called right BEFORE setting the mode, so the actual surface_id is assigned and we get the trap in spice-widget.c (which I have worked around). Thoughts? How come gnome3 (and mate) draw correctly all the monitors on the surface 0 (that's the only things spice-gtk shows, even with your patch), since the crtc seems to be associated with a different surface?.. AFACT, there's no such thing as surface 0, at least not in qxl.ko. That's a fabrication used to connote two completely different things. surface_id = 0 can mean: - a bo that hasn't been checked yet - a substitute for the actual surface_id when primary is trying to be conveyed to downstream (spice-server/spice-widget) In reality, all surfaces (that are used in this context) have a non-zero surface_id inside qxl.ko. These two things seem to get confused. Hopefully Alon or Dave can shed some light here. Yes, please! -- Thanks, David Mansfield Cobite, INC. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
Hi - Original Message - On 05/08/2014 10:59 AM, Marc-André Lureau wrote: Hi Thanks a lot for your analysis so far! Adding David Airlie in CC - Original Message - I have an explanation for this, but not a fix. The fix needs to be made by the owner of this code (Alon or Dave according to the header!) The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there is a conditional termed recreate_primary. The logic for this is based on qcrtc-index. In other words, the recreate_primary branch is taken only on the first head. In this case, the surface_id is FORCED to 0, and for all other heads, the actual surface_id is used. However, no _actual_ surface_id's are 0. See above. All surfaces (valid for this context) have surface_id 0. So it is IMPOSSIBLE for the monitor_config for the second monitor to have surface_id = 0. Ok So you may be asking yourself (or me) - so how is it working (in gnome3)? Well, that is funny. It's just working by coincidence for gnome3, but broken by design. Here's how that works. When the bo is created, it _initially_ gets surface_id = 0. The actual surface id (assigned by qxl_bo_check_id) isn't assigned until the qxl_update_area_ioctl is called (by userspace). (there seems to be other paths to get surface checked/allocated, with the qxl_release stuff. But I have no idea how this works) In gnome3, this ioctl is called AFTER setting the mode on the 2nd monitor, so the second branch above which returns the actual surface_id still returns 0 because the surface hasn't been checked (whatever that means). In MATE and other xrandr environments, the ioctl is called right BEFORE setting the mode, so the actual surface_id is assigned and we get the trap in spice-widget.c (which I have worked around). Thoughts? How come gnome3 (and mate) draw correctly all the monitors on the surface 0 (that's the only things spice-gtk shows, even with your patch), since the crtc seems to be associated with a different surface?.. AFACT, there's no such thing as surface 0, at least not in qxl.ko. That's a fabrication used to connote two completely different things. surface_id = 0 can mean: - a bo that hasn't been checked yet - a substitute for the actual surface_id when primary is trying to be conveyed to downstream (spice-server/spice-widget) It seems to me surface 0 is the primary surface, it is associated with crtc 0 (kernel and userspace). In xf86-video-qxl, crtc_resize() allocates a primary surface with the _whole virtual size_, on crtc 0. In drmmode_set_mode_major(), all crtc seems to share the same primary fb (except when rotated?): fb_id = drmmode-fb_id; if (drmmode_crtc-rotate_fb_id) { fb_id = drmmode_crtc-rotate_fb_id; x = y = 0; } ret = drmModeSetCrtc(drmmode-fd, drmmode_crtc-mode_crtc-crtc_id, fb_id, x, y, output_ids, output_count, kmode); That would explain why the surface 0 has the right size, and the right content. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
- Original Message - I have an explanation for this, but not a fix. The fix needs to be made by the owner of this code (Alon or Dave according to the header!) The bug lies in qxl_display.c:qxl_crtc_mode_set. In this method, there is a conditional termed recreate_primary. The logic for this is based on qcrtc-index. In other words, the recreate_primary branch is taken only on the first head. In this case, the surface_id is FORCED to 0, and for all other heads, the actual surface_id is used. However, no _actual_ surface_id's are 0. See above. All surfaces (valid for this context) have surface_id 0. So it is IMPOSSIBLE for the monitor_config for the second monitor to have surface_id = 0. Ok So you may be asking yourself (or me) - so how is it working (in gnome3)? Well, that is funny. It's just working by coincidence for gnome3, but broken by design. Here's how that works. When the bo is created, it _initially_ gets surface_id = 0. The actual surface id (assigned by qxl_bo_check_id) isn't assigned until the qxl_update_area_ioctl is called (by userspace). (there seems to be other paths to get surface checked/allocated, with the qxl_release stuff. But I have no idea how this works) In gnome3, this ioctl is called AFTER setting the mode on the 2nd monitor, so the second branch above which returns the actual surface_id still returns 0 because the surface hasn't been checked (whatever that means). In MATE and other xrandr environments, the ioctl is called right BEFORE setting the mode, so the actual surface_id is assigned and we get the trap in spice-widget.c (which I have worked around). Thoughts? How come gnome3 (and mate) draw correctly all the monitors on the surface 0 (that's the only things spice-gtk shows, even with your patch), since the crtc seems to be associated with a different surface?.. Hopefully Alon or Dave can shed some light here. Okay I barely remember how this code works, and my memory is saying I wrote it that way to avoid hitting asserts in the host spice server code, which were close to impossible to avoid, but I'd have to go setup a test box for this to look into it. I think the idea was that for the first crtc we had to use the primary surface, but for non-first crtc's we need a surface id, however if you fixed things so the secondary head surface id was correct, then qxl host spews up then I'm not sure its really possible to implement proper KMS support on qxl, though we should be able to avoid this problem, by always pointing at surface 0, and hoping its big enough, but wayland multi-head would be badly broken even then. Dave. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On Wed, Apr 16, 2014 at 06:32:30AM +1000, Lindsay Mathieson wrote: Monitor resize is handled by the spice-vdagent app in the *guest* system, it must be installed and running in the guest for that to work. It serves the same purpose as the guest tools in vmware, virtualbox etc. With a recent qemu (with spice client monitor config support) and a guest with the kms qxl driver, the agent is no longer involved in resizing the guest. Christophe pgpnZL1glNZHt.pgp Description: PGP signature ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On 04/16/2014 03:15 AM, Christophe Fergeau wrote: On Wed, Apr 16, 2014 at 06:32:30AM +1000, Lindsay Mathieson wrote: Monitor resize is handled by the spice-vdagent app in the *guest* system, it must be installed and running in the guest for that to work. It serves the same purpose as the guest tools in vmware, virtualbox etc. With a recent qemu (with spice client monitor config support) and a guest with the kms qxl driver, the agent is no longer involved in resizing the guest. Ok, good to know. So how can I trace the interaction of kms/qxl.ko = spice-server = remote-viewer. I have already posted a debug trace from remote-viewer showing that the monitor config events received by remote-viewer are different when using MATE vs GNOME3. In particular, with MATE we get a bunch of: (remote-viewer:12916): GSpice-WARNING **: FIXME: only support monitor config with primary surface 0, but given config surface 5 Which seems suspicious to me, given that these are followed immediately by incorrect behavior and don't happen in GNOME3. I have already tried using 0.25 spice-gtk compiled from source (instead of F20's 0.23) and it makes no difference. I'm willing to compile qemu or whatever to track this down but some pointers would be really helpful. Thanks, David ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
[Spice-devel] how can i trace monitor change (etc) events
Hi All: I'd like to be able to trace all events going back and forth between spice server / client and also between Xorg driver (qxl) and spice server regarding monitor connect resize etc. The reason for this is that I'd like to debug why gnome3 works in dual head setup but other desktop env don't (eg. mate desktop, fluxbox). There must be something gnome3 desktop is doing via xrandr that is different - or perhaps it's an agent message that is being generated / sent. Seems the minimum would be to trace: * spice agent messages * xrandr events in the guest * the various surfaces (if that is the correct term) that the spice-server is managing. Any pointers? -- Thanks, David Mansfield Cobite, INC. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
- Original Message - Hi All: I'd like to be able to trace all events going back and forth between spice server / client and also between Xorg driver (qxl) and spice server regarding monitor connect resize etc. The reason for this is that I'd like to debug why gnome3 works in dual head setup but other desktop env don't (eg. mate desktop, fluxbox). There must be something gnome3 desktop is doing via xrandr that is different - or perhaps it's an agent message that is being generated / sent. gnome3 / mutter is watching xrandr events, and reconfiguring the display to follow preferred resolution. https://git.gnome.org/browse/mutter/tree/src/backends/x11/meta-monitor-manager-xrandr.c Seems the minimum would be to trace: * spice agent messages * xrandr events in the guest * the various surfaces (if that is the correct term) that the spice-server is managing. Any pointers? There is no easy solution, each components has different tracing functionality. In this case (monitor configuration issues), it's usually easier to add a few printf and break on point of interest and follow the code. Having 10 traces of various components won't help you much as you will be struggling and overwhelm to understand the interactions. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On 04/15/2014 10:35 AM, David Mansfield wrote: Hi All: I'd like to be able to trace all events going back and forth between spice server / client and also between Xorg driver (qxl) and spice server regarding monitor connect resize etc. The reason for this is that I'd like to debug why gnome3 works in dual head setup but other desktop env don't (eg. mate desktop, fluxbox). There must be something gnome3 desktop is doing via xrandr that is different - or perhaps it's an agent message that is being generated / sent. Seems the minimum would be to trace: * spice agent messages * xrandr events in the guest * the various surfaces (if that is the correct term) that the spice-server is managing. Any pointers? Further information/evidence: I've enabled debugging in remote-viewer (--spice-debug) and can clearly see from the client side that the monitor config is different between gnome3 and mate desktop when activating the second display. Here are the monitor config events on mate (i.e. grep for 'monitor config' in remote-viewer output): In mate: (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #0 1024x768+400+0 @ 32 bpp (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #1 400x377+0+0 @ 32 bpp (remote-viewer:12916): GSpice-WARNING **: FIXME: only support monitor config with primary surface 0, but given config surface 5 (remote-viewer:12916): GSpice-WARNING **: FIXME: only support monitor config with primary surface 0, but given config surface 5 (remote-viewer:12916): GSpice-WARNING **: FIXME: only support monitor config with primary surface 0, but given config surface 5 (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #0 1024x768+0+0 @ 32 bpp (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #1 400x377+0+0 @ 32 bpp And here is gnome3: (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #0 1024x768+400+0 @ 32 bpp (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #1 400x377+0+0 @ 32 bpp (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #0 1024x768+0+0 @ 32 bpp (remote-viewer:12916): GSpice-DEBUG: channel-main.c:1183 main-1:0: monitor config: #1 400x377+1024+0 @ 32 bpp Where do the monitor config events originate? spice-vdagentd? Clearly remote-viewer is being told to display the wrong area in the mate-desktop case. I have the entire debug output of remote-viewer available during each scenario if anyone is interested. Thanks, David ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On Tue, 15 Apr 2014 10:35:32 AM David Mansfield wrote: I'd like to be able to trace all events going back and forth between spice server / client and also between Xorg driver (qxl) and spice server regarding monitor connect resize etc. Monitor resize is handled by the spice-vdagent app in the *guest* system, it must be installed and running in the guest for that to work. It serves the same purpose as the guest tools in vmware, virtualbox etc. Dual monitors needs the qxl drivers working (xserver-xorg-video-qxl) and kernel support - can't remember which version, but I know there were some kernel patches related to that. Fedora 20 didn't have them but they may have been added since. -- Lindsay signature.asc Description: This is a digitally signed message part. ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
Re: [Spice-devel] how can i trace monitor change (etc) events
On 04/15/2014 04:32 PM, Lindsay Mathieson wrote: On Tue, 15 Apr 2014 10:35:32 AM David Mansfield wrote: I'd like to be able to trace all events going back and forth between spice server / client and also between Xorg driver (qxl) and spice server regarding monitor connect resize etc. Monitor resize is handled by the spice-vdagent app in the *guest* system, it must be installed and running in the guest for that to work. It serves the same purpose as the guest tools in vmware, virtualbox etc. Dual monitors needs the qxl drivers working (xserver-xorg-video-qxl) and kernel support - can't remember which version, but I know there were some kernel patches related to that. Fedora 20 didn't have them but they may have been added since. Thanks for the clarification. All of this is somewhat known to me although I did not know there are recent kernel patches that may be missing from F20 qxl.ko. My situation is: dual monitors works (with qxl.ko loaded in the F20 guest of course) when the guest desktop environment is gnome3, but does not work when the guest desktop environment is mate or fluxbox. I am trying to identify what is different between these two environments so I'd like to be able to trace the messages going back and forth to some degree. How can I see the messages being generated/received by spice-vdagentd in the guest? FYI: Host, guest and client are all F20 fully updated (and running on the same machine). Thanks, David ___ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel