In tracking down a rare crash in GTK-VNC clients, I think I've discovered a race in the way QEMU processes resize events during initial connection handshake
The symptom of the problem is that the client receives a framebuffer update which stretches outside the boundaries of the guest framebuffer, as known to the VNC client. This GTK-VNC debug trace shows the initial framebuffer size is '640x480' but the first framebuffer update received is for a region of size '720x400' (virt-viewer:20171): gtk-vnc-DEBUG: Pixel format BPP: 32, Depth: 24, Byte order: 1234, True color: 1 Mask red: 255, green: 255, blue: 255 Shift red: 16, green: 8, blue: 0 (virt-viewer:20171): gtk-vnc-DEBUG: Display name 'QEMU (f14i686)' (virt-viewer:20171): gtk-vnc-DEBUG: Setting depth color to 24 (32 bpp) (virt-viewer:20171): gtk-vnc-DEBUG: Do resize 0x117ec10 1 640 480 0 (virt-viewer:20171): gtk-vnc-DEBUG: Visual mask: 16711680 65280 255 shift: 16 8 0 (virt-viewer:20171): gtk-vnc-DEBUG: Mask local: 255 255 255 remote: 255 255 255 merged: 255 255 255 (virt-viewer:20171): gtk-vnc-DEBUG: Pixel shifts right: 16 8 0 left: 16 8 0 (virt-viewer:20171): gtk-vnc-DEBUG: Running main loop (virt-viewer:20171): gtk-vnc-DEBUG: Expose 0x0 @ 640,480 (virt-viewer:20171): gtk-vnc-DEBUG: FramebufferUpdate(-258, 0, 0, 720, 400) (virt-viewer:20171): gtk-vnc-DEBUG: Using evdev keycode mapping (virt-viewer:20171): gtk-vnc-DEBUG: FramebufferUpdate(-257, 0, 0, 720, 400) (virt-viewer:20171): gtk-vnc-DEBUG: FramebufferUpdate(5, 0, 0, 720, 400) (virt-viewer:20171): gtk-vnc-DEBUG: Framebuffer update 720x400 outside boundary 640x480 I can reproduce this perhaps 1 time in 5, if I connect the VNC client at exactly the time the QEMU guest starts while also using XSync. At a protocol level the initial startup sequence is 1. Client & server negotiate version 2. Client & server negotiate auth 3. Client sets the 'shared flag' 4. Server sends width + height 5. Server sends pixel format 6. Server sends display name 7. Client sends its supported framebuffer update encodings 8. Client requests first framebuffer update 9. Server sends first framebuffer update What I believe I am seeing is the guest resizing its display at some point between steps 4 & 9, and QEMU forgetting to send a DESKTOPRESIZE message before doing step 9. Looking at the QEMU code seems to confirm this hypothesis. In vnc.c in the desktop-resize messages are triggered from vnc_dpy_resize(). This code checks: if (size_changed) { if (vs->csock != -1 && vnc_has_feature(vs, VNC_FEATURE_RESIZE)) { ....send the resize message .... } } The VNC_FEATURE_RESIZE feature does not get set until step 7. So there is a clear window between step 4 and 7 when vnc_dpy_resize() can be invoked where the test against VNC_FEATURE_RESIZE returns false & thus the client does not get notified of the resize even though it is capable of handling it. Furthermore in the case where desktop resize is not supported, QEMU is not clipping its framebuffer updates to the clients' view of the framebuffer size. A crude fix would be to send an immediate DESKTOPRESIZE update the moment the client tells the server is supports the DESKTOPRESIZE psuedo encoding. The more involved fix is to record the clients expected width+height in the VncState struct per client. Instead of sending the DESKTOPRESIZE updates directly from the vnc_dpy_resize() method, simply update the VncState struct with new width/height. Then at the send_framebuffer_update() method check to see if a DESKTOPRESIZE needs to be triggered, or if not supported, clip the update to the client's boundary. Thoughts ? Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|