On Fri, May 27, 2016 at 04:18:48PM +0200, David Coppa wrote:
> On Fri, 27 May 2016, Carlin Bingham wrote:
> 
> > On Fri, May 27, 2016 at 01:07:09AM +0200, Theo Buehler wrote:
> > > On Thu, May 26, 2016 at 05:54:30PM -0400, Andre Smagin wrote:
> > > > On Sat, 14 May 2016 21:01:29 +0200 (CEST)
> > > > [email protected] wrote:
> > > > 
> > > > > >Synopsis:    radeon(4) drm crashing on current/amd64
> > > > [...]
> > > > > drm:pid77501:radeon_fence_wait_empty_locked *ERROR* error waiting for 
> > > > > ring[3] to become idle (-1601868)
> > > > 
> > > > 
> > > > I am seeing the same issue, very infrequently (may be once every week 
> > > > or two):
> > > > 
> > > > drm:pid55825:radeon_fence_wait_empty_locked *ERROR* error waiting for 
> > > > ring[3] to become idle (-6007676)
> > > > i3(49392): syscall 97 "inet"
> > > > 
> > > > Not sure what happens to i3 as X crashes, but I get that pledge message 
> > > > every time.
> > > > (Previously mentioned i3 to dcoppa, but before realizing it was related 
> > > > to radeon issue.)
> > > 
> > > I combed through the i3 source code hoping to get an indication what
> > > might be the cause for that socket(2) call breaking a pledge promise i3.
> > > I couldn't find anything: all socket calls are with AF_LOCAL that should
> > > be covered by the "unix" pledge.
> > > 
> > > Without seeing a ktrace output, I don't think I can make any progress
> > > here.
> > > 
> > 
> > i3's restore_xcb_check_cb() (src/restore_layout.c), if it sees that the
> > connection to X has been lost, it calls restore_connect() which calls
> > libxcb's xcb_connect().
> > 
> > In libxcb that calls xcb_connect_to_display_with_auth_info() which calls
> > _xcb_open(), which calls _xcb_open_unix() and ususally that would be it,
> > but if opening the unix socket fails (beause X has fallen over) it tries
> > again to connect by calling _xcb_open_tcp() which sets up an AF_INET
> > addrinfo and passes that to _xcb_socket()... and you can probably guess
> > what happens next.
> 
> Fallback code could be removed with no (imho) dramatic consequences.

That seems to be enough to fix the present issue, but there is still
this part at the start of _xcb_open():

    if ((!protocol || (strcmp("unix",protocol) != 0)) &&
        (*host != '\0') && (strcmp("unix",host) != 0))
    {
        /* display specifies TCP */
        unsigned short port = X_TCP_PORT + display;
        return _xcb_open_tcp(host, protocol, port);
    }

Can't this also be triggered in some circumstances in Carlin's codepath?
Unfortunately, I don't know my way around X well enough to make this
happen.

> Index: src/xcb_util.c
> ===================================================================
> RCS file: /cvs/xenocara/dist/libxcb/src/xcb_util.c,v
> retrieving revision 1.11
> diff -u -p -u -p -r1.11 xcb_util.c
> --- src/xcb_util.c    2 Feb 2016 18:42:22 -0000       1.11
> +++ src/xcb_util.c    27 May 2016 14:18:23 -0000
> @@ -297,11 +297,6 @@ static int _xcb_open(const char *host, c
>      fd = _xcb_open_unix(protocol, file);
>      free(file);
>  
> -    if (fd < 0 && !protocol && *host == '\0') {
> -            unsigned short port = X_TCP_PORT + display;
> -            fd = _xcb_open_tcp(host, protocol, port);
> -    }
> -
>      return fd;
>  #endif /* !_WIN32 */
>      return -1; /* if control reaches here then something has gone wrong */

Reply via email to