On 12/07/2017 07:35 AM, Alan Somers wrote:
On Thu, Dec 7, 2017 at 2:33 AM, Andriy Gapon <a...@freebsd.org <mailto:a...@freebsd.org>> wrote:


    [cc-ing current@ to raise more awareness]

    On 05/12/2017 16:03, Alexey Dokuchaev wrote:
     > On Fri, Nov 24, 2017 at 11:31:51AM +0200, Andriy Gapon wrote:
     >>
     >> I have reported a couple of nvidia-driver issues in the FreeBSD
    section
     >> of the nVidia developer forum, but no replies so far.
     >>
     >> Well, the first issue is not with the driver, but with a utility
    that
     >> comes with it, nvidia-smi:
     >>
    
https://devtalk.nvidia.com/default/topic/1026589/freebsd/nvidia-smi-query-gpu-spins-forever-on-freebsd-head-amd64-/
    
<https://devtalk.nvidia.com/default/topic/1026589/freebsd/nvidia-smi-query-gpu-spins-forever-on-freebsd-head-amd64-/>
     >> I wonder if I am the only one affected or if I see the problem
    because
     >> I am on head or something else.
     >> I am pretty sure that the problem is caused by a programming bug
    related
     >> to strtok_r.
     >
     > I'll try to reproduce it and report back.

    I've done some work with a debugger and it seems that there is code
    that does
    something like this:

    char *last = NULL;

    while (1) {
             if (last == NULL)
                     p = strtok_r(str, sep, &last);
             else
                     p = strtok_r(NULL, sep, &last);
             if (p == NULL)
                     break;
             ...
    }

    The problem is that when 'p' points to the last token, 'last' is
    NULL (in
    FreeBSD implementation of strtok_r).  That means that when we go to
    the next
    iteration the parsing starts all over again leading to the endless loop.
    The code is incorrect from the standards point of view, because the
    value of
    'last' is completely opaque and should not be used for anything else
    but passing
    it back to strtok_r.

    I used gdb -w to change the logic to:

    char *last = 1;

    While (1) {
             if (last == 1)
                     p = strtok_r(str, sep, &last);
             else
                     p = strtok_r(NULL, sep, &last);
             ...
    }

    Where 1 is used as an "impossible" pointer value which is neither
    NULL nor a
    valid pointer that can be set by strtok_r.  It's not ideal, but
    binary code
    editing is not as easy as that of source code.

    The binary patch is here:
    https://people.freebsd.org/~avg/nvidia-smi.bsdiff
    <https://people.freebsd.org/~avg/nvidia-smi.bsdiff>

     >> The second issue is with the FreeBSD support for the kernel driver:
     >>
    
https://devtalk.nvidia.com/default/topic/1026645/freebsd/panic-related-to-nvkms_timers-lock-sx-lock-/
    
<https://devtalk.nvidia.com/default/topic/1026645/freebsd/panic-related-to-nvkms_timers-lock-sx-lock-/>
     >> I would like to get some feedback on my analysis.
     >> I am testing this patch right now:
     >>
    
https://people.freebsd.org/~avg/extra-patch-src_nvidia-modeset_nvidia-modeset-freebsd.c
    
<https://people.freebsd.org/~avg/extra-patch-src_nvidia-modeset_nvidia-modeset-freebsd.c>
     >
     > Unfortunately, I'm not an expert on kernel locking primitives to
    give you
     > a proper review, let's see what others have to say.

    It's been a while since I posted the patch and there are no comments
    yet.
    I can only add that I am running an INVARIANTS and WITNESS enabled
    kernel all
    the time and before the patch I was getting kernel panics every now
    and then.
    Since I started using the patch I haven't had a single nvidia panic yet.

     >> Also, what's the best place or who are the best people with whom to
     >> discuss such issues?
     >
     > Yes, this is a problem now: since Christian Zander had left
    nVidia, he
     > could not tell me who'd be their next liaison to talk to from FreeBSD
     > community. :-(

    Oh, I didn't know about Christian's departure.
    So, we are not in a very good position now.


How about Aaron Plattner (CC'd).  Aaron, are you still working on FreeBSD driver issues?

Thanks for the heads up, Alan. I filed bug 2032249 to track this.

-- Aaron
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to