> I keep on getting a kernel crash in "conskbd" and
> I've narrowed it down to:
> 
> * Pressing a control, windows key, alt or shift
> * Another key (possibly the left hand side of a
> QWERTY keyboard)

A wild guess: CAPSLOCK, perhaps?


> I know that sounds awfully vague, but I'm not yet
> able to reproduce it.

> Attaching mdb to the latest crash I see:
> 
> (Full log at http://www.adam.com.au/lloy0076/crash-log2.log)


> I'm not sure what more I can do in mdb to help find the cause of the 
> problem, but can anyone see something I could do that could help 
> pinpoint it better?


Hmm, looks like it crashes in conskbd_mux_dequeue_msg().
Called from  conskbd_mux_upstream_msg().

In conskbd_mux_upstream_msg() I see:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/conskbd.c#1824

   1834         msg = conskbd_mux_find_msg(mp);
   ...
   1869         conskbd_mux_dequeue_msg(msg);


conskbd_mux_find_msg() is searching a linked list,
starting at "conskbd_msg_queue", looking for a certain
element that matches some ioctl id.  A pointer to the
found linked list element is returned.

conskbd_mux_dequeue_msg() is supposed to remove
the element from the linked list, that was previously
found and returned by conskbd_mux_find_msg().

It seems conskbd_mux_dequeue_msg() is unable to
find that message. And in this case it would crash the
kernel with a NULL pointer dereference panic.



I think it would be interesting to print:

- all the elements of the linked list, starting with "conskbd_msg_queue"

    conskbd_msg_queue::print
    conskbd_msg_queue::print [0]
    conskbd_msg_queue::print [0].kpm_next[0]
    conskbd_msg_queue::print [0].kpm_next[0].kpm_next[0]
    ....


- The argument for conskbd_mux_dequeue_msg, fffffffed1aacc08
  the element that is supposed to be on that linked list, but
  cannot be found.  Print that as a conskbd_pending_msg_t 
  struct, too:

   fffffffed1aacc08::print conskbd_pending_msg_t 



Hmm, looking at the code in conskbd_mux_upstream_msg()...

   1834         msg = conskbd_mux_find_msg(mp);
   1835 
   1836         if (!msg) {
   1837                 /*
   1838                  * Here we discard the response if:
   1839                  *
   1840                  *   1. It's an KIOCSLED request; see 
conskbd_streams_setled().
   1841                  *   2. The application has already closed the upper 
stream;
   1842                  *              see conskbdclose()
   1843                  */
   1844                 freemsg(mp);
   1845                 return;
   1846         }
   1847 
   1848         /*
   1849          * We use the b_next field of mblk_t structure to link all
   1850          * response coming from lower queues into a linkage list,
   1851          * and make use of the b_prev field to save a pointer to
   1852          * the lower queue from which the current response message
   1853          * comes.
   1854          */
   1855         ASSERT(mp->b_next == NULL && mp->b_prev == NULL);
   1856         mutex_enter(&msg->kpm_lock);
   1857         mp->b_next = msg->kpm_resp_list;
   1858         mp->b_prev = (mblk_t *)lqs;
   1859         msg->kpm_resp_list = mp;
   1860         msg->kpm_resp_nums ++;
   1861         mutex_exit(&msg->kpm_lock);
   1862 
   1863         if (msg->kpm_resp_nums < msg->kpm_req_nums)
   1864                 return;
   1865 
   1866         ASSERT(msg->kpm_resp_nums == msg->kpm_req_nums);
   1867         ASSERT(mp == msg->kpm_resp_list);
   1868 
   1869         conskbd_mux_dequeue_msg(msg);

Shouldn't the if condition at line 1863 be protected by the mutex?

Something like:

        msg->kpm_resp_nums ++;
        if (msg->kpm_resp_nums < msg->kpm_req_nums) {
                mutex_exit(&msg->kpm_lock);
                return;
        }
        mutex_exit(&msg->kpm_lock);
 
 
This message posted from opensolaris.org
_______________________________________________
opensolaris-help mailing list
[email protected]

Reply via email to