#25125: KIST: Bug: scheduler_release_channel: Non-fatal assertion
!(smartlist_pos(channels_pending, chan) == -1)
------------------------------+--------------------------------
     Reporter:  dgoulet       |      Owner:  dgoulet
         Type:  defect        |     Status:  assigned
     Priority:  High          |  Milestone:  Tor: 0.3.3.x-final
    Component:  Core Tor/Tor  |    Version:
     Severity:  Normal        |   Keywords:  032-backport
Actual Points:                |  Parent ID:
       Points:                |   Reviewer:
      Sponsor:                |
------------------------------+--------------------------------
 Because of commit `cb5654f300312a8f` and `adaf3e9b89f62d68` merged last
 night, we now see a new bug (benign as far as I can tell).

 Here is the `SCHED_BUG()` on one of my test relay:

 {{{
 Feb 02 12:56:21.930 [warn] tor_bug_occurred_(): Bug:
 src/or/scheduler.c:648: scheduler_release_channel: Non-fatal assertion
 !(smartlist_pos(channels_pending, chan) == -1) failed. (on Tor 0.3.3.1
 -alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.930 [warn] Bug: Non-fatal assertion
 !(smartlist_pos(channels_pending, chan) == -1) failed in
 scheduler_release_channel at src/or/scheduler.c:648. Stack trace: (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.930 [warn] Bug:
 /root/git/tor/src/or/tor(log_backtrace+0x42) [0x5579b56bb582] (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.930 [warn] Bug:
 /root/git/tor/src/or/tor(tor_bug_occurred_+0xb9) [0x5579b56d6919] (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:
 /root/git/tor/src/or/tor(scheduler_release_channel+0x1aa) [0x5579b55edb0a]
 (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /root/git/tor/src/or/tor(+0x10bd94)
 [0x5579b563fd94] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:
 /root/git/tor/src/or/tor(connection_handle_write+0x2e) [0x5579b564067e]
 (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /root/git/tor/src/or/tor(+0xbb229)
 [0x5579b55ef229] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /root/git/tor/src/or/tor(+0xb8c8f)
 [0x5579b55ecc8f] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /usr/lib/x86_64-linux-
 gnu/libevent-2.0.so.5(event_base_loop+0x819) [0x7fcbd0fe94c9] (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:
 /root/git/tor/src/or/tor(do_main_loop+0x25c) [0x5579b5587f0c] (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:
 /root/git/tor/src/or/tor(tor_run_main+0x275) [0x5579b5589475] (on Tor
 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:
 /root/git/tor/src/or/tor(tor_main+0x3a) [0x5579b5582b3a] (on Tor 0.3.3.1
 -alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /root/git/tor/src/or/tor(main+0x19)
 [0x5579b55828a9] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /lib/x86_64-linux-
 gnu/libc.so.6(__libc_start_main+0xf0) [0x7fcbcff01830] (on Tor 0.3.3.1
 -alpha-dev 0fd91772b1ece9e0)
 Feb 02 12:56:21.931 [warn] Bug:     /root/git/tor/src/or/tor(_start+0x29)
 [0x5579b55828f9] (on Tor 0.3.3.1-alpha-dev 0fd91772b1ece9e0)
 }}}

 It appears that we can release a channel from within
 `connection_handle_write()`. At that point, the channel was popped from
 the pending list and within the scheduler loop, we are writing the channel
 connection's outbuf onto the wire so when it is release, this bug is
 triggered in `scheduler_release_channel()`:

 {{{
   if (chan->scheduler_state == SCHED_CHAN_PENDING) {
     if (SCHED_BUG(smartlist_pos(channels_pending, chan) == -1, chan)) {
       log_warn(LD_SCHED, "Scheduler asked to release channel %" PRIu64 " "
                          "but it wasn't in channels_pending",
                chan->global_identifier);
 }}}

 This doesn't affect tor in any ways and the current codeflow of channels
 suggest that the release should accept that the channel is not in the
 pending list even if the scheduler state is in PENDING.

 The reason is that while in the scheduler loop, once the channel is popped
 from the pending list, we *have* to keep it in PENDING state (see #24700)
 until the scheduler is done with it and decides to put it back in the list
 or change its state. This is because a channel can be rescheduled in while
 it is in the scheduler loop which is *not* ideal for KIST but this is the
 way it is for now.

 We have to do a quick fix for 032 and 033 but this should be resolved in
 034 with #24554.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25125>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Reply via email to