Le Thu, Apr 15, 2021 at 08:56:09AM +0200, Eric Dumazet a écrit : > > > On 4/15/21 8:39 AM, Du Cheng wrote: > > There is a reproducible sequence from the userland that will trigger a > > WARN_ON() > > condition in taprio_get_start_time, which causes kernel to panic if > > configured > > as "panic_on_warn". Remove this WARN_ON() to prevent kernel from crashing by > > userland-initiated syscalls. > > > > Reported as bug on syzkaller: > > https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c > > > > Reported-by: syzbot+d50710fd0873a9c6b...@syzkaller.appspotmail.com > > Signed-off-by: Du Cheng <duche...@gmail.com> > > --- > > Detailed explanation: > > > > In net/sched/sched_taprio.c:999 > > The condition WARN_ON(!cycle) will be triggered if cycle == 0. Value of > > cycle > > comes from sched->cycle_time, where sched is of type(struct > > sched_gate_list*). > > > > sched->cycle_time is accumulated within `parse_taprio_schedule()` during > > `taprio_init()`, in the following 2 ways: > > > > 1. from nla_get_s64(tb[TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME]); > > 2. (if zero) from parse_sched_list(..., > > tb[TCA_TAPRIO_ATTR_SCHED_ENTRY_LIST], ...); > > > > note: tb is a map parsed from netlink attributes provided via sendmsg() > > from the userland: > > > > If both two attributes (TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME, > > TCA_TAPRIO_ATTR_SCHED_ENTRY_LIST) contain 0 values or are missing, this > > will result > > in sched->cycle_time == 0 and hence trigger the WARN_ON(!cycle). > > > > Reliable reproducable steps: > > 1. add net device team0 > > 2. add team_slave_0, team_slave_1 > > 3. sendmsg(struct msghdr { > > .iov = struct nlmsghdr { > > .type = RTM_NEWQDISC, > > } > > struct tcmsg { > > .tcm_ifindex = ioctl(SIOCGIFINDEX, "team0"), > > .nlattr[] = { > > TCA_KIND: "taprio", > > TCA_OPTIONS: { > > .nlattr = { > > TCA_TAPRIO_ATTR_PRIOMAP: ..., > > TCA_TAPRIO_ATTR_SCHED_ENTRY_LIST: {0}, > > TCA_TAPRIO_ATTR_SCHED_CLICKID: 0, > > } > > } > > } > > } > > } > > > > Callstack: > > > > parse_taprio_schedule() > > taprio_change() > > taprio_init() > > qdisc_create() > > tc_modify_qdisc() > > rtnetlink_rcv_msg() > > ... > > sendmsg() > > > > These steps are extracted from syzkaller reproducer: > > https://syzkaller.appspot.com/text?tag=ReproC&x=15727cf1900000 > > > > net/sched/sch_taprio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > > index 8287894541e3..5f2ff0f15d5c 100644 > > --- a/net/sched/sch_taprio.c > > +++ b/net/sched/sch_taprio.c > > @@ -996,7 +996,7 @@ static int taprio_get_start_time(struct Qdisc *sch, > > * something went really wrong. In that case, we should warn about this > > * inconsistent state and return error. > > */ > > - if (WARN_ON(!cycle)) > > + if (!cycle) { > > return -EFAULT; > > > > /* Schedule the start time for the beginning of the next > > > > This 'fix' is wrong, not even compiled, thus not tested. > > sched->cycle_time MUST not be zero ever, there are plenty > of other places where other crashes will happen. > > You are silencing a condition that should be caught much earlier. > > There is even a fat comment to explain this, that can be partially seen in > your patch. > > > Correct fix will probably be : > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index > 8287894541e3ce5f290be2e592c0dcbdf2ec6b60..189c617a582a2eecc92e35187379d4c2889289df > 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -901,6 +901,8 @@ static int parse_taprio_schedule(struct taprio_sched *q, > struct nlattr **tb, > > list_for_each_entry(entry, &new->entries, list) > cycle = ktime_add_ns(cycle, entry->interval); > + if (!cycle) > + return -EINVAL; > new->cycle_time = cycle; > } > > >
Hi Eric, Sorry for the typo in the previous PATCH, I must have let the '{' slip out , when I tried to cleanup my own debugging lines. Anyway, my intention was indeed to silently return error back to the userland, if the cycle is 0. The removal of WARN_ON() is so that this would not cause the kernel to panic. `cycle` can be 0 if the userland provides invalid input, which should be anticipated and gracefully handled, hence panic would be too dramatic. With the removal of WARN_ON(), the original logic was not altered: return -EINVAL to the userland and abort subsequent steps. I will later submit a v2 to correct my typo. Thank you for your input. Regards, Du Cheng