On Thu, Aug 22, 2019 at 12:18:07PM -0400, tony camuso wrote: > On 8/22/19 12:13 PM, Corey Minyard wrote: > > On Thu, Aug 22, 2019 at 10:05:05AM -0400, Tony Camuso wrote: > > > This includes the fix for the bug I introduced in the last patch. > > > > Got it, thanks for the update. > > > > In the future, could you send patches with git send-email and put > > comments after the --- line? It looks like you are using > > git format-patch and then inserting it into an email, which > > means I have to save it in a file, edit it, then use git am. > > Plus I'm getting "Corey," as the subject line for some strange > > reason. Most kernel maintainers probably won't take patches that > > way. > > > > -corey > > Yes, sorry about that. I'm using git send-email --annotate. > I'm just putting my note in the wrong place. > > Do you need me to resend?
No, it's not a big deal, I'm just trying to avoid you getting a hostile response in some other situation :). -corey > > > > > > > > > From bca3736afa746dcd92ebf1b04417462dcd46283c Mon Sep 17 00:00:00 2001 > > > From: Tony Camuso <tcam...@redhat.com> > > > Date: Thu, 22 Aug 2019 08:24:53 -0400 > > > Subject: [PATCH v2] ipmi: move message error checking to avoid deadlock > > > > > > V1->V2: in handle_one_rcv_msg, if data_size > 2, set requeue to zero and > > > goto out instead of calling ipmi_free_msg. > > > Kosuke Tatsukawa <ta...@ab.jp.nec.com> > > > > > > In the source stack trace below, function set_need_watch tries to > > > take out the same si_lock that was taken earlier by ipmi_thread. > > > > > > ipmi_thread() [drivers/char/ipmi/ipmi_si_intf.c:995] > > > smi_event_handler() [drivers/char/ipmi/ipmi_si_intf.c:765] > > > handle_transaction_done() [drivers/char/ipmi/ipmi_si_intf.c:555] > > > deliver_recv_msg() [drivers/char/ipmi/ipmi_si_intf.c:283] > > > ipmi_smi_msg_received() [drivers/char/ipmi/ipmi_msghandler.c:4503] > > > intf_err_seq() [drivers/char/ipmi/ipmi_msghandler.c:1149] > > > smi_remove_watch() [drivers/char/ipmi/ipmi_msghandler.c:999] > > > set_need_watch() [drivers/char/ipmi/ipmi_si_intf.c:1066] > > > > > > Upstream commit e1891cffd4c4896a899337a243273f0e23c028df adds code to > > > ipmi_smi_msg_received() to call smi_remove_watch() via intf_err_seq() > > > and this seems to be causing the deadlock. > > > > > > commit e1891cffd4c4896a899337a243273f0e23c028df > > > Author: Corey Minyard <cminy...@mvista.com> > > > Date: Wed Oct 24 15:17:04 2018 -0500 > > > ipmi: Make the smi watcher be disabled immediately when not needed > > > > > > The fix is to put all messages in the queue and move the message > > > checking code out of ipmi_smi_msg_received and into handle_one_recv_msg, > > > which processes the message checking after ipmi_thread releases its > > > locks. > > > > > > Additionally,Kosuke Tatsukawa <ta...@ab.jp.nec.com> reported that > > > handle_new_recv_msgs calls ipmi_free_msg when handle_one_rcv_msg returns > > > zero, so that the call to ipmi_free_msg in handle_one_rcv_msg introduced > > > another panic when "ipmitool sensor list" was run in a loop. He > > > submitted this part of the patch. > > > > > > +free_msg: > > > + requeue = 0; > > > + goto out; > > > > > > Reported by: Osamu Samukawa <osa-samuk...@tg.jp.nec.com> > > > Characterized by: Kosuke Tatsukawa <ta...@ab.jp.nec.com> > > > Signed-off-by: Tony Camuso <tcam...@redhat.com> > > > --- > > > drivers/char/ipmi/ipmi_msghandler.c | 114 > > > ++++++++++++++++++------------------ > > > 1 file changed, 57 insertions(+), 57 deletions(-) > > > > > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c > > > b/drivers/char/ipmi/ipmi_msghandler.c > > > index 0dcea9d256a0..57ca446f6a60 100644 > > > --- a/drivers/char/ipmi/ipmi_msghandler.c > > > +++ b/drivers/char/ipmi/ipmi_msghandler.c > > > @@ -4206,7 +4206,53 @@ static int handle_one_recv_msg(struct ipmi_smi > > > *intf, > > > int chan; > > > ipmi_debug_msg("Recv:", msg->rsp, msg->rsp_size); > > > - if (msg->rsp_size < 2) { > > > + > > > + if ((msg->data_size >= 2) > > > + && (msg->data[0] == (IPMI_NETFN_APP_REQUEST << 2)) > > > + && (msg->data[1] == IPMI_SEND_MSG_CMD) > > > + && (msg->user_data == NULL)) { > > > + > > > + if (intf->in_shutdown) > > > + goto free_msg; > > > + > > > + /* > > > + * This is the local response to a command send, start > > > + * the timer for these. The user_data will not be > > > + * NULL if this is a response send, and we will let > > > + * response sends just go through. > > > + */ > > > + > > > + /* > > > + * Check for errors, if we get certain errors (ones > > > + * that mean basically we can try again later), we > > > + * ignore them and start the timer. Otherwise we > > > + * report the error immediately. > > > + */ > > > + if ((msg->rsp_size >= 3) && (msg->rsp[2] != 0) > > > + && (msg->rsp[2] != IPMI_NODE_BUSY_ERR) > > > + && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR) > > > + && (msg->rsp[2] != IPMI_BUS_ERR) > > > + && (msg->rsp[2] != IPMI_NAK_ON_WRITE_ERR)) { > > > + int ch = msg->rsp[3] & 0xf; > > > + struct ipmi_channel *chans; > > > + > > > + /* Got an error sending the message, handle it. */ > > > + > > > + chans = READ_ONCE(intf->channel_list)->c; > > > + if ((chans[ch].medium == IPMI_CHANNEL_MEDIUM_8023LAN) > > > + || (chans[ch].medium == IPMI_CHANNEL_MEDIUM_ASYNC)) > > > + ipmi_inc_stat(intf, sent_lan_command_errs); > > > + else > > > + ipmi_inc_stat(intf, sent_ipmb_command_errs); > > > + intf_err_seq(intf, msg->msgid, msg->rsp[2]); > > > + } else > > > + /* The message was sent, start the timer. */ > > > + intf_start_seq_timer(intf, msg->msgid); > > > +free_msg: > > > + requeue = 0; > > > + goto out; > > > + > > > + } else if (msg->rsp_size < 2) { > > > /* Message is too small to be correct. */ > > > dev_warn(intf->si_dev, > > > "BMC returned too small a message for netfn %x > > > cmd %x, got %d bytes\n", > > > @@ -4463,62 +4509,16 @@ void ipmi_smi_msg_received(struct ipmi_smi *intf, > > > unsigned long flags = 0; /* keep us warning-free. */ > > > int run_to_completion = intf->run_to_completion; > > > - if ((msg->data_size >= 2) > > > - && (msg->data[0] == (IPMI_NETFN_APP_REQUEST << 2)) > > > - && (msg->data[1] == IPMI_SEND_MSG_CMD) > > > - && (msg->user_data == NULL)) { > > > - > > > - if (intf->in_shutdown) > > > - goto free_msg; > > > - > > > - /* > > > - * This is the local response to a command send, start > > > - * the timer for these. The user_data will not be > > > - * NULL if this is a response send, and we will let > > > - * response sends just go through. > > > - */ > > > - > > > - /* > > > - * Check for errors, if we get certain errors (ones > > > - * that mean basically we can try again later), we > > > - * ignore them and start the timer. Otherwise we > > > - * report the error immediately. > > > - */ > > > - if ((msg->rsp_size >= 3) && (msg->rsp[2] != 0) > > > - && (msg->rsp[2] != IPMI_NODE_BUSY_ERR) > > > - && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR) > > > - && (msg->rsp[2] != IPMI_BUS_ERR) > > > - && (msg->rsp[2] != IPMI_NAK_ON_WRITE_ERR)) { > > > - int ch = msg->rsp[3] & 0xf; > > > - struct ipmi_channel *chans; > > > - > > > - /* Got an error sending the message, handle it. */ > > > - > > > - chans = READ_ONCE(intf->channel_list)->c; > > > - if ((chans[ch].medium == IPMI_CHANNEL_MEDIUM_8023LAN) > > > - || (chans[ch].medium == IPMI_CHANNEL_MEDIUM_ASYNC)) > > > - ipmi_inc_stat(intf, sent_lan_command_errs); > > > - else > > > - ipmi_inc_stat(intf, sent_ipmb_command_errs); > > > - intf_err_seq(intf, msg->msgid, msg->rsp[2]); > > > - } else > > > - /* The message was sent, start the timer. */ > > > - intf_start_seq_timer(intf, msg->msgid); > > > - > > > -free_msg: > > > - ipmi_free_smi_msg(msg); > > > - } else { > > > - /* > > > - * To preserve message order, we keep a queue and deliver from > > > - * a tasklet. > > > - */ > > > - if (!run_to_completion) > > > - spin_lock_irqsave(&intf->waiting_rcv_msgs_lock, flags); > > > - list_add_tail(&msg->link, &intf->waiting_rcv_msgs); > > > - if (!run_to_completion) > > > - spin_unlock_irqrestore(&intf->waiting_rcv_msgs_lock, > > > - flags); > > > - } > > > + /* > > > + * To preserve message order, we keep a queue and deliver from > > > + * a tasklet. > > > + */ > > > + if (!run_to_completion) > > > + spin_lock_irqsave(&intf->waiting_rcv_msgs_lock, flags); > > > + list_add_tail(&msg->link, &intf->waiting_rcv_msgs); > > > + if (!run_to_completion) > > > + spin_unlock_irqrestore(&intf->waiting_rcv_msgs_lock, > > > + flags); > > > if (!run_to_completion) > > > spin_lock_irqsave(&intf->xmit_msgs_lock, flags); > > > -- > > > 2.18.1 > > > > _______________________________________________ Openipmi-developer mailing list Openipmi-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openipmi-developer