Re: [PATCH v2] ipmi: refactor deprecated strncpy

2023-09-13 Thread Corey Minyard
On Wed, Sep 13, 2023 at 05:13:04PM +, Justin Stitt wrote:
> `strncpy` is deprecated for use on NUL-terminated destination strings [1].

Thanks, applied to my next tree.

-corey

> 
> In this case, strncpy is being used specifically for its NUL-padding
> behavior (and has been commented as such). Moreover, the destination
> string is not required to be NUL-terminated [2].
> 
> We can use a more robust and less ambiguous interface in
> `memcpy_and_pad` which makes the code more readable and even eliminates
> the need for that comment.
> 
> Let's also use `strnlen` instead of `strlen()` with an upper-bounds
> check as this is intrinsically a part of `strnlen`.
> 
> Also included in this patch is a simple 1:1 change of `strncpy` to
> `strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
> should opt again for `strscpy_pad`.
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
>  [1]
> Link: https://lore.kernel.org/all/zqeadybl0uz1n...@mail.minyard.net/ [2]
> Link: https://github.com/KSPP/linux/issues/90
> Cc: linux-harden...@vger.kernel.org
> Cc: Kees Cook 
> Signed-off-by: Justin Stitt 
> ---
> Changes in v2:
> - use memcpy_and_pad (thanks Corey)
> - Link to v1: 
> https://lore.kernel.org/r/20230912-strncpy-drivers-char-ipmi-ipmi-v1-1-cc43e0d1c...@google.com
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 11 +++
>  drivers/char/ipmi/ipmi_ssif.c   |  2 +-
>  2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 186f1fee7534..d6f14279684d 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi *intf, 
> char *str)
>  
>   j = 0;
>   while (*p) {
> - int size = strlen(p);
> + int size = strnlen(p, 11);
>  
> - if (size > 11)
> - size = 11;
>   data[0] = 0;
>   data[1] = 0;
>   data[2] = 0xf0; /* OEM event without timestamp. */
>   data[3] = intf->addrinfo[0].address;
>   data[4] = j++; /* sequence # */
> - /*
> -  * Always give 11 bytes, so strncpy will fill
> -  * it with zeroes for me.
> -  */
> - strncpy(data+5, p, 11);
> +
> + memcpy_and_pad(data+5, 11, p, size, '\0');
>   p += size;
>  
>   ipmi_panic_request_and_wait(intf, , );
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 3b921c78ba08..edcb83765dce 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char *adapter_name,
>   }
>   }
>  
> - strncpy(addr_info->binfo.type, DEVICE_NAME,
> + strscpy(addr_info->binfo.type, DEVICE_NAME,
>   sizeof(addr_info->binfo.type));
>   addr_info->binfo.addr = addr;
>   addr_info->binfo.platform_data = addr_info;
> 
> ---
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd
> 
> Best regards,
> --
> Justin Stitt 
> 


Re: [PATCH] ipmi: refactor deprecated strncpy

2023-09-13 Thread Corey Minyard
On Tue, Sep 12, 2023 at 05:55:02PM -0700, Justin Stitt wrote:
> On Tue, Sep 12, 2023 at 5:19 PM Corey Minyard  wrote:
> >
> > On Tue, Sep 12, 2023 at 11:43:05PM +, Justin Stitt wrote:
> > > `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> > >
> > > In this case, strncpy is being used specifically for its NUL-padding
> > > behavior (and has been commented as such). We can use a more robust and
> > > less ambiguous interface in `strscpy_pad` which makes the code more
> > > readable and even eliminates the need for that comment.
> > >
> > > Let's also use `strnlen` instead of `strlen()` with an upper-bounds
> > > check as this is intrinsically a part of `strnlen`.
> > >
> > > Also included in this patch is a simple 1:1 change of `strncpy` to
> > > `strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
> > > should opt again for `strscpy_pad`.
> > >
> > > Link: 
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> > >  [1]
> > > Link: https://github.com/KSPP/linux/issues/90
> > > Cc: linux-harden...@vger.kernel.org
> > > Cc: Kees Cook 
> > > Signed-off-by: Justin Stitt 
> > > ---
> > >  drivers/char/ipmi/ipmi_msghandler.c | 11 +++
> > >  drivers/char/ipmi/ipmi_ssif.c   |  2 +-
> > >  2 files changed, 4 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> > > b/drivers/char/ipmi/ipmi_msghandler.c
> > > index 186f1fee7534..04f7622cb703 100644
> > > --- a/drivers/char/ipmi/ipmi_msghandler.c
> > > +++ b/drivers/char/ipmi/ipmi_msghandler.c
> > > @@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi 
> > > *intf, char *str)
> > >
> > >   j = 0;
> > >   while (*p) {
> > > - int size = strlen(p);
> > > + int size = strnlen(p, 11);
> > >
> > > - if (size > 11)
> > > - size = 11;
> > >   data[0] = 0;
> > >   data[1] = 0;
> > >   data[2] = 0xf0; /* OEM event without timestamp. */
> > >   data[3] = intf->addrinfo[0].address;
> > >   data[4] = j++; /* sequence # */
> > > - /*
> > > -  * Always give 11 bytes, so strncpy will fill
> > > -  * it with zeroes for me.
> > > -  */
> > > - strncpy(data+5, p, 11);
> > > +
> > > + strscpy_pad(data+5, p, 11);
> >
> > This is incorrect, the destination should *not* be nil terminated if the
> > destination is full.  strncpy does exactly what is needed here.
> 
> Could we use `memcpy_and_pad()` as this matches the behavior of
> strncpy in this case? I understand strncpy works here but I'm really
> keen on snuffing out all its uses -- treewide.

Sure, I think "memcpy_and_pad(data + 5, 11, p, size, 0);" should work.
And that's self-documenting.

-corey

> 
> >
> > A comment should be added here, this is not the first time this has been
> > brought up.
> >
> > >   p += size;
> > >
> > >   ipmi_panic_request_and_wait(intf, , );
> > > diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> > > index 3b921c78ba08..edcb83765dce 100644
> > > --- a/drivers/char/ipmi/ipmi_ssif.c
> > > +++ b/drivers/char/ipmi/ipmi_ssif.c
> > > @@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char 
> > > *adapter_name,
> > >   }
> > >   }
> > >
> > > - strncpy(addr_info->binfo.type, DEVICE_NAME,
> > > + strscpy(addr_info->binfo.type, DEVICE_NAME,
> > >   sizeof(addr_info->binfo.type));
> >
> > This one is good.
> >
> > -corey
> >
> > >   addr_info->binfo.addr = addr;
> > >   addr_info->binfo.platform_data = addr_info;
> > >
> > > ---
> > > base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> > > change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd
> > >
> > > Best regards,
> > > --
> > > Justin Stitt 
> > >


Re: [PATCH] ipmi: refactor deprecated strncpy

2023-09-12 Thread Corey Minyard
On Tue, Sep 12, 2023 at 11:43:05PM +, Justin Stitt wrote:
> `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> 
> In this case, strncpy is being used specifically for its NUL-padding
> behavior (and has been commented as such). We can use a more robust and
> less ambiguous interface in `strscpy_pad` which makes the code more
> readable and even eliminates the need for that comment.
> 
> Let's also use `strnlen` instead of `strlen()` with an upper-bounds
> check as this is intrinsically a part of `strnlen`.
> 
> Also included in this patch is a simple 1:1 change of `strncpy` to
> `strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
> should opt again for `strscpy_pad`.
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
>  [1]
> Link: https://github.com/KSPP/linux/issues/90
> Cc: linux-harden...@vger.kernel.org
> Cc: Kees Cook 
> Signed-off-by: Justin Stitt 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 11 +++
>  drivers/char/ipmi/ipmi_ssif.c   |  2 +-
>  2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 186f1fee7534..04f7622cb703 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi *intf, 
> char *str)
>  
>   j = 0;
>   while (*p) {
> - int size = strlen(p);
> + int size = strnlen(p, 11);
>  
> - if (size > 11)
> - size = 11;
>   data[0] = 0;
>   data[1] = 0;
>   data[2] = 0xf0; /* OEM event without timestamp. */
>   data[3] = intf->addrinfo[0].address;
>   data[4] = j++; /* sequence # */
> - /*
> -  * Always give 11 bytes, so strncpy will fill
> -  * it with zeroes for me.
> -  */
> - strncpy(data+5, p, 11);
> +
> + strscpy_pad(data+5, p, 11);

This is incorrect, the destination should *not* be nil terminated if the
destination is full.  strncpy does exactly what is needed here.

A comment should be added here, this is not the first time this has been
brought up.

>   p += size;
>  
>   ipmi_panic_request_and_wait(intf, , );
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 3b921c78ba08..edcb83765dce 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char *adapter_name,
>   }
>   }
>  
> - strncpy(addr_info->binfo.type, DEVICE_NAME,
> + strscpy(addr_info->binfo.type, DEVICE_NAME,
>   sizeof(addr_info->binfo.type));

This one is good.

-corey

>   addr_info->binfo.addr = addr;
>   addr_info->binfo.platform_data = addr_info;
> 
> ---
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd
> 
> Best regards,
> --
> Justin Stitt 
> 


Re: [PATCH v2 00/21] ipmi: Allow raw access to KCS devices

2021-04-08 Thread Corey Minyard
On Thu, Apr 08, 2021 at 10:27:46AM +0930, Andrew Jeffery wrote:
> Hi Corey,
> 
> On Fri, 19 Mar 2021, at 16:49, Andrew Jeffery wrote:
> > Hello,
> > 
> > This series is a bit of a mix of things, but its primary purpose is to
> > expose BMC KCS IPMI devices to userspace in a way that enables userspace
> > to talk to host firmware using protocols that are not IPMI.
> > 
> > v1 can be found here:
> > 
> > https://lore.kernel.org/openbmc/20210219142523.3464540-1-and...@aj.id.au/
> > 
> > Changes in v2 include:
> > 
> > * A rebase onto v5.12-rc2
> > * Incorporation of off-list feedback on SerIRQ configuration from
> >   Chiawei
> > * Further validation on hardware for ASPEED KCS devices 2, 3 and 4
> > * Lifting the existing single-open constraint of the IPMI chardev
> > * Fixes addressing Rob's feedback on the conversion of the ASPEED KCS
> >   binding to dt-schema
> > * Fixes addressing Rob's feedback on the new aspeed,lpc-interrupts
> >   property definition for the ASPEED KCS binding
> > 
> > A new chardev device is added whose implementation exposes the Input
> > Data Register (IDR), Output Data Register (ODR) and Status Register
> > (STR) via read() and write(), and implements poll() for event
> > monitoring.
> > 
> > The existing /dev/ipmi-kcs* chardev interface exposes the KCS devices in
> > a way which encoded the IPMI protocol in its behaviour. However, as
> > LPC[0] KCS devices give us bi-directional interrupts between the host
> > and a BMC with both a data and status byte, they are useful for purposes
> > beyond IPMI.
> > 
> > As a concrete example, libmctp[1] implements a vendor-defined MCTP[2]
> > binding using a combination of LPC Firmware cycles for bulk data
> > transfer and a KCS device via LPC IO cycles for out-of-band protocol
> > control messages[3]. This gives a throughput improvement over the
> > standard KCS binding[4] while continuing to exploit the ease of setup of
> > the LPC bus for early boot firmware on the host processor.
> > 
> > The series takes a bit of a winding path to achieve its aim:
> > 
> > 1. It begins with patches 1-5 put together by Chia-Wei, which I've
> > rebased on v5.12-rc2. These fix the ASPEED LPC bindings and other
> > non-KCS LPC-related ASPEED device drivers in a way that enables the
> > SerIRQ patches at the end of the series. With Joel's review I'm hoping
> > these 5 can go through the aspeed tree, and that the rest can go through
> > the IPMI tree.
> > 
> > 2. Next, patches 6-13 fairly heavily refactor the KCS support in the
> > IPMI part of the tree, re-architecting things such that it's possible to
> > support multiple chardev implementations sitting on top of the ASPEED
> > and Nuvoton device drivers. However, the KCS code didn't really have
> > great separation of concerns as it stood, so even if we disregard the
> > multiple-chardev support I think the cleanups are worthwhile.
> > 
> > 3. Patch 14 adds some interrupt management capabilities to the KCS
> > device drivers in preparation for patch 16, which introduces the new
> > "raw" KCS device interface. I'm not stoked about the device name/path,
> > so if people are looking to bikeshed something then feel free to lay
> > into that.
> > 
> > 4. The remaining patches switch the ASPEED KCS devicetree binding to
> > dt-schema, add a new interrupt property to describe the SerIRQ behaviour
> > of the device and finally clean up Serial IRQ support in the ASPEED KCS
> > driver.
> > 
> > Rob: The dt-binding patches still come before the relevant driver
> > changes, I tried to keep the two close together in the series, hence the
> > bindings changes not being patches 1 and 2.
> > 
> > I've exercised the series under qemu with the rainier-bmc machine plus
> > additional patches for KCS support[5]. I've also substituted this series in
> > place of a hacky out-of-tree driver that we've been using for the
> > libmctp stack and successfully booted the host processor under our
> > internal full-platform simulation tools for a Rainier system.
> > 
> > Note that this work touches the Nuvoton driver as well as ASPEED's, but
> > I don't have the capability to test those changes or the IPMI chardev
> > path. Tested-by tags would be much appreciated if you can exercise one
> > or both.
> > 
> > Please review!
> 
> Unfortunately the cover letter got detached from the rest of the series.
> 
> Any chance you can take a look at the patches?

There were some minor concerns that were unanswered, and there really
was no review by others for many of the patches.

I would like this patch set, it makes some good cleanups.  But I would
like some more review and testing by others, if possible.  I'm fairly
sure it has already been done, it just needs to be documented.

-corey

> 
> https://lore.kernel.org/linux-arm-kernel/20210319062752.145730-1-and...@aj.id.au/
> 
> Cheers,
> 
> Andrew


Re: [PATCH v2 2/3] drivers: char: ipmi: Add Aspeed SSIF BMC driver

2021-04-07 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:10:28PM +0700, Quan Nguyen wrote:
> The SMBus system interface (SSIF) IPMI BMC driver can be used to perform
> in-band IPMI communication with their host in management (BMC) side.
> 
> This commits adds support specifically for Aspeed AST2500 which commonly
> used as Board Management Controllers.

Two major comments:

This needs to be two patches: one with the generic SSIF slave code, and
one with the aspeed-specific code.  It's hard to tell that you are
adding generic code otherwise.

If you are going to add a generic interface like this, you need to add
documentation on how to use it, but from userland and from in the
kernel.

And some other comments:

Did you run this through checkpatch?

I think there is a high-level race condition in this code.  Consider the
following sequence:

  1) host writes a request to the BMC
  2) BMC reads the request
  3) host aborts the operation
  4) host write a new request to the BMC
  5) BMC sends the response to the first message

You probably need something to say that a response can only go out if
there is a request that has come in that has been read by the BMC.

Other comments inline.

> 
> Signed-off-by: Quan Nguyen 
> ---
>  drivers/char/ipmi/Kconfig   |  22 +
>  drivers/char/ipmi/Makefile  |   2 +
>  drivers/char/ipmi/ssif_bmc.c| 645 
>  drivers/char/ipmi/ssif_bmc.h|  92 
>  drivers/char/ipmi/ssif_bmc_aspeed.c | 132 ++
>  5 files changed, 893 insertions(+)
>  create mode 100644 drivers/char/ipmi/ssif_bmc.c
>  create mode 100644 drivers/char/ipmi/ssif_bmc.h
>  create mode 100644 drivers/char/ipmi/ssif_bmc_aspeed.c
> 
> diff --git a/drivers/char/ipmi/Kconfig b/drivers/char/ipmi/Kconfig
> index 07847d9a459a..45be57023577 100644
> --- a/drivers/char/ipmi/Kconfig
> +++ b/drivers/char/ipmi/Kconfig
> @@ -133,6 +133,28 @@ config ASPEED_BT_IPMI_BMC
> found on Aspeed SOCs (AST2400 and AST2500). The driver
> implements the BMC side of the BT interface.
>  
> +config SSIF_IPMI_BMC
> + tristate "SSIF IPMI BMC driver"
> + select I2C
> + select I2C_SLAVE
> + help
> +   This enables the IPMI SMBus system interface (SSIF) at the
> +   management (BMC) side.
> +
> +   The driver implements the BMC side of the SMBus system
> +   interface (SSIF).
> +
> +config ASPEED_SSIF_IPMI_BMC
> + depends on ARCH_ASPEED || COMPILE_TEST
> + select SSIF_IPMI_BMC
> + tristate "Aspeed SSIF IPMI BMC driver"
> + help
> +   Provides a driver for the SSIF IPMI interface found on
> +   Aspeed AST2500 SoC.
> +
> +   The driver implements the BMC side of the SMBus system
> +   interface (SSIF), specific for Aspeed AST2500 SoC.
> +
>  config IPMB_DEVICE_INTERFACE
>   tristate 'IPMB Interface handler'
>   depends on I2C
> diff --git a/drivers/char/ipmi/Makefile b/drivers/char/ipmi/Makefile
> index 0822adc2ec41..05b993f7335b 100644
> --- a/drivers/char/ipmi/Makefile
> +++ b/drivers/char/ipmi/Makefile
> @@ -27,3 +27,5 @@ obj-$(CONFIG_ASPEED_BT_IPMI_BMC) += bt-bmc.o
>  obj-$(CONFIG_ASPEED_KCS_IPMI_BMC) += kcs_bmc_aspeed.o
>  obj-$(CONFIG_NPCM7XX_KCS_IPMI_BMC) += kcs_bmc_npcm7xx.o
>  obj-$(CONFIG_IPMB_DEVICE_INTERFACE) += ipmb_dev_int.o
> +obj-$(CONFIG_SSIF_IPMI_BMC) += ssif_bmc.o
> +obj-$(CONFIG_ASPEED_SSIF_IPMI_BMC) += ssif_bmc_aspeed.o
> diff --git a/drivers/char/ipmi/ssif_bmc.c b/drivers/char/ipmi/ssif_bmc.c
> new file mode 100644
> index ..ae6e8750c795
> --- /dev/null
> +++ b/drivers/char/ipmi/ssif_bmc.c
> @@ -0,0 +1,645 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * The driver for BMC side of SSIF interface
> + *
> + * Copyright (c) 2021, Ampere Computing LLC
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of
> + * the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ssif_bmc.h"
> +
> +/*
> + * Call in WRITE context
> + */
> +static int send_ssif_bmc_response(struct ssif_bmc_ctx *ssif_bmc, bool 
> non_blocking)
> +{
> + unsigned long flags;
> + int ret;
> +
> + if (!non_blocking) {
> +retry:
> + ret = wait_event_interruptible(ssif_bmc->wait_queue,
> +!ssif_bmc->response_in_progress);
> + if (ret)
> +  

Re: [PATCH v2 0/3] Add Aspeed SSIF BMC driver

2021-04-07 Thread Corey Minyard
On Wed, Apr 07, 2021 at 08:09:50PM +0700, Quan Nguyen wrote:
> Hi Corey,
> 
> Thank you for reviewing
> I'll put my respond inline below.
> 
> -Quan
> 
> On 02/04/2021 21:21, Corey Minyard wrote:
> > On Tue, Mar 30, 2021 at 09:10:26PM +0700, Quan Nguyen wrote:
> > > This series add support for the Aspeed specific SSIF BMC driver which
> > > is to perform in-band IPMI communication with the host in management
> > > (BMC) side.
> > 
> > I don't have any specific feedback for this, but I'm wondering if it's
> > really necessary.
> > 
> > Why can't the BMC just open the I2C device and use it?  Is there any
> > functionality that this provides that cannot be accomplished from
> > userland access to the I2C device?  I don't see any.
> > 
> > If it tied into some existing framework to give abstract access to a BMC
> > slave side interface, I'd be ok with this.  But I don't see that.
> > 
> 
> The SSIF at the BMC side acts as an I2C slave and we think that the kernel
> driver is unavoidable to handle the I2c slave events
> (https://www.kernel.org/doc/html/latest/i2c/slave-interface.html)
> 
> And to make it works with existing OpenBMC IPMI stack, a userspace part,
> ssifbridge, is needed (https://github.com/openbmc/ssifbridge). This
> ssifbridge is to connect this driver with the OpenBMC IPMI stack so the IPMI
> stack can communicate via SSIF channel in similar way that was implemented
> with BT and KCS (ie: btbridge/kcsbridge and its corespondent kernel drivers
> (https://github.com/openbmc/btbridge and
> https://github.com/openbmc/kcsbridge))

Dang, I don't know why there's not a generic userland interface for
the slave.  And I've made this mistake before :(.

Anyway, you are right, you need a driver.  I'll review.

-corey

> 
> > Unless there is a big need to have this in the kernel, I'm against
> > including this and would suggest you do all this work in userland.
> > Perhaps write a library.  Sorry, but I'm trying to do my part to reduce
> > unnecessary things in the kernel.
> > 
> > Thanks,
> > 
> > -corey
> > 


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-06 Thread Corey Minyard
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> kernel.h is being used as a dump for all kinds of stuff for a long time.
> Here is the attempt to start cleaning it up by splitting out panic and
> oops helpers.
> 
> At the same time convert users in header and lib folder to use new header.
> Though for time being include new header back to kernel.h to avoid twisted
> indirected includes for existing users.

For the IPMI portion:

Acked-by: Corey Minyard 

> 
> Signed-off-by: Andy Shevchenko 
> ---
>  arch/powerpc/kernel/setup-common.c   |  1 +
>  arch/x86/include/asm/desc.h  |  1 +
>  arch/x86/kernel/cpu/mshyperv.c   |  1 +
>  arch/x86/kernel/setup.c  |  1 +
>  drivers/char/ipmi/ipmi_msghandler.c  |  1 +
>  drivers/remoteproc/remoteproc_core.c |  1 +
>  include/asm-generic/bug.h|  3 +-
>  include/linux/kernel.h   | 84 +---
>  include/linux/panic.h| 98 
>  include/linux/panic_notifier.h   | 12 
>  kernel/hung_task.c   |  1 +
>  kernel/kexec_core.c  |  1 +
>  kernel/panic.c   |  1 +
>  kernel/rcu/tree.c|  2 +
>  kernel/sysctl.c  |  1 +
>  kernel/trace/trace.c |  1 +
>  16 files changed, 126 insertions(+), 84 deletions(-)
>  create mode 100644 include/linux/panic.h
>  create mode 100644 include/linux/panic_notifier.h
> 
> diff --git a/arch/powerpc/kernel/setup-common.c 
> b/arch/powerpc/kernel/setup-common.c
> index 74a98fff2c2f..046fe21b5c3b 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -9,6 +9,7 @@
>  #undef DEBUG
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> index 476082a83d1c..ceb12683b6d1 100644
> --- a/arch/x86/include/asm/desc.h
> +++ b/arch/x86/include/asm/desc.h
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 
>  
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 22f13343b5da..9e5c6f2b044d 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 59e5e0903b0c..570699eecf90 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 8a0e97b33cae..e96cb5c4f97a 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -16,6 +16,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index 626a6b90fba2..76dd8e2b1e7e 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index 76a10e0dca9f..719410b93f99 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -17,7 +17,8 @@
>  #endif
>  
>  #ifndef __ASSEMBLY__
> -#include 
> +#include 
> +#include 
>  
>  #ifdef CONFIG_BUG
>  
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 09035ac67d4b..6c5a05ac1ecb 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -70,7 +71,6 @@
>  #define lower_32_bits(n) ((u32)((n) & 0x))
>  
>  struct completion;
> -struct pt_regs;
>  struct user;
>  
>  #ifdef CONFIG_PREEMPT_VOLUNTARY
> @@ -175,14 +175,6 @@ void __might_fault(const char *file, int line);
>  static inline void might_fault(void) { }
>  #endif
>  
> -extern struct atomic_notifier_head panic_notifier_list;
> -extern long (*panic_blink)(int state);
> -__printf(1, 2)
> -void panic(const char *fmt, ...) __noreturn __cold;
> -void nmi_panic(struct pt_regs *regs, const char *msg);
> -extern void oops_enter(void);
> -extern void oops_exit(void);
> -extern bool o

Re: [PATCH v2 00/10] ipmi_si: Set of clean ups

2021-04-02 Thread Corey Minyard
On Fri, Apr 02, 2021 at 08:43:24PM +0300, Andy Shevchenko wrote:
> The series contains set of clean ups, main parts of which are:
>  - use ne platform_get_mem_or_io() API
>  - use match_string() API

As I have already said, a very nice set of cleanups.  Thank you.
These are applied and in the ipmi linux-next tree.

-corey

> 
> Since v2:
> - patch 3: rephrased commit message (Corey)
> - patch 5: added a comment that array maps to enum (Corey)
> - patch 5: added "ipmi" prefix to the name of the array
> - patch 6: just exported array w/o moving to header (Corey)
> - wrapped up cover letter
> 
> Andy Shevchenko (10):
>   ipmi_si: Switch to use platform_get_mem_or_io()
>   ipmi_si: Remove bogus err_free label
>   ipmi_si: Utilize temporary variable to hold device pointer
>   ipmi_si: Use proper ACPI macros to check error code for failures
>   ipmi_si: Introduce ipmi_panic_event_str[] array
>   ipmi_si: Reuse si_to_str[] array in ipmi_hardcode_init_one()
>   ipmi_si: Get rid of ->addr_source_cleanup()
>   ipmi_si: Use strstrip() to remove surrounding spaces
>   ipmi_si: Drop redundant check before calling put_device()
>   ipmi_si: Join string literals back
> 
>  drivers/char/ipmi/ipmi_msghandler.c  | 54 ++--
>  drivers/char/ipmi/ipmi_si.h  |  8 ++-
>  drivers/char/ipmi/ipmi_si_hardcode.c | 73 -
>  drivers/char/ipmi/ipmi_si_hotmod.c   | 24 ++-
>  drivers/char/ipmi/ipmi_si_intf.c | 32 --
>  drivers/char/ipmi/ipmi_si_pci.c  | 22 ++-
>  drivers/char/ipmi/ipmi_si_platform.c | 95 
>  7 files changed, 112 insertions(+), 196 deletions(-)
> 
> -- 
> 2.30.2
> 


Re: [PATCH v2 0/3] Add Aspeed SSIF BMC driver

2021-04-02 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:10:26PM +0700, Quan Nguyen wrote:
> This series add support for the Aspeed specific SSIF BMC driver which
> is to perform in-band IPMI communication with the host in management
> (BMC) side.

I don't have any specific feedback for this, but I'm wondering if it's
really necessary.

Why can't the BMC just open the I2C device and use it?  Is there any
functionality that this provides that cannot be accomplished from
userland access to the I2C device?  I don't see any.

If it tied into some existing framework to give abstract access to a BMC
slave side interface, I'd be ok with this.  But I don't see that.

Unless there is a big need to have this in the kernel, I'm against
including this and would suggest you do all this work in userland.
Perhaps write a library.  Sorry, but I'm trying to do my part to reduce
unnecessary things in the kernel.

Thanks,

-corey

> 
> v2:
>   + Fixed compiling error with COMPILE_TEST for arc
> 
> Quan Nguyen (3):
>   i2c: i2c-core-smbus: Expose PEC calculate function for generic use
>   drivers: char: ipmi: Add Aspeed SSIF BMC driver
>   bindings: ipmi: Add binding for Aspeed SSIF BMC driver
> 
>  .../bindings/ipmi/aspeed-ssif-bmc.txt |  18 +
>  drivers/char/ipmi/Kconfig |  22 +
>  drivers/char/ipmi/Makefile|   2 +
>  drivers/char/ipmi/ssif_bmc.c  | 645 ++
>  drivers/char/ipmi/ssif_bmc.h  |  92 +++
>  drivers/char/ipmi/ssif_bmc_aspeed.c   | 132 
>  drivers/i2c/i2c-core-smbus.c  |  12 +-
>  include/linux/i2c.h   |   1 +
>  8 files changed, 922 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/ipmi/aspeed-ssif-bmc.txt
>  create mode 100644 drivers/char/ipmi/ssif_bmc.c
>  create mode 100644 drivers/char/ipmi/ssif_bmc.h
>  create mode 100644 drivers/char/ipmi/ssif_bmc_aspeed.c
> 
> -- 
> 2.28.0
> 


Re: [PATCH v1 01/10] ipmi_si: Switch to use platform_get_mem_or_io()

2021-04-02 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:16:40PM +0300, Andy Shevchenko wrote:
> Switch to use new platform_get_mem_or_io() instead of home grown analogue.
> Note, we also introduce ipmi_set_addr_data_and_space() helper here.

You didn't send a part 0 that I saw, so just using this.  This is a nice
cleanup set, I just had a few very minor nits.  Thanks for this.

-corey

> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/char/ipmi/ipmi_si_platform.c | 40 +++-
>  1 file changed, 16 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_platform.c 
> b/drivers/char/ipmi/ipmi_si_platform.c
> index 129b5713f187..d7bd093f80e9 100644
> --- a/drivers/char/ipmi/ipmi_si_platform.c
> +++ b/drivers/char/ipmi/ipmi_si_platform.c
> @@ -100,35 +100,32 @@ static int acpi_gpe_irq_setup(struct si_sm_io *io)
>  }
>  #endif
>  
> +static void ipmi_set_addr_data_and_space(struct resource *r, struct si_sm_io 
> *io)
> +{
> + io->addr_data = r->start;
> + if (resource_type(r) == IORESOURCE_IO)
> + io->addr_space = IPMI_IO_ADDR_SPACE;
> + else
> + io->addr_space = IPMI_MEM_ADDR_SPACE;
> +}
> +
>  static struct resource *
>  ipmi_get_info_from_resources(struct platform_device *pdev,
>struct si_sm_io *io)
>  {
> - struct resource *res, *res_second;
> + struct resource *res, *second;
>  
> - res = platform_get_resource(pdev, IORESOURCE_IO, 0);
> - if (res) {
> - io->addr_space = IPMI_IO_ADDR_SPACE;
> - } else {
> - res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - if (res)
> - io->addr_space = IPMI_MEM_ADDR_SPACE;
> - }
> + res = platform_get_mem_or_io(pdev, 0);
>   if (!res) {
>   dev_err(>dev, "no I/O or memory address\n");
>   return NULL;
>   }
> - io->addr_data = res->start;
> + ipmi_set_addr_data_and_space(res, io);
>  
>   io->regspacing = DEFAULT_REGSPACING;
> - res_second = platform_get_resource(pdev,
> -(io->addr_space == IPMI_IO_ADDR_SPACE) ?
> - IORESOURCE_IO : IORESOURCE_MEM,
> -1);
> - if (res_second) {
> - if (res_second->start > io->addr_data)
> - io->regspacing = res_second->start - io->addr_data;
> - }
> + second = platform_get_mem_or_io(pdev, 1);
> + if (second && resource_type(second) == resource_type(res) && 
> second->start > io->addr_data)
> + io->regspacing = second->start - io->addr_data;
>  
>   return res;
>  }
> @@ -275,12 +272,7 @@ static int of_ipmi_probe(struct platform_device *pdev)
>   io.addr_source  = SI_DEVICETREE;
>   io.irq_setup= ipmi_std_irq_setup;
>  
> - if (resource.flags & IORESOURCE_IO)
> - io.addr_space = IPMI_IO_ADDR_SPACE;
> - else
> - io.addr_space = IPMI_MEM_ADDR_SPACE;
> -
> - io.addr_data= resource.start;
> + ipmi_set_addr_data_and_space(, );
>  
>   io.regsize  = regsize ? be32_to_cpup(regsize) : DEFAULT_REGSIZE;
>   io.regspacing   = regspacing ? be32_to_cpup(regspacing) : 
> DEFAULT_REGSPACING;
> -- 
> 2.30.2
> 


Re: [PATCH v1 06/10] ipmi_si: Reuse si_to_str array in ipmi_hardcode_init_one()

2021-04-02 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:16:45PM +0300, Andy Shevchenko wrote:
> Instead of making the comparison one by one, reuse si_to_str array
> in ipmi_hardcode_init_one() in conjunction with match_string() API.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/char/ipmi/ipmi_si.h  |  3 +++
>  drivers/char/ipmi/ipmi_si_hardcode.c | 23 +--
>  drivers/char/ipmi/ipmi_si_intf.c |  2 --
>  3 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si.h b/drivers/char/ipmi/ipmi_si.h
> index bac0ff86e48e..fd3167d1e1e9 100644
> --- a/drivers/char/ipmi/ipmi_si.h
> +++ b/drivers/char/ipmi/ipmi_si.h
> @@ -22,6 +22,9 @@ enum si_type {
>   SI_TYPE_INVALID, SI_KCS, SI_SMIC, SI_BT
>  };
>  
> +/* 'invalid' to allow a firmware-specified interface to be disabled */
> +static __maybe_unused const char *const si_to_str[] = { "invalid", "kcs", 
> "smic", "bt" };

Can we just make this non-static and leave the definition where it is?
That would save a little space and wouldn't affect performance at all.

-corey

> +
>  enum ipmi_addr_space {
>   IPMI_IO_ADDR_SPACE, IPMI_MEM_ADDR_SPACE
>  };
> diff --git a/drivers/char/ipmi/ipmi_si_hardcode.c 
> b/drivers/char/ipmi/ipmi_si_hardcode.c
> index f6ece7569504..cf3797523469 100644
> --- a/drivers/char/ipmi/ipmi_si_hardcode.c
> +++ b/drivers/char/ipmi/ipmi_si_hardcode.c
> @@ -80,26 +80,21 @@ static void __init ipmi_hardcode_init_one(const char 
> *si_type_str,
> enum ipmi_addr_space addr_space)
>  {
>   struct ipmi_plat_data p;
> + int t;
>  
>   memset(, 0, sizeof(p));
>  
>   p.iftype = IPMI_PLAT_IF_SI;
> - if (!si_type_str || !*si_type_str || strcmp(si_type_str, "kcs") == 0) {
> + if (!si_type_str || !*si_type_str) {
>   p.type = SI_KCS;
> - } else if (strcmp(si_type_str, "smic") == 0) {
> - p.type = SI_SMIC;
> - } else if (strcmp(si_type_str, "bt") == 0) {
> - p.type = SI_BT;
> - } else if (strcmp(si_type_str, "invalid") == 0) {
> - /*
> -  * Allow a firmware-specified interface to be
> -  * disabled.
> -  */
> - p.type = SI_TYPE_INVALID;
>   } else {
> - pr_warn("Interface type specified for interface %d, was 
> invalid: %s\n",
> - i, si_type_str);
> - return;
> + t = match_string(si_to_str, ARRAY_SIZE(si_to_str), si_type_str);
> + if (t < 0) {
> + pr_warn("Interface type specified for interface %d, was 
> invalid: %s\n",
> + i, si_type_str);
> + return;
> + }
> + p.type = t;
>   }
>  
>   p.regsize = regsizes[i];
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index be41a473e3c2..ff448098f185 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -70,8 +70,6 @@ enum si_intf_state {
>  #define IPMI_BT_INTMASK_CLEAR_IRQ_BIT2
>  #define IPMI_BT_INTMASK_ENABLE_IRQ_BIT   1
>  
> -static const char * const si_to_str[] = { "invalid", "kcs", "smic", "bt" };
> -
>  static bool initialized;
>  
>  /*
> -- 
> 2.30.2
> 


Re: [PATCH v1 05/10] ipmi_si: Introduce panic_event_str array

2021-04-02 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:16:44PM +0300, Andy Shevchenko wrote:
> Instead of twice repeat the constant literals, introduce
> panic_event_str array. It allows to simplify the code with
> help of match_string() API.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 49 ++---
>  1 file changed, 17 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index f19f0f967e28..c7d37366d7bb 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -52,8 +52,12 @@ static bool drvregistered;
>  enum ipmi_panic_event_op {
>   IPMI_SEND_PANIC_EVENT_NONE,
>   IPMI_SEND_PANIC_EVENT,
> - IPMI_SEND_PANIC_EVENT_STRING
> + IPMI_SEND_PANIC_EVENT_STRING,
> + IPMI_SEND_PANIC_EVENT_MAX
>  };

This is a nice change.  Can you add a comment here so that readers know
that the above enum and the following array are tied numerically?

-corey

> +
> +static const char *const panic_event_str[] = { "none", "event", "string", 
> NULL };
> +
>  #ifdef CONFIG_IPMI_PANIC_STRING
>  #define IPMI_PANIC_DEFAULT IPMI_SEND_PANIC_EVENT_STRING
>  #elif defined(CONFIG_IPMI_PANIC_EVENT)
> @@ -68,46 +72,27 @@ static int panic_op_write_handler(const char *val,
> const struct kernel_param *kp)
>  {
>   char valcp[16];
> - char *s;
> -
> - strncpy(valcp, val, 15);
> - valcp[15] = '\0';
> + int e;
>  
> - s = strstrip(valcp);
> -
> - if (strcmp(s, "none") == 0)
> - ipmi_send_panic_event = IPMI_SEND_PANIC_EVENT_NONE;
> - else if (strcmp(s, "event") == 0)
> - ipmi_send_panic_event = IPMI_SEND_PANIC_EVENT;
> - else if (strcmp(s, "string") == 0)
> - ipmi_send_panic_event = IPMI_SEND_PANIC_EVENT_STRING;
> - else
> - return -EINVAL;
> + strscpy(valcp, val, sizeof(valcp));
> + e = match_string(panic_event_str, -1, strstrip(valcp));
> + if (e < 0)
> + return e;
>  
> + ipmi_send_panic_event = e;
>   return 0;
>  }
>  
>  static int panic_op_read_handler(char *buffer, const struct kernel_param *kp)
>  {
> - switch (ipmi_send_panic_event) {
> - case IPMI_SEND_PANIC_EVENT_NONE:
> - strcpy(buffer, "none\n");
> - break;
> -
> - case IPMI_SEND_PANIC_EVENT:
> - strcpy(buffer, "event\n");
> - break;
> -
> - case IPMI_SEND_PANIC_EVENT_STRING:
> - strcpy(buffer, "string\n");
> - break;
> + const char *event_str;
>  
> - default:
> - strcpy(buffer, "???\n");
> - break;
> - }
> + if (ipmi_send_panic_event >= IPMI_SEND_PANIC_EVENT_MAX)
> + event_str = "???";
> + else
> + event_str = panic_event_str[ipmi_send_panic_event];
>  
> - return strlen(buffer);
> + return sprintf(buffer, "%s\n", event_str);
>  }
>  
>  static const struct kernel_param_ops panic_op_ops = {
> -- 
> 2.30.2
> 


Re: [PATCH v1 03/10] ipmi_si: Utilize temporary variable to hold device pointer

2021-04-02 Thread Corey Minyard
On Tue, Mar 30, 2021 at 09:16:42PM +0300, Andy Shevchenko wrote:
> By one of the previous clean up change we got a temporary variable to hold
> a device pointer. It can be utilized in other calls in the ->probe() and
> save a bit of LOCs.

The description here isn't accurate, there is no previous change where a
temporary variable comes in.  This change adds the temporary variable.

This change is ok, but doesn't add much value.

-corey

> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/char/ipmi/ipmi_si_platform.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_platform.c 
> b/drivers/char/ipmi/ipmi_si_platform.c
> index 009563073d30..954c297b459b 100644
> --- a/drivers/char/ipmi/ipmi_si_platform.c
> +++ b/drivers/char/ipmi/ipmi_si_platform.c
> @@ -309,6 +309,7 @@ static int find_slave_address(struct si_sm_io *io, int 
> slave_addr)
>  
>  static int acpi_ipmi_probe(struct platform_device *pdev)
>  {
> + struct device *dev = >dev;
>   struct si_sm_io io;
>   acpi_handle handle;
>   acpi_status status;
> @@ -318,21 +319,20 @@ static int acpi_ipmi_probe(struct platform_device *pdev)
>   if (!si_tryacpi)
>   return -ENODEV;
>  
> - handle = ACPI_HANDLE(>dev);
> + handle = ACPI_HANDLE(dev);
>   if (!handle)
>   return -ENODEV;
>  
>   memset(, 0, sizeof(io));
>   io.addr_source = SI_ACPI;
> - dev_info(>dev, "probing via ACPI\n");
> + dev_info(dev, "probing via ACPI\n");
>  
>   io.addr_info.acpi_info.acpi_handle = handle;
>  
>   /* _IFT tells us the interface type: KCS, BT, etc */
>   status = acpi_evaluate_integer(handle, "_IFT", NULL, );
>   if (ACPI_FAILURE(status)) {
> - dev_err(>dev,
> - "Could not find ACPI IPMI interface type\n");
> + dev_err(dev, "Could not find ACPI IPMI interface type\n");
>   return -EINVAL;
>   }
>  
> @@ -349,10 +349,11 @@ static int acpi_ipmi_probe(struct platform_device *pdev)
>   case 4: /* SSIF, just ignore */
>   return -ENODEV;
>   default:
> - dev_info(>dev, "unknown IPMI type %lld\n", tmp);
> + dev_info(dev, "unknown IPMI type %lld\n", tmp);
>   return -EINVAL;
>   }
>  
> + io.dev = dev;
>   io.regsize = DEFAULT_REGSIZE;
>   io.regshift = 0;
>  
> @@ -376,9 +377,7 @@ static int acpi_ipmi_probe(struct platform_device *pdev)
>  
>   io.slave_addr = find_slave_address(, io.slave_addr);
>  
> - io.dev = >dev;
> -
> - dev_info(io.dev, "%pR regsize %d spacing %d irq %d\n",
> + dev_info(dev, "%pR regsize %d spacing %d irq %d\n",
>res, io.regsize, io.regspacing, io.irq);
>  
>   request_module("acpi_ipmi");
> -- 
> 2.30.2
> 


Re: [PATCH] ipmi: Handle device properties with software node API

2021-03-05 Thread Corey Minyard
On Thu, Mar 04, 2021 at 12:03:12PM +0300, Heikki Krogerus wrote:
> The old device property API is going to be removed.
> Replacing the device_add_properties() call with the software
> node API equivalent, device_create_managed_software_node().

Ok, this has been queued for next release.

Thanks,

-corey

> 
> Signed-off-by: Heikki Krogerus 
> ---
>  drivers/char/ipmi/ipmi_plat_data.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_plat_data.c 
> b/drivers/char/ipmi/ipmi_plat_data.c
> index 28471ff2a3a3e..747b51ae01a80 100644
> --- a/drivers/char/ipmi/ipmi_plat_data.c
> +++ b/drivers/char/ipmi/ipmi_plat_data.c
> @@ -102,7 +102,7 @@ struct platform_device *ipmi_platform_add(const char 
> *name, unsigned int inst,
>   goto err;
>   }
>   add_properties:
> - rv = platform_device_add_properties(pdev, pr);
> + rv = device_create_managed_software_node(>dev, pr, NULL);
>   if (rv) {
>   dev_err(>dev,
>   "Unable to add hard-code properties: %d\n", rv);
> -- 
> 2.30.1
> 


Re: [PATCH] ipmi:ssif: make ssif_i2c_send() void

2021-03-01 Thread Corey Minyard
This looks ok, it's queued for 5.12.

Thanks,

-corey

On Mon, Mar 01, 2021 at 10:05:15PM +0800, Liguang Zhang wrote:
> This function actually needs no return value. So remove the unneeded
> check and make it void.
> 
> Signed-off-by: Liguang Zhang 
> ---
>  drivers/char/ipmi/ipmi_ssif.c | 81 +--
>  1 file changed, 20 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 0416b9c9d410..20d5af92966d 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -510,7 +510,7 @@ static int ipmi_ssif_thread(void *data)
>   return 0;
>  }
>  
> -static int ssif_i2c_send(struct ssif_info *ssif_info,
> +static void ssif_i2c_send(struct ssif_info *ssif_info,
>   ssif_i2c_done handler,
>   int read_write, int command,
>   unsigned char *data, unsigned int size)
> @@ -522,7 +522,6 @@ static int ssif_i2c_send(struct ssif_info *ssif_info,
>   ssif_info->i2c_data = data;
>   ssif_info->i2c_size = size;
>   complete(_info->wake_thread);
> - return 0;
>  }
>  
>  
> @@ -531,22 +530,12 @@ static void msg_done_handler(struct ssif_info 
> *ssif_info, int result,
>  
>  static void start_get(struct ssif_info *ssif_info)
>  {
> - int rv;
> -
>   ssif_info->rtc_us_timer = 0;
>   ssif_info->multi_pos = 0;
>  
> - rv = ssif_i2c_send(ssif_info, msg_done_handler, I2C_SMBUS_READ,
> -   SSIF_IPMI_RESPONSE,
> -   ssif_info->recv, I2C_SMBUS_BLOCK_DATA);
> - if (rv < 0) {
> - /* request failed, just return the error. */
> - if (ssif_info->ssif_debug & SSIF_DEBUG_MSG)
> - dev_dbg(_info->client->dev,
> - "Error from i2c_non_blocking_op(5)\n");
> -
> - msg_done_handler(ssif_info, -EIO, NULL, 0);
> - }
> + ssif_i2c_send(ssif_info, msg_done_handler, I2C_SMBUS_READ,
> +   SSIF_IPMI_RESPONSE,
> +   ssif_info->recv, I2C_SMBUS_BLOCK_DATA);
>  }
>  
>  static void retry_timeout(struct timer_list *t)
> @@ -620,7 +609,6 @@ static void msg_done_handler(struct ssif_info *ssif_info, 
> int result,
>  {
>   struct ipmi_smi_msg *msg;
>   unsigned long oflags, *flags;
> - int rv;
>  
>   /*
>* We are single-threaded here, so no need for a lock until we
> @@ -666,17 +654,10 @@ static void msg_done_handler(struct ssif_info 
> *ssif_info, int result,
>   ssif_info->multi_len = len;
>   ssif_info->multi_pos = 1;
>  
> - rv = ssif_i2c_send(ssif_info, msg_done_handler, I2C_SMBUS_READ,
> -   SSIF_IPMI_MULTI_PART_RESPONSE_MIDDLE,
> -   ssif_info->recv, I2C_SMBUS_BLOCK_DATA);
> - if (rv < 0) {
> - if (ssif_info->ssif_debug & SSIF_DEBUG_MSG)
> - dev_dbg(_info->client->dev,
> - "Error from i2c_non_blocking_op(1)\n");
> -
> - result = -EIO;
> - } else
> - return;
> + ssif_i2c_send(ssif_info, msg_done_handler, I2C_SMBUS_READ,
> +  SSIF_IPMI_MULTI_PART_RESPONSE_MIDDLE,
> +  ssif_info->recv, I2C_SMBUS_BLOCK_DATA);
> + return;
>   } else if (ssif_info->multi_pos) {
>   /* Middle of multi-part read.  Start the next transaction. */
>   int i;
> @@ -738,19 +719,12 @@ static void msg_done_handler(struct ssif_info 
> *ssif_info, int result,
>  
>   ssif_info->multi_pos++;
>  
> - rv = ssif_i2c_send(ssif_info, msg_done_handler,
> -I2C_SMBUS_READ,
> -SSIF_IPMI_MULTI_PART_RESPONSE_MIDDLE,
> -ssif_info->recv,
> -I2C_SMBUS_BLOCK_DATA);
> - if (rv < 0) {
> - if (ssif_info->ssif_debug & SSIF_DEBUG_MSG)
> - dev_dbg(_info->client->dev,
> - "Error from ssif_i2c_send\n");
> -
> - result = -EIO;
> - } else
> - return;
> + ssif_i2c_send(ssif_info, msg_done_handler,
> +   I2C_SMBUS_READ,
> +   SSIF_IPMI_MULTI_PART_RESPONSE_MIDDLE,
> +   ssif_info->recv,
> +   I2C_SMBUS_BLOCK_DATA);
> + return;
>   }
>   }
>  
> @@ -908,8 +882,6 @@ static void msg_done_handler(struct ssif_info *ssif_info, 
> int result,
>  static void msg_written_handler(struct ssif_info *ssif_info, int result,
> 

[GIT PULL] IPMI bug fixes for 5.12

2021-02-22 Thread Corey Minyard
The following changes since commit 76c057c84d286140c6c416c3b4ba832cd1d8984e:

  Merge branch 'parisc-5.11-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux (2021-01-27 
11:06:15 -0800)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.12-1

for you to fetch changes up to fc26067c7417e7fafed7bcc97bda155d91988734:

  ipmi: remove open coded version of SMBus block write (2021-01-28 07:15:12 
-0600)


Pull request for IPMI for 5.12

Only one change in this pull, but it's required for other things, so it
needs to go in.

-corey


Wolfram Sang (1):
  ipmi: remove open coded version of SMBus block write

 drivers/char/ipmi/ipmb_dev_int.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)



Re: [PATCH RESEND] ipmi: remove open coded version of SMBus block write

2021-01-28 Thread Corey Minyard
On Thu, Jan 28, 2021 at 01:53:50PM +0100, Wolfram Sang wrote:
> On Thu, Jan 28, 2021 at 06:37:57AM -0600, Corey Minyard wrote:
> > Looks good, do you want this in the IPMI tree or are you handling this
> > another way?
> 
> I can take it but would prefer the IPMI tree.

Ok, it's queued for next merge window.

-corey

> 
> Thanks!
> 




Re: [PATCH RESEND] ipmi: remove open coded version of SMBus block write

2021-01-28 Thread Corey Minyard
Looks good, do you want this in the IPMI tree or are you handling this
another way?

Thanks,

-corey

On Thu, Jan 28, 2021 at 09:55:43AM +0100, Wolfram Sang wrote:
> The block-write function of the core was not used because there was no
> client-struct to use. However, in this case it seems apropriate to use a
> temporary client struct. Because we are answering a request we recieved
> when being a client ourselves. So, convert the code to use a temporary
> client and use the block-write function of the I2C core.
> 
> Signed-off-by: Wolfram Sang 
> Reviewed-by: Asmaa Mnebhi 
> Acked-by: Corey Minyard 
> ---
> 
> No change since V1, Only added tags given in private communication.
> 
>  drivers/char/ipmi/ipmb_dev_int.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 382b28f1cf2f..49b8f22fdcf0 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -137,7 +137,7 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>  {
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   u8 rq_sa, netf_rq_lun, msg_len;
> - union i2c_smbus_data data;
> + struct i2c_client *temp_client;
>   u8 msg[MAX_MSG_LEN];
>   ssize_t ret;
>  
> @@ -160,21 +160,21 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>   }
>  
>   /*
> -  * subtract rq_sa and netf_rq_lun from the length of the msg passed to
> -  * i2c_smbus_xfer
> +  * subtract rq_sa and netf_rq_lun from the length of the msg. Fill the
> +  * temporary client. Note that its use is an exception for IPMI.
>*/
>   msg_len = msg[IPMB_MSG_LEN_IDX] - SMBUS_MSG_HEADER_LENGTH;
> - if (msg_len > I2C_SMBUS_BLOCK_MAX)
> - msg_len = I2C_SMBUS_BLOCK_MAX;
> + temp_client = kmemdup(ipmb_dev->client, sizeof(*temp_client), 
> GFP_KERNEL);
> + if (!temp_client)
> + return -ENOMEM;
> +
> + temp_client->addr = rq_sa;
>  
> - data.block[0] = msg_len;
> - memcpy([1], msg + SMBUS_MSG_IDX_OFFSET, msg_len);
> - ret = i2c_smbus_xfer(ipmb_dev->client->adapter, rq_sa,
> -  ipmb_dev->client->flags,
> -  I2C_SMBUS_WRITE, netf_rq_lun,
> -  I2C_SMBUS_BLOCK_DATA, );
> + ret = i2c_smbus_write_block_data(temp_client, netf_rq_lun, msg_len,
> +  msg + SMBUS_MSG_IDX_OFFSET);
> + kfree(temp_client);
>  
> - return ret ? : count;
> + return ret < 0 ? ret : count;
>  }
>  
>  static __poll_t ipmb_poll(struct file *file, poll_table *wait)
> -- 
> 2.28.0
> 


Re: [PATCH v5 3/5] ipmi: kcs: aspeed: Adapt to new LPC DTS layout

2021-01-22 Thread Corey Minyard
On Fri, Jan 22, 2021 at 09:55:56AM +, ChiaWei Wang wrote:
> Hi Corey,
> 
> Could you help to review this patch to kcs_bmc_aspeed.c?
> It mainly fixes the register layout/offsets of Aspeed LPC controller.

I am not really qualified to review this.  It looks ok from a structural
and style point of view, but that's all I can tell.  So I'm ok with it.

Acked-by: Corey Minyard 

> 
> Thanks,
> Chiawei
> 
> > -Original Message-
> > From: Andrew Jeffery 
> > Sent: Wednesday, January 20, 2021 1:03 PM
> > Subject: Re: [PATCH v5 3/5] ipmi: kcs: aspeed: Adapt to new LPC DTS layout
> > 
> > 
> > 
> > On Thu, 14 Jan 2021, at 23:46, Chia-Wei, Wang wrote:
> > > Add check against LPC device v2 compatible string to ensure that the
> > > fixed device tree layout is adopted.
> > > The LPC register offsets are also fixed accordingly.
> > >
> > > Signed-off-by: Chia-Wei, Wang 
> > > Acked-by: Haiyue Wang 
> > 
> > Reviewed-by: Andrew Jeffery 
> 
> Thanks for the review.


Re: [PATCH RFC 3/3] ipmi: remove open coded version of SMBus block write

2021-01-13 Thread Corey Minyard
On Tue, Jan 12, 2021 at 05:41:29PM +0100, Wolfram Sang wrote:
> The block-write function of the core was not used because there was no
> client-struct to use. However, in this case it seems apropriate to use a
> temporary client struct. Because we are answering a request we recieved
> when being a client ourselves. So, convert the code to use a temporary
> client and use the block-write function of the I2C core.

I asked the original authors of this about the change, and apparently is
results in a stack size warning.  Arnd Bergmann ask for it to be changed
from what you are suggesting to what it currently is.  See:

https://www.lkml.org/lkml/2019/6/19/440

So apparently this change will cause compile warnings due to the size of
struct i2c_client.

-corey

> 
> Signed-off-by: Wolfram Sang 
> ---
>  drivers/char/ipmi/ipmb_dev_int.c | 21 -
>  1 file changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 382b28f1cf2f..10d89886e5f3 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -137,7 +137,7 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>  {
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   u8 rq_sa, netf_rq_lun, msg_len;
> - union i2c_smbus_data data;
> + struct i2c_client temp_client;
>   u8 msg[MAX_MSG_LEN];
>   ssize_t ret;
>  
> @@ -160,21 +160,16 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>   }
>  
>   /*
> -  * subtract rq_sa and netf_rq_lun from the length of the msg passed to
> -  * i2c_smbus_xfer
> +  * subtract rq_sa and netf_rq_lun from the length of the msg. Fill the
> +  * temporary client. Note that its use is an exception for IPMI.
>*/
>   msg_len = msg[IPMB_MSG_LEN_IDX] - SMBUS_MSG_HEADER_LENGTH;
> - if (msg_len > I2C_SMBUS_BLOCK_MAX)
> - msg_len = I2C_SMBUS_BLOCK_MAX;
> + memcpy(_client, ipmb_dev->client, sizeof(temp_client));
> + temp_client.addr = rq_sa;
>  
> - data.block[0] = msg_len;
> - memcpy([1], msg + SMBUS_MSG_IDX_OFFSET, msg_len);
> - ret = i2c_smbus_xfer(ipmb_dev->client->adapter, rq_sa,
> -  ipmb_dev->client->flags,
> -  I2C_SMBUS_WRITE, netf_rq_lun,
> -  I2C_SMBUS_BLOCK_DATA, );
> -
> - return ret ? : count;
> + ret = i2c_smbus_write_block_data(_client, netf_rq_lun, msg_len,
> +  msg + SMBUS_MSG_IDX_OFFSET);
> + return ret < 0 ? ret : count;
>  }
>  
>  static __poll_t ipmb_poll(struct file *file, poll_table *wait)
> -- 
> 2.29.2
> 


Re: ipmi_msghandler.c question

2021-01-08 Thread Corey Minyard
On Fri, Jan 08, 2021 at 11:37:04PM +, Asmaa Mnebhi wrote:
> Hi Corey,
> 
> I have a question for you related to the following function in 
> ipmi_msghandler.c
> 
> static void __get_guid(struct ipmi_smi *intf)
> {
>   int rv;
>   struct bmc_device *bmc = intf->bmc;
> 
>   bmc->dyn_guid_set = 2;
>   intf->null_user_handler = guid_handler;
>   rv = send_guid_cmd(intf, 0);
>   if (rv)
>   /* Send failed, no GUID available. */
>   bmc->dyn_guid_set = 0;
>   else
>   wait_event(intf->waitq, bmc->dyn_guid_set != 2);
> 
>   /* dyn_guid_set makes the guid data available. */
>   smp_rmb();
> 
>   intf->null_user_handler = NULL;
> }
> 
> Why is wait_event used as opposed to wait_event_timeout? In the context where 
> the dyn_guid_set value doesn't change from 2, this would run forever. 
> Wouldn't we want to timeout after a certain amount of time?
> 

The low-level IPMI driver is guarateed to return a response to a
message, though if something goes wrong with the BMC it can take a few
seconds to return the failure message.  So it shouldn't be an issue.

-corey

> Thanks.
> Asmaa


Re: [PATCH 2/2] drivers:tty:pty: Fix a race causing data loss on close

2021-01-02 Thread Corey Minyard
On Mon, Nov 23, 2020 at 06:49:02PM -0600, miny...@acm.org wrote:
> From: Corey Minyard 
> 
> Remove the tty_vhangup() from the pty code and just release the
> redirect.  The tty_vhangup() results in data loss and data out of order
> issues.

It's been a while, so ping on this.  I'm pretty sure this is the right
fix, the more I've thought about it.

Thankks,

-corey

> 
> If you write to a pty master an immediately close the pty master, the
> receiver might get a chunk of data dropped, but then receive some later
> data.  That's obviously something rather unexpected for a user.  It
> certainly confused my test program.
> 
> It turns out that tty_vhangup() on the slave pty gets called from
> pty_close(), and that causes the data on the slave side to be flushed,
> but due to races more data can be copied into the slave side's buffer
> after that.  Consider the following sequence:
> 
> thread1  thread2 thread3
> ---  --- ---
>  ||-write data into buffer,
>  || n_tty buffer is filled
>  || along with other buffers
>  ||-pty_close(master)
>  ||--tty_vhangup(slave)
>  ||---tty_ldisc_hangup()
>  ||n_tty_flush_buffer()
>  ||-reset_buffer_flags()
>  |-n_tty_read()   |
>  |--up_read(>termios_rwsem);
>  ||--down_read(>termios_rwsem)
>  ||--clear n_tty buffer contents
>  ||--up_read(>termios_rwsem)
>  |--tty_buffer_flush_work()   |
>  |--schedules work calling|
>  |  flush_to_ldisc()  |
>  ||-flush_to_ldisc()
>  ||--receive_buf()
>  ||---tty_port_default_receive_buf()
>  ||tty_ldisc_receive_buf()
>  ||-n_tty_receive_buf2()
>  ||--n_tty_receive_buf_common()
>  ||---down_read(>termios_rwsem)
>  ||---__receive_buf()
>  ||   copies data into n_tty buffer
>  ||---up_read(>termios_rwsem)
>  |--down_read(>termios_rwsem)
>  |--copy buffer data to user
> 
> From this sequence, you can see that thread2 writes to the buffer then
> only clears the part of the buffer in n_tty.  The n_tty receive buffer
> code then copies more data into the n_tty buffer.
> 
> But part of the vhangup, releasing the redirect, is still required to
> avoid issues with consoles running on pty slaves.  So do that.
> As far as I can tell, that is all that should be required.
> 
> Signed-off-by: Corey Minyard 
> ---
>  drivers/tty/pty.c| 15 +--
>  drivers/tty/tty_io.c |  5 +++--
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
> index 23368cec7ee8..29be6b985e76 100644
> --- a/drivers/tty/pty.c
> +++ b/drivers/tty/pty.c
> @@ -67,7 +67,8 @@ static void pty_close(struct tty_struct *tty, struct file 
> *filp)
>   wake_up_interruptible(>link->read_wait);
>   wake_up_interruptible(>link->write_wait);
>   if (tty->driver->subtype == PTY_TYPE_MASTER) {
> - set_bit(TTY_OTHER_CLOSED, >flags);
> + struct file *f;
> +
>  #ifdef CONFIG_UNIX98_PTYS
>   if (tty->driver == ptm_driver) {
>   mutex_lock(_mutex);
> @@ -76,7 +77,17 @@ static void pty_close(struct tty_struct *tty, struct file 
> *filp)
>   mutex_unlock(_mutex);
>   }
>  #endif
> - tty_vhangup(tty->link);
> +
> + /*
> +  * This hack is required because a program can open a
> +  * pty and redirect a console to it, but if the pty is
> +  * closed and the console is not released, then the
> +  * slave side will never close.  So release the
> +  * redirect when the master closes.
> +  */
> + f = tty_release_redirect(tty->link);
> + if (f)
> + fput(f);
>   }
>  }
>  
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 571b1d7d4d5a..91c33a0df3c4 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -547,7 +547,9 @@ EXPORT_SYMBOL_GPL(tty_wakeup);
>   *   @tty: tty device
>   *
>   *   This is available to the pty code so if the master closes, if the
> - *   slave is a redirect it can release t

[GIT PULL] IPMI bug fixes for 5.11

2020-12-16 Thread Corey Minyard
Some very minor fixes.  One came it today, but it was just changing
some commas to semicolons.  The rest have been lying around a month or
more.

The following changes since commit 9ff9b0d392ea08090cd1780fb196f36dbb586529:

  Merge tag 'net-next-5.10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next (2020-10-15 
18:42:13 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.11-1

for you to fetch changes up to fad0319cacdf02a8d4d31aa1d8dc18c5bd5e397e:

  char: ipmi: convert comma to semicolon (2020-12-16 07:54:54 -0600)


Some very minor changes

Nothing functional, just little syntax cleanups and a RCU warning
suppression.


Qinglang Miao (1):
  ipmi: msghandler: Suppress suspicious RCU usage warning

Tom Rix (1):
  char: ipmi: remove unneeded break

Yejune Deng (1):
  ipmi/watchdog: replace atomic_add() and atomic_sub()

Zheng Yongjun (1):
  char: ipmi: convert comma to semicolon

 drivers/char/ipmi/bt-bmc.c  | 6 +++---
 drivers/char/ipmi/ipmi_devintf.c| 1 -
 drivers/char/ipmi/ipmi_msghandler.c | 3 ++-
 drivers/char/ipmi/ipmi_watchdog.c   | 8 
 4 files changed, 9 insertions(+), 9 deletions(-)



Re: [PATCH] ipmi: msghandler: Suppress suspicious RCU usage warning

2020-11-19 Thread Corey Minyard
On Thu, Nov 19, 2020 at 03:08:39PM +0800, Qinglang Miao wrote:
> while running ipmi, ipmi_smi_watcher_register() caused
> a suspicious RCU usage warning.

Thanks.  I had looked at this and found it was ok, but I hand't spent
the time to figure out how to suppress it.  It's in my next queue.

-corey

> 
> -
> 
> =
> WARNING: suspicious RCU usage
> 5.10.0-rc3+ #1 Not tainted
> -
> drivers/char/ipmi/ipmi_msghandler.c:750 RCU-list traversed in non-reader 
> section!!
> other info that might help us debug this:
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by syz-executor.0/4254:
> stack backtrace:
> CPU: 0 PID: 4254 Comm: syz-executor.0 Not tainted 5.10.0-rc3+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 
> 04/ 01/2014
> Call Trace:
> dump_stack+0x19d/0x200
> ipmi_smi_watcher_register+0x2d3/0x340 [ipmi_msghandler]
> acpi_ipmi_init+0xb1/0x1000 [acpi_ipmi]
> do_one_initcall+0x149/0x7e0
> do_init_module+0x1ef/0x700
> load_module+0x3467/0x4140
> __do_sys_finit_module+0x10d/0x1a0
> do_syscall_64+0x34/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x468ded
> 
> -
> 
> It is safe because smi_watchers_mutex is locked and srcu_read_lock
> has been used, so simply pass lockdep_is_held() to the
> list_for_each_entry_rcu() to suppress this warning.
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Qinglang Miao 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 8774a3b8f..c44ad1846 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -747,7 +747,8 @@ int ipmi_smi_watcher_register(struct ipmi_smi_watcher 
> *watcher)
>   list_add(>link, _watchers);
>  
>   index = srcu_read_lock(_interfaces_srcu);
> - list_for_each_entry_rcu(intf, _interfaces, link) {
> + list_for_each_entry_rcu(intf, _interfaces, link,
> + lockdep_is_held(_watchers_mutex)) {
>   int intf_num = READ_ONCE(intf->intf_num);
>  
>   if (intf_num == -1)
> -- 
> 2.23.0
> 


Re: [PATCH] ipmi/watchdog: replace atomic_add() and atomic_sub()

2020-11-17 Thread Corey Minyard
On Mon, Nov 16, 2020 at 03:30:07PM +0800, Yejune Deng wrote:
> atomic_inc() and atomic_dec() looks better

Yes, that's a little neater.  Queued for next release.

Thanks,

-corey

> 
> Signed-off-by: Yejune Deng 
> ---
>  drivers/char/ipmi/ipmi_watchdog.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_watchdog.c 
> b/drivers/char/ipmi/ipmi_watchdog.c
> index f78156d..32c334e 100644
> --- a/drivers/char/ipmi/ipmi_watchdog.c
> +++ b/drivers/char/ipmi/ipmi_watchdog.c
> @@ -495,7 +495,7 @@ static void panic_halt_ipmi_heartbeat(void)
>   msg.cmd = IPMI_WDOG_RESET_TIMER;
>   msg.data = NULL;
>   msg.data_len = 0;
> - atomic_add(1, _done_count);
> + atomic_inc(_done_count);
>   rv = ipmi_request_supply_msgs(watchdog_user,
> (struct ipmi_addr *) ,
> 0,
> @@ -505,7 +505,7 @@ static void panic_halt_ipmi_heartbeat(void)
> _halt_heartbeat_recv_msg,
> 1);
>   if (rv)
> - atomic_sub(1, _done_count);
> + atomic_dec(_done_count);
>  }
>  
>  static struct ipmi_smi_msg panic_halt_smi_msg = {
> @@ -529,12 +529,12 @@ static void panic_halt_ipmi_set_timeout(void)
>   /* Wait for the messages to be free. */
>   while (atomic_read(_done_count) != 0)
>   ipmi_poll_interface(watchdog_user);
> - atomic_add(1, _done_count);
> + atomic_inc(_done_count);
>   rv = __ipmi_set_timeout(_halt_smi_msg,
>   _halt_recv_msg,
>   _heartbeat_now);
>   if (rv) {
> - atomic_sub(1, _done_count);
> + atomic_dec(_done_count);
>   pr_warn("Unable to extend the watchdog timeout\n");
>   } else {
>   if (send_heartbeat_now)
> -- 
> 1.9.1
> 


Re: [PATCH] char: ipmi: remove unneeded break

2020-10-27 Thread Corey Minyard
On Mon, Oct 19, 2020 at 12:48:05PM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> A break is not needed if it is preceded by a return

Ok, it's in my next tree.

Thanks,

-corey

> 
> Signed-off-by: Tom Rix 
> ---
>  drivers/char/ipmi/ipmi_devintf.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_devintf.c 
> b/drivers/char/ipmi/ipmi_devintf.c
> index f7b1c004a12b..3dd1d5abb298 100644
> --- a/drivers/char/ipmi/ipmi_devintf.c
> +++ b/drivers/char/ipmi/ipmi_devintf.c
> @@ -490,7 +490,6 @@ static long ipmi_ioctl(struct file   *file,
>   }
>  
>   return ipmi_set_my_address(priv->user, val.channel, val.value);
> - break;
>   }
>  
>   case IPMICTL_GET_MY_CHANNEL_ADDRESS_CMD:
> -- 
> 2.18.1
> 


Re: [Openipmi-developer] [PATCH] ipmi_si: replace spin_lock_irqsave by spin_lock in hard IRQ

2020-10-27 Thread Corey Minyard
On Sat, Oct 17, 2020 at 09:40:10AM +0800, Tian Tao wrote:
> It is redundant to do irqsave and irqrestore in hardIRQ context.

Are ACPI GPEs run in hardirq context?  I looked around a bit and
couldn't tell.  If not, then I can't take this patch.  Otherwise, it's
ok.

-corey

> 
> Signed-off-by: Tian Tao 
> ---
>  drivers/char/ipmi/ipmi_si_intf.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index 45546ac..97452a8 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -1116,7 +1116,6 @@ static void smi_timeout(struct timer_list *t)
>  irqreturn_t ipmi_si_irq_handler(int irq, void *data)
>  {
>   struct smi_info *smi_info = data;
> - unsigned long   flags;
>  
>   if (smi_info->io.si_type == SI_BT)
>   /* We need to clear the IRQ flag for the BT interface. */
> @@ -1124,14 +1123,14 @@ irqreturn_t ipmi_si_irq_handler(int irq, void *data)
>IPMI_BT_INTMASK_CLEAR_IRQ_BIT
>| IPMI_BT_INTMASK_ENABLE_IRQ_BIT);
>  
> - spin_lock_irqsave(&(smi_info->si_lock), flags);
> + spin_lock(&(smi_info->si_lock));
>  
>   smi_inc_stat(smi_info, interrupts);
>  
>   debug_timestamp("Interrupt");
>  
>   smi_event_handler(smi_info, 0);
> - spin_unlock_irqrestore(&(smi_info->si_lock), flags);
> + spin_unlock(&(smi_info->si_lock));
>   return IRQ_HANDLED;
>  }
>  
> -- 
> 2.7.4
> 
> 
> 
> ___
> Openipmi-developer mailing list
> openipmi-develo...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openipmi-developer


[GIT PULL] IPMI bug fixes for 5.10

2020-10-13 Thread Corey Minyard
The following changes since commit fc80c51fd4b23ec007e88d4c688f2cac1b8648e7:

  Merge tag 'kbuild-v5.9' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild 
(2020-08-09 14:10:26 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.10-1

for you to fetch changes up to 8fe7990ceda8597e407d06bffc4bdbe835a93ece:

  ipmi_si: Fix wrong return value in try_smi_init() (2020-10-05 13:30:51 -0500)


Bug fix pull for IPMI for 5.10

Some minor bug fixes, return values, cleanups of prints, conversion of
tasklets to the new API.

The biggest change is retrying the initial information fetch from the
management controller.  If that fails, the iterface is not operational,
and one group was having trouble with the management controller not
being ready when the OS started up.  So a retry was added.


Allen Pais (1):
  char: ipmi: convert tasklets to use new tasklet_setup() API

Corey Minyard (1):
  ipmi: Clean up some printks

Dan Carpenter (1):
  ipmi: msghandler: Fix a signedness bug

Markus Boehme (1):
  ipmi: Reset response handler when failing to send the command

Tianjia Zhang (1):
  ipmi_si: Fix wrong return value in try_smi_init()

Xianting Tian (3):
  ipmi:sm: Print current state when the state is invalid
  ipmi:msghandler: retry to get device id on an error
  ipmi: add retry in try_get_dev_id()

Xiongfeng Wang (1):
  ipmi: add a newline when printing parameter 'panic_op' by sysfs

 drivers/char/ipmi/ipmi_bt_sm.c  |  4 ++-
 drivers/char/ipmi/ipmi_kcs_sm.c | 15 +++
 drivers/char/ipmi/ipmi_msghandler.c | 52 +
 drivers/char/ipmi/ipmi_si_intf.c| 19 +-
 drivers/char/ipmi/ipmi_smic_sm.c| 35 +++--
 include/linux/ipmi.h|  2 ++
 include/uapi/linux/ipmi_msgdefs.h   |  2 ++
 7 files changed, 93 insertions(+), 36 deletions(-)



Re: [Openipmi-developer] [PATCH 3/3] ipmi: Add timeout waiting for channel information

2020-10-07 Thread Corey Minyard
On Thu, Sep 10, 2020 at 11:08:40AM +, Boehme, Markus via Openipmi-developer 
wrote:
> > > - && ipmi_version_minor(id) >= 5)) {
> > > - unsigned int set;
> > > + if (ipmi_version_major(id) == 1 && ipmi_version_minor(id) < 5) {
> > This is incorrect, it will not correctly handle IPMI 0.x BMCs.  Yes,
> > they exist.
> 
> Interesting! I wasn't aware of those. Searching the web doesn't turn up
> much and the spec doesn't mention them either. Are these pre-release
> implementations of the IPMI 1.0 spec or some kind of "IPMI light"?

There was an 0.9 version of the spec that some machines implemented.
It's not really a "light" version, it's just a really early version.  I
don't know how many machine out there still implement it, but I try to
keep them working if I can.

Thanks,

-corey

> 
> Markus


Re: [PATCH] ipmi_si: Fix wrong return value in try_smi_init()

2020-10-05 Thread Corey Minyard
On Mon, Oct 05, 2020 at 10:52:12PM +0800, Tianjia Zhang wrote:
> On an error exit path, a negative error code should be returned
> instead of a positive return value.

Thanks!  In my tree for the next release.

-corey

> 
> Fixes: 90b2d4f15ff7 ("ipmi_si: Remove hacks for adding a dummy platform 
> devices")
> Cc: Corey Minyard 
> Signed-off-by: Tianjia Zhang 
> ---
>  drivers/char/ipmi/ipmi_si_intf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index 77b8d551ae7f..dd559661c15b 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -1963,7 +1963,7 @@ static int try_smi_init(struct smi_info *new_smi)
>   /* Do this early so it's available for logs. */
>   if (!new_smi->io.dev) {
>   pr_err("IPMI interface added with no device\n");
> - rv = EIO;
> + rv = -EIO;
>   goto out_err;
>   }
>  
> -- 
> 2.24.3 (Apple Git-128)
> 


Re: [PATCH] dt-bindings: Another round of adding missing 'additionalProperties'

2020-10-03 Thread Corey Minyard
;  
> +additionalProperties: false
> +
>  examples:
>- |
>  #include 
> diff --git a/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml 
> b/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml
> index 12aa16f24772..f8a932be0d10 100644
> --- a/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml
> +++ b/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml
> @@ -61,6 +61,8 @@ required:
>- sharp,proximity-far-hysteresis
>- sharp,proximity-close-hysteresis
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  #include 
> diff --git 
> a/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8975.yaml 
> b/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8975.yaml
> index f0b336ac39c9..a25590a16ba7 100644
> --- 
> a/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8975.yaml
> +++ 
> b/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8975.yaml
> @@ -55,6 +55,8 @@ required:
>- compatible
>- reg
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  #include 
> diff --git 
> a/Documentation/devicetree/bindings/iio/proximity/vishay,vcnl3020.yaml 
> b/Documentation/devicetree/bindings/iio/proximity/vishay,vcnl3020.yaml
> index 51dba64037f6..fbd3a2e32280 100644
> --- a/Documentation/devicetree/bindings/iio/proximity/vishay,vcnl3020.yaml
> +++ b/Documentation/devicetree/bindings/iio/proximity/vishay,vcnl3020.yaml
> @@ -47,6 +47,8 @@ required:
>- compatible
>- reg
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  i2c {
> diff --git 
> a/Documentation/devicetree/bindings/interrupt-controller/ingenic,intc.yaml 
> b/Documentation/devicetree/bindings/interrupt-controller/ingenic,intc.yaml
> index 02a3cf470518..0a046be8d1cd 100644
> --- a/Documentation/devicetree/bindings/interrupt-controller/ingenic,intc.yaml
> +++ b/Documentation/devicetree/bindings/interrupt-controller/ingenic,intc.yaml
> @@ -49,6 +49,8 @@ required:
>- "#interrupt-cells"
>- interrupt-controller
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  intc: interrupt-controller@10001000 {
> diff --git 
> a/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-msi.yaml
>  
> b/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-msi.yaml
> index 1b256d9dd92a..1f6fd73d4624 100644
> --- 
> a/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-msi.yaml
> +++ 
> b/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-msi.yaml
> @@ -46,6 +46,8 @@ required:
>- loongson,msi-base-vec
>- loongson,msi-num-vecs
>  
> +additionalProperties: true #fixme
> +
>  examples:
>- |
>  #include 
> diff --git 
> a/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-pic.yaml
>  
> b/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-pic.yaml
> index a6dcbb2971a9..fdd6a38a31db 100644
> --- 
> a/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-pic.yaml
> +++ 
> b/Documentation/devicetree/bindings/interrupt-controller/loongson,pch-pic.yaml
> @@ -41,6 +41,8 @@ required:
>- interrupt-controller
>- '#interrupt-cells'
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  #include 
> diff --git a/Documentation/devicetree/bindings/ipmi/ipmi-smic.yaml 
> b/Documentation/devicetree/bindings/ipmi/ipmi-smic.yaml
> index 58fa76ee6176..898e3267893a 100644
> --- a/Documentation/devicetree/bindings/ipmi/ipmi-smic.yaml
> +++ b/Documentation/devicetree/bindings/ipmi/ipmi-smic.yaml
> @@ -49,6 +49,8 @@ required:
>- compatible
>- reg
>  
> +additionalProperties: false
> +
>  examples:
>- |
>  smic@fff3a000 {

For IPMI:

Reviewd-by: Corey Minyard 

> diff --git a/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml 
> b/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml
> index b1bb3feb0f4d..cd877e817ad1 100644
> --- a/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml
> +++ b/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml
> @@ -58,6 +58,12 @@ properties:
>- 2 # D1~6 with VOUT, D7~9 with VDD
>- 3 # D1~9 are connected to VOUT
>  
> +  '#address-cells':
> +const: 1
> +
> +  '#size-cells':
> +const: 0
> +
>  patternProperties:
>"(^led@[0-9a-f]$|led)":
>  type: object
> @@ -98,6 +104,8 @@ required:
>- compatible
>- reg
>  
> +additionalProperties: false
> +
>  examples:
>- |
> #include 
> diff --git a/Documentation/devicetree/bindings/me

Re: [PATCH] MAINTAINERS: exclude char maintainers from things they do not maintain

2020-09-30 Thread Corey Minyard
On Wed, Sep 30, 2020 at 02:10:07PM +0200, Greg Kroah-Hartman wrote:
> There are a number of subdirectories and files in drivers/char/ that
> have their own maintainers and developers and ways of getting patches to
> Linus.  This includes random.c, IPMI, hardware random drivers, TPM
> drivers, and agp drivers.  Instead of sending those patches to Arnd and
> myself, who can't do anything with them, send them to the proper
> developers instead.
> 
> Cc: Arnd Bergmann 
> Signed-off-by: Greg Kroah-Hartman 

Yes, please do.  No reason for you to get all the noise from these.

Acked-by: Corey Minyard 

> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d6b9445649e5..a6f0a3ec0047 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4101,6 +4101,11 @@ T: git 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
>  F:   drivers/char/
>  F:   drivers/misc/
>  F:   include/linux/miscdevice.h
> +X:   drivers/char/agp/
> +X:   drivers/char/hw_random/
> +X:   drivers/char/ipmi/
> +X:   drivers/char/random.c
> +X:   drivers/char/tpm/
>  
>  CHECKPATCH
>  M:   Andy Whitcroft 


Re: Bug with data getting dropped on a pty

2020-09-27 Thread Corey Minyard
On Sun, Sep 27, 2020 at 01:37:51PM +0200, Greg Kroah-Hartman wrote:
> On Fri, Sep 25, 2020 at 05:05:36PM -0500, Corey Minyard wrote:
> > I've been trying to track down a bug in a library I support (named
> > gensio; it does all kinds of stream I/O) and I have figured out that
> > the problem is not in the library, it's in the kernel.  I have
> > attached a reproducer program, more on how to run it later.
> > 
> > Basically, if you have a pty master and do the following:
> > 
> >   write(ptym, data, size);
> >   close(ptym);
> > 
> > The other end will occasionally not get the first 4095 bytes of data,
> > but it will get byte 4095 and on.  This only happens on SMP systems; I
> > couldn't reproduce with just one processor.  (Running under qemu I
> > have seen it drop 2048 bytes, but it has always been 4095 outside of a
> > VM.)  I have tested on Ubuntu 18.04.5 x86_64, the base 5.4 kernel, on a
> > raspberry pi running raspian, 5.4.51 kernel, and the latest on the
> > master branch of Linus' tree running under qemu on x86_64.
> > 
> > I have never seen it fail going the other way (writing to the slave
> > and reading from the master) and that's part of the test suite.
> > 
> > I'm ok with it not getting any of the data, I'm ok with it getting
> > some of the data at the beginning, but dropping a chunk of the data
> > and getting later data is a problem.
> > 
> > I've looked at the pty and tty code and I haven't found anything
> > obvious, but I haven't looked that hard and I don't know that code
> > very well.
> > 
> > To run the reproducer:
> > 
> >   gcc -g -o testpty testpty.c
> >   ulimit -c unlimited
> >   while ./testpty; do echo pass; done
> > 
> > It should fail pretty quickly; it asserts when it detects the error.
> > You can load the core dump into the debugger.  Note that I wasn't able
> > to reproduce running it in the debugger.
> > 
> > In the debugger, you can back up to the assert and look at the readbuf:
> > 
> > (gdb) x/30xb readbuf
> > 0x559e5e9c6080 :   0xff0x080x000x080x010x08
> > 0x020x08
> > 0x559e5e9c6088 : 0x030x080x040x080x050x08
> > 0x060x08
> > 0x559e5e9c6090 :0x070x080x080x080x09
> > 0x080x0a0x08
> > 0x559e5e9c6098 :0x0b0x080x0c0x080x0d
> > 0x08
> > 
> > verses the data that was sent:
> > 
> > 0x559e5e9b6080 :  0x000x000x000x010x000x02
> > 0x000x03
> > 0x559e5e9b6088 :0x000x040x000x050x000x06
> > 0x000x07
> > 0x559e5e9b6090 :   0x000x080x000x090x000x0a
> > 0x000x0b
> > 0x559e5e9b6098 :   0x000x0c0x000x0d0x000x0e
> > 
> > The data is two byte big endian numbers ascending, the data in readbuf
> > that was read by the reader thread is the data starting at position
> > 4095 in the data buffer that was transmitted.  Since n_tty has a 4096
> > byte buffer, that's somewhat suspicious.
> > 
> > Though the reproducer always fails on the first buffer, the test
> > program I had would close in random places, it would fail at places
> > besides the beginning of the buffer.
> > 
> > I searched and I couldn't find any error report on this.
> > 
> > -corey
> 
> > 
> > #define _XOPEN_SOURCE 600
> > #define _DEFAULT_SOURCE
> > 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > 
> > static int pty_make_raw(int ptym)
> > {
> > struct termios t;
> > int err;
> > 
> > err = tcgetattr(ptym, );
> > if (err)
> > return err;
> > 
> > cfmakeraw();
> > return tcsetattr(ptym, TCSANOW, );
> > }
> > 
> > unsigned char data[65536];
> > unsigned char readbuf[65536];
> > int slavefd, slaveerr;
> > size_t readsize;
> > 
> > int
> > cmp_mem(unsigned char *buf, unsigned char *buf2, size_t len, size_t pos)
> > {
> > size_t i;
> > int rv = 0;
> > 
> > for (i = 0; i < len; i++) {
> > if (buf[i] != buf2[i]) {
> > printf("Mismatch on byte %lu, expected 0x%2.2x, got 0x%2.2x\n",
> >(long) (i + pos), buf[i], buf2[i]);
> > fflush(stdout);
> > rv = -1;
> > break;
> > }
> > }
> > return 

Re: [PATCH 09/11] drivers/char/ipmi: convert stats to use counter_atomic32

2020-09-25 Thread Corey Minyard
On Fri, Sep 25, 2020 at 05:47:23PM -0600, Shuah Khan wrote:
> counter_atomic* is introduced to be used when a variable is used as
> a simple counter and doesn't guard object lifetimes. This clearly
> differentiates atomic_t usages that guard object lifetimes.
> 
> counter_atomic* variables will wrap around to 0 when it overflows and
> should not be used to guard resource lifetimes, device usage and
> open counts that control state changes, and pm states.
> 
> atomic_t variables used for stats are atomic counters. Overflow will
> wrap around and reset the stats and no change with the conversion.
> 
> Convert them to use counter_atomic32.
> 
> Signed-off-by: Shuah Khan 

Reviewed-by: Corey Minyard 

I assume for this conversion that the plan is to eliminate atomic_t
completely and convert all atomic counters used for object lifetime to
struct kref?  The new naming is certainly more clear and I'm happy with
this change.

-corey

> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 9 +
>  drivers/char/ipmi/ipmi_si_intf.c| 9 +
>  2 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6b24ea..36c0b1be22fb 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define IPMI_DRIVER_VERSION "39.2"
>  
> @@ -584,7 +585,7 @@ struct ipmi_smi {
>   struct ipmi_my_addrinfo addrinfo[IPMI_MAX_CHANNELS];
>   bool channels_ready;
>  
> - atomic_t stats[IPMI_NUM_STATS];
> + struct counter_atomic32 stats[IPMI_NUM_STATS];
>  
>   /*
>* run_to_completion duplicate of smb_info, smi_info
> @@ -630,9 +631,9 @@ static LIST_HEAD(smi_watchers);
>  static DEFINE_MUTEX(smi_watchers_mutex);
>  
>  #define ipmi_inc_stat(intf, stat) \
> - atomic_inc(&(intf)->stats[IPMI_STAT_ ## stat])
> + counter_atomic32_inc(&(intf)->stats[IPMI_STAT_ ## stat])
>  #define ipmi_get_stat(intf, stat) \
> - ((unsigned int) atomic_read(&(intf)->stats[IPMI_STAT_ ## stat]))
> + ((unsigned int) counter_atomic32_read(&(intf)->stats[IPMI_STAT_ ## 
> stat]))
>  
>  static const char * const addr_src_to_str[] = {
>   "invalid", "hotmod", "hardcoded", "SPMI", "ACPI", "SMBIOS", "PCI",
> @@ -3448,7 +3449,7 @@ int ipmi_add_smi(struct module *owner,
>   INIT_LIST_HEAD(>cmd_rcvrs);
>   init_waitqueue_head(>waitq);
>   for (i = 0; i < IPMI_NUM_STATS; i++)
> - atomic_set(>stats[i], 0);
> + counter_atomic32_set(>stats[i], 0);
>  
>   mutex_lock(_interfaces_mutex);
>   /* Look for a hole in the numbers. */
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index 77b8d551ae7f..0909a3461f05 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -43,6 +43,7 @@
>  #include "ipmi_si_sm.h"
>  #include 
>  #include 
> +#include 
>  
>  /* Measure times between events in the driver. */
>  #undef DEBUG_TIMING
> @@ -237,7 +238,7 @@ struct smi_info {
>   bool dev_group_added;
>  
>   /* Counters and things for the proc filesystem. */
> - atomic_t stats[SI_NUM_STATS];
> + struct counter_atomic32 stats[SI_NUM_STATS];
>  
>   struct task_struct *thread;
>  
> @@ -245,9 +246,9 @@ struct smi_info {
>  };
>  
>  #define smi_inc_stat(smi, stat) \
> - atomic_inc(&(smi)->stats[SI_STAT_ ## stat])
> + counter_atomic32_inc(&(smi)->stats[SI_STAT_ ## stat])
>  #define smi_get_stat(smi, stat) \
> - ((unsigned int) atomic_read(&(smi)->stats[SI_STAT_ ## stat]))
> + ((unsigned int) counter_atomic32_read(&(smi)->stats[SI_STAT_ ## stat]))
>  
>  #define IPMI_MAX_INTFS 4
>  static int force_kipmid[IPMI_MAX_INTFS];
> @@ -2013,7 +2014,7 @@ static int try_smi_init(struct smi_info *new_smi)
>   atomic_set(_smi->req_events, 0);
>   new_smi->run_to_completion = false;
>   for (i = 0; i < SI_NUM_STATS; i++)
> - atomic_set(_smi->stats[i], 0);
> + counter_atomic32_set(_smi->stats[i], 0);
>  
>   new_smi->interrupt_disabled = true;
>   atomic_set(_smi->need_watch, 0);
> -- 
> 2.25.1
> 


Bug with data getting dropped on a pty

2020-09-25 Thread Corey Minyard
I've been trying to track down a bug in a library I support (named
gensio; it does all kinds of stream I/O) and I have figured out that
the problem is not in the library, it's in the kernel.  I have
attached a reproducer program, more on how to run it later.

Basically, if you have a pty master and do the following:

  write(ptym, data, size);
  close(ptym);

The other end will occasionally not get the first 4095 bytes of data,
but it will get byte 4095 and on.  This only happens on SMP systems; I
couldn't reproduce with just one processor.  (Running under qemu I
have seen it drop 2048 bytes, but it has always been 4095 outside of a
VM.)  I have tested on Ubuntu 18.04.5 x86_64, the base 5.4 kernel, on a
raspberry pi running raspian, 5.4.51 kernel, and the latest on the
master branch of Linus' tree running under qemu on x86_64.

I have never seen it fail going the other way (writing to the slave
and reading from the master) and that's part of the test suite.

I'm ok with it not getting any of the data, I'm ok with it getting
some of the data at the beginning, but dropping a chunk of the data
and getting later data is a problem.

I've looked at the pty and tty code and I haven't found anything
obvious, but I haven't looked that hard and I don't know that code
very well.

To run the reproducer:

  gcc -g -o testpty testpty.c
  ulimit -c unlimited
  while ./testpty; do echo pass; done

It should fail pretty quickly; it asserts when it detects the error.
You can load the core dump into the debugger.  Note that I wasn't able
to reproduce running it in the debugger.

In the debugger, you can back up to the assert and look at the readbuf:

(gdb) x/30xb readbuf
0x559e5e9c6080 :   0xff0x080x000x080x010x08
0x020x08
0x559e5e9c6088 : 0x030x080x040x080x050x08
0x060x08
0x559e5e9c6090 :0x070x080x080x080x090x08
0x0a0x08
0x559e5e9c6098 :0x0b0x080x0c0x080x0d0x08

verses the data that was sent:

0x559e5e9b6080 :  0x000x000x000x010x000x020x00
0x03
0x559e5e9b6088 :0x000x040x000x050x000x06
0x000x07
0x559e5e9b6090 :   0x000x080x000x090x000x0a
0x000x0b
0x559e5e9b6098 :   0x000x0c0x000x0d0x000x0e

The data is two byte big endian numbers ascending, the data in readbuf
that was read by the reader thread is the data starting at position
4095 in the data buffer that was transmitted.  Since n_tty has a 4096
byte buffer, that's somewhat suspicious.

Though the reproducer always fails on the first buffer, the test
program I had would close in random places, it would fail at places
besides the beginning of the buffer.

I searched and I couldn't find any error report on this.

-corey

#define _XOPEN_SOURCE 600
#define _DEFAULT_SOURCE

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static int pty_make_raw(int ptym)
{
struct termios t;
int err;

err = tcgetattr(ptym, );
if (err)
	return err;

cfmakeraw();
return tcsetattr(ptym, TCSANOW, );
}

unsigned char data[65536];
unsigned char readbuf[65536];
int slavefd, slaveerr;
size_t readsize;

int
cmp_mem(unsigned char *buf, unsigned char *buf2, size_t len, size_t pos)
{
size_t i;
int rv = 0;

for (i = 0; i < len; i++) {
	if (buf[i] != buf2[i]) {
	printf("Mismatch on byte %lu, expected 0x%2.2x, got 0x%2.2x\n",
		   (long) (i + pos), buf[i], buf2[i]);
	fflush(stdout);
	rv = -1;
	break;
	}
}
return rv;
}

static void *read_thread(void *dummy)
{
ssize_t i;

do {
	i = read(slavefd, readbuf + readsize, sizeof(readbuf) - readsize);
	if (i <= -1) {
	if (errno == EAGAIN)
		continue;
	if (errno == EIO)
		/* Remote close causes an EIO. */
		return NULL;
	perror("read");
	slaveerr = errno;
	return NULL;
	}
	if (i + readsize > sizeof(data)) {
	slaveerr = E2BIG;
	return NULL;
	}
	if (i && cmp_mem(data + readsize, readbuf + readsize, i, readsize)) {
	fprintf(stderr, "Data mismatch, starting at %ld, %ld bytes\n",
		(long) readsize, (long) i);
	assert(0);
	slaveerr = EBADMSG;
	return NULL;
	}
	readsize += i;
} while (i != 0);

return NULL;
}

int main(int argc, char *argv[])
{
int ptym, err;
char *slave;
ssize_t i;
pthread_t slavethr;

for (i = 0; i < sizeof(data); i += 2) {
	data[i] = (i / 2) >> 8;
	data[i + 1] = i / 2;
}

ptym = posix_openpt(O_RDWR | O_NOCTTY);
if (ptym == -1) {
	perror("posix_openpt");
	exit(1);
}

if (fcntl(ptym, F_SETFL, O_NONBLOCK) == -1) {
	perror("fcntl ptym");
	exit(1);
}

if (pty_make_raw(ptym)) {
	perror("pty_make_raw");
	exit(1);
}

if (unlockpt(ptym) < 0) {
	perror("unlockpt");
	exit(1);
}

slave = ptsname(ptym);
slavefd = open(slave, O_RDWR);
if (slavefd == -1) {
	perror("open");
	exit(1);
}

err = 

Re: [Openipmi-developer] [PATCH] x86: Fix MCE error handing when kdump is enabled

2020-09-23 Thread Corey Minyard
On Wed, Sep 23, 2020 at 04:48:31PM +0800, Wu Bo wrote:
> On 2020/9/23 2:43, Corey Minyard wrote:
> > On Tue, Sep 22, 2020 at 01:29:40PM -0500, miny...@acm.org wrote:
> > > From: Corey Minyard 
> > > 
> > > If kdump is enabled, the handling of shooting down CPUs does not use the
> > > RESET_VECTOR irq before trying to use NMIs to shoot down the CPUs.
> > > 
> > > For normal errors that is fine.  MCEs, however, are already running in
> > > an NMI, so sending them an NMI won't do anything.  The MCE code is set
> > > up to receive the RESET_VECTOR because it disables CPUs, but it won't
> >  ^ should be "enables irqs"
> > > work on the NMI-only case.
> > > 
> > > There is already code in place to scan for the NMI callback being ready,
> > > simply call that from the MCE's wait_for_panic() code so it will pick up
> > > and handle it if an NMI shootdown is requested.  This required
> > > propagating the registers down to wait_for_panic().
> > > 
> > > Signed-off-by: Corey Minyard 
> > > ---
> > > After looking at it a bit, I think this is the proper way to fix the
> > > issue, though I'm not an expert on this code so I'm not sure.
> > > 
> > > I have not even tested this patch, I have only compiled it.  But from
> > > what I can tell, things waiting in NMIs for a shootdown should call
> > > run_crash_ipi_callback() in their wait loop.
> 
> Hi,
> 
> In my VM (using qemu-kvm), Kump is enabled, used mce-inject injects an
> uncorrectable error. I has an issue with the IPMI driver's panic handling
> running while the other CPUs are sitting in "wait_for_panic()" with
> interrupt on, and IPMI interrupts interfering with the panic handling, As a
> result, IPMI panic hangs for more than 3000 seconds.
> 
> After I has patched and tested this patch, the problem of IPMI hangs has
> disappeared. It should be a solution to the problem.

Thanks for testing this.  I have submitted the patch to the MCE
maintainers.

-corey

> 
> 
> Thanks,
> 
> Wu Bo
> 
> > > 
> > >   arch/x86/kernel/cpu/mce/core.c | 67 ++
> > >   1 file changed, 44 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/cpu/mce/core.c 
> > > b/arch/x86/kernel/cpu/mce/core.c
> > > index f43a78bde670..3a842b3773b3 100644
> > > --- a/arch/x86/kernel/cpu/mce/core.c
> > > +++ b/arch/x86/kernel/cpu/mce/core.c
> > > @@ -282,20 +282,35 @@ static int fake_panic;
> > >   static atomic_t mce_fake_panicked;
> > >   /* Panic in progress. Enable interrupts and wait for final IPI */
> > > -static void wait_for_panic(void)
> > > +static void wait_for_panic(struct pt_regs *regs)
> > >   {
> > >   long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
> > >   preempt_disable();
> > >   local_irq_enable();
> > > - while (timeout-- > 0)
> > > + while (timeout-- > 0) {
> > > + /*
> > > +  * We are in an NMI waiting to be stopped by the
> > > +  * handing processor.  For kdump handling, we need to
> > > +  * be monitoring crash_ipi_issued since that is what
> > > +  * is used for an NMI stop used by kdump.  But we also
> > > +  * need to have interrupts enabled some so that
> > > +  * RESET_VECTOR will interrupt us on a normal
> > > +  * shutdown.
> > > +  */
> > > + local_irq_disable();
> > > + run_crash_ipi_callback(regs);
> > > + local_irq_enable();
> > > +
> > >   udelay(1);
> > > + }
> > >   if (panic_timeout == 0)
> > >   panic_timeout = mca_cfg.panic_timeout;
> > >   panic("Panicing machine check CPU died");
> > >   }
> > > -static void mce_panic(const char *msg, struct mce *final, char *exp)
> > > +static void mce_panic(const char *msg, struct mce *final, char *exp,
> > > +   struct pt_regs *regs)
> > >   {
> > >   int apei_err = 0;
> > >   struct llist_node *pending;
> > > @@ -306,7 +321,7 @@ static void mce_panic(const char *msg, struct mce 
> > > *final, char *exp)
> > >* Make sure only one CPU runs in machine check panic
> > >*/
> > >   if (atomic_inc_return(_panicked) > 1)
> >

Re: [Openipmi-developer] [PATCH] x86: Fix MCE error handing when kdump is enabled

2020-09-22 Thread Corey Minyard
On Tue, Sep 22, 2020 at 01:29:40PM -0500, miny...@acm.org wrote:
> From: Corey Minyard 
> 
> If kdump is enabled, the handling of shooting down CPUs does not use the
> RESET_VECTOR irq before trying to use NMIs to shoot down the CPUs.
> 
> For normal errors that is fine.  MCEs, however, are already running in
> an NMI, so sending them an NMI won't do anything.  The MCE code is set
> up to receive the RESET_VECTOR because it disables CPUs, but it won't
^ should be "enables irqs"
> work on the NMI-only case.
> 
> There is already code in place to scan for the NMI callback being ready,
> simply call that from the MCE's wait_for_panic() code so it will pick up
> and handle it if an NMI shootdown is requested.  This required
> propagating the registers down to wait_for_panic().
> 
> Signed-off-by: Corey Minyard 
> ---
> After looking at it a bit, I think this is the proper way to fix the
> issue, though I'm not an expert on this code so I'm not sure.
> 
> I have not even tested this patch, I have only compiled it.  But from
> what I can tell, things waiting in NMIs for a shootdown should call
> run_crash_ipi_callback() in their wait loop.
> 
>  arch/x86/kernel/cpu/mce/core.c | 67 ++
>  1 file changed, 44 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index f43a78bde670..3a842b3773b3 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -282,20 +282,35 @@ static int fake_panic;
>  static atomic_t mce_fake_panicked;
>  
>  /* Panic in progress. Enable interrupts and wait for final IPI */
> -static void wait_for_panic(void)
> +static void wait_for_panic(struct pt_regs *regs)
>  {
>   long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
>  
>   preempt_disable();
>   local_irq_enable();
> - while (timeout-- > 0)
> + while (timeout-- > 0) {
> + /*
> +  * We are in an NMI waiting to be stopped by the
> +  * handing processor.  For kdump handling, we need to
> +  * be monitoring crash_ipi_issued since that is what
> +  * is used for an NMI stop used by kdump.  But we also
> +  * need to have interrupts enabled some so that
> +  * RESET_VECTOR will interrupt us on a normal
> +  * shutdown.
> +  */
> + local_irq_disable();
> + run_crash_ipi_callback(regs);
> + local_irq_enable();
> +
>   udelay(1);
> + }
>   if (panic_timeout == 0)
>   panic_timeout = mca_cfg.panic_timeout;
>   panic("Panicing machine check CPU died");
>  }
>  
> -static void mce_panic(const char *msg, struct mce *final, char *exp)
> +static void mce_panic(const char *msg, struct mce *final, char *exp,
> +   struct pt_regs *regs)
>  {
>   int apei_err = 0;
>   struct llist_node *pending;
> @@ -306,7 +321,7 @@ static void mce_panic(const char *msg, struct mce *final, 
> char *exp)
>* Make sure only one CPU runs in machine check panic
>*/
>   if (atomic_inc_return(_panicked) > 1)
> - wait_for_panic();
> + wait_for_panic(regs);
>   barrier();
>  
>   bust_spinlocks(1);
> @@ -817,7 +832,7 @@ static atomic_t mce_callin;
>  /*
>   * Check if a timeout waiting for other CPUs happened.
>   */
> -static int mce_timed_out(u64 *t, const char *msg)
> +static int mce_timed_out(u64 *t, const char *msg, struct pt_regs *regs)
>  {
>   /*
>* The others already did panic for some reason.
> @@ -827,12 +842,12 @@ static int mce_timed_out(u64 *t, const char *msg)
>*/
>   rmb();
>   if (atomic_read(_panicked))
> - wait_for_panic();
> + wait_for_panic(regs);
>   if (!mca_cfg.monarch_timeout)
>   goto out;
>   if ((s64)*t < SPINUNIT) {
>   if (mca_cfg.tolerant <= 1)
> - mce_panic(msg, NULL, NULL);
> + mce_panic(msg, NULL, NULL, regs);
>   cpu_missing = 1;
>   return 1;
>   }
> @@ -866,7 +881,7 @@ static int mce_timed_out(u64 *t, const char *msg)
>   * All the spin loops have timeouts; when a timeout happens a CPU
>   * typically elects itself to be Monarch.
>   */
> -static void mce_reign(void)
> +static void mce_reign(struct pt_regs *regs)
>  {
>   int cpu;
>   struct mce *m = NULL;
> @@ -896,7 +911,7 @@ static void mce_reign(void)
>* other

Re: [RFC PATCH V2] ipmi: ssif: Fix out of bounds in write_next_byte()

2020-09-22 Thread Corey Minyard
On Tue, Sep 22, 2020 at 08:31:44AM -0500, Corey Minyard wrote:
> On Mon, Sep 21, 2020 at 10:00:08PM +0800, Wu Bo wrote:
> > In my virtual machine (have 4 cpus), Use mce_inject to inject errors
> > into the system. After mce-inject injects an uncorrectable error,
> > there is a probability that the virtual machine is not reset immediately,
> > but hangs for more than 3000 seconds, and appeared unable to
> > handle kernel paging request.
> > 
> > The analysis reasons are as follows:
> > 1) MCE appears on all CPUs, Currently all CPUs are in the NMI interrupt
> >context. cpu0 is the first to seize the opportunity to run panic
> >routines, and panic event should stop the other processors before
> >do ipmi flush_messages(). but cpu1, cpu2 and cpu3 has already
> >in NMI interrupt context, So the Second NMI interrupt(IPI)
> >will not be processed again by cpu1, cpu2 and cpu3.
> >At this time, cpu1,cpu2 and cpu3 did not stopped.
> > 
> > 2) cpu1, cpu2 and cpu3 are waitting for cpu0 to finish the panic process.
> >if a timeout waiting for other CPUs happened, do wait_for_panic(),
> >the irq is enabled in the wait_for_panic() function.
> > 
> > 3) ipmi IRQ occurs on the cpu3, and the cpu0 is doing the panic,
> >they have the opportunity to call the smi_event_handler()
> >function concurrently. the ipmi IRQ affects the panic process of cpu0.
> > 
> >   CPU0CPU3
> > 
> >|-nmi_handle do mce_panic   |-nmi_handle do_machine_check
> >|   |
> >|-panic()   |-wait_for_panic()
> >|   |
> >|-stop other cpus  NMI --> (Ignore, already in nmi interrupt)
> 
> There is a step that happens before this.  In native_stop_other_cpus()
> it uses the REBOOT_VECTOR irq to stop the other CPUs before it does the
> NMI.
> 
> The question is: Why isn't that working?  That's why irqs are enabled in
> wait_for_panic, so this REBOOT_VECTOR will work.
> 
> Again, changing the IPMI driver to account for this is ignoring the root
> problem, and the root problem will cause other issues.
> 
> You mention you are running in a VM, but you don't mention which one.
> Maybe the problem is in the VM?  I can't see how this is an issue unless
> you are not using native_stop_other_cpus() (using Xen?) or you have
> other kernel code changes.

I looked a bit more, and I'm guessing you are using kdump.
kdump_nmi_shuutdown_cpus() does not have the same handling for sending
the REBOOT_VECTOR interrupts as native_stop_other_cpus().

The kdump version of the code is not going to work with an MCE as it
only uses the NMI, and the NMI is not going to work.

I'm still not sure of the right way to fix this.  I can see two options:

* Do the REBOOT_VECTOR handling in kdump_nmi_shuutdown_cpus().

* In mce_panic, detect if you are going to do a kdump and handle it
  appropriately there.

We really need to get the x86 maintainers to see the issue and fix it.

-corey

> 
> -corey
> 
> >|   |
> >|-notifier call(ipmi panic_event)   |<-ipmi IRQ occurs
> >|   |
> >   \|/ \|/
> > do smi_event_handler() do smi_event_handler()
> > 
> > 
> > The current scene encountered is a simulation injection error of the mce 
> > software, 
> > and it is not confirmed whether there are other similar scenes. 
> > 
> > so add the try spinlocks in the IPMI panic handler to solve the concurrency 
> > problem of 
> > panic process and IRQ process, and also to prevent the panic process from 
> > deadlock.
> > 
> > Steps to reproduce (Have a certain probability):
> > 1. # vim /tmp/uncorrected
> > CPU 1 BANK 4
> > STATUS uncorrected 0xc0
> > MCGSTATUS  EIPV MCIP
> > ADDR 0x1234
> > RIP 0xdeadbabe
> > RAISINGCPU 0
> > MCGCAP SER CMCI TES 0x6
> > 
> > 2. # modprobe mce_inject
> > 3. # cd /tmp
> > 4. # mce-inject uncorrected
> > 
> > The logs:
> > [   55.086670] core: [Hardware Error]: RIP 00:<deadbabe>
> > [   55.086671] core: [Hardware Error]: TSC 2e11aff65eea ADDR 1234
> > [   55.086673] core: [Hardware Error]: PROCESSOR 0:50654 TIME 1598967234 
> > SOCKET 0 APIC 1 microcode 1
> > [   55.086674] core: [Hardware Error]: Run the above through 'mcelog 
> > --ascii'
> > [   55.086675] core: [Hardware Error]: Machine check: In kernel and no 
> > restart IP
> > [   55.086676]

Re: [RFC PATCH V2] ipmi: ssif: Fix out of bounds in write_next_byte()

2020-09-22 Thread Corey Minyard
On Mon, Sep 21, 2020 at 10:00:08PM +0800, Wu Bo wrote:
> In my virtual machine (have 4 cpus), Use mce_inject to inject errors
> into the system. After mce-inject injects an uncorrectable error,
> there is a probability that the virtual machine is not reset immediately,
> but hangs for more than 3000 seconds, and appeared unable to
> handle kernel paging request.
> 
> The analysis reasons are as follows:
> 1) MCE appears on all CPUs, Currently all CPUs are in the NMI interrupt
>context. cpu0 is the first to seize the opportunity to run panic
>routines, and panic event should stop the other processors before
>do ipmi flush_messages(). but cpu1, cpu2 and cpu3 has already
>in NMI interrupt context, So the Second NMI interrupt(IPI)
>will not be processed again by cpu1, cpu2 and cpu3.
>At this time, cpu1,cpu2 and cpu3 did not stopped.
> 
> 2) cpu1, cpu2 and cpu3 are waitting for cpu0 to finish the panic process.
>if a timeout waiting for other CPUs happened, do wait_for_panic(),
>the irq is enabled in the wait_for_panic() function.
> 
> 3) ipmi IRQ occurs on the cpu3, and the cpu0 is doing the panic,
>they have the opportunity to call the smi_event_handler()
>function concurrently. the ipmi IRQ affects the panic process of cpu0.
> 
>   CPU0CPU3
> 
>|-nmi_handle do mce_panic   |-nmi_handle do_machine_check
>|   |
>|-panic()   |-wait_for_panic()
>|   |
>|-stop other cpus  NMI --> (Ignore, already in nmi interrupt)

There is a step that happens before this.  In native_stop_other_cpus()
it uses the REBOOT_VECTOR irq to stop the other CPUs before it does the
NMI.

The question is: Why isn't that working?  That's why irqs are enabled in
wait_for_panic, so this REBOOT_VECTOR will work.

Again, changing the IPMI driver to account for this is ignoring the root
problem, and the root problem will cause other issues.

You mention you are running in a VM, but you don't mention which one.
Maybe the problem is in the VM?  I can't see how this is an issue unless
you are not using native_stop_other_cpus() (using Xen?) or you have
other kernel code changes.

-corey

>|   |
>|-notifier call(ipmi panic_event)   |<-ipmi IRQ occurs
>|   |
>   \|/ \|/
> do smi_event_handler() do smi_event_handler()
> 
> 
> The current scene encountered is a simulation injection error of the mce 
> software, 
> and it is not confirmed whether there are other similar scenes. 
> 
> so add the try spinlocks in the IPMI panic handler to solve the concurrency 
> problem of 
> panic process and IRQ process, and also to prevent the panic process from 
> deadlock.
> 
> Steps to reproduce (Have a certain probability):
> 1. # vim /tmp/uncorrected
> CPU 1 BANK 4
> STATUS uncorrected 0xc0
> MCGSTATUS  EIPV MCIP
> ADDR 0x1234
> RIP 0xdeadbabe
> RAISINGCPU 0
> MCGCAP SER CMCI TES 0x6
> 
> 2. # modprobe mce_inject
> 3. # cd /tmp
> 4. # mce-inject uncorrected
> 
> The logs:
> [   55.086670] core: [Hardware Error]: RIP 00:
> [   55.086671] core: [Hardware Error]: TSC 2e11aff65eea ADDR 1234
> [   55.086673] core: [Hardware Error]: PROCESSOR 0:50654 TIME 1598967234 
> SOCKET 0 APIC 1 microcode 1
> [   55.086674] core: [Hardware Error]: Run the above through 'mcelog --ascii'
> [   55.086675] core: [Hardware Error]: Machine check: In kernel and no 
> restart IP
> [   55.086676] Kernel panic - not syncing: Fatal machine check
> [   55.086677] kernel fault(0x5) notification starting on CPU 0
> [   55.086682] kernel fault(0x5) notification finished on CPU 0
> [ 4767.947960] BUG: unable to handle kernel paging request at 893e4000
> [ 4767.947962] PGD 13c001067 P4D 13c001067 PUD 0
> [ 4767.947965] Oops:  [#1] SMP PTI
> [ 4767.947967] CPU: 0 PID: 0 Comm: swapper/0
> [ 4767.947968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-20181220_00-szxrtosci1 04/01/2014
> [ 4767.947972] RIP: 0010:kcs_event+0x3c2/0x890 [ipmi_si]
> [ 4767.947974] Code: 74 0e 48 8b 7b 08 31 f6 48 8b 07 e8 98 4f 44 cd 83 bb 24 
> 01
> [ 4767.947975] RSP: 0018:fe007658 EFLAGS: 00010046
> [ 4767.947976] RAX: 0c7c5ff0 RBX: 893e3383a000 RCX: 
> 
> [ 4767.947976] RDX: 0ca2 RSI:  RDI: 
> 893e2fdf6e40
> [ 4767.947977] RBP: 0001 R08:  R09: 
> 0a35
> [ 4767.947978] R10: 0002 R11: 0006 R12: 
> 
> [ 4767.947978] R13: fe007b28 R14: 893e34bd R15: 
> 
> [ 4767.947979] FS:  () GS:893e3ec0() 
> knlGS:
> [ 4767.947980] CS:  0010 DS:  ES:  CR0: 80050033
> [ 

Re: [RFC PATCH] mce: don't not enable IRQ in wait_for_panic()

2020-09-17 Thread Corey Minyard
On Thu, Sep 17, 2020 at 06:37:50PM +0800, Wu Bo wrote:
> In my virtual machine (have 4 cpus), Use mce_inject to inject errors
> into the system. After mce-inject injects an uncorrectable error, 
> there is a probability that the virtual machine is not reset immediately, 
> but hangs for more than 3000 seconds, and appeared unable to 
> handle kernel paging request.
> 
> The analysis reasons are as follows:
> 1) MCE appears on all CPUs, Currently all CPUs are in the NMI interrupt 
>context. cpu0 is the first to seize the opportunity to run panic 
>routines, and panic event should stop the other processors before 
>do ipmi flush_messages(). but cpu1, cpu2 and cpu3 has already 
>in NMI interrupt context, So the Second NMI interrupt(IPI) 
>will not be processed again by cpu1, cpu2 and cpu3.
>At this time, cpu1,cpu2 and cpu3 did not stopped.
> 
> 2) cpu1, cpu2 and cpu3 are waitting for cpu0 to finish the panic process. 
>if a timeout waiting for other CPUs happened, do wait_for_panic(), 
>the irq is enabled in the wait_for_panic() function.
> 
> 3) ipmi IRQ occurs on the cpu3, and the cpu0 is doing the panic, 
>they have the opportunity to call the smi_event_handler() 
>function concurrently. the ipmi IRQ affects the panic process of cpu0.
> 
>   CPU0CPU3
> 
>|-nmi_handle do mce_panic   |-nmi_handle do_machine_check
>|   |
>|-panic()   |-wait_for_panic()
>|   |
>|-stop other cpus  NMI --> (Ignore, already in nmi interrupt)
>|   |
>|-notifier call(ipmi panic_event)   |<-ipmi IRQ occurs
>|   |
>   \|/ \|/
> do smi_event_handler() do smi_event_handler()
> 
> My understanding is that panic() runs with only one operational CPU 
> in the system, other CPUs should be stopped, if other CPUs does not stop, 
> at least IRQ interrupts should be disabled. The x86 architecture, 
> disable IRQ interrupt will not affect IPI when do mce panic, 
> because IPI is notified through NMI interrupt. If my analysis
> is not right, please correct me, thanks.

I'm not sure this is the right fix, but I'm not sure what the right fix
is.  I think this will prevent the other CPUs from being interrupted to
disable them in some cases.

The group at Huawei has an issue with the IPMI driver's panic handling
running while the other CPUs are sitting in "wait_for_panic()" with
interrupt on, and IPMI interrupts interfering with the panic handling,
as they describe above.

It is my understanding that in a panic all other CPUs should be in a
state where they won't do anything and won't take interrupts.  Is that
a correct assumption?  If not, I have some work to do.

-corey

> 
> Steps to reproduce (Have a certain probability):
> 1. # vim /tmp/uncorrected
> CPU 1 BANK 4
> STATUS uncorrected 0xc0
> MCGSTATUS  EIPV MCIP
> ADDR 0x1234
> RIP 0xdeadbabe
> RAISINGCPU 0
> MCGCAP SER CMCI TES 0x6
>  
> 2. # modprobe mce_inject
> 3. # cd /tmp
> 4. # mce-inject uncorrected
> 
> The logs:
> [   55.086670] core: [Hardware Error]: RIP 00:
> [   55.086671] core: [Hardware Error]: TSC 2e11aff65eea ADDR 1234
> [   55.086673] core: [Hardware Error]: PROCESSOR 0:50654 TIME 1598967234 
> SOCKET 0 APIC 1 microcode 1
> [   55.086674] core: [Hardware Error]: Run the above through 'mcelog --ascii'
> [   55.086675] core: [Hardware Error]: Machine check: In kernel and no 
> restart IP
> [   55.086676] Kernel panic - not syncing: Fatal machine check
> [   55.086677] kernel fault(0x5) notification starting on CPU 0
> [   55.086682] kernel fault(0x5) notification finished on CPU 0
> [ 4767.947960] BUG: unable to handle kernel paging request at 893e4000
> [ 4767.947962] PGD 13c001067 P4D 13c001067 PUD 0
> [ 4767.947965] Oops:  [#1] SMP PTI
> [ 4767.947967] CPU: 0 PID: 0 Comm: swapper/0
> [ 4767.947968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-20181220_00-szxrtosci1 04/01/2014
> [ 4767.947972] RIP: 0010:kcs_event+0x3c2/0x890 [ipmi_si]
> [ 4767.947974] Code: 74 0e 48 8b 7b 08 31 f6 48 8b 07 e8 98 4f 44 cd 83 bb 24 
> 01
> [ 4767.947975] RSP: 0018:fe007658 EFLAGS: 00010046
> [ 4767.947976] RAX: 0c7c5ff0 RBX: 893e3383a000 RCX: 
> 
> [ 4767.947976] RDX: 0ca2 RSI:  RDI: 
> 893e2fdf6e40
> [ 4767.947977] RBP: 0001 R08:  R09: 
> 0a35
> [ 4767.947978] R10: 0002 R11: 0006 R12: 
> 
> [ 4767.947978] R13: fe007b28 R14: 893e34bd R15: 
> 
> [ 4767.947979] FS:  () GS:893e3ec0() 
> knlGS:
> [ 4767.947980] CS:  0010 DS:  ES:  CR0: 

Re: [PATCH] ipmi: add retry in try_get_dev_id()

2020-09-16 Thread Corey Minyard
On Wed, Sep 16, 2020 at 02:21:29PM +0800, Xianting Tian wrote:
> Use retry machanism to give device more opportunitys to correctly response
> kernel when we received specific completion codes.
> 
> This is similar to what we done in __get_device_id().

Thanks.  I moved GET_DEVICE_ID_MAX_RETRY to include/linux/ipmi.h since
uapi is for things exported to userspace.  But this is good, it's in my
next tree.

-corey

> 
> Signed-off-by: Xianting Tian 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c |  2 --
>  drivers/char/ipmi/ipmi_si_intf.c| 17 +
>  include/uapi/linux/ipmi.h   |  2 ++
>  3 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index b9685093e..75cb7e062 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -62,8 +62,6 @@ enum ipmi_panic_event_op {
>  #define IPMI_PANIC_DEFAULT IPMI_SEND_PANIC_EVENT_NONE
>  #endif
>  
> -#define GET_DEVICE_ID_MAX_RETRY  5
> -
>  static enum ipmi_panic_event_op ipmi_send_panic_event = IPMI_PANIC_DEFAULT;
>  
>  static int panic_op_write_handler(const char *val,
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index 77b8d551a..beeb705f1 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -1316,6 +1316,7 @@ static int try_get_dev_id(struct smi_info *smi_info)
>   unsigned char *resp;
>   unsigned long resp_len;
>   int   rv = 0;
> + unsigned int  retry_count = 0;
>  
>   resp = kmalloc(IPMI_MAX_MSG_LENGTH, GFP_KERNEL);
>   if (!resp)
> @@ -1327,6 +1328,8 @@ static int try_get_dev_id(struct smi_info *smi_info)
>*/
>   msg[0] = IPMI_NETFN_APP_REQUEST << 2;
>   msg[1] = IPMI_GET_DEVICE_ID_CMD;
> +
> +retry:
>   smi_info->handlers->start_transaction(smi_info->si_sm, msg, 2);
>  
>   rv = wait_for_msg_done(smi_info);
> @@ -1339,6 +1342,20 @@ static int try_get_dev_id(struct smi_info *smi_info)
>   /* Check and record info from the get device id, in case we need it. */
>   rv = ipmi_demangle_device_id(resp[0] >> 2, resp[1],
>   resp + 2, resp_len - 2, _info->device_id);
> + if (rv) {
> + /* record completion code */
> + char cc = *(resp + 2);
> +
> + if ((cc == IPMI_DEVICE_IN_FW_UPDATE_ERR
> + || cc == IPMI_DEVICE_IN_INIT_ERR
> + || cc == IPMI_NOT_IN_MY_STATE_ERR)
> + && ++retry_count <= GET_DEVICE_ID_MAX_RETRY) {
> + dev_warn(smi_info->io.dev,
> + "retry to get device id as completion code 
> 0x%x\n",
> +  cc);
> + goto retry;
> + }
> + }
>  
>  out:
>   kfree(resp);
> diff --git a/include/uapi/linux/ipmi.h b/include/uapi/linux/ipmi.h
> index 32d148309..bc57f07e3 100644
> --- a/include/uapi/linux/ipmi.h
> +++ b/include/uapi/linux/ipmi.h
> @@ -426,4 +426,6 @@ struct ipmi_timing_parms {
>  #define IPMICTL_GET_MAINTENANCE_MODE_CMD _IOR(IPMI_IOC_MAGIC, 30, int)
>  #define IPMICTL_SET_MAINTENANCE_MODE_CMD _IOW(IPMI_IOC_MAGIC, 31, int)
>  
> +#define GET_DEVICE_ID_MAX_RETRY  5
> +
>  #endif /* _UAPI__LINUX_IPMI_H */
> -- 
> 2.17.1
> 


Re: [PATCH] [v2] ipmi: retry to get device id when error

2020-09-15 Thread Corey Minyard
On Tue, Sep 15, 2020 at 09:40:02AM +, Tianxianting wrote:
> Hi Corey,
> Thanks for your comments,
> Please review these two patches, which are based on your guide.
> 1, [PATCH] ipmi: print current state when error
> https://lkml.org/lkml/2020/9/15/183 
> 2, [PATCH] [v3] ipmi: retry to get device id when error
> https://lkml.org/lkml/2020/9/15/156 

Patches are applied and in my next tree.

> 
> As you said "You are having the same issue in the ipmi_si code. It's choosing 
> defaults, but that's not ideal.  You probably need to handle this there, too, 
> in a separate patch."
> I am not sure whether I grasped what you said, 
> The print ' device id demangle failed: -22' in commit message, is just 
> triggered by bmc_device_id_handler->ipmi_demangle_device_id, this is the 
> issue we met and is solving.
> I found try_get_dev_id(in drivers/char/ipmi/ipmi_si_intf.c) also called 
> ipmi_demangle_device_id(), do you mean if this ipmi_demangle_device_id() 
> returned error, we also need to retry?

Yes, I think so, retrying in try_get_dev_id() would be a good idea, I
think.  You are probably getting sub-optimal performance if you don't
do this.

Thanks,

-corey

> 
> Thanks a lot.
> 
> -Original Message-
> From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard
> Sent: Monday, September 14, 2020 11:40 PM
> To: tianxianting (RD) 
> Cc: a...@arndb.de; gre...@linuxfoundation.org; 
> openipmi-develo...@lists.sourceforge.net; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] [v2] ipmi: retry to get device id when error
> 
> On Mon, Sep 14, 2020 at 04:13:13PM +0800, Xianting Tian wrote:
> > We can't get bmc's device id with low probability when loading ipmi 
> > driver, it caused bmc device register failed. When this issue 
> > happened, we got below kernel printks:
> 
> This patch is moving in the right direction.  For the final patch(es), I can 
> clean up the english grammar issues, since that's not your native language.  
> A few comments:
> 
> > [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler: 
> > device id demangle failed: -22
> 
> You are having the same issue in the ipmi_si code.  It's choosing defaults, 
> but that's not ideal.  You probably need to handle this there, too, in a 
> separate patch.
> 
> Can you create a separate patch to add a dev_warn() to the BT code when it 
> returns IPMI_NOT_IN_MY_STATE_ERR, like I asked previously?  And print the 
> current state when it happens.  That way we know where this issue is coming 
> from and possibly fix the state machine.  I'm thinking that the BMC is just 
> not responding, but I'd like to be sure.
> 
> Other comments inline...
> 
> > [Wed Sep  9 19:52:03 2020] IPMI BT: using default values
> > [Wed Sep  9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2
> > [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the device 
> > id: -5
> > [Wed Sep  9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register 
> > device: error -5
> > 
> > When this issue happened, we want to manually unload the driver and 
> > try to load it again, but it can't be unloaded by 'rmmod' as it is already 
> > 'in use'.
> > 
> > We add below 'printk' in handle_one_recv_msg(), when this issue 
> > happened, the msg we received is "Recv: 1c 01 d5", which means the 
> > data_len is 1, data[0] is 0xd5(completion code), which means "bmc cannot 
> > execute command.
> > Command, or request parameter(s), not supported in present state".
> > Debug code:
> > static int handle_one_recv_msg(struct ipmi_smi *intf,
> >struct ipmi_smi_msg *msg) {
> > printk("Recv: %*ph\n", msg->rsp_size, msg->rsp);
> > ... ...
> > }
> > Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7'
> > and 'data[0] != 0'.
> > 
> > We used this patch to retry to get device id when error happen, we 
> > reproduced this issue again and the retry succeed on the first retry, 
> > we finally got the correct msg and then all is ok:
> > Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00
> > 
> > So use retry machanism in this patch to give bmc more opportunity to 
> > correctly response kernel when we received specific completion code.
> > 
> > Signed-off-by: Xianting Tian 
> > ---
> >  drivers/char/ipmi/ipmi_msghandler.c | 29 +
> >  include/uapi/linux/ipmi_msgdefs.h   |  2 ++
> >  2 files changed, 27 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/char/ipmi/i

Re: [PATCH] [v2] ipmi: retry to get device id when error

2020-09-14 Thread Corey Minyard
On Mon, Sep 14, 2020 at 04:13:13PM +0800, Xianting Tian wrote:
> We can't get bmc's device id with low probability when loading ipmi driver,
> it caused bmc device register failed. When this issue happened, we got
> below kernel printks:

This patch is moving in the right direction.  For the final patch(es), I
can clean up the english grammar issues, since that's not your native
language.  A few comments:

>   [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler: 
> device id demangle failed: -22

You are having the same issue in the ipmi_si code.  It's choosing
defaults, but that's not ideal.  You probably need to handle this
there, too, in a separate patch.

Can you create a separate patch to add a dev_warn() to the BT code when
it returns IPMI_NOT_IN_MY_STATE_ERR, like I asked previously?  And print
the current state when it happens.  That way we know where this issue is
coming from and possibly fix the state machine.  I'm thinking that the
BMC is just not responding, but I'd like to be sure.

Other comments inline...

>   [Wed Sep  9 19:52:03 2020] IPMI BT: using default values
>   [Wed Sep  9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2
>   [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the device 
> id: -5
>   [Wed Sep  9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register 
> device: error -5
> 
> When this issue happened, we want to manually unload the driver and try to
> load it again, but it can't be unloaded by 'rmmod' as it is already 'in use'.
> 
> We add below 'printk' in handle_one_recv_msg(), when this issue happened,
> the msg we received is "Recv: 1c 01 d5", which means the data_len is 1,
> data[0] is 0xd5(completion code), which means "bmc cannot execute command.
> Command, or request parameter(s), not supported in present state".
>   Debug code:
>   static int handle_one_recv_msg(struct ipmi_smi *intf,
>struct ipmi_smi_msg *msg) {
>   printk("Recv: %*ph\n", msg->rsp_size, msg->rsp);
>   ... ...
>   }
> Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7'
> and 'data[0] != 0'.
> 
> We used this patch to retry to get device id when error happen, we
> reproduced this issue again and the retry succeed on the first retry, we
> finally got the correct msg and then all is ok:
> Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00
> 
> So use retry machanism in this patch to give bmc more opportunity to
> correctly response kernel when we received specific completion code.
> 
> Signed-off-by: Xianting Tian 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 29 +
>  include/uapi/linux/ipmi_msgdefs.h   |  2 ++
>  2 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6b2..07d5be2cd 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define IPMI_DRIVER_VERSION "39.2"
>  
> @@ -60,6 +61,9 @@ enum ipmi_panic_event_op {
>  #else
>  #define IPMI_PANIC_DEFAULT IPMI_SEND_PANIC_EVENT_NONE
>  #endif
> +
> +#define GET_DEVICE_ID_MAX_RETRY  5
> +
>  static enum ipmi_panic_event_op ipmi_send_panic_event = IPMI_PANIC_DEFAULT;
>  
>  static int panic_op_write_handler(const char *val,
> @@ -317,6 +321,7 @@ struct bmc_device {
>   intdyn_guid_set;
>   struct krefusecount;
>   struct work_struct remove_work;
> + char   cc; /* completion code */
>  };
>  #define to_bmc_device(x) container_of((x), struct bmc_device, pdev.dev)
>  
> @@ -2381,6 +2386,8 @@ static void bmc_device_id_handler(struct ipmi_smi *intf,
>   msg->msg.data, msg->msg.data_len, >bmc->fetch_id);
>   if (rv) {
>   dev_warn(intf->si_dev, "device id demangle failed: %d\n", rv);
> + /* record completion code when error */
> + intf->bmc->cc = msg->msg.data[0];
>   intf->bmc->dyn_id_set = 0;
>   } else {
>   /*
> @@ -2426,19 +2433,34 @@ send_get_device_id_cmd(struct ipmi_smi *intf)
>  static int __get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc)
>  {
>   int rv;
> -
> - bmc->dyn_id_set = 2;
> + unsigned int retry_count = 0;

You need to initialize bmc->cc to 0 here.

>  
>   intf->null_user_handler = bmc_device_id_handler;
>  
> +retry:
> + bmc->dyn_id_set = 2;
> +
>   rv = send_get_device_id_cmd(intf);
>   if (rv)
>   return rv;
>  
>   wait_event(intf->waitq, bmc->dyn_id_set != 2);
>  
> - if (!bmc->dyn_id_set)
> + if (!bmc->dyn_id_set) {
> + if ((bmc->cc == IPMI_NOT_IN_MY_STATE_ERR
> +  || bmc->cc == IPMI_NOT_IN_MY_STATE_ERR_1
> +  || bmc->cc == IPMI_NOT_IN_MY_STATE_ERR_2)
> + 

Re: [PATCH] ipmi: retry to get device id when error

2020-09-13 Thread Corey Minyard
On Sun, Sep 13, 2020 at 02:10:01PM +, Tianxianting wrote:
> Hi Corey
> Thanks for your quickly reply,
> We didn't try the method you mentioned, actually, I didn't know it before you 
> told me:(
> The issue ever occurred on our 2 ceph storage server both with low 
> probability.
> We finally use this patch to solve the issue, it can automatically solve the 
> issue when it happened. So no need to judge and reload ipmi driver manually 
> or develop additional scripts to make it.
> The 1 second delay is acceptable to us.
> 
> If there really isn't a BMC out there, ipmi driver will not be loaded, is it 
> right?

No, there are systems that have IPMI controllers that are specified in
the ACPI/SMBIOS tables but have no IPMI controller.

> 
> May be we can adjust to retry 3 times with 500ms interval? 

Maybe there is another way.  What I'm guessing is happening is not that
the interface is lossy, but that the BMC is in the middle of a reset or
an upgrade.  The D5 completion code means: Cannot execute command. Command,
or request parameter(s), not supported in present state.

That error is also returned from bt_start_transaction() in the driver if
the driver is not in the right state when starting a transaction,
which may point to a bug in the BT state machine.  Search for
IPMI_NOT_IN_MY_STATE_ERR in drivers/char/ipmi/ipmi_bt_sm.c.

If it's coming fron the state machine, it would be handy to know what
state it was in when the error happened to help trace the bug down.
That's what I would suggest first.  Fix the fundamental bug, if you can.
a pr_warn() added to the code there would be all that's needed, and
thats probably a good permanent addition.

I would be ok with a patch that retried some number of times if it got a
D5 completion code.  That wouldn't have any effect on other systems.
Plus you could add a D1 and D2 completion code, which are similar
things.  Add the new completion codes to include/uapi/linux/ipmi_msgdefs.h
and implement the retry.  That makes sense from a general point of view.

Thanks,

-corey

> 
> Thanks in advance if you can feedback again.
> 
> -Original Message-
> From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard
> Sent: Sunday, September 13, 2020 8:40 PM
> To: tianxianting (RD) 
> Cc: a...@arndb.de; gre...@linuxfoundation.org; 
> openipmi-develo...@lists.sourceforge.net; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ipmi: retry to get device id when error
> 
> On Sun, Sep 13, 2020 at 08:02:03PM +0800, Xianting Tian wrote:
> > We can't get bmc's device id with low probability when loading ipmi 
> > driver, it caused bmc device register failed. This issue may caused by 
> > bad lpc signal quality. When this issue happened, we got below kernel 
> > printks:
> > [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler: 
> > device id demangle failed: -22
> > [Wed Sep  9 19:52:03 2020] IPMI BT: using default values
> > [Wed Sep  9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2
> > [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the device 
> > id: -5
> > [Wed Sep  9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register 
> > device: error -5
> > 
> > When this issue happened, we want to manually unload the driver and 
> > try to load it again, but it can't be unloaded by 'rmmod' as it is already 
> > 'in use'.
> 
> I'm not sure this patch is a good idea; it would cause a long boot delay in 
> situations where there really isn't a BMC out there.  Yes, it happens.
> 
> You don't have to reload the driver to add a device, though.  You can hot-add 
> devices using /sys/modules/ipmi_si/parameters/hotmod.  Look in 
> Documentation/driver-api/ipmi.rst for details.
> 
> Does that work for you?
> 
> -corey
> 
> > 
> > We add below 'printk' in handle_one_recv_msg(), when this issue 
> > happened, the msg we received is "Recv: 1c 01 d5", which means the 
> > data_len is 1, data[0] is 0xd5.
> > Debug code:
> > static int handle_one_recv_msg(struct ipmi_smi *intf,
> >struct ipmi_smi_msg *msg) {
> > printk("Recv: %*ph\n", msg->rsp_size, msg->rsp);
> > ... ...
> > }
> > Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7'
> > and 'data[0] != 0'.
> > 
> > We used this patch to retry to get device id when error happen, we 
> > reproduced this issue again and the retry succeed on the first retry, 
> > we finally got the correct msg and then all is ok:
> > Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00
> > 
> > So use retry machanism in this patch to give bmc

Re: [PATCH] ipmi: retry to get device id when error

2020-09-13 Thread Corey Minyard
On Sun, Sep 13, 2020 at 08:02:03PM +0800, Xianting Tian wrote:
> We can't get bmc's device id with low probability when loading ipmi driver,
> it caused bmc device register failed. This issue may caused by bad lpc
> signal quality. When this issue happened, we got below kernel printks:
>   [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler: 
> device id demangle failed: -22
>   [Wed Sep  9 19:52:03 2020] IPMI BT: using default values
>   [Wed Sep  9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2
>   [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the device 
> id: -5
>   [Wed Sep  9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register 
> device: error -5
> 
> When this issue happened, we want to manually unload the driver and try to
> load it again, but it can't be unloaded by 'rmmod' as it is already 'in use'.

I'm not sure this patch is a good idea; it would cause a long boot delay
in situations where there really isn't a BMC out there.  Yes, it
happens.

You don't have to reload the driver to add a device, though.  You can
hot-add devices using /sys/modules/ipmi_si/parameters/hotmod.  Look in
Documentation/driver-api/ipmi.rst for details.

Does that work for you?

-corey

> 
> We add below 'printk' in handle_one_recv_msg(), when this issue happened,
> the msg we received is "Recv: 1c 01 d5", which means the data_len is 1,
> data[0] is 0xd5.
>   Debug code:
>   static int handle_one_recv_msg(struct ipmi_smi *intf,
>struct ipmi_smi_msg *msg) {
>   printk("Recv: %*ph\n", msg->rsp_size, msg->rsp);
>   ... ...
>   }
> Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7'
> and 'data[0] != 0'.
> 
> We used this patch to retry to get device id when error happen, we
> reproduced this issue again and the retry succeed on the first retry, we
> finally got the correct msg and then all is ok:
> Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00
> 
> So use retry machanism in this patch to give bmc more opportunity to
> correctly response kernel.
> 
> Signed-off-by: Xianting Tian 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6b2..bfb2de77a 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define IPMI_DRIVER_VERSION "39.2"
>  
> @@ -60,6 +61,9 @@ enum ipmi_panic_event_op {
>  #else
>  #define IPMI_PANIC_DEFAULT IPMI_SEND_PANIC_EVENT_NONE
>  #endif
> +
> +#define GET_DEVICE_ID_MAX_RETRY  5
> +
>  static enum ipmi_panic_event_op ipmi_send_panic_event = IPMI_PANIC_DEFAULT;
>  
>  static int panic_op_write_handler(const char *val,
> @@ -2426,19 +2430,26 @@ send_get_device_id_cmd(struct ipmi_smi *intf)
>  static int __get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc)
>  {
>   int rv;
> -
> - bmc->dyn_id_set = 2;
> + unsigned int retry_count = 0;
>  
>   intf->null_user_handler = bmc_device_id_handler;
>  
> +retry:
> + bmc->dyn_id_set = 2;
> +
>   rv = send_get_device_id_cmd(intf);
>   if (rv)
>   return rv;
>  
>   wait_event(intf->waitq, bmc->dyn_id_set != 2);
>  
> - if (!bmc->dyn_id_set)
> + if (!bmc->dyn_id_set) {
> + msleep(1000);
> + if (++retry_count <= GET_DEVICE_ID_MAX_RETRY)
> + goto retry;
> +
>   rv = -EIO; /* Something went wrong in the fetch. */
> + }
>  
>   /* dyn_id_set makes the id data available. */
>   smp_rmb();
> -- 
> 2.17.1
> 


Re: [PATCH 3/3] ipmi: Add timeout waiting for channel information

2020-09-07 Thread Corey Minyard
On Mon, Sep 07, 2020 at 06:25:37PM +0200, Markus Boehme wrote:
> We have observed hosts with misbehaving BMCs that receive a Get Channel
> Info command but don't respond. This leads to an indefinite wait in the
> ipmi_msghandler's __scan_channels function, showing up as hung task
> messages for modprobe.
> 
> Add a timeout waiting for the channel scan to complete. If the scan
> fails to complete within that time, treat that like IPMI 1.0 and only
> assume the presence of the primary IPMB channel at channel number 0.

This patch is a significant rewrite of the function.  This makes me a
little uncomfortable.  It's generally better to fix the bug in a minimal
patch.  It would be easier to read if you had two patches, one to
restructure the code and one to fix the bug.

One comment inline, but it doesn't matter, because...

While thinking about this, I realized an issue with these patches.
There should be timers in the lower layers that detect that the BMC does
not respond and should return an error response.  This is supposed to be
guaranteed by the lower layer, you shouldn't need timers here.  In fact,
if you abort with a timer here, you should get a lower layer reponds
later, causing other issues.

So, this is wrong.  If you are never getting a response, there is a bug
in the lower layer.  If you are not getting the error response as
quickly as you would like, I'm not sure what to do about that.

The first patch, of course, is an obvious bug fix.  I'll include that.

-corey

> 
> Signed-off-by: Stefan Nuernberger 
> Signed-off-by: Markus Boehme 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 72 
> -
>  1 file changed, 39 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 2a2e8b2..9de9ba6 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3315,46 +3315,52 @@ channel_handler(struct ipmi_smi *intf, struct 
> ipmi_recv_msg *msg)
>   */
>  static int __scan_channels(struct ipmi_smi *intf, struct ipmi_device_id *id)
>  {
> - int rv;
> + long rv;
> + unsigned int set;
>  
> - if (ipmi_version_major(id) > 1
> - || (ipmi_version_major(id) == 1
> - && ipmi_version_minor(id) >= 5)) {
> - unsigned int set;
> + if (ipmi_version_major(id) == 1 && ipmi_version_minor(id) < 5) {

This is incorrect, it will not correctly handle IPMI 0.x BMCs.  Yes,
they exist.

> + set = intf->curr_working_cset;
> + goto single_ipmb_channel;
> + }
>  
> - /*
> -  * Start scanning the channels to see what is
> -  * available.
> -  */
> - set = !intf->curr_working_cset;
> - intf->curr_working_cset = set;
> - memset(>wchannels[set], 0,
> -sizeof(struct ipmi_channel_set));
> -
> - intf->null_user_handler = channel_handler;
> - intf->curr_channel = 0;
> - rv = send_channel_info_cmd(intf, 0);
> - if (rv) {
> - dev_warn(intf->si_dev,
> -  "Error sending channel information for channel 
> 0, %d\n",
> -  rv);
> - intf->null_user_handler = NULL;
> - return -EIO;
> - }
> + /*
> +  * Start scanning the channels to see what is
> +  * available.
> +  */
> + set = !intf->curr_working_cset;
> + intf->curr_working_cset = set;
> + memset(>wchannels[set], 0, sizeof(struct ipmi_channel_set));
>  
> - /* Wait for the channel info to be read. */
> - wait_event(intf->waitq, intf->channels_ready);
> + intf->null_user_handler = channel_handler;
> + intf->curr_channel = 0;
> + rv = send_channel_info_cmd(intf, 0);
> + if (rv) {
> + dev_warn(intf->si_dev,
> +  "Error sending channel information for channel 0, 
> %ld\n",
> +  rv);
>   intf->null_user_handler = NULL;
> - } else {
> - unsigned int set = intf->curr_working_cset;
> + return -EIO;
> + }
>  
> - /* Assume a single IPMB channel at zero. */
> - intf->wchannels[set].c[0].medium = IPMI_CHANNEL_MEDIUM_IPMB;
> - intf->wchannels[set].c[0].protocol = IPMI_CHANNEL_PROTOCOL_IPMB;
> - intf->channel_list = intf->wchannels + set;
> - intf->channels_ready = true;
> + /* Wait for the channel info to be read. */
> + rv = wait_event_timeout(intf->waitq, intf->channels_ready, 5 * HZ);
> + if (rv == 0) {
> + dev_warn(intf->si_dev,
> +  "Timed out waiting for channel information. Assuming a 
> single IPMB channel at 0.\n");
> + goto single_ipmb_channel;
>   }
>  
> + goto out;
> +
> 

Re: [PATCH 2/3] ipmi: Add timeout waiting for device GUID

2020-09-07 Thread Corey Minyard
On Mon, Sep 07, 2020 at 06:25:36PM +0200, Markus Boehme wrote:
> We have observed hosts with misbehaving BMCs that receive a Get Device
> GUID command but don't respond. This leads to an indefinite wait in the
> ipmi_msghandler's __get_guid function, showing up as hung task messages
> for modprobe.
> 
> According to IPMI 2.0 specification chapter 20, the implementation of
> the Get Device GUID command is optional. Therefore, add a timeout to
> waiting for its response and treat the lack of one the same as missing a
> device GUID.

This patch looks good.  It's a little bit of a rewrite, but the reasons
are obvious.

-corey

> 
> Signed-off-by: Stefan Nuernberger 
> Signed-off-by: Markus Boehme 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 2b213c9..2a2e8b2 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3184,18 +3184,26 @@ static void guid_handler(struct ipmi_smi *intf, 
> struct ipmi_recv_msg *msg)
>  
>  static void __get_guid(struct ipmi_smi *intf)
>  {
> - int rv;
> + long rv;
>   struct bmc_device *bmc = intf->bmc;
>  
>   bmc->dyn_guid_set = 2;
>   intf->null_user_handler = guid_handler;
>   rv = send_guid_cmd(intf, 0);
> - if (rv)
> + if (rv) {
>   /* Send failed, no GUID available. */
>   bmc->dyn_guid_set = 0;
> - else
> - wait_event(intf->waitq, bmc->dyn_guid_set != 2);
> + goto out;
> + }
>  
> + rv = wait_event_timeout(intf->waitq, bmc->dyn_guid_set != 2, 5 * HZ);
> + if (rv == 0) {
> + dev_warn_once(intf->si_dev,
> +   "Timed out waiting for GUID. Assuming GUID is not 
> available.\n");
> + bmc->dyn_guid_set = 0;
> + }
> +
> +out:
>   /* dyn_guid_set makes the guid data available. */
>   smp_rmb();
>  
> -- 
> 2.7.4
> 


Re: [PATCH 1/3] ipmi: Reset response handler when failing to send the command

2020-09-07 Thread Corey Minyard
On Mon, Sep 07, 2020 at 06:25:35PM +0200, Markus Boehme wrote:
> When failing to send a command we don't expect a response. Clear the
> `null_user_handler` like is done in the success path.

This is correct.  I guess, from the next two patches, I know how you
found this.

I can incude this, but I will ask some questions in the later patches.

-corey

> 
> Signed-off-by: Markus Boehme 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6..2b213c9 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -2433,7 +2433,7 @@ static int __get_device_id(struct ipmi_smi *intf, 
> struct bmc_device *bmc)
>  
>   rv = send_get_device_id_cmd(intf);
>   if (rv)
> - return rv;
> + goto out_reset_handler;
>  
>   wait_event(intf->waitq, bmc->dyn_id_set != 2);
>  
> @@ -2443,6 +2443,7 @@ static int __get_device_id(struct ipmi_smi *intf, 
> struct bmc_device *bmc)
>   /* dyn_id_set makes the id data available. */
>   smp_rmb();
>  
> +out_reset_handler:
>   intf->null_user_handler = NULL;
>  
>   return rv;
> @@ -3329,6 +3330,7 @@ static int __scan_channels(struct ipmi_smi *intf, 
> struct ipmi_device_id *id)
>   dev_warn(intf->si_dev,
>"Error sending channel information for channel 
> 0, %d\n",
>rv);
> + intf->null_user_handler = NULL;
>   return -EIO;
>   }
>  
> -- 
> 2.7.4
> 


Re: [PATCH] ipmi: add a newline when printing parameter 'panic_op' by sysfs

2020-09-03 Thread Corey Minyard
On Thu, Sep 03, 2020 at 07:01:13PM +0800, Xiongfeng Wang wrote:
> When I cat ipmi_msghandler parameter 'panic_op' by sysfs, it displays as
> follows. It's better to add a newline for easy reading.
> 
> root@(none):/# cat /sys/module/ipmi_msghandler/parameters/panic_op
> noneroot@(none):/#

Thanks, it's in my for-next queue.

-corey

> 
> Signed-off-by: Xiongfeng Wang 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6..6ebb549 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -89,19 +89,19 @@ static int panic_op_read_handler(char *buffer, const 
> struct kernel_param *kp)
>  {
>   switch (ipmi_send_panic_event) {
>   case IPMI_SEND_PANIC_EVENT_NONE:
> - strcpy(buffer, "none");
> + strcpy(buffer, "none\n");
>   break;
>  
>   case IPMI_SEND_PANIC_EVENT:
> - strcpy(buffer, "event");
> + strcpy(buffer, "event\n");
>   break;
>  
>   case IPMI_SEND_PANIC_EVENT_STRING:
> - strcpy(buffer, "string");
> + strcpy(buffer, "string\n");
>   break;
>  
>   default:
> - strcpy(buffer, "???");
> + strcpy(buffer, "???\n");
>   break;
>   }
>  
> -- 
> 1.7.12.4
> 


Re: [PATCH] char: ipmi: convert tasklets to use new tasklet_setup() API

2020-08-18 Thread Corey Minyard
On Tue, Aug 18, 2020 at 02:46:23PM +0530, Allen wrote:
> > >
> > > Signed-off-by: Romain Perier 
> > > Signed-off-by: Allen Pais 
> >
> > This looks good to me.
> >
> > Reviewed-by: Corey Minyard 
> >
> > Are you planning to push this, or do you want me to take it?  If you
> > want me to take it, what is the urgency?
> 
>  Thanks. Well, not hurry, as long as it goes into 5.9 with all other
> changes.

Ok, this is queued in my for-next branch.

-corey

> 
> 
> >
> > -corey
> >
> > > ---
> > >  drivers/char/ipmi/ipmi_msghandler.c | 13 ++---
> > >  1 file changed, 6 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> > > b/drivers/char/ipmi/ipmi_msghandler.c
> > > index 737c0b6b24ea..e1814b6a1225 100644
> > > --- a/drivers/char/ipmi/ipmi_msghandler.c
> > > +++ b/drivers/char/ipmi/ipmi_msghandler.c
> > > @@ -39,7 +39,7 @@
> > >
> > >  static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void);
> > >  static int ipmi_init_msghandler(void);
> > > -static void smi_recv_tasklet(unsigned long);
> > > +static void smi_recv_tasklet(struct tasklet_struct *t);
> > >  static void handle_new_recv_msgs(struct ipmi_smi *intf);
> > >  static void need_waiter(struct ipmi_smi *intf);
> > >  static int handle_one_recv_msg(struct ipmi_smi *intf,
> > > @@ -3430,9 +3430,8 @@ int ipmi_add_smi(struct module *owner,
> > >   intf->curr_seq = 0;
> > >   spin_lock_init(>waiting_rcv_msgs_lock);
> > >   INIT_LIST_HEAD(>waiting_rcv_msgs);
> > > - tasklet_init(>recv_tasklet,
> > > -  smi_recv_tasklet,
> > > -  (unsigned long) intf);
> > > + tasklet_setup(>recv_tasklet,
> > > +  smi_recv_tasklet);
> > >   atomic_set(>watchdog_pretimeouts_to_deliver, 0);
> > >   spin_lock_init(>xmit_msgs_lock);
> > >   INIT_LIST_HEAD(>xmit_msgs);
> > > @@ -4467,10 +4466,10 @@ static void handle_new_recv_msgs(struct ipmi_smi 
> > > *intf)
> > >   }
> > >  }
> > >
> > > -static void smi_recv_tasklet(unsigned long val)
> > > +static void smi_recv_tasklet(struct tasklet_struct *t)
> > >  {
> > >   unsigned long flags = 0; /* keep us warning-free. */
> > > - struct ipmi_smi *intf = (struct ipmi_smi *) val;
> > > + struct ipmi_smi *intf = from_tasklet(intf, t, recv_tasklet);
> > >   int run_to_completion = intf->run_to_completion;
> > >   struct ipmi_smi_msg *newmsg = NULL;
> > >
> > > @@ -4542,7 +4541,7 @@ void ipmi_smi_msg_received(struct ipmi_smi *intf,
> > >   spin_unlock_irqrestore(>xmit_msgs_lock, flags);
> > >
> > >   if (run_to_completion)
> > > - smi_recv_tasklet((unsigned long) intf);
> > > + smi_recv_tasklet(>recv_tasklet);
> > >   else
> > >   tasklet_schedule(>recv_tasklet);
> > >  }
> > > --
> > > 2.17.1
> > >
> 
> 
> 
> -- 
>- Allen


Re: Oops on current Raspian when closing an SCTP connection

2020-08-17 Thread Corey Minyard
On Mon, Aug 17, 2020 at 10:44:57AM -0300, Marcelo Ricardo Leitner wrote:
> On Sun, Aug 16, 2020 at 06:06:24PM -0500, Corey Minyard wrote:
> > I'm seeing the following when an SCTP connection terminates.  This is on
> > Raspian on a Raspberry Pi, version is Linux version 5.4.51-v7+.  That's
> > 32-bit ARM.
> > 
> > I haven't looked into it yet, I thought I would report before trying to
> > chase anything down.  I'm not seeing it on 5.4 x86_64 systems.
> > 
> > Aug 16 17:59:01 access kernel: [510640.438008] Hardware name: BCM2835
> > Aug 16 17:59:01 access kernel: [510640.443823] PC is at 
> > sctp_ulpevent_free+0x38/0xa0 [sctp]
> > Aug 16 17:59:01 access kernel: [510640.451498] LR is at 
> > sctp_queue_purge_ulpevents+0x34/0x50 [sctp]
> 
> Not ringing a bell here. Can you pinpoint on which line this crash
> was? It seems, by the 0x8 offset and these function offsets, that this
> could be when it was trying to access event->rmem_len, but if event
> was NULL then it should have crashed earlier.
> 
>   Marcelo

I think so:

00015e38 :
   15e38:   e1a0c00dmov ip, sp
   15e3c:   e92dd878push{r3, r4, r5, r6, fp, ip, lr, pc}
   15e40:   e24cb004sub fp, ip, #4
   15e44:   e52de004push{lr}; (str lr, [sp, #-4]!)
   15e48:   ebfebl  0 <__gnu_mcount_nc>

ulpevent.c:1102 if (sctp_ulpevent_is_notification(event))
   15e4c:   e1d032f0ldrsh   r3, [r0, #32]
   15e50:   e1a05000mov r5, r0
   15e54:   e353cmp r3, #0
   15e58:   ba11blt 15ea4 

This is the false branch from the above (high bit isn't set).

ulpevent.c:1061 if (!skb->data_len) (just the fetch part)
   15e5c:   e5903040ldr r3, [r0, #64]   ; 0x40

ulpevent.c:1059 len = skb->len;
   15e60:   e590603cldr r6, [r0, #60]   ; 0x3c

ulpevent.c:1061 if (!skb->data_len) (the compare part)
   15e64:   e353cmp r3, #0
   15e68:   0a08beq 15e90 


ulpevent.c:1065 skb_walk_frags(skb, frag) {
skbuff.h:1395   return skb->end; (inside skb_shinfo())
   15e6c:   e5903084ldr r3, [r0, #132]  ; 0x84

skbuff.h:3461   iter = skb_shinfo(skb)->frag_list
   15e70:   e5934008ldr r4, [r3, #8] <-- Crash occurs here

   15e74:   e354cmp r4, #0
   15e78:   0a04beq 15e90 
   15e7c:   e2840018add r0, r4, #24
   15e80:   ebfffa34bl  14758 


So it appears that skb->end is NULL.

-corey


Re: [PATCH] char: ipmi: convert tasklets to use new tasklet_setup() API

2020-08-17 Thread Corey Minyard
On Mon, Aug 17, 2020 at 02:45:57PM +0530, Allen Pais wrote:
> From: Allen Pais 
> 
> In preparation for unconditionally passing the
> struct tasklet_struct pointer to all tasklet
> callbacks, switch to using the new tasklet_setup()
> and from_tasklet() to pass the tasklet pointer explicitly.
> 
> Signed-off-by: Romain Perier 
> Signed-off-by: Allen Pais 

This looks good to me.

Reviewed-by: Corey Minyard 

Are you planning to push this, or do you want me to take it?  If you
want me to take it, what is the urgency?

-corey

> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 737c0b6b24ea..e1814b6a1225 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -39,7 +39,7 @@
>  
>  static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void);
>  static int ipmi_init_msghandler(void);
> -static void smi_recv_tasklet(unsigned long);
> +static void smi_recv_tasklet(struct tasklet_struct *t);
>  static void handle_new_recv_msgs(struct ipmi_smi *intf);
>  static void need_waiter(struct ipmi_smi *intf);
>  static int handle_one_recv_msg(struct ipmi_smi *intf,
> @@ -3430,9 +3430,8 @@ int ipmi_add_smi(struct module *owner,
>   intf->curr_seq = 0;
>   spin_lock_init(>waiting_rcv_msgs_lock);
>   INIT_LIST_HEAD(>waiting_rcv_msgs);
> - tasklet_init(>recv_tasklet,
> -  smi_recv_tasklet,
> -  (unsigned long) intf);
> + tasklet_setup(>recv_tasklet,
> +  smi_recv_tasklet);
>   atomic_set(>watchdog_pretimeouts_to_deliver, 0);
>   spin_lock_init(>xmit_msgs_lock);
>   INIT_LIST_HEAD(>xmit_msgs);
> @@ -4467,10 +4466,10 @@ static void handle_new_recv_msgs(struct ipmi_smi 
> *intf)
>   }
>  }
>  
> -static void smi_recv_tasklet(unsigned long val)
> +static void smi_recv_tasklet(struct tasklet_struct *t)
>  {
>   unsigned long flags = 0; /* keep us warning-free. */
> - struct ipmi_smi *intf = (struct ipmi_smi *) val;
> + struct ipmi_smi *intf = from_tasklet(intf, t, recv_tasklet);
>   int run_to_completion = intf->run_to_completion;
>   struct ipmi_smi_msg *newmsg = NULL;
>  
> @@ -4542,7 +4541,7 @@ void ipmi_smi_msg_received(struct ipmi_smi *intf,
>   spin_unlock_irqrestore(>xmit_msgs_lock, flags);
>  
>   if (run_to_completion)
> - smi_recv_tasklet((unsigned long) intf);
> + smi_recv_tasklet(>recv_tasklet);
>   else
>   tasklet_schedule(>recv_tasklet);
>  }
> -- 
> 2.17.1
> 


Oops on current Raspian when closing an SCTP connection

2020-08-16 Thread Corey Minyard
I'm seeing the following when an SCTP connection terminates.  This is on
Raspian on a Raspberry Pi, version is Linux version 5.4.51-v7+.  That's
32-bit ARM.

I haven't looked into it yet, I thought I would report before trying to
chase anything down.  I'm not seeing it on 5.4 x86_64 systems.

Aug 16 17:59:00 access kernel: [510640.326415] Unable to handle kernel NULL 
pointer dereference at virtual address 0008
Aug 16 17:59:00 access kernel: [510640.341624] pgd = c00fc16c
Aug 16 17:59:00 access kernel: [510640.347834] [0008] *pgd=355ef835, 
*pte=, *ppte=
Aug 16 17:59:00 access kernel: [510640.357731] Internal error: Oops: 17 [#22] 
SMP ARM
Aug 16 17:59:01 access kernel: [510640.365931] Modules linked in: md5 sctp 
ftdi_sio cp210x usbserial raspberrypi_hwmon bcm2835_codec(C) v4l2_mem2mem 
bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) videobuf2_vmalloc 
videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 snd_bcm2835(C) 
videobuf2_common i2c_bcm2835 snd_pcm snd_timer videodev snd mc vc_sm_cma(C) 
uio_pdrv_genirq uio fixed nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack 
nf_defrag_ipv4 rtc_ds1307 regmap_i2c i2c_dev ip_tables x_tables ipv6 
nf_defrag_ipv6
Aug 16 17:59:01 access kernel: [510640.425420] CPU: 1 PID: 4592 Comm: gtlsshd 
Tainted: G  D  C5.4.51-v7+ #1327
Aug 16 17:59:01 access kernel: [510640.438008] Hardware name: BCM2835
Aug 16 17:59:01 access kernel: [510640.443823] PC is at 
sctp_ulpevent_free+0x38/0xa0 [sctp]
Aug 16 17:59:01 access kernel: [510640.451498] LR is at 
sctp_queue_purge_ulpevents+0x34/0x50 [sctp]
Aug 16 17:59:01 access kernel: [510640.459748] pc : [<7f1f3e08>]lr : 
[<7f1f3ea4>]psr: 2013
Aug 16 17:59:01 access kernel: [510640.468311] sp : ad397dd0  ip : ad397df0  fp 
: ad397dec
Aug 16 17:59:01 access kernel: [510640.475811] r10: 0008  r9 : b082c2c0  r8 
: 80d04f48
Aug 16 17:59:01 access kernel: [510640.483282] r7 :   r6 :   r5 
: b1ebc6a8  r4 : 
Aug 16 17:59:01 access kernel: [510640.492079] r3 :   r2 :   r1 
:   r0 : b1ebc6a8
Aug 16 17:59:01 access kernel: [510640.500815] Flags: nzCv  IRQs on  FIQs on  
Mode SVC_32  ISA ARM  Segment user
Aug 16 17:59:01 access kernel: [510640.510156] Control: 10c5387d  Table: 
2028806a  DAC: 0055
Aug 16 17:59:01 access kernel: [510640.518148] Process gtlsshd (pid: 4592, 
stack limit = 0x4ac0cb39)
Aug 16 17:59:01 access kernel: [510640.526489] Stack: (0xad397dd0 to 0xad398000)
Aug 16 17:59:01 access kernel: [510640.533055] 7dc0:
   b1ec2a10 b1ea1300
Aug 16 17:59:01 access kernel: [510640.545471] 7de0: ad397e04 ad397df0 7f1f3ea4 
7f1f3ddc b1ec2600 80d87e00 ad397e54 ad397e08
Aug 16 17:59:01 access kernel: [510640.557828] 7e00: 7f1fb208 7f1f3e7c ad397e94 
 802d4fec 8020b210 ad397e44 ad397e28
Aug 16 17:59:01 access kernel: [510640.570172] 7e20: aeacaf00 9ff167fd 8010bbbc 
b1ec2600 b082c240 7f21c380 b64b9550 94d54bb0
Aug 16 17:59:01 access kernel: [510640.582536] 7e40: b082c2c0 0008 ad397e6c 
ad397e58 80826f9c 7f1fb1ac b1ec2600 b082c240
Aug 16 17:59:01 access kernel: [510640.595093] 7e60: ad397e84 ad397e70 7f00a23c 
80826f5c b082c240 b082c340 ad397ea4 ad397e88
Aug 16 17:59:01 access kernel: [510640.607835] 7e80: 80751c38 7f00a20c 80751cac 
ad0006c0 000e0003 b082c2c0 ad397eb4 ad397ea8
Aug 16 17:59:01 access kernel: [510640.620747] 7ea0: 80751ccc 80751bf4 ad397ef4 
ad397eb8 802e43f0 80751cb8  
Aug 16 17:59:01 access kernel: [510640.633918] 7ec0: 8010cce8 ad0006c8 ad0006c0 
 b3c160b0 b3c15b80 b3c160d4 80dab2c0
Aug 16 17:59:01 access kernel: [510640.647206] 7ee0: ad0006c0 0006 ad397f04 
ad397ef8 802e459c 802e4358 ad397f2c ad397f08
Aug 16 17:59:01 access kernel: [510640.660620] 7f00: 801408e4 802e4590 ad396000 
ad397fb0 80d04f48 80d04f4c 0004 801011c4
Aug 16 17:59:01 access kernel: [510640.674165] 7f20: ad397fac ad397f30 8010cce8 
80140838 ad397f4c ad397f40 802e4b1c 802e4a30
Aug 16 17:59:01 access kernel: [510640.687762] 7f40: ad397f6c ad397f50 802ddeb8 
802e4b0c 0052  b4335600 
Aug 16 17:59:01 access kernel: [510640.701356] 7f60: ad397f94 ad397f70 80304e0c 
802dde54 01614360 01614150 01615b18 9ff167fd
Aug 16 17:59:01 access kernel: [510640.714947] 7f80: 801011c4 01614360 01614150 
01615b18 0006 801011c4 ad396000 0006
Aug 16 17:59:01 access kernel: [510640.728540] 7fa0:  ad397fb0 80101034 
8010c884   11f0 
Aug 16 17:59:01 access kernel: [510640.742130] 7fc0: 01614360 01614150 01615b18 
0006 76ea3c04 0001  0009
Aug 16 17:59:01 access kernel: [510640.755714] 7fe0: 76ecd4a0 7ea20418 76e8d2f8 
76b8c2f0 6010   
Aug 16 17:59:01 access kernel: [510640.769284] Backtrace:
Aug 16 17:59:01 access kernel: [510640.774544] [<7f1f3dd0>] (sctp_ulpevent_free 
[sctp]) from [<7f1f3ea4>] (sctp_queue_purge_ulpevents+0x34/0x50 [sctp])
Aug 16 17:59:01 access kernel: 

[GIT PULL] IPMI bug fixes for 5.9

2020-08-08 Thread Corey Minyard
The following changes since commit a5dc8300df75e8b8384b4c82225f1e4a0b4d9b55:

  scripts/decode_stacktrace: warn when modpath is needed but is unset 
(2020-06-15 15:37:24 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.9-1

for you to fetch changes up to 634b06def11cf7ecf6282c79735f08004e323983:

  ipmi/watchdog: add missing newlines when printing parameters by sysfs 
(2020-07-21 06:29:15 -0500)


Minor cleanups to the IPMI driver for 5.9

Nothing of any major consequence.  Duplicate code, some missing \n's in
sysfs files, some documentation and comment changes.

-corey


Jing Xiangfeng (1):
  ipmi: remve duplicate code in __ipmi_bmc_register()

Misono Tomohiro (2):
  Doc: driver-api: ipmi: Add description of alerts_broken module param
  ipmi: ssif: Remove finished TODO comment about SMBus alert

Xiongfeng Wang (1):
  ipmi/watchdog: add missing newlines when printing parameters by sysfs

 Documentation/driver-api/ipmi.rst   | 4 
 drivers/char/ipmi/ipmi_msghandler.c | 2 --
 drivers/char/ipmi/ipmi_ssif.c   | 5 -
 drivers/char/ipmi/ipmi_watchdog.c   | 9 +++--
 4 files changed, 11 insertions(+), 9 deletions(-)



Re: [PATCH v2] ipmi/watchdog: add missing newlines when printing parameters by sysfs

2020-07-21 Thread Corey Minyard
On Tue, Jul 21, 2020 at 02:35:09PM +0800, Xiongfeng Wang wrote:
> When I cat some ipmi_watchdog parameters by sysfs, it displays as
> follows. It's better to add a newline for easy reading.
> 
> root@(none):/# cat /sys/module/ipmi_watchdog/parameters/action
> resetroot@(none):/# cat /sys/module/ipmi_watchdog/parameters/preaction
> pre_noneroot@(none):/# cat /sys/module/ipmi_watchdog/parameters/preop
> preop_noneroot@(none):/#
> 
> Signed-off-by: Xiongfeng Wang 

Thanks, included in my queue for next release.

-corey

> ---
>  drivers/char/ipmi/ipmi_watchdog.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_watchdog.c 
> b/drivers/char/ipmi/ipmi_watchdog.c
> index 55986e1..f78156d 100644
> --- a/drivers/char/ipmi/ipmi_watchdog.c
> +++ b/drivers/char/ipmi/ipmi_watchdog.c
> @@ -232,12 +232,17 @@ static int set_param_str(const char *val, const struct 
> kernel_param *kp)
>  static int get_param_str(char *buffer, const struct kernel_param *kp)
>  {
>   action_fn fn = (action_fn) kp->arg;
> - int   rv;
> + int rv, len;
>  
>   rv = fn(NULL, buffer);
>   if (rv)
>   return rv;
> - return strlen(buffer);
> +
> + len = strlen(buffer);
> + buffer[len++] = '\n';
> + buffer[len] = 0;
> +
> + return len;
>  }
>  
>  
> -- 
> 1.7.12.4
> 


Re: [PATCH] ipmi/watchdog: add missing newlines when printing parameters by sysfs

2020-07-20 Thread Corey Minyard
On Mon, Jul 20, 2020 at 10:03:25AM +0800, Xiongfeng Wang wrote:
> When I cat some ipmi_watchdog parameters by sysfs, it displays as
> follows. It's better to add a newline for easy reading.
> 
> root@(none):/# cat /sys/module/ipmi_watchdog/parameters/action
> resetroot@(none):/# cat /sys/module/ipmi_watchdog/parameters/preaction
> pre_noneroot@(none):/# cat /sys/module/ipmi_watchdog/parameters/preop
> preop_noneroot@(none):/#

Yeah, that's not consistent with other things displayed in the kernel.

> 
> Signed-off-by: Xiongfeng Wang 
> ---
>  drivers/char/ipmi/ipmi_watchdog.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_watchdog.c 
> b/drivers/char/ipmi/ipmi_watchdog.c
> index 55986e1..3e05a1d 100644
> --- a/drivers/char/ipmi/ipmi_watchdog.c
> +++ b/drivers/char/ipmi/ipmi_watchdog.c
> @@ -232,12 +232,16 @@ static int set_param_str(const char *val, const struct 
> kernel_param *kp)
>  static int get_param_str(char *buffer, const struct kernel_param *kp)
>  {
>   action_fn fn = (action_fn) kp->arg;
> - int   rv;
> + int rv, len;
>  
>   rv = fn(NULL, buffer);
>   if (rv)
>   return rv;
> - return strlen(buffer);
> +
> + len = strlen(buffer);
> + len += sprintf(buffer + len, "\n");

sprintf is kind of overkill to stick a \n on the end of a line.  How
about:

buffer[len++] = '\n';

Since you are returning the length, you shouldn't need to nil terminate
the string.

An unrelated question: Are you using any of the special functions of the
IPMI watchdog, like the pretimeout?

-corey

> +
> + return len;
>  }
>  
>  
> -- 
> 1.7.12.4
> 


Re: [PATCH] ipmi: remve duplicate code in __ipmi_bmc_register()

2020-07-20 Thread Corey Minyard
On Mon, Jul 20, 2020 at 04:08:38PM +0800, Jing Xiangfeng wrote:
> __ipmi_bmc_register() jumps to the label 'out_free_my_dev_name' in an
> error path. So we can remove duplicate code in the if (rv).

Looks correct, queued for next release.

Thanks,

-corey

> 
> Signed-off-by: Jing Xiangfeng 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index e1b22fe0916c..737c0b6b24ea 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3080,8 +3080,6 @@ static int __ipmi_bmc_register(struct ipmi_smi *intf,
>   rv = sysfs_create_link(>pdev.dev.kobj, >si_dev->kobj,
>  intf->my_dev_name);
>   if (rv) {
> - kfree(intf->my_dev_name);
> - intf->my_dev_name = NULL;
>   dev_err(intf->si_dev, "Unable to create symlink to bmc: %d\n",
>   rv);
>   goto out_free_my_dev_name;
> -- 
> 2.17.1
> 


Re: [PATCH net] sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket

2020-06-24 Thread Corey Minyard
On Wed, Jun 24, 2020 at 05:34:18PM -0300, Marcelo Ricardo Leitner wrote:
> If a socket is set ipv6only, it will still send IPv4 addresses in the
> INIT and INIT_ACK packets. This potentially misleads the peer into using
> them, which then would cause association termination.
> 
> The fix is to not add IPv4 addresses to ipv6only sockets.

Fixes the issue for me.

Tested-by: Corey Minyard 

Thanks a bunch.

-corey

> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Corey Minyard 
> Signed-off-by: Marcelo Ricardo Leitner 
> ---
>  include/net/sctp/constants.h | 8 +---
>  net/sctp/associola.c | 5 -
>  net/sctp/bind_addr.c | 1 +
>  net/sctp/protocol.c  | 3 ++-
>  4 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
> index 
> 15b4d9aec7ff278e67a7183f10c14be237227d6b..122d9e2d8dfde33b787d575fc42d454732550698
>  100644
> --- a/include/net/sctp/constants.h
> +++ b/include/net/sctp/constants.h
> @@ -353,11 +353,13 @@ enum {
>ipv4_is_anycast_6to4(a))
>  
>  /* Flags used for the bind address copy functions.  */
> -#define SCTP_ADDR6_ALLOWED   0x0001  /* IPv6 address is allowed by
> +#define SCTP_ADDR4_ALLOWED   0x0001  /* IPv4 address is allowed by
>  local sock family */
> -#define SCTP_ADDR4_PEERSUPP  0x0002  /* IPv4 address is supported by
> +#define SCTP_ADDR6_ALLOWED   0x0002  /* IPv6 address is allowed by
> +local sock family */
> +#define SCTP_ADDR4_PEERSUPP  0x0004  /* IPv4 address is supported by
>  peer */
> -#define SCTP_ADDR6_PEERSUPP  0x0004  /* IPv6 address is supported by
> +#define SCTP_ADDR6_PEERSUPP  0x0008  /* IPv6 address is supported by
>  peer */
>  
>  /* Reasons to retransmit. */
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 
> 72315137d7e7f20d5182291ef4b01102f030078b..8d735461fa196567ab19c583703aad098ef8e240
>  100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -1565,12 +1565,15 @@ void sctp_assoc_rwnd_decrease(struct sctp_association 
> *asoc, unsigned int len)
>  int sctp_assoc_set_bind_addr_from_ep(struct sctp_association *asoc,
>enum sctp_scope scope, gfp_t gfp)
>  {
> + struct sock *sk = asoc->base.sk;
>   int flags;
>  
>   /* Use scoping rules to determine the subset of addresses from
>* the endpoint.
>*/
> - flags = (PF_INET6 == asoc->base.sk->sk_family) ? SCTP_ADDR6_ALLOWED : 0;
> + flags = (PF_INET6 == sk->sk_family) ? SCTP_ADDR6_ALLOWED : 0;
> + if (!inet_v6_ipv6only(sk))
> + flags |= SCTP_ADDR4_ALLOWED;
>   if (asoc->peer.ipv4_address)
>   flags |= SCTP_ADDR4_PEERSUPP;
>   if (asoc->peer.ipv6_address)
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index 
> 53bc61537f44f4e766c417fcef72234df52ecd04..701c5a4e441d9c248df9472f22db5b78987f9e44
>  100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -461,6 +461,7 @@ static int sctp_copy_one_addr(struct net *net, struct 
> sctp_bind_addr *dest,
>* well as the remote peer.
>*/
>   if AF_INET == addr->sa.sa_family) &&
> +   (flags & SCTP_ADDR4_ALLOWED) &&
> (flags & SCTP_ADDR4_PEERSUPP))) ||
>   (((AF_INET6 == addr->sa.sa_family) &&
> (flags & SCTP_ADDR6_ALLOWED) &&
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 
> 092d1afdee0d23cd974210839310fbf406dd443f..cde29f3c7fb3c40ee117636fa3b4b7f0a03e4fba
>  100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -148,7 +148,8 @@ int sctp_copy_local_addr_list(struct net *net, struct 
> sctp_bind_addr *bp,
>* sock as well as the remote peer.
>*/
>   if (addr->a.sa.sa_family == AF_INET &&
> - !(copy_flags & SCTP_ADDR4_PEERSUPP))
> + (!(copy_flags & SCTP_ADDR4_ALLOWED) ||
> +  !(copy_flags & SCTP_ADDR4_PEERSUPP)))
>   continue;
>   if (addr->a.sa.sa_family == AF_INET6 &&
>   (!(copy_flags & SCTP_ADDR6_ALLOWED) ||
> -- 
> 2.25.4
> 


Re: Strange problem with SCTP+IPv6

2020-06-23 Thread Corey Minyard
On Tue, Jun 23, 2020 at 01:17:28PM +, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 22 June 2020 19:33
> > On Mon, Jun 22, 2020 at 08:01:24PM +0200, Michael Tuexen wrote:
> > > > On 22. Jun 2020, at 18:57, Corey Minyard  wrote:
> > > >
> > > > On Mon, Jun 22, 2020 at 08:01:23PM +0800, Xin Long wrote:
> > > >> On Sun, Jun 21, 2020 at 11:56 PM Corey Minyard  wrote:
> > > >>>
> > > >>> I've stumbled upon a strange problem with SCTP and IPv6.  If I create 
> > > >>> an
> > > >>> sctp listening socket on :: and set the IPV6_V6ONLY socket option on 
> > > >>> it,
> > > >>> then I make a connection to it using ::1, the connection will drop 
> > > >>> after
> > > >>> 2.5 seconds with an ECONNRESET error.
> > > >>>
> > > >>> It only happens on SCTP, it doesn't have the issue if you connect to a
> > > >>> full IPv6 address instead of ::1, and it doesn't happen if you don't
> > > >>> set IPV6_V6ONLY.  I have verified current end of tree kernel.org.
> > > >>> I tried on an ARM system and x86_64.
> > > >>>
> > > >>> I haven't dug into the kernel to see if I could find anything yet, 
> > > >>> but I
> > > >>> thought I would go ahead and report it.  I am attaching a reproducer.
> > > >>> Basically, compile the following code:
> > > >> The code only set IPV6_V6ONLY on server side, so the client side will
> > > >> still bind all the local ipv4 addresses (as you didn't call bind() to
> > > >> bind any specific addresses ). Then after the connection is created,
> > > >> the client will send HB on the v4 paths to the server. The server
> > > >> will abort the connection, as it can't support v4.
> > > >>
> > > >> So you can work around it by either:
> > > >>
> > > >>  - set IPV6_V6ONLY on client side.
> > > >>
> > > >> or
> > > >>
> > > >>  - bind to the specific v6 addresses on the client side.
> > > >>
> > > >> I don't see RFC said something about this.
> > > >> So it may not be a good idea to change the current behaviour
> > > >> to not establish the connection in this case, which may cause 
> > > >> regression.
> > > >
> > > > Ok, I understand this.  It's a little strange, but I see why it works
> > > > this way.
> > > I don't. I would expect it to work as I described in my email.
> > > Could someone explain me how and why it is behaving different from
> > > my expectation?
> > 
> > It looks like a bug to me. Testing with this test app here, I can see
> > the INIT_ACK being sent with a bunch of ipv4 addresses in it and
> > that's unexpected for a v6only socket. As is, it's the server saying
> > "I'm available at these other addresses too, but not."
> 
> Does it even make sense to mix IPv4 and IPv6 addresses on the same
> connection?
> I don't remember ever seeing both types of address in a message,
> but may not have looked.

That's an interesting question.  Do the RFCs say anything?  I would
assume it was ok unless ipv6only was set.

> 
> I also wonder whether the connection should be dropped for an error
> response on a path that has never been validated.

That actually bothered me a bit more.  Shouldn't it stay up if any path
is up?  That's kind of the whole point of multihoming.

> 
> OTOH the whole 'multi-homing' part of SCTP sucks.

I don't think so.

> The IP addresses a server needs to bind to depend on where the
> incoming connection will come from.
> A local connection may be able to use a 192.168.x.x address
> but a remote connection must not - as it may be defined locally
> at the remote system.
> But both connections can come into the public (routable) address.
> We have to tell customers to explicitly configure the local IP
> addresses - which means the application has to know what they are.
> Fortunately these apps are pretty static - usually M3UA.

Umm, no,  If you have a private address, it better be behind a firewall,
and the firewall should handle rewriting the packet to fix the addresses.

It doesn't appear that Linux netfilter does this.  There is a TODO in
the code for this.  But that's how it *should* work.

-corey

> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
> 


Re: Strange problem with SCTP+IPv6

2020-06-23 Thread Corey Minyard
On Tue, Jun 23, 2020 at 11:40:21PM +0800, Xin Long wrote:
> On Tue, Jun 23, 2020 at 9:29 PM Corey Minyard  wrote:
> >
> > On Tue, Jun 23, 2020 at 06:13:30PM +0800, Xin Long wrote:
> > > On Tue, Jun 23, 2020 at 2:34 AM Michael Tuexen
> > >  wrote:
> > > >
> > > > > On 22. Jun 2020, at 20:32, Marcelo Ricardo Leitner 
> > > > >  wrote:
> > > > >
> > > > > On Mon, Jun 22, 2020 at 08:01:24PM +0200, Michael Tuexen wrote:
> > > > >>> On 22. Jun 2020, at 18:57, Corey Minyard  wrote:
> > > > >>>
> > > > >>> On Mon, Jun 22, 2020 at 08:01:23PM +0800, Xin Long wrote:
> > > > >>>> On Sun, Jun 21, 2020 at 11:56 PM Corey Minyard  
> > > > >>>> wrote:
> > > > >>>>>
> > > > >>>>> I've stumbled upon a strange problem with SCTP and IPv6.  If I 
> > > > >>>>> create an
> > > > >>>>> sctp listening socket on :: and set the IPV6_V6ONLY socket option 
> > > > >>>>> on it,
> > > > >>>>> then I make a connection to it using ::1, the connection will 
> > > > >>>>> drop after
> > > > >>>>> 2.5 seconds with an ECONNRESET error.
> > > > >>>>>
> > > > >>>>> It only happens on SCTP, it doesn't have the issue if you connect 
> > > > >>>>> to a
> > > > >>>>> full IPv6 address instead of ::1, and it doesn't happen if you 
> > > > >>>>> don't
> > > > >>>>> set IPV6_V6ONLY.  I have verified current end of tree kernel.org.
> > > > >>>>> I tried on an ARM system and x86_64.
> > > > >>>>>
> > > > >>>>> I haven't dug into the kernel to see if I could find anything 
> > > > >>>>> yet, but I
> > > > >>>>> thought I would go ahead and report it.  I am attaching a 
> > > > >>>>> reproducer.
> > > > >>>>> Basically, compile the following code:
> > > > >>>> The code only set IPV6_V6ONLY on server side, so the client side 
> > > > >>>> will
> > > > >>>> still bind all the local ipv4 addresses (as you didn't call bind() 
> > > > >>>> to
> > > > >>>> bind any specific addresses ). Then after the connection is 
> > > > >>>> created,
> > > > >>>> the client will send HB on the v4 paths to the server. The server
> > > > >>>> will abort the connection, as it can't support v4.
> > > > >>>>
> > > > >>>> So you can work around it by either:
> > > > >>>>
> > > > >>>> - set IPV6_V6ONLY on client side.
> > > > >>>>
> > > > >>>> or
> > > > >>>>
> > > > >>>> - bind to the specific v6 addresses on the client side.
> > > > >>>>
> > > > >>>> I don't see RFC said something about this.
> > > > >>>> So it may not be a good idea to change the current behaviour
> > > > >>>> to not establish the connection in this case, which may cause 
> > > > >>>> regression.
> > > > >>>
> > > > >>> Ok, I understand this.  It's a little strange, but I see why it 
> > > > >>> works
> > > > >>> this way.
> > > > >> I don't. I would expect it to work as I described in my email.
> > > > >> Could someone explain me how and why it is behaving different from
> > > > >> my expectation?
> > > > >
> > > > > It looks like a bug to me. Testing with this test app here, I can see
> > > > > the INIT_ACK being sent with a bunch of ipv4 addresses in it and
> > > > > that's unexpected for a v6only socket. As is, it's the server saying
> > > > > "I'm available at these other addresses too, but not."
> > > > I agree.
> > > Then we need a fix in sctp_bind_addrs_to_raw():
> > >
> > > @@ -238,6 +240,9 @@ union sctp_params sctp_bind_addrs_to_raw(const
> > > struct sctp_bind_addr *bp,
> > > addrparms = retval;
> > >
> > > list_for_each_entry(

Re: Strange problem with SCTP+IPv6

2020-06-23 Thread Corey Minyard
On Tue, Jun 23, 2020 at 06:13:30PM +0800, Xin Long wrote:
> On Tue, Jun 23, 2020 at 2:34 AM Michael Tuexen
>  wrote:
> >
> > > On 22. Jun 2020, at 20:32, Marcelo Ricardo Leitner 
> > >  wrote:
> > >
> > > On Mon, Jun 22, 2020 at 08:01:24PM +0200, Michael Tuexen wrote:
> > >>> On 22. Jun 2020, at 18:57, Corey Minyard  wrote:
> > >>>
> > >>> On Mon, Jun 22, 2020 at 08:01:23PM +0800, Xin Long wrote:
> > >>>> On Sun, Jun 21, 2020 at 11:56 PM Corey Minyard  wrote:
> > >>>>>
> > >>>>> I've stumbled upon a strange problem with SCTP and IPv6.  If I create 
> > >>>>> an
> > >>>>> sctp listening socket on :: and set the IPV6_V6ONLY socket option on 
> > >>>>> it,
> > >>>>> then I make a connection to it using ::1, the connection will drop 
> > >>>>> after
> > >>>>> 2.5 seconds with an ECONNRESET error.
> > >>>>>
> > >>>>> It only happens on SCTP, it doesn't have the issue if you connect to a
> > >>>>> full IPv6 address instead of ::1, and it doesn't happen if you don't
> > >>>>> set IPV6_V6ONLY.  I have verified current end of tree kernel.org.
> > >>>>> I tried on an ARM system and x86_64.
> > >>>>>
> > >>>>> I haven't dug into the kernel to see if I could find anything yet, 
> > >>>>> but I
> > >>>>> thought I would go ahead and report it.  I am attaching a reproducer.
> > >>>>> Basically, compile the following code:
> > >>>> The code only set IPV6_V6ONLY on server side, so the client side will
> > >>>> still bind all the local ipv4 addresses (as you didn't call bind() to
> > >>>> bind any specific addresses ). Then after the connection is created,
> > >>>> the client will send HB on the v4 paths to the server. The server
> > >>>> will abort the connection, as it can't support v4.
> > >>>>
> > >>>> So you can work around it by either:
> > >>>>
> > >>>> - set IPV6_V6ONLY on client side.
> > >>>>
> > >>>> or
> > >>>>
> > >>>> - bind to the specific v6 addresses on the client side.
> > >>>>
> > >>>> I don't see RFC said something about this.
> > >>>> So it may not be a good idea to change the current behaviour
> > >>>> to not establish the connection in this case, which may cause 
> > >>>> regression.
> > >>>
> > >>> Ok, I understand this.  It's a little strange, but I see why it works
> > >>> this way.
> > >> I don't. I would expect it to work as I described in my email.
> > >> Could someone explain me how and why it is behaving different from
> > >> my expectation?
> > >
> > > It looks like a bug to me. Testing with this test app here, I can see
> > > the INIT_ACK being sent with a bunch of ipv4 addresses in it and
> > > that's unexpected for a v6only socket. As is, it's the server saying
> > > "I'm available at these other addresses too, but not."
> > I agree.
> Then we need a fix in sctp_bind_addrs_to_raw():
> 
> @@ -238,6 +240,9 @@ union sctp_params sctp_bind_addrs_to_raw(const
> struct sctp_bind_addr *bp,
> addrparms = retval;
> 
> list_for_each_entry(addr, >address_list, list) {
> +   if ((PF_INET6 == sk->sk_family) && inet_v6_ipv6only(sk) &&
> +   (AF_INET == addr->a.sa.sa_family))
> +   continue;

This does not compile in the latest mainline.  sk is not defined.
Also, if you could send a normal git patch, that would be easier to 
manage.

Thanks,

-corey

> af = sctp_get_af_specific(addr->a.v4.sin_family);
> len = af->to_addr_param(>a, );
> memcpy(addrparms.v, , len);
> 
> >
> > Best regards
> > Michael
> > >
> > > Thanks,
> > > Marcelo
> > >
> > >>
> > >> Best regards
> > >> Michael
> > >>>
> > >>> Thanks,
> > >>>
> > >>> -corey
> > >>>
> > >>>>
> > >>>>>
> > >>>>> gcc -g -o sctptest -Wall sctptest.c
> > >>>>>
> > >>&

Re: Strange problem with SCTP+IPv6

2020-06-22 Thread Corey Minyard
On Mon, Jun 22, 2020 at 08:01:23PM +0800, Xin Long wrote:
> On Sun, Jun 21, 2020 at 11:56 PM Corey Minyard  wrote:
> >
> > I've stumbled upon a strange problem with SCTP and IPv6.  If I create an
> > sctp listening socket on :: and set the IPV6_V6ONLY socket option on it,
> > then I make a connection to it using ::1, the connection will drop after
> > 2.5 seconds with an ECONNRESET error.
> >
> > It only happens on SCTP, it doesn't have the issue if you connect to a
> > full IPv6 address instead of ::1, and it doesn't happen if you don't
> > set IPV6_V6ONLY.  I have verified current end of tree kernel.org.
> > I tried on an ARM system and x86_64.
> >
> > I haven't dug into the kernel to see if I could find anything yet, but I
> > thought I would go ahead and report it.  I am attaching a reproducer.
> > Basically, compile the following code:
> The code only set IPV6_V6ONLY on server side, so the client side will
> still bind all the local ipv4 addresses (as you didn't call bind() to
> bind any specific addresses ). Then after the connection is created,
> the client will send HB on the v4 paths to the server. The server
> will abort the connection, as it can't support v4.
> 
> So you can work around it by either:
> 
>   - set IPV6_V6ONLY on client side.
> 
> or
> 
>   - bind to the specific v6 addresses on the client side.
> 
> I don't see RFC said something about this.
> So it may not be a good idea to change the current behaviour
> to not establish the connection in this case, which may cause regression.

Ok, I understand this.  It's a little strange, but I see why it works
this way.

Thanks,

-corey

> 
> >
> >   gcc -g -o sctptest -Wall sctptest.c
> >
> > and run it in one window as a server:
> >
> >   ./sctptest a
> >
> > (Pass in any option to be the server) and run the following in another
> > window as the client:
> >
> >   ./sctptest
> >
> > It disconnects after about 2.5 seconds.  If it works, it should just sit
> > there forever.
> >
> > -corey
> >
> >
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> >
> > static int
> > getaddr(const char *addr, const char *port, bool listen,
> > struct addrinfo **rai)
> > {
> > struct addrinfo *ai, hints;
> >
> > memset(, 0, sizeof(hints));
> > hints.ai_flags = AI_ADDRCONFIG;
> > if (listen)
> > hints.ai_flags |= AI_PASSIVE;
> > hints.ai_family = AF_UNSPEC;
> > hints.ai_socktype = SOCK_STREAM;
> > hints.ai_protocol = IPPROTO_SCTP;
> > if (getaddrinfo(addr, port, , )) {
> > perror("getaddrinfo");
> > return -1;
> > }
> >
> > *rai = ai;
> > return 0;
> > }
> >
> > static int
> > waitread(int s)
> > {
> > char data[1];
> > ssize_t rv;
> >
> > rv = read(s, data, sizeof(data));
> > if (rv == -1) {
> > perror("read");
> > return -1;
> > }
> > printf("Read %d bytes\n", (int) rv);
> > return 0;
> > }
> >
> > static int
> > do_server(void)
> > {
> > int err, ls, s, optval;
> > struct addrinfo *ai;
> >
> > printf("Server\n");
> >
> > err = getaddr("::", "3023", true, );
> > if (err)
> > return err;
> >
> > ls = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
> > if (ls == -1) {
> > perror("socket");
> > return -1;
> > }
> >
> > optval = 1;
> > if (setsockopt(ls, SOL_SOCKET, SO_REUSEADDR,
> >(void *), sizeof(optval)) == -1) {
> > perror("setsockopt reuseaddr");
> > return -1;
> > }
> >
> > /* Comment this out and it will work. */
> > if (setsockopt(ls, IPPROTO_IPV6, IPV6_V6ONLY, ,
> >sizeof(optval)) == -1) {
> > perror("setsockopt ipv6 only");
> > return -1;
> > }
> >
> > err = bind(ls, ai->ai_addr, ai->ai_addrlen);
> > if (err == -1) {
> > perror("bind");
> > return -1;
> > }
> >
> > err = listen(ls, 5);
> > if (err == -1) {
> > perror("

Strange problem with SCTP+IPv6

2020-06-21 Thread Corey Minyard
I've stumbled upon a strange problem with SCTP and IPv6.  If I create an
sctp listening socket on :: and set the IPV6_V6ONLY socket option on it,
then I make a connection to it using ::1, the connection will drop after
2.5 seconds with an ECONNRESET error.

It only happens on SCTP, it doesn't have the issue if you connect to a
full IPv6 address instead of ::1, and it doesn't happen if you don't
set IPV6_V6ONLY.  I have verified current end of tree kernel.org.
I tried on an ARM system and x86_64.

I haven't dug into the kernel to see if I could find anything yet, but I
thought I would go ahead and report it.  I am attaching a reproducer.
Basically, compile the following code:

  gcc -g -o sctptest -Wall sctptest.c

and run it in one window as a server:

  ./sctptest a

(Pass in any option to be the server) and run the following in another
window as the client:

  ./sctptest

It disconnects after about 2.5 seconds.  If it works, it should just sit
there forever.

-corey


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static int
getaddr(const char *addr, const char *port, bool listen,
struct addrinfo **rai)
{
struct addrinfo *ai, hints;

memset(, 0, sizeof(hints));
hints.ai_flags = AI_ADDRCONFIG;
if (listen)
hints.ai_flags |= AI_PASSIVE;
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_SCTP;
if (getaddrinfo(addr, port, , )) {
perror("getaddrinfo");
return -1;
}

*rai = ai;
return 0;
}

static int
waitread(int s)
{
char data[1];
ssize_t rv;

rv = read(s, data, sizeof(data));
if (rv == -1) {
perror("read");
return -1;
}
printf("Read %d bytes\n", (int) rv);
return 0;
}

static int
do_server(void)
{
int err, ls, s, optval;
struct addrinfo *ai;

printf("Server\n");

err = getaddr("::", "3023", true, );
if (err)
return err;

ls = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
if (ls == -1) {
perror("socket");
return -1;
}

optval = 1;
if (setsockopt(ls, SOL_SOCKET, SO_REUSEADDR,
   (void *), sizeof(optval)) == -1) {
perror("setsockopt reuseaddr");
return -1;
}

/* Comment this out and it will work. */
if (setsockopt(ls, IPPROTO_IPV6, IPV6_V6ONLY, ,
   sizeof(optval)) == -1) {
perror("setsockopt ipv6 only");
return -1;
}

err = bind(ls, ai->ai_addr, ai->ai_addrlen);
if (err == -1) {
perror("bind");
return -1;
}

err = listen(ls, 5);
if (err == -1) {
perror("listen");
return -1;
}

s = accept(ls, NULL, NULL);
if (s == -1) {
perror("accept");
return -1;
}

close(ls);

err = waitread(s);
close(s);
return err;
}

static int
do_client(void)
{
int err, s;
struct addrinfo *ai;

printf("Client\n");

err = getaddr("::1", "3023", false, );
if (err)
return err;

s = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
if (s == -1) {
perror("socket");
return -1;
}

err = connect(s, ai->ai_addr, ai->ai_addrlen);
if (err == -1) {
perror("connect");
return -1;
}

err = waitread(s);
close(s);
return err;
}

int
main(int argc, char *argv[])
{
int err;

if (argc > 1)
err = do_server();
else
err = do_client();
return !!err;
}



Re: [PATCH] ipmi: code cleanup and prevent potential issue.

2020-06-09 Thread Corey Minyard
On Tue, Jun 09, 2020 at 01:04:10AM -0500, wu000...@umn.edu wrote:
> From: Qiushi Wu 
> 
> All the previous get/put operations against intf->refcount are
> inside the mutex. Thus, put the last kref_put() also inside mutex
> to make sure get/put functions execute in order and prevent the
> potential race condition.

No, this can result in a crash.  intf and intf->bmc_reg_mutex will
be freed by intf_free.  In fact, every call to kref_put() on intf
better be outside any mutex/lock in intf.  If you saw any, that
is a bug, please report that.  kref_get() is fine inside the
mutex.

Plus, this is not a race condition.  get/put is atomic.

-corey

> 
> Signed-off-by: Qiushi Wu 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index e1b22fe0916c..d34343e34272 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -2583,10 +2583,11 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, 
> struct bmc_device *bmc,
>   *guid =  bmc->guid;
>   }
>  
> + kref_put(>refcount, intf_free);
> +
>   mutex_unlock(>dyn_mutex);
>   mutex_unlock(>bmc_reg_mutex);
>  
> - kref_put(>refcount, intf_free);
>   return rv;
>  }
>  
> -- 
> 2.17.1
> 


[GIT PULL] IPMI bug fixes for 5.8

2020-06-04 Thread Corey Minyard
The following changes since commit b9bbe6ed63b2b9f2c9ee5cbd0f2c946a2723f4ce:

  Linux 5.7-rc6 (2020-05-17 16:48:37 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.8-1

for you to fetch changes up to 2a556ce779e39b15cbb74e896ca640e86baeb1a1:

  ipmi:ssif: Remove dynamic platform device handing (2020-05-27 18:25:56 -0500)


IPMI update for 5.8

A few small fixes for things, nothing earth shattering.

-corey


Andy Shevchenko (1):
  ipmi: Replace guid_copy() with import_guid() where it makes sense

Corey Minyard (2):
  Try to load acpi_ipmi when an SSIF ACPI IPMI interface is added
  ipmi:ssif: Remove dynamic platform device handing

Feng Tang (1):
  ipmi: use vzalloc instead of kmalloc for user creation

Stuart Hayes (1):
  ipmi_si: Load acpi_ipmi when ACPI IPMI interface added

Tang Bin (3):
  ipmi:bt-bmc: Avoid unnecessary check
  ipmi:bt-bmc: Fix some format issue of the code
  ipmi:bt-bmc: Fix error handling and status check

 drivers/char/ipmi/bt-bmc.c   | 21 +
 drivers/char/ipmi/ipmi_msghandler.c  |  9 +
 drivers/char/ipmi/ipmi_si_platform.c |  2 ++
 drivers/char/ipmi/ipmi_ssif.c| 24 ++--
 4 files changed, 18 insertions(+), 38 deletions(-)



[GIT PULL] IPMI second update for 5.7

2020-05-17 Thread Corey Minyard
The following changes since commit ae83d0b416db002fe95601e7f97f64b59514d936:

  Linux 5.7-rc2 (2020-04-19 14:35:30 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.7-2

for you to fetch changes up to 653d374771601a345296dd2904a10e6e479ad866:

  char: ipmi: convert to use i2c_new_client_device() (2020-05-14 15:37:31 -0500)


Convert i2c_new_device() to i2c_new_client_device()

Wolfram Sang has asked to have this included in 5.7 so the deprecated
API can be removed next release.  There should be no functional
difference.

I think that entire this section of code can be removed; it is leftover
from other things that have since changed, but this is the safer thing
to do for now.  The full removal can happen next release.

Thanks,

-corey


Wolfram Sang (1):
  char: ipmi: convert to use i2c_new_client_device()

 drivers/char/ipmi/ipmi_ssif.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)



Re: [PATCH 1/1] char: ipmi: convert to use i2c_new_client_device()

2020-05-13 Thread Corey Minyard
On Wed, May 13, 2020 at 09:10:04AM +0200, Wolfram Sang wrote:
> 
> > > - addr_info->added_client = i2c_new_device(to_i2c_adapter(adev),
> > > -  _info->binfo);
> > > + addr_info->added_client = i2c_new_client_device(to_i2c_adapter(adev),
> > > + _info->binfo);
> > 
> > i2c_new_client_device returns an ERR_PTR, not NULL on error.  So this
> 
> Yes, this is the main motivation for the new API.
> 
> > needs some more work.  I'll send something out soon.
> 
> Why does it need that work? 'added_client' is only used with
> i2c_unregister_device() which has been fixed to handle ERR_PTR as well.
> Or am I missing something?
> 

No, I didn't look to see if i2c_unregister_device could handle that.

-corey


Re: [PATCH] char: ipmi: convert to use i2c_new_client_device()

2020-05-13 Thread Corey Minyard
On Wed, May 13, 2020 at 10:37:46AM +0200, Wolfram Sang wrote:
> On Tue, May 12, 2020 at 04:45:32PM -0500, miny...@acm.org wrote:
> > From: Wolfram Sang 
> > 
> > Move away from the deprecated API.
> > 
> > Based on a patch by Wolfram Sang .
> > 
> > Signed-off-by: Corey Minyard 
> > ---
> > I think this works.
> 
> Yes, we can do it like this (despite the question from earlier if it is
> really needed). I fixed other drivers using this pattern, too.

I was wondering whether this is really needed, too, but I'm not 100%
sure it can be removed in all cases.  This is the safer route.

> 
> As Stephen Rothwell pointed out, you either need to remove my "From:" or
> add my SoB. I am fine with both.

It was enough of a rewrite that you as the author didn't seem right.
I've fixed the From line, sorry about that.

-corey

> 
> Thanks,
> 
>Wolfram
> 




Re: linux-next: Signed-off-by missing for commit in the ipmi tree

2020-05-13 Thread Corey Minyard
On Wed, May 13, 2020 at 10:30:34AM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Commit
> 
>   73d0824e48eb ("char: ipmi: convert to use i2c_new_client_device()")
> 
> is missing a Signed-off-by from its author.

Fixed, thanks.

-corey

> 
> -- 
> Cheers,
> Stephen Rothwell




Re: [PATCH 1/1] char: ipmi: convert to use i2c_new_client_device()

2020-05-12 Thread Corey Minyard
On Thu, Mar 26, 2020 at 10:09:58PM +0100, Wolfram Sang wrote:
> Move away from the deprecated API.

Well, I should have looked a little closer first... comment inline

> 
> Signed-off-by: Wolfram Sang 
> ---
>  drivers/char/ipmi/ipmi_ssif.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 8ac390c2b514..2791b799e33d 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -1945,8 +1945,8 @@ static int ssif_adapter_handler(struct device *adev, 
> void *opaque)
>   if (adev->type != _adapter_type)
>   return 0;
>  
> - addr_info->added_client = i2c_new_device(to_i2c_adapter(adev),
> -  _info->binfo);
> + addr_info->added_client = i2c_new_client_device(to_i2c_adapter(adev),
> + _info->binfo);

i2c_new_client_device returns an ERR_PTR, not NULL on error.  So this
needs some more work.  I'll send something out soon.

-corey

>  
>   if (!addr_info->adapter_name)
>   return 1; /* Only try the first I2C adapter by default. */
> -- 
> 2.20.1
> 


Re: [PATCH 1/1] char: ipmi: convert to use i2c_new_client_device()

2020-05-12 Thread Corey Minyard
On Thu, Mar 26, 2020 at 10:09:58PM +0100, Wolfram Sang wrote:
> Move away from the deprecated API.
> 
> Signed-off-by: Wolfram Sang 

Ok by me.

Acked-by: Corey Minyard 

Do you want me to take this, or is this part of something else?  I can
submit it if you like.

-corey

> ---
>  drivers/char/ipmi/ipmi_ssif.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 8ac390c2b514..2791b799e33d 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -1945,8 +1945,8 @@ static int ssif_adapter_handler(struct device *adev, 
> void *opaque)
>   if (adev->type != _adapter_type)
>   return 0;
>  
> - addr_info->added_client = i2c_new_device(to_i2c_adapter(adev),
> -  _info->binfo);
> + addr_info->added_client = i2c_new_client_device(to_i2c_adapter(adev),
> + _info->binfo);
>  
>   if (!addr_info->adapter_name)
>   return 1; /* Only try the first I2C adapter by default. */
> -- 
> 2.20.1
> 


Re: [PATCH v3] ipmi:bt-bmc: Fix error handling and status check

2020-05-05 Thread Corey Minyard
On Tue, May 05, 2020 at 06:29:06PM +0800, Tang Bin wrote:
> If the function platform_get_irq() failed, the negative value
> returned will not be detected here. So fix error handling in
> bt_bmc_config_irq(). And in the function bt_bmc_probe(),
> when get irq failed, it will print error message. So use
> platform_get_irq_optional() to simplify code. Finally in the
> function bt_bmc_remove() should make the right status check
> if get irq failed.

Ok, this is included in my tree.

Thanks,

-corey

> 
> Signed-off-by: Shengju Zhang 
> Signed-off-by: Tang Bin 
> ---
> Changes from v2
>  - fix the commit message and the code of status check
> Changes from v1
>  - fix the code of status check
> ---
>  drivers/char/ipmi/bt-bmc.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
> index d36aeacb2..88ee54767 100644
> --- a/drivers/char/ipmi/bt-bmc.c
> +++ b/drivers/char/ipmi/bt-bmc.c
> @@ -399,9 +399,9 @@ static int bt_bmc_config_irq(struct bt_bmc *bt_bmc,
>   struct device *dev = >dev;
>   int rc;
>  
> - bt_bmc->irq = platform_get_irq(pdev, 0);
> - if (!bt_bmc->irq)
> - return -ENODEV;
> + bt_bmc->irq = platform_get_irq_optional(pdev, 0);
> + if (bt_bmc->irq < 0)
> + return bt_bmc->irq;
>  
>   rc = devm_request_irq(dev, bt_bmc->irq, bt_bmc_irq, IRQF_SHARED,
> DEVICE_NAME, bt_bmc);
> @@ -477,7 +477,7 @@ static int bt_bmc_probe(struct platform_device *pdev)
>  
>   bt_bmc_config_irq(bt_bmc, pdev);
>  
> - if (bt_bmc->irq) {
> + if (bt_bmc->irq >= 0) {
>   dev_info(dev, "Using IRQ %d\n", bt_bmc->irq);
>   } else {
>   dev_info(dev, "No IRQ; using timer\n");
> @@ -503,7 +503,7 @@ static int bt_bmc_remove(struct platform_device *pdev)
>   struct bt_bmc *bt_bmc = dev_get_drvdata(>dev);
>  
>   misc_deregister(_bmc->miscdev);
> - if (!bt_bmc->irq)
> + if (bt_bmc->irq < 0)
>   del_timer_sync(_bmc->poll_timer);
>   return 0;
>  }
> -- 
> 2.20.1.windows.1
> 
> 
> 


Re: [PATCH v2] ipmi:bt-bmc: Fix error handling and status check

2020-05-04 Thread Corey Minyard
On Sun, Apr 19, 2020 at 02:29:26PM +0800, Tang Bin wrote:
> Hi, Corey:
> 
> On 2020/4/18 21:49, Corey Minyard wrote:
> > On Sat, Apr 18, 2020 at 04:02:29PM +0800, Tang Bin wrote:
> > > If the function platform_get_irq() failed, the negative
> > > value returned will not be detected here. So fix error
> > > handling in bt_bmc_config_irq(). And if devm_request_irq()
> > > failed, 'bt_bmc->irq' is assigned to zero maybe redundant,
> > > it may be more suitable for using the correct negative values
> > > to make the status check in the function bt_bmc_remove().
> > You need to mention changing platform_get_irq to
> > platform_get_irq_optional in the header.
> > 
> > Another comment inline below.
> > 
> > Otherwise, this looks good.
> 
> Got it. The v3 will be as follows:
> 
> If the function platform_get_irq() failed, the negative value
> 
> returned will not be detected here. So fix error handling in
> 
> bt_bmc_config_irq(). And in the function bt_bmc_probe(),
> 
> when get irq failed, it will print error message. So use
> 
> platform_get_irq_optional() to simplify code. Finally in the
> 
> function bt_bmc_remove() should make the right status
> 
> check if get irq failed.
> 
> > 
> > You need to set this to rc.  Otherwise it will remain the interrupt
> > number assigned by platform_get_irq_optional().
> 
> Yes, I think you are right. I'm not as considerate as you. Thank you for
> your instruction.
> 
> When get irq failed, the 'bt_bmc->irq' is negative; when request irq failed,
> the 'bt_bmc->irq = 0' is right.
> 
> So 'bt_bmc->irq <= 0' means irq failed.

Sorry, I missed your question here and was waiting for v3.

Well, we want bt_bmc->irq < 0 to mean the irq request failed.

> 
> Now let me rearrange the logic here:
> 
>     In bt_bmc_probe():
> 
>         bt_bmc_config_irq(bt_bmc, pdev);
> 
>         if (bt_bmc->irq > 0) {

Should be >= 0.

> 
>         }
> 
> 
>     In bt_bmc_remove():
> 
>         if (bt_bmc->irq <= 0)
>             del_timer_sync(_bmc->poll_timer);

Should be < 0.  But other than that, I think it's correct.

-corey

> 
> 
> If you think this logic is correct, I'll submit v3.
> 
> Thanks,
> 
> Tang Bin
> 
> > 
> > 
> > 
> > 
> 
> 


Re: [PATCH 04/14] docs: move IPMI.txt to the driver API book

2020-05-01 Thread Corey Minyard
On Fri, May 01, 2020 at 05:37:48PM +0200, Mauro Carvalho Chehab wrote:
> The IPMI is under drivers/char. This doc describes the kAPI
> part of the IPMI (mainly).
> 
> So, move it to the driver-api directory and add it to the
> corresponding index.rst file.
> 
> Signed-off-by: Mauro Carvalho Chehab 

This is fine with me.

Acked-by: Corey Minyard 

> ---
>  Documentation/driver-api/index.rst  | 1 +
>  Documentation/{IPMI.txt => driver-api/ipmi.rst} | 0
>  2 files changed, 1 insertion(+)
>  rename Documentation/{IPMI.txt => driver-api/ipmi.rst} (100%)
> 
> diff --git a/Documentation/driver-api/index.rst 
> b/Documentation/driver-api/index.rst
> index dcc47c029f8e..6567187e7687 100644
> --- a/Documentation/driver-api/index.rst
> +++ b/Documentation/driver-api/index.rst
> @@ -39,6 +39,7 @@ available subsections can be seen below.
> spi
> i2c
> ipmb
> +   ipmi
> i3c/index
> interconnect
> devfreq
> diff --git a/Documentation/IPMI.txt b/Documentation/driver-api/ipmi.rst
> similarity index 100%
> rename from Documentation/IPMI.txt
> rename to Documentation/driver-api/ipmi.rst
> -- 
> 2.25.4
> 


Re: [PATCH] ipmi: Don't allow device module unload when in use

2019-10-22 Thread Corey Minyard
On Tue, Oct 22, 2019 at 10:29:12AM -0400, Tony Camuso wrote:
> Corey,
> 
> Testing shows that this patch works as expected.

Thanks, I'll add a Tested-by for you.  It's queued for the next merge
window.

-corey

> 
> Regards,
> Tony
> 
> 
> On 10/14/19 11:46 AM, miny...@acm.org wrote:
> > From: Corey Minyard 
> > 
> > If something has the IPMI driver open, don't allow the device
> > module to be unloaded.  Before it would unload and the user would
> > get errors on use.
> > 
> > This change is made on user request, and it makes it consistent
> > with the I2C driver, which has the same behavior.
> > 
> > It does change things a little bit with respect to kernel users.
> > If the ACPI or IPMI watchdog (or any other kernel user) has
> > created a user, then the device module cannot be unloaded.  Before
> > it could be unloaded,
> > 
> > This does not affect hot-plug.  If the device goes away (it's on
> > something removable that is removed or is hot-removed via sysfs)
> > then it still behaves as it did before.
> > 
> > Reported-by: tony camuso 
> > Signed-off-by: Corey Minyard 
> > ---
> > Tony, here is a suggested change for this.  Can you look it over and
> > see if it looks ok?
> > 
> > Thanks,
> > 
> > -corey
> > 
> >   drivers/char/ipmi/ipmi_msghandler.c | 23 ---
> >   include/linux/ipmi_smi.h| 12 
> >   2 files changed, 24 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> > b/drivers/char/ipmi/ipmi_msghandler.c
> > index 2aab80e19ae0..15680de18625 100644
> > --- a/drivers/char/ipmi/ipmi_msghandler.c
> > +++ b/drivers/char/ipmi/ipmi_msghandler.c
> > @@ -448,6 +448,8 @@ enum ipmi_stat_indexes {
> >   #define IPMI_IPMB_NUM_SEQ 64
> >   struct ipmi_smi {
> > +   struct module *owner;
> > +
> > /* What interface number are we? */
> > int intf_num;
> > @@ -1220,6 +1222,11 @@ int ipmi_create_user(unsigned int  if_num,
> > if (rv)
> > goto out_kfree;
> > +   if (!try_module_get(intf->owner)) {
> > +   rv = -ENODEV;
> > +   goto out_kfree;
> > +   }
> > +   
> > /* Note that each existing user holds a refcount to the interface. */
> > kref_get(>refcount);
> > @@ -1349,6 +1356,7 @@ static void _ipmi_destroy_user(struct ipmi_user *user)
> > }
> > kref_put(>refcount, intf_free);
> > +   module_put(intf->owner);
> >   }
> >   int ipmi_destroy_user(struct ipmi_user *user)
> > @@ -2459,7 +2467,7 @@ static int __get_device_id(struct ipmi_smi *intf, 
> > struct bmc_device *bmc)
> >* been recently fetched, this will just use the cached data.  Otherwise
> >* it will run a new fetch.
> >*
> > - * Except for the first time this is called (in ipmi_register_smi()),
> > + * Except for the first time this is called (in ipmi_add_smi()),
> >* this will always return good data;
> >*/
> >   static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device 
> > *bmc,
> > @@ -3377,10 +3385,11 @@ static void redo_bmc_reg(struct work_struct *work)
> > kref_put(>refcount, intf_free);
> >   }
> > -int ipmi_register_smi(const struct ipmi_smi_handlers *handlers,
> > - void *send_info,
> > - struct device*si_dev,
> > - unsigned charslave_addr)
> > +int ipmi_add_smi(struct module *owner,
> > +const struct ipmi_smi_handlers *handlers,
> > +void  *send_info,
> > +struct device *si_dev,
> > +unsigned char slave_addr)
> >   {
> > int  i, j;
> > int  rv;
> > @@ -3406,7 +3415,7 @@ int ipmi_register_smi(const struct ipmi_smi_handlers 
> > *handlers,
> > return rv;
> > }
> > -
> > +   intf->owner = owner;
> > intf->bmc = >tmp_bmc;
> > INIT_LIST_HEAD(>bmc->intfs);
> > mutex_init(>bmc->dyn_mutex);
> > @@ -3514,7 +3523,7 @@ int ipmi_register_smi(const struct ipmi_smi_handlers 
> > *handlers,
> > return rv;
> >   }
> > -EXPORT_SYMBOL(ipmi_register_smi);
> > +EXPORT_SYMBOL(ipmi_add_smi);
> >   static void deliver_smi_err_response(struct ipmi_smi *intf,
> >  struct ipmi_smi_msg *msg,
> > diff --git a/include/linux/ipmi_smi.h b/incl

Re: [PATCH] ipmi: Fix memory leak in __ipmi_bmc_register

2019-10-22 Thread Corey Minyard
On Mon, Oct 21, 2019 at 03:06:48PM -0500, Navid Emamdoost wrote:
> In the impelementation of __ipmi_bmc_register() the allocated memory for
> bmc should be released in case ida_simple_get() fails.

Thanks, queued for next merge window.

-corey

> 
> Fixes: 68e7e50f195f ("ipmi: Don't use BMC product/dev ids in the BMC name")
> Signed-off-by: Navid Emamdoost 
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 2aab80e19ae0..e4928ed46396 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3031,8 +3031,11 @@ static int __ipmi_bmc_register(struct ipmi_smi *intf,
>   bmc->pdev.name = "ipmi_bmc";
>  
>   rv = ida_simple_get(_bmc_ida, 0, 0, GFP_KERNEL);
> - if (rv < 0)
> + if (rv < 0) {
> + kfree(bmc);
>   goto out;
> + }
> +
>   bmc->pdev.dev.driver = 
>   bmc->pdev.id = rv;
>   bmc->pdev.dev.release = release_bmc_device;
> -- 
> 2.17.1
> 


Re: [PATCH] ipmi: Don't allow device module unload when in use

2019-10-16 Thread Corey Minyard
On Wed, Oct 16, 2019 at 03:25:56PM -0400, Tony Camuso wrote:
> On 10/14/19 11:46 AM, miny...@acm.org wrote:
> > From: Corey Minyard 
> > 
> > If something has the IPMI driver open, don't allow the device
> > module to be unloaded.  Before it would unload and the user would
> > get errors on use.
> > 
> > This change is made on user request, and it makes it consistent
> > with the I2C driver, which has the same behavior.
> > 
> > It does change things a little bit with respect to kernel users.
> > If the ACPI or IPMI watchdog (or any other kernel user) has
> > created a user, then the device module cannot be unloaded.  Before
> > it could be unloaded,
> > 
> > This does not affect hot-plug.  If the device goes away (it's on
> > something removable that is removed or is hot-removed via sysfs)
> > then it still behaves as it did before.
> > 
> > Reported-by: tony camuso 
> > Signed-off-by: Corey Minyard 
> > ---
> > Tony, here is a suggested change for this.  Can you look it over and
> > see if it looks ok?
> > 
> > Thanks,
> > 
> > -corey
> > 
> >   drivers/char/ipmi/ipmi_msghandler.c | 23 ---
> >   include/linux/ipmi_smi.h| 12 
> >   2 files changed, 24 insertions(+), 11 deletions(-)
> 
> Hi Corey.
> 
> You changed ipmi_register_ipmi to ipmi_add_ipmi in ipmi_msghandler, but you
> did not change it where it is actually called.
> 
> # grep ipmi_register_smi drivers/char/ipmi/*.c
> drivers/char/ipmi/ipmi_powernv.c: rc = 
> ipmi_register_smi(_powernv_smi_handlers, ipmi, dev, 0);
> drivers/char/ipmi/ipmi_si_intf.c: rv = ipmi_register_smi(,
> drivers/char/ipmi/ipmi_ssif.c:rv = 
> ipmi_register_smi(_info->handlers,
> 
> Is there a reason for changing the interface name? Is this something
> that I could do instead of troubling you with it?

I don't understand.  You don't say that anything went wrong, you just
referenced something I changed.

I changed the name so I could create a macro with that name to pass in
the module name.  Pretty standard to do in the kernel.  Is there a
compile or runtime issue?

-corey

> 
> Regards,
> Tony
> 
> 
> > 
> > diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> > b/drivers/char/ipmi/ipmi_msghandler.c
> > index 2aab80e19ae0..15680de18625 100644
> > --- a/drivers/char/ipmi/ipmi_msghandler.c
> > +++ b/drivers/char/ipmi/ipmi_msghandler.c
> > @@ -448,6 +448,8 @@ enum ipmi_stat_indexes {
> >   #define IPMI_IPMB_NUM_SEQ 64
> >   struct ipmi_smi {
> > +   struct module *owner;
> > +
> > /* What interface number are we? */
> > int intf_num;
> > @@ -1220,6 +1222,11 @@ int ipmi_create_user(unsigned int  if_num,
> > if (rv)
> > goto out_kfree;
> > +   if (!try_module_get(intf->owner)) {
> > +   rv = -ENODEV;
> > +   goto out_kfree;
> > +   }
> > +   
> > /* Note that each existing user holds a refcount to the interface. */
> > kref_get(>refcount);
> > @@ -1349,6 +1356,7 @@ static void _ipmi_destroy_user(struct ipmi_user *user)
> > }
> > kref_put(>refcount, intf_free);
> > +   module_put(intf->owner);
> >   }
> >   int ipmi_destroy_user(struct ipmi_user *user)
> > @@ -2459,7 +2467,7 @@ static int __get_device_id(struct ipmi_smi *intf, 
> > struct bmc_device *bmc)
> >* been recently fetched, this will just use the cached data.  Otherwise
> >* it will run a new fetch.
> >*
> > - * Except for the first time this is called (in ipmi_register_smi()),
> > + * Except for the first time this is called (in ipmi_add_smi()),
> >* this will always return good data;
> >*/
> >   static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device 
> > *bmc,
> > @@ -3377,10 +3385,11 @@ static void redo_bmc_reg(struct work_struct *work)
> > kref_put(>refcount, intf_free);
> >   }
> > -int ipmi_register_smi(const struct ipmi_smi_handlers *handlers,
> > - void *send_info,
> > - struct device*si_dev,
> > - unsigned charslave_addr)
> > +int ipmi_add_smi(struct module *owner,
> > +const struct ipmi_smi_handlers *handlers,
> > +void  *send_info,
> > +struct device *si_dev,
> > +unsigned char slave_addr)
> >   {
> > int  i, j;
> > int  rv;
> > @@ -3406,7 +3415,7

Re: [PATCH -next] ipmi: bt-bmc: use devm_platform_ioremap_resource() to simplify code

2019-10-16 Thread Corey Minyard
On Wed, Oct 16, 2019 at 04:41:07PM +0200, Cédric Le Goater wrote:
> On 16/10/2019 16:19, Corey Minyard wrote:
> > On Wed, Oct 16, 2019 at 05:21:31PM +0800, YueHaibing wrote:
> >> Use devm_platform_ioremap_resource() to simplify the code a bit.
> >> This is detected by coccinelle.
> > 
> > Adding the module author and others. I can't see a reason to not do
> > this.
> 
> yes. Looks good to me.
> 
> Reviewed-by: Cédric Le Goater 

Queued for next merge window, unless someone protests.

-corey

> 
> Thanks,
> 
> C.
> 
> > -corey
> > 
> >>
> >> Signed-off-by: YueHaibing 
> >> ---
> >>  drivers/char/ipmi/bt-bmc.c | 4 +---
> >>  1 file changed, 1 insertion(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
> >> index 40b9927..d36aeac 100644
> >> --- a/drivers/char/ipmi/bt-bmc.c
> >> +++ b/drivers/char/ipmi/bt-bmc.c
> >> @@ -444,15 +444,13 @@ static int bt_bmc_probe(struct platform_device *pdev)
> >>  
> >>bt_bmc->map = syscon_node_to_regmap(pdev->dev.parent->of_node);
> >>if (IS_ERR(bt_bmc->map)) {
> >> -  struct resource *res;
> >>void __iomem *base;
> >>  
> >>/*
> >> * Assume it's not the MFD-based devicetree description, in
> >> * which case generate a regmap ourselves
> >> */
> >> -  res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> >> -  base = devm_ioremap_resource(>dev, res);
> >> +  base = devm_platform_ioremap_resource(pdev, 0);
> >>if (IS_ERR(base))
> >>return PTR_ERR(base);
> >>  
> >> -- 
> >> 2.7.4
> >>
> >>
> 


Re: [PATCH -next] ipmi: bt-bmc: use devm_platform_ioremap_resource() to simplify code

2019-10-16 Thread Corey Minyard
On Wed, Oct 16, 2019 at 05:21:31PM +0800, YueHaibing wrote:
> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.

Adding the module author and others. I can't see a reason to not do
this.

-corey

> 
> Signed-off-by: YueHaibing 
> ---
>  drivers/char/ipmi/bt-bmc.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
> index 40b9927..d36aeac 100644
> --- a/drivers/char/ipmi/bt-bmc.c
> +++ b/drivers/char/ipmi/bt-bmc.c
> @@ -444,15 +444,13 @@ static int bt_bmc_probe(struct platform_device *pdev)
>  
>   bt_bmc->map = syscon_node_to_regmap(pdev->dev.parent->of_node);
>   if (IS_ERR(bt_bmc->map)) {
> - struct resource *res;
>   void __iomem *base;
>  
>   /*
>* Assume it's not the MFD-based devicetree description, in
>* which case generate a regmap ourselves
>*/
> - res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - base = devm_ioremap_resource(>dev, res);
> + base = devm_platform_ioremap_resource(pdev, 0);
>   if (IS_ERR(base))
>   return PTR_ERR(base);
>  
> -- 
> 2.7.4
> 
> 


Re: [PATCH v2] ipmi: use %*ph to print small buffer

2019-10-14 Thread Corey Minyard
On Fri, Oct 11, 2019 at 06:50:36PM +0300, Andy Shevchenko wrote:
> From: Andy Shevchenko 
> 
> Use %*ph format to print small buffer as hex string.
> 
> The change is safe since the specifier can handle up to 64 bytes and taking
> into account the buffer size of 100 bytes on stack the function has never been
> used to dump more than 32 bytes. Note, this also avoids potential buffer
> overflow if the length of the input buffer is bigger.

This is an improvment, thanks, it is in queue in the next tree and
queued for the next merge window.

Thanks, Andy and Jes, for sorting this out while I was on vacation,

-corey

> 
> This completely eliminates ipmi_debug_msg() in favour of Dynamic Debug.
> 
> Signed-off-by: Andy Shevchenko 
> Signed-off-by: Andy Shevchenko 
> ---
> - eliminate ipmi_debug_msg() in favour of Dynamic Debug (Joe)
>  drivers/char/ipmi/ipmi_msghandler.c | 27 ---
>  1 file changed, 4 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 2aab80e19ae0..1768b81aaf78 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -44,25 +44,6 @@ static void need_waiter(struct ipmi_smi *intf);
>  static int handle_one_recv_msg(struct ipmi_smi *intf,
>  struct ipmi_smi_msg *msg);
>  
> -#ifdef DEBUG
> -static void ipmi_debug_msg(const char *title, unsigned char *data,
> -unsigned int len)
> -{
> - int i, pos;
> - char buf[100];
> -
> - pos = snprintf(buf, sizeof(buf), "%s: ", title);
> - for (i = 0; i < len; i++)
> - pos += snprintf(buf + pos, sizeof(buf) - pos,
> - " %2.2x", data[i]);
> - pr_debug("%s\n", buf);
> -}
> -#else
> -static void ipmi_debug_msg(const char *title, unsigned char *data,
> -unsigned int len)
> -{ }
> -#endif
> -
>  static bool initialized;
>  static bool drvregistered;
>  
> @@ -2267,7 +2248,7 @@ static int i_ipmi_request(struct ipmi_user *user,
>   ipmi_free_smi_msg(smi_msg);
>   ipmi_free_recv_msg(recv_msg);
>   } else {
> - ipmi_debug_msg("Send", smi_msg->data, smi_msg->data_size);
> + pr_debug("Send: %*ph\n", smi_msg->data_size, smi_msg->data);
>  
>   smi_send(intf, intf->handlers, smi_msg, priority);
>   }
> @@ -3730,7 +3711,7 @@ static int handle_ipmb_get_msg_cmd(struct ipmi_smi 
> *intf,
>   msg->data[10] = ipmb_checksum(>data[6], 4);
>   msg->data_size = 11;
>  
> - ipmi_debug_msg("Invalid command:", msg->data, msg->data_size);
> + pr_debug("Invalid command: %*ph\n", msg->data_size, msg->data);
>  
>   rcu_read_lock();
>   if (!intf->in_shutdown) {
> @@ -4217,7 +4198,7 @@ static int handle_one_recv_msg(struct ipmi_smi *intf,
>   int requeue;
>   int chan;
>  
> - ipmi_debug_msg("Recv:", msg->rsp, msg->rsp_size);
> + pr_debug("Recv: %*ph\n", msg->rsp_size, msg->rsp);
>  
>   if ((msg->data_size >= 2)
>   && (msg->data[0] == (IPMI_NETFN_APP_REQUEST << 2))
> @@ -4576,7 +4557,7 @@ smi_from_recv_msg(struct ipmi_smi *intf, struct 
> ipmi_recv_msg *recv_msg,
>   smi_msg->data_size = recv_msg->msg.data_len;
>   smi_msg->msgid = STORE_SEQ_IN_MSGID(seq, seqid);
>  
> - ipmi_debug_msg("Resend: ", smi_msg->data, smi_msg->data_size);
> + pr_debug("Resend: %*ph\n", smi_msg->data_size, smi_msg->data);
>  
>   return smi_msg;
>  }
> -- 
> 2.23.0
> 


Re: [PATCH 4.19 012/106] ipmi_si: Only schedule continuously in the thread in maintenance mode

2019-10-08 Thread Corey Minyard
On Tue, Oct 08, 2019 at 11:49:15AM +0200, Pavel Machek wrote:
> Hi!
> 
> > @@ -1013,11 +1016,20 @@ static int ipmi_thread(void *data)
> > spin_unlock_irqrestore(&(smi_info->si_lock), flags);
> > busy_wait = ipmi_thread_busy_wait(smi_result, smi_info,
> >   _until);
> > -   if (smi_result == SI_SM_CALL_WITHOUT_DELAY)
> > +   if (smi_result == SI_SM_CALL_WITHOUT_DELAY) {
> > ; /* do nothing */
> > -   else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)
> > -   schedule();
> > -   else if (smi_result == SI_SM_IDLE) {
> > +   } else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait) {
> > +   /*
> > +* In maintenance mode we run as fast as
> > +* possible to allow firmware updates to
> > +* complete as fast as possible, but normally
> > +* don't bang on the scheduler.
> > +*/
> > +   if (smi_info->in_maintenance_mode)
> > +   schedule();
> > +   else
> > +   usleep_range(100, 200);
> > +   } else if (smi_result == SI_SM_IDLE) {
> 
> This is quite crazy code. usleep() will need to do magic with high
> resolution timers to provide 200usec sleep... when all you want to do
> is unload the scheduler.
> 
> cond_resched() should be okay to call in a loop, can the code use that
> instead?

According to Tejun Heo, spinning in a loop sleeping was causing all
sorts of issues with banging on scheduler locks on systems with lots of
cores.  I forgot to add him to the CC on the patch, adding him now
for comment.

If cond_resched() would work, though, I'd be happy with that, it's
certainly simpler.

-corey

> 
> Best regards,
>   Pavel
> 
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html




[GIT PULL] IPMI bug fixes for 5.4

2019-09-19 Thread Corey Minyard
The following changes since commit 5c6207539aea8b22490f9569db5aa72ddfd0d486:

  Merge branch 'fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs (2019-07-31 13:26:54 
-0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.4-1

for you to fetch changes up to c9acc3c4f8e42ae538aea7f418fddc16f257ba75:

  ipmi_si_intf: Fix race in timer shutdown handling (2019-09-12 16:03:18 -0500)


IPMI: A few minor fixes and some cosmetic changes.

Nothing big here, but some minor things that people have found and
some minor reworks for names and include files.

Thanks,

-corey


Corey Minyard (6):
  ipmi_si: Convert timespec64 to timespec
  ipmi_si: Rework some include files
  ipmi_si: Convert device attr permissions to octal
  ipmi_si: Remove ipmi_ from the device attr names
  ipmi_si: Only schedule continuously in the thread in maintenance mode
  ipmi: Free receive messages when in an oops

Jes Sorensen (1):
  ipmi_si_intf: Fix race in timer shutdown handling

Kamlakant Patel (1):
  ipmi_ssif: avoid registering duplicate ssif interface

Tony Camuso (1):
  ipmi: move message error checking to avoid deadlock

 drivers/char/ipmi/ipmi_dmi.c |   1 -
 drivers/char/ipmi/ipmi_dmi.h |   1 +
 drivers/char/ipmi/ipmi_msghandler.c  | 121 ++-
 drivers/char/ipmi/ipmi_si.h  |  57 -
 drivers/char/ipmi/ipmi_si_intf.c |  98 
 drivers/char/ipmi/ipmi_si_mem_io.c   |   2 +-
 drivers/char/ipmi/ipmi_si_pci.c  |   2 +-
 drivers/char/ipmi/ipmi_si_platform.c |   2 +-
 drivers/char/ipmi/ipmi_si_port_io.c  |   2 +-
 drivers/char/ipmi/ipmi_si_sm.h   |  54 ++--
 drivers/char/ipmi/ipmi_ssif.c|  79 ++-
 11 files changed, 260 insertions(+), 159 deletions(-)



Re: [Openipmi-developer] [PATCH 0/1] Fix race in ipmi timer cleanup

2019-09-14 Thread Corey Minyard
> 
> > 
> > {disable,enable}_si_irq() themselves are racy:
> > 
> > static inline bool disable_si_irq(struct smi_info *smi_info)
> > {
> > if ((smi_info->io.irq) && (!smi_info->interrupt_disabled)) {
> > smi_info->interrupt_disabled = true;
> > 
> > Basically interrupt_disabled need to be atomic here to have any value,
> > unless you ensure to have a spin lock around every access to it.
> 
> It needs to be atomic, yes, but I think just adding the spinlock like
> I suggested will work.  You are right, the check for timer_running is
> not necessary here, and I'm fine with removing it, but there are other
> issues with interrupt_disabled (as you said) and with memory ordering
> in the timer case.  So even if you remove the timer running check, the
> lock is still required here.

It turns out you were right, all that really needs to be done is the
del_timer_sync().  I've added your patch to my queue.

Sorry for the trouble.

Thanks,

-corey


Re: [PATCH 0/1] Fix race in ipmi timer cleanup

2019-08-29 Thread Corey Minyard
On Wed, Aug 28, 2019 at 08:53:47PM -0400, Jes Sorensen wrote:
> On 8/28/19 6:32 PM, Corey Minyard wrote:
> > On Wed, Aug 28, 2019 at 04:36:24PM -0400, Jes Sorensen wrote:
> >> From: Jes Sorensen 
> >>
> >> I came across this in 4.16, but I believe the bug is still present
> >> in current 5.x, even if it is less likely to trigger.
> >>
> >> Basially stop_timer_and_thread() only calls del_timer_sync() if
> >> timer_running == true. However smi_mod_timer enables the timer before
> >> setting timer_running = true.
> > 
> > All the modifications/checks for timer_running should be done under
> > the si_lock.  It looks like a lock is missing in shutdown_smi(),
> > probably starting before setting interrupt_disabled to true and
> > after stop_timer_and_thread.  I think that is the right fix for
> > this problem.
> 
> Hi Corey,
> 
> I agree a spin lock could deal with this specific issue too, but calling
> del_timer_sync() is safe to call on an already disabled timer. The whole
> flagging of timer_running really doesn't make much sense in the first
> place either.
> 
> As for interrupt_disabled that is even worse. There's multiple places in
> the code where interrupt_disabled is checked, some of them are not
> protected by a spin lock, including shutdown_smi() where you have this
> sequence:
> 
> while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)){
> poll(smi_info);
> schedule_timeout_uninterruptible(1);
> }
> if (smi_info->handlers)
> disable_si_irq(smi_info);
> while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)){
> poll(smi_info);
> schedule_timeout_uninterruptible(1);
> }

This one doesn't matter.  At this point the driver is single-threaded,
no interrupts, timeouts, or calls from the upper layer can happen.

> 
> In this case you'll have to drop and retake the long several times.
> 
> You also have this call sequence which leads to disable_si_irq() which
> checks interrupt_disabled:
> 
>   flush_messages()
> smi_event_handler()
>   handle_transaction_done()
> handle_flags()
>   alloc_msg_handle_irq()
> disable_si_irq()

This one only happens in run-to-completion mode.  Which is strange,
but a number of people had issues with getting into a new kernel before
the watchdog timeout went off, so the run-to-completion mode runs at
panic time so the driver can run without scheduling so it can extend
the watchdog and store panic information in the IPMI log.

So you actually *don't* want a lock here, since the panic may have
occurred when the IPMI driver was holding the lock.

> 
> {disable,enable}_si_irq() themselves are racy:
> 
> static inline bool disable_si_irq(struct smi_info *smi_info)
> {
> if ((smi_info->io.irq) && (!smi_info->interrupt_disabled)) {
> smi_info->interrupt_disabled = true;
> 
> Basically interrupt_disabled need to be atomic here to have any value,
> unless you ensure to have a spin lock around every access to it.

It needs to be atomic, yes, but I think just adding the spinlock like
I suggested will work.  You are right, the check for timer_running is
not necessary here, and I'm fine with removing it, but there are other
issues with interrupt_disabled (as you said) and with memory ordering
in the timer case.  So even if you remove the timer running check, the
lock is still required here.

It also might be a good idea to add a WARN_ON() to smi_mod_timer() and
alloc_msg_handle_irq() if the lock is not held, just to be sure.

Thanks,

-corey

> 
> Cheers,
> Jes


Re: [PATCH 0/1] Fix race in ipmi timer cleanup

2019-08-28 Thread Corey Minyard
On Wed, Aug 28, 2019 at 04:36:24PM -0400, Jes Sorensen wrote:
> From: Jes Sorensen 
> 
> I came across this in 4.16, but I believe the bug is still present
> in current 5.x, even if it is less likely to trigger.
> 
> Basially stop_timer_and_thread() only calls del_timer_sync() if
> timer_running == true. However smi_mod_timer enables the timer before
> setting timer_running = true.

All the modifications/checks for timer_running should be done under
the si_lock.  It looks like a lock is missing in shutdown_smi(),
probably starting before setting interrupt_disabled to true and
after stop_timer_and_thread.  I think that is the right fix for
this problem.

-corey

> 
> I was able to reproduce this in 4.16 running the following on a host
> 
>while :; do rmmod ipmi_si ; modprobe ipmi_si; done
> 
> while rebooting the BMC on it in parallel.
> 
> 5.2 moves the error handling around and does it more centralized, but
> relying on timer_running still seems dubious to me.
> 
> static void smi_mod_timer(struct smi_info *smi_info, unsigned long new_val)
> {
> if (!smi_info->timer_can_start)
> return;
> smi_info->last_timeout_jiffies = jiffies;
> mod_timer(_info->si_timer, new_val);
> smi_info->timer_running = true;
> }
> 
> static inline void stop_timer_and_thread(struct smi_info *smi_info)
> {
> if (smi_info->thread != NULL) {
> kthread_stop(smi_info->thread);
> smi_info->thread = NULL;
> }
> 
> smi_info->timer_can_start = false;
> if (smi_info->timer_running)
> del_timer_sync(_info->si_timer);
> }
> 
> Cheers,
> Jes
> 
> Jes Sorensen (1):
>   ipmi_si_intf: Fix race in timer shutdown handling
> 
>  drivers/char/ipmi/ipmi_si_intf.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> -- 
> 2.21.0
> 


Removal of IPMI watchdog features

2019-08-23 Thread Corey Minyard
I am considering moving the IPMI watchdog over to the standard
watchdog framework.  This will require the removal of the feature
that provides a byte of read data when the pretimeout occurs,
since that is not available in the standard framework.

Before I remove this, I thought I would ask: Is anyone using
this feature?  If they are, I'll need to rething what is done.

Thanks,

-corey


Re: [PATCH v1 1/1] Fix uninitialized variable in ipmb_dev_int.c

2019-07-24 Thread Corey Minyard
On Wed, Jul 24, 2019 at 03:32:57PM -0400, Asmaa Mnebhi wrote:
> ret at line 112 of ipmb_dev_int.c is uninitialized which
> results in a warning during build regressions.
> This warning was found by build regression/improvement
> testing for v5.3-rc1.

Applied, thanks for sticking with it :).

-corey

> 
> Reported-by: build regression/improvement testing for v5.3-rc1.
> Fixes: 51bd6f291583 ("Add support for IPMB driver")
> Signed-off-by: Asmaa Mnebhi 
> ---
>  drivers/char/ipmi/ipmb_dev_int.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 5720433..285e0b8 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -76,7 +76,7 @@ static ssize_t ipmb_read(struct file *file, char __user 
> *buf, size_t count,
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   struct ipmb_request_elem *queue_elem;
>   struct ipmb_msg msg;
> - ssize_t ret;
> + ssize_t ret = 0;
>  
>   memset(, 0, sizeof(msg));
>  
> -- 
> 2.1.2
> 


Re: [PATCH v1 1/1] Fix uninitialized variable in ipmb_dev_int.c

2019-07-24 Thread Corey Minyard
On Wed, Jul 24, 2019 at 01:45:57PM -0400, Asmaa Mnebhi wrote:
> Signed-off-by: Asmaa Mnebhi 
> Reported-by: Geert Uytterhoeven 

Sorry to be picky here, but it's considered bad style to have an
empty message.  I probably wasn't clear before, but you should
add some text like "Found by build regression/improvement testing."
or something like that.  Just so people know where it was found.

Could you also add a "Fixes" field?  This is important in case
someone pulls the original patch, they can look forward and see
if any bugs were fixed.  From the kernel docs:

If your patch fixes a bug in a specific commit, e.g. you found an issue using
``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
the SHA-1 ID, and the one line summary.  Do not split the tag across multiple
lines, tags are exempt from the "wrap at 75 columns" rule in order to simplify
parsing scripts.  For example::

Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the 
number of pages it actually freed")

I was going to do that myself, but since another spin is required...

Thanks,

-corey

> ---
>  drivers/char/ipmi/ipmb_dev_int.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 5720433..285e0b8 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -76,7 +76,7 @@ static ssize_t ipmb_read(struct file *file, char __user 
> *buf, size_t count,
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   struct ipmb_request_elem *queue_elem;
>   struct ipmb_msg msg;
> - ssize_t ret;
> + ssize_t ret = 0;
>  
>   memset(, 0, sizeof(msg));
>  
> -- 
> 2.1.2
> 


Re: [PATCH v1 1/1] Fix uninitialized variable in ipmb_dev_int.c

2019-07-24 Thread Corey Minyard
On Wed, Jul 24, 2019 at 10:36:42AM -0400, Asmaa Mnebhi wrote:
> Signed-off-by: Asmaa Mnebhi 

The patch is, of course, fine, but you should add some info
about how it was found and a Reported-by: tag.

Thanks,

-corey

> ---
>  drivers/char/ipmi/ipmb_dev_int.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 5720433..285e0b8 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -76,7 +76,7 @@ static ssize_t ipmb_read(struct file *file, char __user 
> *buf, size_t count,
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   struct ipmb_request_elem *queue_elem;
>   struct ipmb_msg msg;
> - ssize_t ret;
> + ssize_t ret = 0;
>  
>   memset(, 0, sizeof(msg));
>  
> -- 
> 2.1.2
> 


[GIT PULL] IPMI bug fixes for 5.3

2019-07-12 Thread Corey Minyard
The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.3

for you to fetch changes up to ac499fba98c3c65078fd84fa0a62cd6f6d5837ed:

  docs: ipmb: place it at driver-api and convert to ReST (2019-06-30 19:33:25 
-0500)


Some small fixes for various things, nothing huge, mostly found
by automated tools.

Plus add a driver that allows Linux to act as an IPMB slave device,
so it can be a satellite MC in an IPMI network.


Arnd Bergmann (1):
  ipmi: ipmb: don't allocate i2c_client on stack

Asmaa Mnebhi (1):
  Add support for IPMB driver

Kefeng Wang (3):
  ipmi_si: fix unexpected driver unregister warning
  ipmi_si: use bool type for initialized variable
  ipmi_ssif: fix unexpected driver unregister warning

Mauro Carvalho Chehab (1):
  docs: ipmb: place it at driver-api and convert to ReST

Suzuki K Poulose (1):
  drivers: ipmi: Drop device reference

YueHaibing (1):
  ipmi: ipmb: Fix build error while CONFIG_I2C is set to m

kbuild test robot (1):
  fix platform_no_drv_owner.cocci warnings

 Documentation/driver-api/index.rst   |   1 +
 Documentation/driver-api/ipmb.rst| 105 ++
 drivers/char/ipmi/Kconfig|   9 +
 drivers/char/ipmi/Makefile   |   1 +
 drivers/char/ipmi/ipmb_dev_int.c | 364 +++
 drivers/char/ipmi/ipmi_si_intf.c |   4 +-
 drivers/char/ipmi/ipmi_si_platform.c |   7 +-
 drivers/char/ipmi/ipmi_ssif.c|   5 +-
 8 files changed, 492 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/driver-api/ipmb.rst
 create mode 100644 drivers/char/ipmi/ipmb_dev_int.c



Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-10 Thread Corey Minyard
On Wed, Jul 10, 2019 at 07:22:21AM -0700, Tejun Heo wrote:
> Hello,
> 
> > > We can go for shorter timeouts for sure but I don't think this sort of
> > > busy looping is acceptable.  Is your position that this must be a busy
> > > loop?
> > 
> > Well, no.  I want something that provides as high a throughput as
> > possible and doesn't cause scheduling issues.  But that may not be
> > possible.  Screwing up the scheduler is a lot worse than slow IPMI
> > firmware updates.
> > 
> > How short can the timeouts be and avoid issues?
> 
> We first tried msleep(1) and that was too slow even for sensor reading
> making it take longer than 50s.  With the 100us-200us sleep, it got
> down to ~5s which was good enough for our use case and the cpu /
> scheduler impact was still mostly negligible.  I can't tell for sure
> without testing but going significantly below 100us is likely to
> become visible pretty quickly.

What was the time to read the sensors before you did the change?
It depends a lot on the system, so I can't really guess.

> 
> We can also take a hybrid approach where we busy poll w/ 1us udelay
> upto, say, fifty times and then switch to sleeping poll.

I'm pretty sure we didn't try that in the original work, but I'm
not sure that would work.  Most of the initial spinning would be
pointless.

I would guess that you would decrease the delay and the performance
would improve linearly until you hit a certain point, and then
decreasing the delay wouldn't make a big difference.  That's the
point you want to use, I think.

What might actually be best would be for the driver to measure the
time it takes the BMC to respond and try to set the timeout based
on that information.  BMCs vary a lot, a constant probably won't
work.

And I was just saying that I wasn't expecting any big changes in
the IPMI driver any more...

> 
> Are there some tests which can be used to verify the cases which may
> get impacted by these changes?

Unfortunately not.  The original people at Dell that did the work
don't work there any more, I don't think.

I mostly use qemu now for testing, but this is not something you can
really simulate on qemu very well.  Can you do an IPMI firmware
update on your system?  That would be the easiest way to measure.

Thanks,

-corey

> 
> Thanks.
> 
> -- 
> tejun


Re: [Openipmi-developer] [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Corey Minyard
On Tue, Jul 09, 2019 at 03:11:47PM -0700, Tejun Heo wrote:
> On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> > On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote:
> > > ipmi_thread() uses back-to-back schedule() to poll for command
> > > completion which, on some machines, can push up CPU consumption and
> > > heavily tax the scheduler locks leading to noticeable overall
> > > performance degradation.
> > > 
> > > This patch replaces schedule() with usleep_range(100, 200).  This
> > > allows the sensor readings to finish resonably fast and the cpu
> > > consumption of the kthread is kept under several percents of a core.
> > 
> > The IPMI thread was not really designed for sensor reading, it was
> > designed so that firmware updates would happen in a reasonable time
> > on systems without an interrupt on the IPMI interface.  This change
> > will degrade performance for that function.  IIRC correctly the
> > people who did the patch tried this and it slowed things down too
> > much.
> 
> Also, can you point me to the exact patch?  I'm kinda curious what
> kind of timning they used.

I believe the change was 33979734cd35ae "IPMI: use schedule in kthread"
The original change that added the kthread was a9a2c44ff0a1350
"ipmi: add timer thread".

I mis-remembered this, we switched from doing a udelay() to
schedule(), but that udelay was 1us, so that's probably not helpful
information.

-corey

> 
> Thanks.
> 
> -- 
> tejun
> 
> 
> ___
> Openipmi-developer mailing list
> openipmi-develo...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Corey Minyard
On Tue, Jul 09, 2019 at 03:09:08PM -0700, Tejun Heo wrote:
> Hello, Corey.
> 
> On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> > I'm also a little confused because the CPU in question shouldn't
> > be doing anything else if the schedule() immediately returns here,
> > so it's not wasting CPU that could be used on another process.  Or
> > is it lock contention that is causing an issue on other CPUs?
> 
> Yeah, pretty pronounced too and it also keeps the CPU busy which makes
> the load balancer deprioritize that CPU.  Busy looping is never free.
> 
> > IMHO, this whole thing is stupid; if you design hardware with
> > stupid interfaces (byte at a time, no interrupts) you should
> > expect to get bad performance.  But I can't control what the
> > hardware vendors do.  This whole thing is a carefully tuned
> > compromise.
> 
> I'm really not sure "carefully tuned" is applicable on indefinite busy
> looping.

Well, yeah, but other things were tried and this was the only thing
we could find that worked.  That was before the kind of SMP stuff
we have now, though.

> 
> > So I can't really take this as-is.
> 
> We can go for shorter timeouts for sure but I don't think this sort of
> busy looping is acceptable.  Is your position that this must be a busy
> loop?

Well, no.  I want something that provides as high a throughput as
possible and doesn't cause scheduling issues.  But that may not be
possible.  Screwing up the scheduler is a lot worse than slow IPMI
firmware updates.

How short can the timeouts be and avoid issues?

Thanks,

-corey

> 
> Thanks.
> 
> -- 
> tejun


Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Corey Minyard
On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote:
> ipmi_thread() uses back-to-back schedule() to poll for command
> completion which, on some machines, can push up CPU consumption and
> heavily tax the scheduler locks leading to noticeable overall
> performance degradation.
> 
> This patch replaces schedule() with usleep_range(100, 200).  This
> allows the sensor readings to finish resonably fast and the cpu
> consumption of the kthread is kept under several percents of a core.

The IPMI thread was not really designed for sensor reading, it was
designed so that firmware updates would happen in a reasonable time
on systems without an interrupt on the IPMI interface.  This change
will degrade performance for that function.  IIRC correctly the
people who did the patch tried this and it slowed things down too
much.

I'm also a little confused because the CPU in question shouldn't
be doing anything else if the schedule() immediately returns here,
so it's not wasting CPU that could be used on another process.  Or
is it lock contention that is causing an issue on other CPUs?

IMHO, this whole thing is stupid; if you design hardware with
stupid interfaces (byte at a time, no interrupts) you should
expect to get bad performance.  But I can't control what the
hardware vendors do.  This whole thing is a carefully tuned
compromise.

So I can't really take this as-is.

-corey

> 
> Signed-off-by: Tejun Heo 
> ---
>  drivers/char/ipmi/ipmi_si_intf.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c 
> b/drivers/char/ipmi/ipmi_si_intf.c
> index f124a2d2bb9f..2143e3c10623 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -1010,7 +1010,7 @@ static int ipmi_thread(void *data)
>   if (smi_result == SI_SM_CALL_WITHOUT_DELAY)
>   ; /* do nothing */
>   else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)
> - schedule();
> + usleep_range(100, 200);
>   else if (smi_result == SI_SM_IDLE) {
>   if (atomic_read(_info->need_watch)) {
>   schedule_timeout_interruptible(100);


Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

2019-07-02 Thread Corey Minyard
On Tue, Jul 02, 2019 at 10:35:36AM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-07-02 09:04:18 [+0200], Kurt Kanzenbach wrote:
> > > In fact, my system doesn't boot with this commit in 5.0-rt.
> > >
> > > If I revert 90e1b18eba2ae4a729 ("swait: Delete the task from after a
> > > wakeup occured") the machine boots again.
> > >
> > > Sebastian, I think that's a bad commit, please revert it.
> > 
> > I'm having the same problem on a Cyclone V based ARM board. Reverting
> > this commit solves the boot issue for me as well.
> 
> Okay. So the original Corey fix as in v5.0.14-rt9 works for everyone.
> Peter's version as I picked it up for v5.0.21-rt14 is causing problems
> for two persons now.
> 
> I'm leaning towards reverting it back to old version for now…

Just to avoid confusion... it wasn't my patch 1921ea799b7dc56
(sched/completion: Fix a lockup in wait_for_completion()) that caused
the issue, nor was it Peter's version of it.  Instead, it was the patch
mentioned above, 90e1b18eba2ae4a729 ("swait: Delete the task from after a
wakeup occured"), which came from someone else.  I can verify by visual
inspection that that patch is broken and it should definitely be removed.
Just don't want someone to be confused and remove the wrong patch.

-corey


> 
> > Thanks,
> > Kurt
> 
> Sebastian


Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

2019-07-01 Thread Corey Minyard
On Mon, Jul 01, 2019 at 05:28:25PM -0400, Steven Rostedt wrote:
> On Mon, 1 Jul 2019 17:13:33 -0400
> Steven Rostedt  wrote:
> 
> > On Mon, 1 Jul 2019 17:06:02 -0400
> > Steven Rostedt  wrote:
> > 
> > > On Mon, 1 Jul 2019 15:43:25 -0500
> > > Corey Minyard  wrote:
> > > 
> > >   
> > > > I show that patch is already applied at
> > > > 
> > > > 1921ea799b7dc561c97185538100271d88ee47db
> > > > sched/completion: Fix a lockup in wait_for_completion()
> > > > 
> > > > git describe --contains 1921ea799b7dc561c97185538100271d88ee47db
> > > > v4.19.37-rt20~1
> > > > 
> > > > So I'm not sure what is going on.
> > > 
> > > Bah, I'm replying to the wrong commit that I'm having issues with.
> > > 
> > > I searched your name to find the patch that is of trouble, and picked
> > > this one.
> > > 
> > > I'll go find the problem patch, sorry for the noise on this one.
> > >   
> > 
> > No, I did reply to the right email, but it wasn't the top patch I was
> > having issues with. It was the patch I replied to:
> > 
> > This change below that Sebastian marked as stable-rt is what is causing
> > me an issue. Not the patch that started the thread.
> > 
> 
> In fact, my system doesn't boot with this commit in 5.0-rt.
> 
> If I revert 90e1b18eba2ae4a729 ("swait: Delete the task from after a
> wakeup occured") the machine boots again.
> 
> Sebastian, I think that's a bad commit, please revert it.

Yeah.  d_wait_lookup() does not use __SWAITQUEUE_INITIALIZER() to
intitialize it's queue item, but uses swake_up_all(), so it goes
into an infinite loop since it won't remove the item because remove
isn't set.

I'd suspect there are other places this is the case.

-corey

> 
> Thanks!
> 
> -- Steve
> 
> > 
> > 
> > > Now.. that will fix it, but I think it is also wrong.
> > > 
> > > The problem being that it violates FIFO, something that might be more
> > > important on -RT than elsewhere.
> > > 
> > > The regular wait API seems confused/inconsistent when it uses
> > > autoremove_wake_function and default_wake_function, which doesn't help,
> > > but we can easily support this with swait -- the problematic thing is
> > > the custom wake functions, we musn't do that.
> > > 
> > > (also, mingo went and renamed a whole bunch of wait_* crap and didn't do
> > > the same to swait_ so now its named all different :/)
> > > 
> > > Something like the below perhaps.
> > > 
> > > ---
> > > diff --git a/include/linux/swait.h b/include/linux/swait.h
> > > index 73e06e9986d4..f194437ae7d2 100644
> > > --- a/include/linux/swait.h
> > > +++ b/include/linux/swait.h
> > > @@ -61,11 +61,13 @@ struct swait_queue_head {
> > >  struct swait_queue {
> > >   struct task_struct  *task;
> > >   struct list_headtask_list;
> > > + unsigned intremove;
> > >  };
> > >  
> > >  #define __SWAITQUEUE_INITIALIZER(name) { \
> > >   .task   = current,  \
> > >   .task_list  = LIST_HEAD_INIT((name).task_list), \
> > > + .remove = 1,\
> > >  }
> > >  
> > >  #define DECLARE_SWAITQUEUE(name) \
> > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > > index e83a3f8449f6..86974ecbabfc 100644
> > > --- a/kernel/sched/swait.c
> > > +++ b/kernel/sched/swait.c
> > > @@ -28,7 +28,8 @@ void swake_up_locked(struct swait_queue_head *q)
> > >  
> > >   curr = list_first_entry(>task_list, typeof(*curr), task_list);
> > >   wake_up_process(curr->task);
> > > - list_del_init(>task_list);
> > > + if (curr->remove)
> > > + list_del_init(>task_list);
> > >  }
> > >  EXPORT_SYMBOL(swake_up_locked);
> > >  
> > > @@ -57,7 +58,8 @@ void swake_up_all(struct swait_queue_head *q)
> > >   curr = list_first_entry(, typeof(*curr), task_list);
> > >  
> > >   wake_up_state(curr->task, TASK_NORMAL);
> > > - list_del_init(>task_list);
> > > + if (curr->remove)
> > > + list_del_init(>task_list);
> > >  
> > >   if (list_empty())
> > >   break;  
> > 
> 


Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

2019-07-01 Thread Corey Minyard
On Mon, Jul 01, 2019 at 04:18:40PM -0400, Steven Rostedt wrote:
> On Mon, 1 Jul 2019 14:09:49 -0500
> Corey Minyard  wrote:
> 
> > On Fri, Jun 28, 2019 at 09:49:03PM -0400, Steven Rostedt wrote:
> > > On Fri, 10 May 2019 12:33:18 +0200
> > > Sebastian Andrzej Siewior  wrote:
> 
> > > 
> > > When I applied this patch to 4.19-rt, I get the following lock up:  
> > 
> > I was unable to reproduce, and I looked at the code and I can't really
> > see a connection between this change and this crash.
> > 
> > Can you reproduce at will?  If so, can you send a testcase?
> > 
> 
> Yes, it wont boot. There is no testcase as I don't even make it to a
> boot prompt. I applied the patch and it crashes, I remove the patch and
> it boots without issue.
> 
> Attached is the full dmesg and the config used. I applied it to
> 
>   ae97a0ba0197fb424008a317b79bebacd6a50213
>   Linux 4.19.56-rt23
> 
> It works fine for 5.0.14-rt9 where it was added.
> 
> -- Steve

I show that patch is already applied at

1921ea799b7dc561c97185538100271d88ee47db
sched/completion: Fix a lockup in wait_for_completion()

git describe --contains 1921ea799b7dc561c97185538100271d88ee47db
v4.19.37-rt20~1

So I'm not sure what is going on.

-corey


Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

2019-07-01 Thread Corey Minyard
On Fri, Jun 28, 2019 at 09:49:03PM -0400, Steven Rostedt wrote:
> On Fri, 10 May 2019 12:33:18 +0200
> Sebastian Andrzej Siewior  wrote:
> 
> > On 2019-05-09 14:33:20 [-0500], miny...@acm.org wrote:
> > > From: Corey Minyard 
> > > 
> > > The function call do_wait_for_common() has a race condition that
> > > can result in lockups waiting for completions.  Adding the thread
> > > to (and removing the thread from) the wait queue for the completion
> > > is done outside the do loop in that function.  However, if the thread
> > > is woken up, the swake_up_locked() function will delete the entry
> > > from the wait queue.  If that happens and another thread sneaks
> > > in and decrements the done count in the completion to zero, the
> > > loop will go around again, but the thread will no longer be in the
> > > wait queue, so there is no way to wake it up.  
> > 
> > applied, thank you.
> > 
> 
> When I applied this patch to 4.19-rt, I get the following lock up:

I was unable to reproduce, and I looked at the code and I can't really
see a connection between this change and this crash.

Can you reproduce at will?  If so, can you send a testcase?

-corey

> 
> watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [sh:745]
> Modules linked in: floppy i915 drm_kms_helper drm fb_sys_fops sysimgblt 
> sysfillrect syscopyarea iosf_mbi i2c_algo_bit video
> CPU: 2 PID: 745 Comm: sh Not tainted 4.19.56-test-rt23+ #16
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by 
> O.E.M., BIOS SDBLI944.86P 05/08/2007
> RIP: 0010:_raw_spin_unlock_irq+0x17/0x4d
> Code: 48 8b 12 0f ba e2 12 73 07 e8 f1 4a 92 ff 31 c0 5b 5d c3 66 66 66 66 90 
> 55 48 89 e5 c6 07 00 e8 de 3d a3 ff fb bf 01 00 00 00  a7 27 9a ff 65 8b 
> 05 c8 7f 93 7e 85 c0 74 1f a9 ff ff
>  ff 7f 75
> RSP: 0018:c9c8bbb8 EFLAGS: 0246 ORIG_RAX: ff13
> RAX:  RBX: c9c8bd58 RCX: 0003
> RDX:  RSI: 8108ffab RDI: 0001
> RBP: c9c8bbb8 R08: 816dcd76 R09: 00020600
> R10: 0400 R11: 001c0eef1808 R12: c9c8bbc8
> R13: c9f13ca0 R14: 888074b2d7d8 R15: 8880789efe10
> FS:  () GS:88807b30() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0030662001b8 CR3: 376ac000 CR4: 06e0
> Call Trace:
>  swake_up_all+0xa6/0xde
>  __d_lookup_done+0x7c/0xc7
>  __d_add+0x44/0xf7
>  d_splice_alias+0x208/0x218
>  ext4_lookup+0x1a6/0x1c5
>  path_openat+0x63a/0xb15
>  ? preempt_latency_stop+0x25/0x27
>  do_filp_open+0x51/0xae
>  ? trace_preempt_on+0xde/0xe7
>  ? rt_spin_unlock+0x13/0x24
>  ? __alloc_fd+0x145/0x155
>  do_sys_open+0x81/0x125
>  __x64_sys_open+0x21/0x23
>  do_syscall_64+0x5c/0x6e
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> I haven't really looked too much into it though. I ran out of time :-/
> 
> -- Steve


Re: [PATCH] docs: ipmb: place it at driver-api and convert to ReST

2019-06-30 Thread Corey Minyard
On Sat, Jun 29, 2019 at 07:36:46AM -0300, Mauro Carvalho Chehab wrote:
> No new doc should be added at the main Documentation/ directory.
> 
> Instead, new docs should be added as ReST files, within the
> Kernel documentation body.

Got it, thanks.

-corey

> 
> Fixes: 51bd6f291583 ("Add support for IPMB driver")
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/driver-api/index.rst|  1 +
>  .../{IPMB.txt => driver-api/ipmb.rst} | 62 ++-
>  2 files changed, 33 insertions(+), 30 deletions(-)
>  rename Documentation/{IPMB.txt => driver-api/ipmb.rst} (71%)
> 
> diff --git a/Documentation/driver-api/index.rst 
> b/Documentation/driver-api/index.rst
> index e33849b948c7..e49c34bf16c0 100644
> --- a/Documentation/driver-api/index.rst
> +++ b/Documentation/driver-api/index.rst
> @@ -75,6 +75,7 @@ available subsections can be seen below.
> dell_rbu
> edid
> eisa
> +   ipmb
> isa
> isapnp
> generic-counter
> diff --git a/Documentation/IPMB.txt b/Documentation/driver-api/ipmb.rst
> similarity index 71%
> rename from Documentation/IPMB.txt
> rename to Documentation/driver-api/ipmb.rst
> index cd20c9764705..3ec3baed84c4 100644
> --- a/Documentation/IPMB.txt
> +++ b/Documentation/driver-api/ipmb.rst
> @@ -32,11 +32,11 @@ This driver works with the I2C driver and a userspace
>  program such as OpenIPMI:
>  
>  1) It is an I2C slave backend driver. So, it defines a callback
> -function to set the Satellite MC as an I2C slave.
> -This callback function handles the received IPMI requests.
> +   function to set the Satellite MC as an I2C slave.
> +   This callback function handles the received IPMI requests.
>  
>  2) It defines the read and write functions to enable a user
> -space program (such as OpenIPMI) to communicate with the kernel.
> +   space program (such as OpenIPMI) to communicate with the kernel.
>  
>  
>  Load the IPMB driver
> @@ -48,34 +48,35 @@ CONFIG_IPMB_DEVICE_INTERFACE=y
>  
>  1) If you want the driver to be loaded at boot time:
>  
> -a) Add this entry to your ACPI table, under the appropriate SMBus:
> +a) Add this entry to your ACPI table, under the appropriate SMBus::
>  
> -Device (SMB0) // Example SMBus host controller
> -{
> -  Name (_HID, "") // Vendor-Specific HID
> -  Name (_UID, 0) // Unique ID of particular host controller
> -  :
> -  :
> -Device (IPMB)
> -{
> -  Name (_HID, "IPMB0001") // IPMB device interface
> -  Name (_UID, 0) // Unique device identifier
> -}
> -}
> + Device (SMB0) // Example SMBus host controller
> + {
> + Name (_HID, "") // Vendor-Specific HID
> + Name (_UID, 0) // Unique ID of particular host controller
> + :
> + :
> +   Device (IPMB)
> +   {
> + Name (_HID, "IPMB0001") // IPMB device interface
> + Name (_UID, 0) // Unique device identifier
> +   }
> + }
>  
> -b) Example for device tree:
> +b) Example for device tree::
>  
> - {
> - status = "okay";
> +  {
> +status = "okay";
>  
> - ipmb@10 {
> - compatible = "ipmb-dev";
> - reg = <0x10>;
> - };
> -};
> +ipmb@10 {
> +compatible = "ipmb-dev";
> +reg = <0x10>;
> +};
> + };
>  
> -2) Manually from Linux:
> -modprobe ipmb-dev-int
> +2) Manually from Linux::
> +
> + modprobe ipmb-dev-int
>  
>  
>  Instantiate the device
> @@ -86,15 +87,16 @@ described in 
> 'Documentation/i2c/instantiating-devices.rst'.
>  If you have multiple BMCs, each connected to your Satellite MC via
>  a different I2C bus, you can instantiate a device for each of
>  those BMCs.
> +
>  The name of the instantiated device contains the I2C bus number
> -associated with it as follows:
> +associated with it as follows::
>  
> -BMC1 -- IPMB/I2C bus 1 -|   /dev/ipmb-1
> +  BMC1 -- IPMB/I2C bus 1 -|   /dev/ipmb-1
>   Satellite MC
> -BMC1 -- IPMB/I2C bus 2 -|   /dev/ipmb-2
> +  BMC1 -- IPMB/I2C bus 2 -|   /dev/ipmb-2
>  
>  For instance, you can instantiate the ipmb-dev-int device from
> -user space at the 7 bit address 0x10 on bus 2:
> +user space at the 7 bit address 0x10 on bus 2::
>  
># echo ipmb-dev 0x1010 > /sys/bus/i2c/devices/i2c-2/new_device
>  
> -- 
> 2.21.0
> 


Re: [PATCH] ipmi: ipmb: don't allocate i2c_client on stack

2019-06-19 Thread Corey Minyard
On Wed, Jun 19, 2019 at 02:50:34PM +0200, Arnd Bergmann wrote:
> The i2c_client structure can be fairly large, which leads to
> a warning about possible kernel stack overflow in some
> configurations:
> 
> drivers/char/ipmi/ipmb_dev_int.c:115:16: error: stack frame size of 1032 
> bytes in function 'ipmb_write' [-Werror,-Wframe-larger-than=]
> 
> There is no real reason to even declare an i2c_client, as we can simply
> call i2c_smbus_xfer() directly instead of the i2c_smbus_write_block_data()
> wrapper.
> 
> Convert the ipmb_write() to use an open-coded i2c_smbus_write_block_data()
> here, without changing the behavior.
> 
> It seems that there is another problem with this implementation;
> when user space passes a length of more than I2C_SMBUS_BLOCK_MAX
> bytes, all the rest is silently ignored. This should probably be
> addressed in a separate patch, but I don't know what the intended
> behavior is here.
> 
> Fixes: 51bd6f291583 ("Add support for IPMB driver")
> Signed-off-by: Arnd Bergmann 

I broke up the line with the call to i2c_smbus_xfer(), which was
longer than 80 characters, but that's it, it's in the IPMI next queue.

Thanks,

-corey

> ---
>  drivers/char/ipmi/ipmb_dev_int.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmb_dev_int.c 
> b/drivers/char/ipmi/ipmb_dev_int.c
> index 2895abf72e61..c9724f6cf32d 100644
> --- a/drivers/char/ipmi/ipmb_dev_int.c
> +++ b/drivers/char/ipmi/ipmb_dev_int.c
> @@ -117,7 +117,7 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>  {
>   struct ipmb_dev *ipmb_dev = to_ipmb_dev(file);
>   u8 rq_sa, netf_rq_lun, msg_len;
> - struct i2c_client rq_client;
> + union i2c_smbus_data data;
>   u8 msg[MAX_MSG_LEN];
>   ssize_t ret;
>  
> @@ -138,17 +138,17 @@ static ssize_t ipmb_write(struct file *file, const char 
> __user *buf,
>  
>   /*
>* subtract rq_sa and netf_rq_lun from the length of the msg passed to
> -  * i2c_smbus_write_block_data_local
> +  * i2c_smbus_xfer
>*/
>   msg_len = msg[IPMB_MSG_LEN_IDX] - SMBUS_MSG_HEADER_LENGTH;
> -
> - strcpy(rq_client.name, "ipmb_requester");
> - rq_client.adapter = ipmb_dev->client->adapter;
> - rq_client.flags = ipmb_dev->client->flags;
> - rq_client.addr = rq_sa;
> -
> - ret = i2c_smbus_write_block_data(_client, netf_rq_lun, msg_len,
> - msg + SMBUS_MSG_IDX_OFFSET);
> + if (msg_len > I2C_SMBUS_BLOCK_MAX)
> + msg_len = I2C_SMBUS_BLOCK_MAX;
> +
> + data.block[0] = msg_len;
> + memcpy([1], msg + SMBUS_MSG_IDX_OFFSET, msg_len);
> + ret = i2c_smbus_xfer(ipmb_dev->client->adapter, rq_sa, 
> ipmb_dev->client->flags,
> +  I2C_SMBUS_WRITE, netf_rq_lun,
> +  I2C_SMBUS_BLOCK_DATA, );
>  
>   return ret ? : count;
>  }
> -- 
> 2.20.0
> 


  1   2   3   4   5   6   7   8   9   10   >