Re: [PATCH v1 3/3] mtd: spi-nor: mtk-quadspi: rename config to a common one

2019-01-14 Thread Ryder Lee
On Tue, 2019-01-15 at 07:34 +, tudor.amba...@microchip.com wrote:
> Hi, Ryder,
> 
> On 01/14/2019 07:12 AM, Ryder Lee wrote:
> > The quadspi is a generic communication interface which could be shared
> > with other MediaTek SoCs. Hence rename it to a common one.
> > 
> > Signed-off-by: Ryder Lee 
> > ---
> > Changes since v1: rebase to v5.0-rc1. 
> 
> The patch doesn't apply on v5.0-rc1 or rc2.
> > ---
> >  drivers/mtd/spi-nor/Kconfig  | 16 
> >  drivers/mtd/spi-nor/Makefile |  2 +-
> >  2 files changed, 9 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig
> > index b433e5f..99d9d53 100644
> > --- a/drivers/mtd/spi-nor/Kconfig
> > +++ b/drivers/mtd/spi-nor/Kconfig
> > @@ -7,14 +7,6 @@ menuconfig MTD_SPI_NOR
> >  
> >  if MTD_SPI_NOR
> >  
> > -config MTD_MT81xx_NOR
> > -   tristate "Mediatek MT81xx SPI NOR flash controller"
> > -   depends on HAS_IOMEM
> > -   help
> > - This enables access to SPI NOR flash, using MT81xx SPI NOR flash
> > - controller. This controller does not support generic SPI BUS, it only
> > - supports SPI NOR Flash.
> > -
> >  config MTD_SPI_NOR_USE_4K_SECTORS
> > bool "Use small 4096 B erase sectors"
> > default y
> > @@ -68,6 +60,14 @@ config SPI_NXP_SPIFI
> >   Flash. Enable this option if you have a device with a SPIFI
> >   controller and want to access the Flash as a mtd device.
> >  
> > +config SPI_MTK_QUADSPI
> 
> Since you are moving the config into the file, would you mind to put your 
> config
> in an alphabetical order?
> 
> Thanks,
> ta

Okay, I will send a new one to fix them.

Thanks,
Ryder

> > +   tristate "MediaTek Quad SPI controller"
> > +   depends on HAS_IOMEM
> > +   help
> > + This enables support for the Quad SPI controller in master mode.
> > + This controller does not support generic SPI. It only supports
> > + SPI NOR.
> > +
> >  config SPI_INTEL_SPI
> > tristate
> >  
> > diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile
> > index 2adedbe..189a15c 100644
> > --- a/drivers/mtd/spi-nor/Makefile
> > +++ b/drivers/mtd/spi-nor/Makefile
> > @@ -3,7 +3,7 @@ obj-$(CONFIG_MTD_SPI_NOR)   += spi-nor.o
> >  obj-$(CONFIG_SPI_ASPEED_SMC)   += aspeed-smc.o
> >  obj-$(CONFIG_SPI_CADENCE_QUADSPI)  += cadence-quadspi.o
> >  obj-$(CONFIG_SPI_HISI_SFC) += hisi-sfc.o
> > -obj-$(CONFIG_MTD_MT81xx_NOR)+= mtk-quadspi.o
> > +obj-$(CONFIG_SPI_MTK_QUADSPI)+= mtk-quadspi.o
> >  obj-$(CONFIG_SPI_NXP_SPIFI)+= nxp-spifi.o
> >  obj-$(CONFIG_SPI_INTEL_SPI)+= intel-spi.o
> >  obj-$(CONFIG_SPI_INTEL_SPI_PCI)+= intel-spi-pci.o
> > 




[PATCH 2/2] gpio: sprd: Fix incorrect irq type setting for the async EIC

2019-01-14 Thread Baolin Wang
From: Neo Hou 

When setting async EIC as IRQ_TYPE_EDGE_BOTH type, we missed to set the
SPRD_EIC_ASYNC_INTMODE register to 0, which means detecting edge signals.

Thus this patch fixes the issue.

Signed-off-by: Neo Hou 
Signed-off-by: Baolin Wang 
---
 drivers/gpio/gpio-eic-sprd.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpio/gpio-eic-sprd.c b/drivers/gpio/gpio-eic-sprd.c
index 257df59..e41223c 100644
--- a/drivers/gpio/gpio-eic-sprd.c
+++ b/drivers/gpio/gpio-eic-sprd.c
@@ -379,6 +379,7 @@ static int sprd_eic_irq_set_type(struct irq_data *data, 
unsigned int flow_type)
irq_set_handler_locked(data, handle_edge_irq);
break;
case IRQ_TYPE_EDGE_BOTH:
+   sprd_eic_update(chip, offset, SPRD_EIC_ASYNC_INTMODE, 
0);
sprd_eic_update(chip, offset, SPRD_EIC_ASYNC_INTBOTH, 
1);
irq_set_handler_locked(data, handle_edge_irq);
break;
-- 
1.7.9.5



[PATCH 1/2] gpio: sprd: Fix the incorrect data register

2019-01-14 Thread Baolin Wang
From: Neo Hou 

Since differnt type EICs have its own data register to read, thus fix the
incorrect data register.

Signed-off-by: Neo Hou 
Signed-off-by: Baolin Wang 
---
 drivers/gpio/gpio-eic-sprd.c |   13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-eic-sprd.c b/drivers/gpio/gpio-eic-sprd.c
index e0d6a0a..257df59 100644
--- a/drivers/gpio/gpio-eic-sprd.c
+++ b/drivers/gpio/gpio-eic-sprd.c
@@ -180,7 +180,18 @@ static void sprd_eic_free(struct gpio_chip *chip, unsigned 
int offset)
 
 static int sprd_eic_get(struct gpio_chip *chip, unsigned int offset)
 {
-   return sprd_eic_read(chip, offset, SPRD_EIC_DBNC_DATA);
+   struct sprd_eic *sprd_eic = gpiochip_get_data(chip);
+
+   switch (sprd_eic->type) {
+   case SPRD_EIC_DEBOUNCE:
+   return sprd_eic_read(chip, offset, SPRD_EIC_DBNC_DATA);
+   case SPRD_EIC_ASYNC:
+   return sprd_eic_read(chip, offset, SPRD_EIC_ASYNC_DATA);
+   case SPRD_EIC_SYNC:
+   return sprd_eic_read(chip, offset, SPRD_EIC_SYNC_DATA);
+   default:
+   return -ENOTSUPP;
+   }
 }
 
 static int sprd_eic_direction_input(struct gpio_chip *chip, unsigned int 
offset)
-- 
1.7.9.5



Re: [PATCH v1 2/3] mtd: spi-nor: mtk-quadspi: add SNOR_HWCAPS_READ for capcity setting

2019-01-14 Thread Guochun Mao
On Tue, 2019-01-15 at 06:59 +, tudor.amba...@microchip.com wrote:
> Hi, Ryder,
> 
> On 01/14/2019 07:12 AM, Ryder Lee wrote:
> > From: Guochun Mao 
> > 
> > SNOR_HWCAPS_READ is a basic read mode for both flash and controller,
> > it should be supported, so add the capcity for mtk-quadspi.
> 
> Since I couldn't find a datasheet for mt8173, I tend to share your assumption 
> -
> SNOR_HWCAPS_READ should be supported by this controller. However, it's always
> better to test it and not rely on assumptions. You can test it by forcing the
> mask to have just SNOR_HWCAPS_READ | SNOR_HWCAPS_PP set. Or you already 
> tested it?

Our IPs all support SNOR_HWCAPS_READ, Ryedr and I have test it.

> 
> You have a typo in capcity. Maybe substitute it with capability or "add this
> flag to spi_nor_hwcaps mask"

Ok, we'll correct it next version.

Thanks.
Guochun
> 
> > 
> > Signed-off-by: Guochun Mao 
> 
> You should add your SoB tag, because you are sending a patch that is not 
> yours.
> 
> Cheers,
> ta
> 
> > ---
> > Changes since v1: none. 
> > ---
> >  drivers/mtd/spi-nor/mtk-quadspi.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/mtd/spi-nor/mtk-quadspi.c 
> > b/drivers/mtd/spi-nor/mtk-quadspi.c
> > index 5442993..d9eed68 100644
> > --- a/drivers/mtd/spi-nor/mtk-quadspi.c
> > +++ b/drivers/mtd/spi-nor/mtk-quadspi.c
> > @@ -431,7 +431,8 @@ static int mtk_nor_init(struct mtk_nor *mtk_nor,
> > struct device_node *flash_node)
> >  {
> > const struct spi_nor_hwcaps hwcaps = {
> > -   .mask = SNOR_HWCAPS_READ_FAST |
> > +   .mask = SNOR_HWCAPS_READ |
> > +   SNOR_HWCAPS_READ_FAST |
> > SNOR_HWCAPS_READ_1_1_2 |
> > SNOR_HWCAPS_PP,
> > };
> > 




Re: [PATCHv2 6/7] x86/mm: remove bottom-up allocation style for x86_64

2019-01-14 Thread Pingfan Liu
On Tue, Jan 15, 2019 at 7:27 AM Dave Hansen  wrote:
>
> On 1/10/19 9:12 PM, Pingfan Liu wrote:
> > Although kaslr-kernel can avoid to stain the movable node. [1]
>
> Can you explain what staining is, or perhaps try to use some more
> standard nomenclature?  There are exactly 0 instances of the word
> "stain" in arch/x86/ or mm/.
>
I mean that KASLR may randomly choose some positions for base address,
which are located in movable node.

> > But the
> > pgtable can still stain the movable node. That is a probability problem,
> > although low, but exist. This patch tries to make it certainty by
> > allocating pgtable on unmovable node, instead of following kernel end.
>
> Anyway, can you read my suggested summary in the earlier patch and see
> if it fits or if I missed anything?  This description is really hard to
> read.
>
Your summary in the reply to [PATCH 0/7] express the things clearly. I
will use them to update the commit log

> ...> +#ifdef CONFIG_X86_32
> > +
> > +static unsigned long min_pfn_mapped;
> > +
> >  static unsigned long __init get_new_step_size(unsigned long step_size)
> >  {
> >   /*
> > @@ -653,6 +655,32 @@ static void __init memory_map_bottom_up(unsigned long 
> > map_start,
> >   }
> >  }
> >
> > +static unsigned long __init init_range_memory_mapping32(
> > + unsigned long r_start, unsigned long r_end)
> > +{
>
> Why is this returning a value which is not used?
>
> Did you compile this?  Didn't you get a warning that you're not
> returning a value from a function returning non-void?
>
It should be void. I will fix it in next version

> Also, I'd much rather see something like this written:
>
> static __init
> unsigned long init_range_memory_mapping32(unsigned long r_start,
>   unsigned long r_end)
>
> than what you have above.  But, if you get rid of the 'unsigned long',
> it will look much more sane in the first place.

Yes. Thank for your kindly review.

Best Regards,
Pingfan


Re: [PATCH v1 3/3] mtd: spi-nor: mtk-quadspi: rename config to a common one

2019-01-14 Thread Tudor.Ambarus
Hi, Ryder,

On 01/14/2019 07:12 AM, Ryder Lee wrote:
> The quadspi is a generic communication interface which could be shared
> with other MediaTek SoCs. Hence rename it to a common one.
> 
> Signed-off-by: Ryder Lee 
> ---
> Changes since v1: rebase to v5.0-rc1. 

The patch doesn't apply on v5.0-rc1 or rc2.

> ---
>  drivers/mtd/spi-nor/Kconfig  | 16 
>  drivers/mtd/spi-nor/Makefile |  2 +-
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig
> index b433e5f..99d9d53 100644
> --- a/drivers/mtd/spi-nor/Kconfig
> +++ b/drivers/mtd/spi-nor/Kconfig
> @@ -7,14 +7,6 @@ menuconfig MTD_SPI_NOR
>  
>  if MTD_SPI_NOR
>  
> -config MTD_MT81xx_NOR
> - tristate "Mediatek MT81xx SPI NOR flash controller"
> - depends on HAS_IOMEM
> - help
> -   This enables access to SPI NOR flash, using MT81xx SPI NOR flash
> -   controller. This controller does not support generic SPI BUS, it only
> -   supports SPI NOR Flash.
> -
>  config MTD_SPI_NOR_USE_4K_SECTORS
>   bool "Use small 4096 B erase sectors"
>   default y
> @@ -68,6 +60,14 @@ config SPI_NXP_SPIFI
> Flash. Enable this option if you have a device with a SPIFI
> controller and want to access the Flash as a mtd device.
>  
> +config SPI_MTK_QUADSPI

Since you are moving the config into the file, would you mind to put your config
in an alphabetical order?

Thanks,
ta

> + tristate "MediaTek Quad SPI controller"
> + depends on HAS_IOMEM
> + help
> +   This enables support for the Quad SPI controller in master mode.
> +   This controller does not support generic SPI. It only supports
> +   SPI NOR.
> +
>  config SPI_INTEL_SPI
>   tristate
>  
> diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile
> index 2adedbe..189a15c 100644
> --- a/drivers/mtd/spi-nor/Makefile
> +++ b/drivers/mtd/spi-nor/Makefile
> @@ -3,7 +3,7 @@ obj-$(CONFIG_MTD_SPI_NOR) += spi-nor.o
>  obj-$(CONFIG_SPI_ASPEED_SMC) += aspeed-smc.o
>  obj-$(CONFIG_SPI_CADENCE_QUADSPI)+= cadence-quadspi.o
>  obj-$(CONFIG_SPI_HISI_SFC)   += hisi-sfc.o
> -obj-$(CONFIG_MTD_MT81xx_NOR)+= mtk-quadspi.o
> +obj-$(CONFIG_SPI_MTK_QUADSPI)+= mtk-quadspi.o
>  obj-$(CONFIG_SPI_NXP_SPIFI)  += nxp-spifi.o
>  obj-$(CONFIG_SPI_INTEL_SPI)  += intel-spi.o
>  obj-$(CONFIG_SPI_INTEL_SPI_PCI)  += intel-spi-pci.o
> 


Re: [PATCHv2 2/7] acpi: change the topo of acpi_table_upgrade()

2019-01-14 Thread Pingfan Liu
On Tue, Jan 15, 2019 at 7:12 AM Dave Hansen  wrote:
>
> On 1/10/19 9:12 PM, Pingfan Liu wrote:
> > The current acpi_table_upgrade() relies on initrd_start, but this var is
>
> "var" meaning variable?
>
> Could you please go back and try to ensure you spell out all the words
> you are intending to write?  I think "topo" probably means "topology",
> but it's a really odd word to use for changing the arguments of a
> function, so I'm not sure.
>
> There are a couple more of these in this set.
>
Yes. I will do it and fix them in next version.

> > only valid after relocate_initrd(). There is requirement to extract the
> > acpi info from initrd before memblock-allocator can work(see [2/4]), hence
> > acpi_table_upgrade() need to accept the input param directly.
>
> "[2/4]"
>
> It looks like you quickly resent this set without updating the patch
> descriptions.
>
> > diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
> > index 61203ee..84e0a79 100644
> > --- a/drivers/acpi/tables.c
> > +++ b/drivers/acpi/tables.c
> > @@ -471,10 +471,8 @@ static DECLARE_BITMAP(acpi_initrd_installed, 
> > NR_ACPI_INITRD_TABLES);
> >
> >  #define MAP_CHUNK_SIZE   (NR_FIX_BTMAPS << PAGE_SHIFT)
> >
> > -void __init acpi_table_upgrade(void)
> > +void __init acpi_table_upgrade(void *data, size_t size)
> >  {
> > - void *data = (void *)initrd_start;
> > - size_t size = initrd_end - initrd_start;
> >   int sig, no, table_nr = 0, total_offset = 0;
> >   long offset = 0;
> >   struct acpi_table_header *table;
>
> I know you are just replacing some existing variables, but we have a
> slightly higher standard for naming when you actually have to specify
> arguments to a function.  Can you please give these proper names?
>
OK, I will change it to acpi_table_upgrade(void *initrd, size_t size).

Thanks,
Pingfan


Re: [PATCH v3 1/3] powerpc/mm: prepare kernel for KAsan on PPC32

2019-01-14 Thread Christophe Leroy




On 01/14/2019 09:34 AM, Dmitry Vyukov wrote:

On Sat, Jan 12, 2019 at 12:16 PM Christophe Leroy
 wrote:

 In kernel/cputable.c, explicitly use memcpy() in order
 to allow GCC to replace it with __memcpy() when KASAN is
 selected.

 Since commit 400c47d81ca38 ("powerpc32: memset: only use dcbz once cache is
 enabled"), memset() can be used before activation of the cache,
 so no need to use memset_io() for zeroing the BSS.

 Signed-off-by: Christophe Leroy 
 ---
  arch/powerpc/kernel/cputable.c | 4 ++--
  arch/powerpc/kernel/setup_32.c | 6 ++
  2 files changed, 4 insertions(+), 6 deletions(-)

 diff --git a/arch/powerpc/kernel/cputable.c
b/arch/powerpc/kernel/cputable.c
 index 1eab54bc6ee9..84814c8d1bcb 100644
 --- a/arch/powerpc/kernel/cputable.c
 +++ b/arch/powerpc/kernel/cputable.c
 @@ -2147,7 +2147,7 @@ void __init set_cur_cpu_spec(struct cpu_spec *s)
 struct cpu_spec *t = the_cpu_spec;

 t = PTRRELOC(t);
 -   *t = *s;
 +   memcpy(t, s, sizeof(*t));

Hi Christophe,

I understand why you are doing this, but this looks a bit fragile and
non-scalable. This may not work with the next version of compiler,
just different than yours version of compiler, clang, etc.


My felling would be that this change makes it more solid.

My understanding is that when you do *t = *s, the compiler can use 
whatever way it wants to do the copy.
When you do memcpy(), you ensure it will do it that way and not another 
way, don't you ?


My problem is that when using *t = *s, the function set_cur_cpu_spec() 
always calls memcpy(), not taking into account the following define 
which is in arch/powerpc/include/asm/string.h (other arches do the same):


#if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
/*
 * For files that are not instrumented (e.g. mm/slub.c) we
 * should use not instrumented version of mem* functions.
 */
#define memcpy(dst, src, len) __memcpy(dst, src, len)
#define memmove(dst, src, len) __memmove(dst, src, len)
#define memset(s, c, n) __memset(s, c, n)
#endif

void __init set_cur_cpu_spec(struct cpu_spec *s)
{
struct cpu_spec *t = _cpu_spec;

t = PTRRELOC(t);
*t = *s;

*PTRRELOC(_cpu_spec) = _cpu_spec;
}

 :
   0:   94 21 ff f0 stwur1,-16(r1)
   4:   7c 08 02 a6 mflrr0
   8:   bf c1 00 08 stmwr30,8(r1)
   c:   3f e0 00 00 lis r31,0
e: R_PPC_ADDR16_HA  .data..read_mostly
  10:   3b ff 00 00 addir31,r31,0
12: R_PPC_ADDR16_LO .data..read_mostly
  14:   7c 7e 1b 78 mr  r30,r3
  18:   7f e3 fb 78 mr  r3,r31
  1c:   90 01 00 14 stw r0,20(r1)
  20:   48 00 00 01 bl  20 
20: R_PPC_REL24 add_reloc_offset
  24:   7f c4 f3 78 mr  r4,r30
  28:   38 a0 00 58 li  r5,88
  2c:   48 00 00 01 bl  2c 
2c: R_PPC_REL24 memcpy
  30:   38 7f 00 58 addir3,r31,88
  34:   48 00 00 01 bl  34 
34: R_PPC_REL24 add_reloc_offset
  38:   93 e3 00 00 stw r31,0(r3)
  3c:   80 01 00 14 lwz r0,20(r1)
  40:   bb c1 00 08 lmw r30,8(r1)
  44:   7c 08 03 a6 mtlrr0
  48:   38 21 00 10 addir1,r1,16
  4c:   4e 80 00 20 blr


When replacing *t = *s by memcpy(t, s, sizeof(*t)), GCC replace it by 
__memcpy() as expected.




Does using -ffreestanding and/or -fno-builtin-memcpy (-memset) help?


No it doesn't and to be honest I can't see how it would. My 
understanding is that it could be even worse because it would mean 
adding calls to memcpy() also in all trivial places where GCC does the 
copy itself by default.


Do you see any alternative ?

Christophe


If it helps, perhaps it makes sense to add these flags to
KASAN_SANITIZE := n files.



 *PTRRELOC(_cpu_spec) = _cpu_spec;
  }
@@ -2162,7 +2162,7 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned 
long offset,
 old = *t;

 /* Copy everything, then do fixups */
-   *t = *s;
+   memcpy(t, s, sizeof(*t));

 /*
  * If we are overriding a previous value derived from the real
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 947f904688b0..5e761eb16a6d 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -73,10 +73,8 @@ notrace unsigned long __init early_init(unsigned long dt_ptr)
  {
 unsigned long offset = reloc_offset();

-   /* First zero the BSS -- use memset_io, some platforms don't have
-* caches on yet */
-   memset_io((void __iomem *)PTRRELOC(&__bss_start), 0,
-   __bss_stop - __bss_start);
+   /* First zero the BSS */
+   memset(PTRRELOC(&__bss_start), 0, __bss_stop - __bss_start);

 /*
  * Identify the CPU type and fix up code sections
--
2.13.3



Re: Regression in 32-bit-compat TIOCGPTPEER ioctl due to 311fc65c9fb9c966bca8e6f3ff8132ce57344ab9

2019-01-14 Thread Eric W. Biederman
"Robert O'Callahan"  writes:

> This commit refactored the implementation of TIOCGPTPEER, moving "case
> TIOCGPTPEER" from pty_unix98_ioctl() to tty_ioctl().
> pty_unix98_ioctl() is called by pty_unix98_compat_ioctl(), so before
> the commit, TIOCGPTPEER worked for 32-bit userspace. Unfortunately
> tty_compat_ioctl() does not call tty_ioctl() so after the commit,
> TIOCGPTPEER from 32-bit userspace fails with ENOTTY.
>
> Testcase in https://bugzilla.kernel.org/show_bug.cgi?id=202271.
>
> I found this bug running the rr test suite.

Can you confirm this fixes it for you?

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index bfe9ad85b362..1b0847976b28 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2815,6 +2815,7 @@ static long tty_compat_ioctl(struct file *file, unsigned 
int cmd,
case TCXONC:
case TIOCMIWAIT:
case TIOCSERCONFIG:
+   case TIOCGPTPEER:
return tty_ioctl(file, cmd, arg);
}


Thank you,
Eric Biederman


Re: [PATCH v3] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-14 Thread Michal Hocko
On Mon 14-01-19 12:18:07, Shakeel Butt wrote:
> On Sun, Jan 13, 2019 at 10:34 AM Michal Hocko  wrote:
> >
> > On Fri 11-01-19 14:54:32, Shakeel Butt wrote:
> > > Hi Johannes,
> > >
> > > On Fri, Jan 11, 2019 at 12:59 PM Johannes Weiner  
> > > wrote:
> > > >
> > > > Hi Shakeel,
> > > >
> > > > On Thu, Jan 10, 2019 at 09:44:32AM -0800, Shakeel Butt wrote:
> > > > > If a memcg is over high limit, memory reclaim is scheduled to run on
> > > > > return-to-userland.  However it is assumed that the memcg is the 
> > > > > current
> > > > > process's memcg.  With remote memcg charging for kmem or swapping in a
> > > > > page charged to remote memcg, current process can trigger reclaim on
> > > > > remote memcg.  So, schduling reclaim on return-to-userland for remote
> > > > > memcgs will ignore the high reclaim altogether. So, record the memcg
> > > > > needing high reclaim and trigger high reclaim for that memcg on
> > > > > return-to-userland.  However if the memcg is already recorded for high
> > > > > reclaim and the recorded memcg is not the descendant of the the memcg
> > > > > needing high reclaim, punt the high reclaim to the work queue.
> > > >
> > > > The idea behind remote charging is that the thread allocating the
> > > > memory is not responsible for that memory, but a different cgroup
> > > > is. Why would the same thread then have to work off any high excess
> > > > this could produce in that unrelated group?
> > > >
> > > > Say you have a inotify/dnotify listener that is restricted in its
> > > > memory use - now everybody sending notification events from outside
> > > > that listener's group would get throttled on a cgroup over which it
> > > > has no control. That sounds like a recipe for priority inversions.
> > > >
> > > > It seems to me we should only do reclaim-on-return when current is in
> > > > the ill-behaved cgroup, and punt everything else - interrupts and
> > > > remote charges - to the workqueue.
> > >
> > > This is what v1 of this patch was doing but Michal suggested to do
> > > what this version is doing. Michal's argument was that the current is
> > > already charging and maybe reclaiming a remote memcg then why not do
> > > the high excess reclaim as well.
> >
> > Johannes has a good point about the priority inversion problems which I
> > haven't thought about.
> >
> > > Personally I don't have any strong opinion either way. What I actually
> > > wanted was to punt this high reclaim to some process in that remote
> > > memcg. However I didn't explore much on that direction thinking if
> > > that complexity is worth it. Maybe I should at least explore it, so,
> > > we can compare the solutions. What do you think?
> >
> > My question would be whether we really care all that much. Do we know of
> > workloads which would generate a large high limit excess?
> >
> 
> The current semantics of memory.high is that it can be breached under
> extreme conditions. However any workload where memory.high is used and
> a lot of remote memcg charging happens (inotify/dnotify example given
> by Johannes or swapping in tmpfs file or shared memory region) the
> memory.high breach will become common.

This is exactly what I am asking about. Is this something that can
happen easily? Remote charges on themselves should be rare, no?
-- 
Michal Hocko
SUSE Labs


Re: KASAN: use-after-scope Read in corrupted

2019-01-14 Thread Dmitry Vyukov
On Tue, Jan 15, 2019 at 5:43 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:1bdbe2274920 Merge tag 'vfio-v5.0-rc2' of git://github.com..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1519d39f40
> kernel config:  https://syzkaller.appspot.com/x/.config?x=edf1c3031097c304
> dashboard link: https://syzkaller.appspot.com/bug?extid=bd36b7dd9330f67037ab
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10fce14f40
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=110b201740

Based on the reproducer this is:

#syz dup: kernel panic: stack is corrupted in udp4_lib_lookup2


> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+bd36b7dd9330f6703...@syzkaller.appspotmail.com
>
> ==
> BUG: KASAN: use-after-scope in debug_lockdep_rcu_enabled.part.0+0x50/0x60
> kernel/rcu/update.c:249
> Read of size 4 at addr 8880a945eabc by task
> `9��#�(  �<�  k���E�>9hA/-2122188634
>
> CPU: 0 PID: -2122188634 Comm: ��E�O2� Not tainted 5.0.0-rc1+
> #19
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> [ cut here ]
> Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected
> to SLAB object 'task_struct' (offset 1344, size 8)!
> WARNING: CPU: 0 PID: -1455036288 at mm/usercopy.c:78
> usercopy_warn+0xeb/0x110 mm/usercopy.c:78
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 0 PID: -1455036288 Comm: ��E�O2� Not tainted 5.0.0-rc1+
> #19
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/f49537057f77cb00%40google.com.
> For more options, visit https://groups.google.com/d/optout.


Re: [PATCH v15 3/6] x86/boot: Introduce efi_get_rsdp_addr() to find RSDP from EFI table

2019-01-14 Thread Chao Fan
On Mon, Jan 14, 2019 at 10:07:56AM +0100, Borislav Petkov wrote:
>On Mon, Jan 14, 2019 at 09:26:42AM +0800, Chao Fan wrote:
>> According to the code, I saw:
>> #ifdef ACPI_ASL_COMPILER
>> #define ACPI_32BIT_PHYSICAL_ADDRESS
>> #endif
>> 
>> and then
>> #ifdef ACPI_32BIT_PHYSICAL_ADDRESS
>> typedef u32 acpi_physical_address;
>> 
>> As for ACPI_ASL_COMPILER, I saw iASL in documention, but can't find more
>> information in the code. If I miss something, please let me know.
>
>And, as a result, can acpi_physical_address ever be u32 in a kernel
>build?

In my understanding after looking into the commit message the comments.
I thinks yes. For 32-bit OS:
32-bit OS without PAE, it's u32.
32-bit OS with PAE in 64-bit machine, it's u64.

So I thinks there is some situations where it's u32.
'acpi_physical_address' was always u32 for 32-bit OS, and then to
solve some problems, "Zheng, Lv" add the commit. So I have added to Cc
"Zheng, Lv" , I am not sure whether "Zheng, Lv" can give some suggestion
about when acpi_physical_address is defined as u32.

Thanks,
Chao Fan

>
>git annotate is also very helpful when doing git archeology like, for
>example, finding the patch which added the ifdeffery and looking at its
>commit message for more hints.
>
>-- 
>Regards/Gruss,
>Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>




[PATCH] kbuild: mark prepare0 as PHONY to fix external module build

2019-01-14 Thread Masahiro Yamada
Commit c3ff2a5193fa ("powerpc/32: add stack protector support")
caused kernel panic on PowerPC if an external module is used with
CONFIG_STACKPROTECTOR because the 'prepare' target was not executed
for the external module build.

Commit e07db28eea38 ("kbuild: fix single target build for external
module") turned it into a build error because the 'prepare' target is
now executed but the 'prepare0' target is missing for the external
module build.

External module on arm/arm64 with CONFIG_STACKPROTECTOR_PER_TASK is
also broken in the same way.

Move 'PHONY += prepare0' to the common place. Make is fine with missing
rule for phony targets.

I minimize the change so it can be easily backported to 4.20.x

To fix v4.20 for external modules of PowerPC, please backport
e07db28eea38 ("kbuild: fix single target build for external module"),
and then this commit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=201891
Fixes: e07db28eea38 ("kbuild: fix single target build for external module")
Fixes: c3ff2a5193fa ("powerpc/32: add stack protector support")
Fixes: 189af4657186 ("ARM: smp: add support for per-task stack canaries")
Fixes: 0a1213fa7432 ("arm64: enable per-task stack canaries")
Cc: linux-stable  # v4.20
Reported-by: Samuel Holland 
Reported-by: Alexey Kardashevskiy 
Signed-off-by: Masahiro Yamada 
---

 Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 499b968..789b332 100644
--- a/Makefile
+++ b/Makefile
@@ -955,6 +955,7 @@ ifdef CONFIG_STACK_VALIDATION
   endif
 endif
 
+PHONY += prepare0
 
 ifeq ($(KBUILD_EXTMOD),)
 core-y += kernel/ certs/ mm/ fs/ ipc/ security/ crypto/ block/
@@ -1061,8 +1062,7 @@ scripts: scripts_basic scripts_dtc
 # archprepare is used in arch Makefiles and when processed asm symlink,
 # version.h and scripts_basic is processed / created.
 
-# Listed in dependency order
-PHONY += prepare archprepare prepare0 prepare1 prepare2 prepare3
+PHONY += prepare archprepare prepare1 prepare2 prepare3
 
 # prepare3 is used to check if we are building in a separate output directory,
 # and if so do:
-- 
2.7.4



Re: Plain accesses and data races in the Linux Kernel Memory Model

2019-01-14 Thread Dmitry Vyukov
On Tue, Jan 15, 2019 at 12:54 AM Paul E. McKenney  wrote:
>
> On Mon, Jan 14, 2019 at 02:41:49PM -0500, Alan Stern wrote:
> > The patch below is my first attempt at adapting the Linux Kernel
> > Memory Model to handle plain accesses (i.e., those which aren't
> > specially marked as READ_ONCE, WRITE_ONCE, acquire, release,
> > read-modify-write, or lock accesses).  This work is based on an
> > initial proposal created by Andrea Parri back in December 2017,
> > although it has grown a lot since then.
>
> Hello, Alan,
>
> Good stuff!!!
>
> I tried applying this in order to test it against the various litmus
> tests, but no joy.  Could you please tell me what commit is this patch
> based on?
>
> Thanx, Paul
>
> > The adaptation involves two main aspects: recognizing the ordering
> > induced by plain accesses and detecting data races.  They are handled
> > separately.  In fact, the code for figuring out the ordering assumes
> > there are no data races (the idea being that if a data race is
> > present then pretty much anything could happen, so there's no point
> > worrying about it -- obviously this will have to be changed if we want
> > to cover seqlocks).

Hi Alan,

Is there a mailing list dedicated to this effort? Private messages
tend to lost over time, no archive, not possible to send a link or
show full history to anybody, etc.

Re seqlocks, strictly saying defining races for seqlocks is not
necessary. Seqlocks can be expressed without races in C by using
relaxed atomic loads within the read critical section. We may consider
this option as well.


> > This is a relativly major change to the model and it will require a
> > lot of scrutiny and testing.  At the moment, I haven't even tried to
> > compare it with the existing model on our library of litmus tests.
> >
> > The difficulty with incorporating plain accesses in the memory model
> > is that the compiler has very few constraints on how it treats plain
> > accesses.  It can eliminate them, duplicate them, rearrange them,
> > merge them, split them up, and goodness knows what else.  To make some
> > sense of this, I have taken the view that a plain access can exist
> > (perhaps multiple times) within a certain bounded region of code.
> > Ordering of two accesses X and Y means that we guarantee at least one
> > instance of the X access must be executed before any instances of the
> > Y access.  (This is assuming that neither of the accesses is
> > completely eliminated by the compiler; otherwise there is nothing to
> > order!)
> >
> > After adding some simple definitions for the sets of plain and marked
> > accesses and for compiler barriers, the patch updates the ppo
> > relation.  The basic idea here is that ppo can be broken down into
> > categories: memory barriers, overwrites, and dependencies (including
> > dep-rfi).
> >
> >   Memory barriers always provide ordering (compiler barriers do
> >   not but they have indirect effects).
> >
> >   Overwriting always provides ordering.  This may seem
> >   surprising in the case where both X and Y are plain writes,
> >   but in that case the memory model will say that X can be
> >   eliminated unless there is at least a compiler barrier between
> >   X and Y, and this barrier will enforce the ordering.
> >
> >   Some dependencies provide ordering and some don't.  Going by
> >   cases:
> >
> >   An address dependency to a read provides ordering when
> >   the source is a marked read, even when the target is a
> >   plain read.  This is necessary if rcu_dereference() is
> >   to work correctly; it is tantamount to assuming that
> >   the compiler never speculates address dependencies.
> >   However, if the source is a plain read then there is
> >   no ordering.  This is because of Alpha, which does not
> >   respect address dependencies to reads (on Alpha,
> >   marked reads include a memory barrier to enforce the
> >   ordering but plain reads do not).
> >
> >   An address dependency to a write always provides
> >   ordering.  Neither the compiler nor the CPU can
> >   speculate the address of a write, because a wrong
> >   guess could generate a data race.  (Question: do we
> >   need to include the case where the source is a plain
> >   read?)
> >
> >   A data or control dependency to a write provides
> >   ordering if the target is a marked write.  This is
> >   because the compiler is obliged to translate a marked
> >   write as a single machine instruction; if it
> >   speculates such a write there will be no opportunity
> >   to correct a mistake.
> >
> >   Dep-rfi (i.e., a data or address dependency from a
> >   read to a 

Re: [PATCH 1/2] kbuild: remove top-level built-in.a

2019-01-14 Thread Nicholas Piggin
Masahiro Yamada's on January 14, 2019 1:27 pm:
> The symbol table in the final archive is unneeded because it is passed
> to the linker after the --whole-archive option. Every object file in
> the archive is included in the link anyway.
> 
> Pass thin archives from subdirectories directly to the linker, and
> remove the final archiving step.

This seems like a good improvement. As far as I remember, it was slower 
to do the final link without the index built. Maybe I was testing it 
in a revision before moving those files into --whole-archive? If there
is no slowdown, then I have no objection.

Acked-by: Nicholas Piggin 


Re: [PATCH net V3] vhost: log dirty page correctly

2019-01-14 Thread Jason Wang



On 2019/1/15 上午2:04, Michael S. Tsirkin wrote:

On Fri, Jan 11, 2019 at 12:00:36PM +0800, Jason Wang wrote:

Vhost dirty page logging API is designed to sync through GPA. But we
try to log GIOVA when device IOTLB is enabled. This is wrong and may
lead to missing data after migration.

To solve this issue, when logging with device IOTLB enabled, we will:

1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
get HVA, for writable descriptor, get HVA through iovec. For used
ring update, translate its GIOVA to HVA
2) traverse the GPA->HVA mapping to get the possible GPA and log
through GPA. Pay attention this reverse mapping is not guaranteed
to be unique, so we should log each possible GPA in this case.

This fix the failure of scp to guest during migration. In -next, we
will probably support passing GIOVA->GPA instead of GIOVA->HVA.

Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Reported-by: Jintack Lim
Cc: Jintack Lim
Signed-off-by: Jason Wang
---
Changes from V2:
- check and log the case of range overlap
- remove unnecessary u64 cast
- use smp_wmb() for the case of device IOTLB as well
Changes from V1:
- return error instead of warn
---
  drivers/vhost/net.c   |  3 +-
  drivers/vhost/vhost.c | 88 ---
  drivers/vhost/vhost.h |  3 +-
  3 files changed, 78 insertions(+), 16 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 36f3d0f49e60..bca86bf7189f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1236,7 +1236,8 @@ static void handle_rx(struct vhost_net *net)
if (nvq->done_idx > VHOST_NET_BATCH)
vhost_net_signal_used(nvq);
if (unlikely(vq_log))
-   vhost_log_write(vq, vq_log, log, vhost_len);
+   vhost_log_write(vq, vq_log, log, vhost_len,
+   vq->iov, in);
total_len += vhost_len;
if (unlikely(vhost_exceeds_weight(++recv_pkts, total_len))) {
vhost_poll_queue(>poll);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9f7942cbcbb2..55a2e8f9f8ca 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1733,13 +1733,78 @@ static int log_write(void __user *log_base,
return r;
  }
  
+static int log_write_hva(struct vhost_virtqueue *vq, u64 hva, u64 len)

+{
+   struct vhost_umem *umem = vq->umem;
+   struct vhost_umem_node *u;
+   u64 start, end;
+   int r;
+   bool hit = false;
+
+   /* More than one GPAs can be mapped into a single HVA. So
+* iterate all possible umems here to be safe.
+*/
+   list_for_each_entry(u, >umem_list, link) {
+   if (u->userspace_addr > hva - 1 + len ||
+   u->userspace_addr - 1 + u->size < hva)
+   continue;
+   start = max(u->userspace_addr, hva);
+   end = min(u->userspace_addr - 1 + u->size, hva - 1 + len);
+   r = log_write(vq->log_base,
+ u->start + start - u->userspace_addr,
+ end - start + 1);
+   if (r < 0)
+   return r;
+   hit = true;
+   }
+
+   if (!hit)
+   return -EFAULT;
+
+   return 0;
+}
+

I definitely like the simplicity.

But there's one point left here: if len crosses a region boundary,
but doesn't all fall within a region, we don't consistently report -EFAULT.

So I suspect we need to start by finding a region that contains hva.
If there are many of these - move right to the end of the
leftmost one and then repeat until we run out of len
or fail to find a region and exit with -EFAULT.



Ok, will do it in V4.

Thanks







Re: [PATCH] kvm: add proper frame pointer logic for vmx

2019-01-14 Thread Paolo Bonzini
On 15/01/19 08:04, Qian Cai wrote:
> 
> 
> On 1/15/19 1:44 AM, Qian Cai wrote:
>> compilation warning since v5.0-rc1,
>>
>> arch/x86/kvm/vmx/vmx.o: warning: objtool: vmx_vcpu_run.part.17()+0x3171:
>> call without frame pointer save/setup
>>
>> Fixes: 453eafbe65f (KVM: VMX: Move VM-Enter + VM-Exit handling to
>> non-inline sub-routines)
> 
> Oops, wrong fix. Back to square one.
> 

Hmm, maybe like this:

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index bcef2c7e9bc4..33122fa9d4bd 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -26,19 +26,17 @@ ENTRY(vmx_vmenter)
ret

 2: vmlaunch
+3:
ret

-3: cmpb $0, kvm_rebooting
-   jne 4f
-   call kvm_spurious_fault
-4: ret
-
.pushsection .fixup, "ax"
-5: jmp 3b
+4: cmpb $0, kvm_rebooting
+   jne 3b
+   jmp kvm_spurious_fault
.popsection

-   _ASM_EXTABLE(1b, 5b)
-   _ASM_EXTABLE(2b, 5b)
+   _ASM_EXTABLE(1b, 4b)
+   _ASM_EXTABLE(2b, 4b)

 ENDPROC(vmx_vmenter)


Paolo


Compensation for your effort

2019-01-14 Thread MRS SABAH IBRAHIM
Dear Friend,

How are you I hope you are very fine with your entire family? If so glory be to 
 Almighty God.
I'm happy to inform you about my success in getting those funds transferred 
under the cooperation of a new partner from  GREECE, Presently i'm in GREECE 
for a better treatment  and building of the orphanage home projects with the 
total  money.

Meanwhile, I didn't forget your past efforts and attempts to assist me in 
transferring those funds and use it for the building of the orphanage home and 
helping the less privilege.

Please contact my nurse in Burkina Faso, her  name is Mrs. Manal Yusuf , ask 
her to send you the compensation of $600,000.00USD which i have credited with  
the ECOBANK bank into an ATM card before i traveled for my treatment, you will 
indicate your contact as my else's business associate that tried to help me, 
but it could not work out for us, and I appreciated your good efforts at that 
time very much. so feel free and get in touched with the nurse Mrs. Manal Yusuf 
(email: mrs1manalyu...@gmail.com) and instruct her the address where to send 
the ATM card to you.

Please i am in the hospital here, i would not have much time to check emails or 
 respond to you, but in case you have any important message do send me as an 
update, i might instruct the doctor to check it and respond to you, meanwhile, 
once you received the ATM CARD,  do not delay to inform me.

Finally, remember that I had forwarded an instruction to the nurse on your 
behalf to deliver the ATM  card to you, so feel free to get in touch with her 
by email  she will send the ATM card to you without any delay.

Thank you and God bless you.
MRS SABAH IBRAHIM


[PATCH] nvmem: sc27xx: Convert nvmem offset to block index

2019-01-14 Thread Baolin Wang
From: Freeman Liu 

The Spreadtrum SC27XX efuse data are organized by blocks and each block
contains 2 bytes data. Moreover the nvmem core always pass the offset
in byte to the controller, so we should change the offset in byte to
the correct block index and block offset to read the data.

Signed-off-by: Freeman Liu 
Signed-off-by: Baolin Wang 
---
 drivers/nvmem/sc27xx-efuse.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/nvmem/sc27xx-efuse.c b/drivers/nvmem/sc27xx-efuse.c
index 33185d8..c6ee210 100644
--- a/drivers/nvmem/sc27xx-efuse.c
+++ b/drivers/nvmem/sc27xx-efuse.c
@@ -106,10 +106,12 @@ static int sc27xx_efuse_poll_status(struct sc27xx_efuse 
*efuse, u32 bits)
 static int sc27xx_efuse_read(void *context, u32 offset, void *val, size_t 
bytes)
 {
struct sc27xx_efuse *efuse = context;
-   u32 buf;
+   u32 buf, blk_index = offset / SC27XX_EFUSE_BLOCK_WIDTH;
+   u32 blk_offset = (offset % SC27XX_EFUSE_BLOCK_WIDTH) * BITS_PER_BYTE;
int ret;
 
-   if (offset > SC27XX_EFUSE_BLOCK_MAX || bytes > SC27XX_EFUSE_BLOCK_WIDTH)
+   if (blk_index > SC27XX_EFUSE_BLOCK_MAX ||
+   bytes > SC27XX_EFUSE_BLOCK_WIDTH)
return -EINVAL;
 
ret = sc27xx_efuse_lock(efuse);
@@ -133,7 +135,7 @@ static int sc27xx_efuse_read(void *context, u32 offset, 
void *val, size_t bytes)
/* Set the block address to be read. */
ret = regmap_write(efuse->regmap,
   efuse->base + SC27XX_EFUSE_BLOCK_INDEX,
-  offset & SC27XX_EFUSE_BLOCK_MASK);
+  blk_index & SC27XX_EFUSE_BLOCK_MASK);
if (ret)
goto disable_efuse;
 
@@ -171,8 +173,10 @@ static int sc27xx_efuse_read(void *context, u32 offset, 
void *val, size_t bytes)
 unlock_efuse:
sc27xx_efuse_unlock(efuse);
 
-   if (!ret)
+   if (!ret) {
+   buf >>= blk_offset;
memcpy(val, , bytes);
+   }
 
return ret;
 }
-- 
1.7.9.5



Re: [PATCHv2 1/7] x86/mm: concentrate the code to memblock allocator enabled

2019-01-14 Thread Pingfan Liu
On Tue, Jan 15, 2019 at 7:07 AM Dave Hansen  wrote:
>
> On 1/10/19 9:12 PM, Pingfan Liu wrote:
> > This patch identifies the point where memblock alloc start. It has no
> > functional.
>
> It has no functional ... what?  Effects?
>
During re-organize the code, it takes me a long time to figure out why
memblock_set_bottom_up(true) is added here, and how far can it be
deferred. And finally, I realize that it only takes effect after
e820__memblock_setup(), the point where memblock allocator can work.
So I concentrate the related code, and hope this patch can classify
this truth.

> > - memblock_set_current_limit(ISA_END_ADDRESS);
> > - e820__memblock_setup();
> > -
> >   reserve_bios_regions();
> >
> >   if (efi_enabled(EFI_MEMMAP)) {
> > @@ -1113,6 +1087,8 @@ void __init setup_arch(char **cmdline_p)
> >   efi_reserve_boot_services();
> >   }
> >
> > + memblock_set_current_limit(0, ISA_END_ADDRESS, false);
> > + e820__memblock_setup();
>
> It looks like you changed the arguments passed to
> memblock_set_current_limit().  How can this even compile?  Did you mean
> that this patch is not functional?
>
Sorry that during rebasing, merge trivial fix by mistake. I will build
against each patch.

Best regards,
Pingfan


Re: [PATCH] kvm: add proper frame pointer logic for vmx

2019-01-14 Thread Qian Cai



On 1/15/19 1:44 AM, Qian Cai wrote:
> compilation warning since v5.0-rc1,
> 
> arch/x86/kvm/vmx/vmx.o: warning: objtool: vmx_vcpu_run.part.17()+0x3171:
> call without frame pointer save/setup
> 
> Fixes: 453eafbe65f (KVM: VMX: Move VM-Enter + VM-Exit handling to
> non-inline sub-routines)

Oops, wrong fix. Back to square one.


Re: [PATCH v1 2/3] mtd: spi-nor: mtk-quadspi: add SNOR_HWCAPS_READ for capcity setting

2019-01-14 Thread Tudor.Ambarus
Hi, Ryder,

On 01/14/2019 07:12 AM, Ryder Lee wrote:
> From: Guochun Mao 
> 
> SNOR_HWCAPS_READ is a basic read mode for both flash and controller,
> it should be supported, so add the capcity for mtk-quadspi.

Since I couldn't find a datasheet for mt8173, I tend to share your assumption -
SNOR_HWCAPS_READ should be supported by this controller. However, it's always
better to test it and not rely on assumptions. You can test it by forcing the
mask to have just SNOR_HWCAPS_READ | SNOR_HWCAPS_PP set. Or you already tested 
it?

You have a typo in capcity. Maybe substitute it with capability or "add this
flag to spi_nor_hwcaps mask"

> 
> Signed-off-by: Guochun Mao 

You should add your SoB tag, because you are sending a patch that is not yours.

Cheers,
ta

> ---
> Changes since v1: none. 
> ---
>  drivers/mtd/spi-nor/mtk-quadspi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/spi-nor/mtk-quadspi.c 
> b/drivers/mtd/spi-nor/mtk-quadspi.c
> index 5442993..d9eed68 100644
> --- a/drivers/mtd/spi-nor/mtk-quadspi.c
> +++ b/drivers/mtd/spi-nor/mtk-quadspi.c
> @@ -431,7 +431,8 @@ static int mtk_nor_init(struct mtk_nor *mtk_nor,
>   struct device_node *flash_node)
>  {
>   const struct spi_nor_hwcaps hwcaps = {
> - .mask = SNOR_HWCAPS_READ_FAST |
> + .mask = SNOR_HWCAPS_READ |
> + SNOR_HWCAPS_READ_FAST |
>   SNOR_HWCAPS_READ_1_1_2 |
>   SNOR_HWCAPS_PP,
>   };
> 


[PATCH] spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch

2019-01-14 Thread Vignesh R
Commit b682cffa3ac6 ("spi: omap2-mcspi: Set FIFO DMA trigger level to word 
length")
broke SPI transfers where bits_per_word != 8. This is because of
mimsatch between McSPI FIFO level event trigger size (SPI word length) and
DMA request size(word length * maxburst). This leads to data
corruption, lockup and errors like:

spi1.0: EOW timed out

Fix this by setting DMA maxburst size to 1 so that
McSPI FIFO level event trigger size matches DMA request size.

Fixes: b682cffa3ac6 ("spi: omap2-mcspi: Set FIFO DMA trigger level to word 
length")
Cc: sta...@vger.kernel.org
Reported-by: David Lechner 
Tested-by: David Lechner 
Signed-off-by: Vignesh R 
---
 drivers/spi/spi-omap2-mcspi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
index 2fd8881fcd65..8be304379628 100644
--- a/drivers/spi/spi-omap2-mcspi.c
+++ b/drivers/spi/spi-omap2-mcspi.c
@@ -623,8 +623,8 @@ omap2_mcspi_txrx_dma(struct spi_device *spi, struct 
spi_transfer *xfer)
cfg.dst_addr = cs->phys + OMAP2_MCSPI_TX0;
cfg.src_addr_width = width;
cfg.dst_addr_width = width;
-   cfg.src_maxburst = es;
-   cfg.dst_maxburst = es;
+   cfg.src_maxburst = 1;
+   cfg.dst_maxburst = 1;
 
rx = xfer->rx_buf;
tx = xfer->tx_buf;
-- 
2.20.1



Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()

2019-01-14 Thread Myungho Jung
On Mon, Jan 14, 2019 at 09:37:25PM +0100, Ilya Dryomov wrote:
> On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung  wrote:
> > I reproduced on vm using syzkaller utils and verified the fix by syzbot.
> 
> Hi Myungho,
> 
> I think this might be a better fix:
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index d5718284db57..c5f5313e3537 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con)
>  {
> dout("con_keepalive %p\n", con);
> mutex_lock(>mutex);
> +   con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING);
> clear_standby(con);
> mutex_unlock(>mutex);
> -   if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 &&
> -   con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
> +
> +   if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
> queue_con(con);
>  }
>  EXPORT_SYMBOL(ceph_con_keepalive);
> 
> WRITE_PENDING can be set without con->mutex held from socket callbacks.
> This is the reason we use atomic bit ops here, so testing WRITE_PENDING
> under the lock didn't make sense to me.
> 
> At the same time, KEEPALIVE_PENDING could have been a non-atomic flag.
> I spent some time trying to make sense of conditioning queue_con() call
> on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm
> setting it with con_flag_set(), making ceph_con_keepalive() symmetric
> with ceph_con_send().
> 
> Thanks,
> 
> Ilya

Hi Ilya,

Yes, it looks clear and makes sense to have an atomic operation in if statement
but it still triggers warning. KEEPALIVE_PENDING should be set after
clear_standby() because con_fault() can be called right before acquiring the
lock here which sets the flag in standby state. I tesed the change with syzbot
and confirmed there was no warning.

Thanks,
Myungho


Re: [PATCH v2 00/15] powerpc/32s: Use BATs/LTLBs for STRICT_KERNEL_RWX

2019-01-14 Thread Christophe Leroy




Le 15/01/2019 à 01:33, Jonathan Neuschäfer a écrit :

On Mon, Jan 14, 2019 at 07:23:07PM +0100, Christophe Leroy wrote:



Le 13/01/2019 à 22:02, Jonathan Neuschäfer a écrit :

On Sun, Jan 13, 2019 at 08:43:07PM +0100, Christophe Leroy wrote:

Le 13/01/2019 à 19:16, Jonathan Neuschäfer a écrit :

I just tested the whole series on my Wii (I didn't test any intermediate
steps). Without CONFIG_STRICT_KERNEL_RWX, it seems to work fine, but
with it, I get the following error while booting:

[...]

I can't see anything special in your setup, and this failure looks rather
unexpected because I can't see anything done that early when
CONFIG_STRICT_KERNEL_RWX is selected.

Does CONFIG_STRICT_KERNEL_RWX works properly without my serie ?


I hadn't tried this before, but yes, without this series (on v5.0-rc2),
a kernel with CONFIG_STRICT_KERNEL_RWX boots.

I've checked it patch-by-patch now (with STRICT_KERNEL_RWX):

- patches 1 and 2 build and boot fine
- patches 3 to 6 build, but fail to boot with this error:


The bug is in patch 2, mmu_mapin_ram() should return base instead of 
returning 0 when __map_without_bats is set.




top of MEM2 @ 13F0

zImage starting: loaded at 0x00e0 (sp: 0x01588fa0)
Allocating 0x14e92c8 bytes for kernel...
Decompressing (0x <- 0x00e11000:0x01586ba7)...
Done! Decompressed 0xdc01f4 bytes

Linux/PowerPC load: root=/dev/mmcblk0p2 rootwait console=usbgecko1
Finalizing device tree... flat tree at 0x15897a0
[0.00] printk: bootconsole [udbg0] enabled
[0.00] Total memory = 319MB; using 1024kB for hash table (at 
(ptrval))
[0.00] RAM mapped without BATs
[0.00] RAM mapped without BATs
[0.00] [ cut here ]
[0.00] kernel BUG at arch/powerpc/mm/pgtable_32.c:223!
[0.00] Oops: Exception in kernel mode, sig: 5 [#1]
[0.00] BE PREEMPT
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.0.0-rc1-wii-00024-g596f9fe23c13 #1337
[0.00] NIP:  c0017c4c LR: c0a836a0 CTR: c001edc4
[0.00] REGS: c0d9deb0 TRAP: 0700   Not tainted  
(5.0.0-rc1-wii-00024-g596f9fe23c13)
[0.00] MSR:  00020030   CR: 42000888  XER: 2000
[0.00]
[0.00] GPR00: c0a836a0 c0d9df60 c0d2a4a0 c0d29c00  
c16ff000 c0d9de28 c0dc
[0.00] GPR08: c0d9c000 0001 0001  28000824 
  
[0.00] GPR16:   0020  c086 
c0da c000 c0a7d000
[0.00] GPR24: c0acd55c c0d487c8 13f0 c0d29000 0c00 
0311 c000 c0d487c8
[0.00] NIP [c0017c4c] map_kernel_page+0x78/0xf0
[0.00] LR [c0a836a0] mapin_ram+0xe0/0x14c
[0.00] Call Trace:
[0.00] [c0d9df60] [c0a83f54] mmu_mapin_ram+0x54/0x1a4 
(unreliable)
[0.00] [c0d9df90] [c0a836a0] mapin_ram+0xe0/0x14c
[0.00] [c0d9dfd0] [c0a83578] MMU_init+0x158/0x1a0
[0.00] [c0d9dff0] [c0003418] start_here+0x40/0x78
[0.00] Instruction dump:
[0.00] 55290026 57c5b53a 7ca54a14 3d204000 7f854800 3ca5c000 
419e0088 8125
[0.00] 552afffe 552907fe 7d4a4b79 4082004c <0f0a> 54840026 
7c84eb78 9081000c
[0.00] random: get_random_bytes called from 
print_oops_end_marker+0x34/0x6c with crng_init=0
[0.00] ---[ end trace  ]---
[0.00]
[0.00] Kernel panic - not syncing: Attempted to kill the idle 
task!
[0.00] Rebooting in 180 seconds..

- patches 7 to 11 fail to build with this error (really a warning, but
   arch/powerpc doesn't allow warnings by default):

  CC  arch/powerpc/mm/ppc_mmu_32.o
../arch/powerpc/mm/ppc_mmu_32.c:133:13: error: ‘clearibat’ defined but 
not used [-Werror=unused-function]
 static void clearibat(int index)
 ^
../arch/powerpc/mm/ppc_mmu_32.c:115:13: error: ‘setibat’ defined but 
not used [-Werror=unused-function]
 static void setibat(int index, unsigned long virt, phys_addr_t phys,
 ^~~
cc1: all warnings being treated as errors


Argh ! I have to squash the patch bringing the new functions with the 
one using them (patch 12). The result is a big messy patch which is more 
difficult to review but that's life.




- patches 12 to 15 build but fail to boot with this error:


Thats the one we need to really understand.

Do you have modules ? If so, can you try without ?



top of MEM2 @ 13F0

zImage starting: loaded at 0x0100 (sp: 0x0178afa0)
Allocating 0x166b2c8 bytes for kernel...
Decompressing (0x <- 0x01011000:0x017880ce)...
   

Re: [PATCH 1/1] remoteproc: fix recovery procedure

2019-01-14 Thread xiang xiao
Here is my output after apply your patch, the duplication device still exist:
[   48.012300] remoteproc remoteproc0: crash detected in
f921.toppwr:tl421-rproc: type watchdog
[   48.023473] remoteproc remoteproc0: handling crash #1 in
f921.toppwr:tl421-rproc
[   48.037504] remoteproc remoteproc0: recovering f921.toppwr:tl421-rproc
[   48.048837] remoteproc remoteproc0: stopped remote processor
f921.toppwr:tl421-rproc
[   48.061969] virtio_rpmsg_bus virtio2: rpmsg host is online
[   48.061976] virtio_rpmsg_bus virtio2: creating channel rpmsg-ttyADSP addr 0x1
[   48.062956] virtio_rpmsg_bus virtio2: creating channel rpmsg-clk addr 0x2
[   48.063489] virtio_rpmsg_bus virtio2: creating channel rpmsg-syslog addr 0x3
[   48.063815] virtio_rpmsg_bus virtio2: creating channel rpmsg-rtc addr 0x4
[   48.064064] virtio_rpmsg_bus virtio2: creating channel rpmsg-ttyADSP addr 0x1
[   48.064080] virtio_rpmsg_bus virtio2: channel
rpmsg-ttyADSP::1 already exist
[   48.064087] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064102] virtio_rpmsg_bus virtio2: creating channel rpmsg-ttyADSP addr 0x1
[   48.064118] virtio_rpmsg_bus virtio2: channel
rpmsg-ttyADSP::1 already exist
[   48.064127] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064139] virtio_rpmsg_bus virtio2: creating channel rpmsg-clk addr 0x2
[   48.064153] virtio_rpmsg_bus virtio2: channel rpmsg-clk::2
already exist
[   48.064159] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064174] virtio_rpmsg_bus virtio2: creating channel rpmsg-syslog addr 0x3
[   48.064192] virtio_rpmsg_bus virtio2: channel
rpmsg-syslog::3 already exist
[   48.064200] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064213] virtio_rpmsg_bus virtio2: creating channel rpmsg-rtc addr 0x4
[   48.064229] virtio_rpmsg_bus virtio2: channel rpmsg-rtc::4
already exist
[   48.064235] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064286] virtio_rpmsg_bus virtio2: creating channel rpmsg-syslog addr 0x3
[   48.064302] virtio_rpmsg_bus virtio2: channel
rpmsg-syslog::3 already exist
[   48.064306] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064318] virtio_rpmsg_bus virtio2: creating channel rpmsg-rtc addr 0x4
[   48.064334] virtio_rpmsg_bus virtio2: channel rpmsg-rtc::4
already exist
[   48.064339] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064351] virtio_rpmsg_bus virtio2: creating channel rpmsg-hostfs addr 0x5
[   48.064773] virtio_rpmsg_bus virtio2: creating channel rpmsg-hostfs addr 0x5
[   48.064789] virtio_rpmsg_bus virtio2: channel
rpmsg-hostfs::5 already exist
[   48.064797] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.064930] virtio_rpmsg_bus virtio2: creating channel  addr 0x5f7361
[   48.064945] virtio_rpmsg_bus virtio2: rpmsg_find_device failed
[   48.064957] virtio_rpmsg_bus virtio2: creating channel rpmsg-rtc addr 0x4
[   48.064973] virtio_rpmsg_bus virtio2: channel rpmsg-rtc::4
already exist
[   48.064979] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.065015] virtio_rpmsg_bus virtio2: creating channel rpmsg-hostfs addr 0x5
[   48.065030] virtio_rpmsg_bus virtio2: channel
rpmsg-hostfs::5 already exist
[   48.065035] virtio_rpmsg_bus virtio2: rpmsg_create_channel failed
[   48.352150] remoteproc remoteproc0: registered virtio2 (type 7)
[   48.358813] remoteproc remoteproc0: remote processor
f921.toppwr:tl421-rproc is now up
do I still miss any additional patch?

Thanks
Xiang

On Tue, Jan 15, 2019 at 4:23 AM Loic PALLARDY  wrote:
>
> Hi Xiang,
>
> > -Original Message-
> > From: xiang xiao 
> > Sent: samedi 12 janvier 2019 19:29
> > To: Loic PALLARDY 
> > Cc: bjorn.anders...@linaro.org; o...@wizery.com; linux-
> > remotep...@vger.kernel.org; linux-kernel@vger.kernel.org; Arnaud
> > POULIQUEN ; benjamin.gaign...@linaro.org; s-
> > a...@ti.com
> > Subject: Re: [PATCH 1/1] remoteproc: fix recovery procedure
> >
> > Hi Loic,
> > The change just hide the problem, I think. The big issue is:
> > 1.virtio devices aren't destroyed by rpproc_stop
> Virtio devices are destroyed by rproc_stop() as vdev is registered as rproc 
> sub device.
> rproc_stop() is calling rproc_stop_subdevices() which is in charge of 
> removing virtio device and associated children.
> rproc_vdev_do_stop() --> rproc_remove_virtio_dev() --> 
> unregister_virtio_device()
>
> Please find below trace of a recovery on my ST SOC. My 2 rpmsg tty are 
> removed and re-inserted correctly
> root@stm32mp1:~# ls /dev/ttyRPMSG*
> /dev/ttyRPMSG0  /dev/ttyRPMSG1
> root@stm32mp1:~# [  154.832523] remoteproc remoteproc0: crash detected in m4: 
> type watchdog
> [  154.837725] remoteproc remoteproc0: handling crash #2 in m4
> [  154.843319] remoteproc remoteproc0: recovering m4
> [  154.849185] rpmsg_tty virtio0.rpmsg-tty-channel.-1.0: rpmsg tty device 0 
> is removed
> [  154.857572] rpmsg_tty 

[PATCH] kvm: add proper frame pointer logic for vmx

2019-01-14 Thread Qian Cai
compilation warning since v5.0-rc1,

arch/x86/kvm/vmx/vmx.o: warning: objtool: vmx_vcpu_run.part.17()+0x3171:
call without frame pointer save/setup

Fixes: 453eafbe65f (KVM: VMX: Move VM-Enter + VM-Exit handling to
non-inline sub-routines)
Signed-off-by: Qian Cai 
---
 arch/x86/kvm/vmx/vmenter.S | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index bcef2c7e9bc4..874dd3030dee 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include 
 #include 
+#include 
 
.text
 
@@ -20,18 +21,22 @@
  */
 ENTRY(vmx_vmenter)
/* EFLAGS.ZF is set if VMCS.LAUNCHED == 0 */
+   FRAME_BEGIN
je 2f
 
 1: vmresume
+   FRAME_END
ret
 
 2: vmlaunch
+   FRAME_END
ret
 
 3: cmpb $0, kvm_rebooting
jne 4f
call kvm_spurious_fault
-4: ret
+4: FRAME_END
+   ret
 
.pushsection .fixup, "ax"
 5: jmp 3b
-- 
2.17.2 (Apple Git-113)



Re: [PATCH v3 2/5] virtio-pmem: Add virtio pmem driver

2019-01-14 Thread Pankaj Gupta


> > This patch adds virtio-pmem driver for KVM guest.
> > 
> > Guest reads the persistent memory range information from
> > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > creates a nd_region object with the persistent memory
> > range information so that existing 'nvdimm/pmem' driver
> > can reserve this into system memory map. This way
> > 'virtio-pmem' driver uses existing functionality of pmem
> > driver to register persistent memory compatible for DAX
> > capable filesystems.
> > 
> > This also provides function to perform guest flush over
> > VIRTIO from 'pmem' driver when userspace performs flush
> > on DAX memory range.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/nvdimm/virtio_pmem.c |  84 ++
> >  drivers/virtio/Kconfig   |  10 
> >  drivers/virtio/Makefile  |   1 +
> >  drivers/virtio/pmem.c| 124
> >  +++
> >  include/linux/virtio_pmem.h  |  60 +++
> >  include/uapi/linux/virtio_ids.h  |   1 +
> >  include/uapi/linux/virtio_pmem.h |  10 
> 
> As with any uapi change, you need to CC the virtio dev
> mailing list (subscribers only, sorry about that).

Sure, will add virtio dev mailing list while sending v4.

Thanks,
Pankaj

> 
> 
> >  7 files changed, 290 insertions(+)
> >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> >  create mode 100644 drivers/virtio/pmem.c
> >  create mode 100644 include/linux/virtio_pmem.h
> >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > 
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > new file mode 100644
> > index 000..2a1b1ba
> > --- /dev/null
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -0,0 +1,84 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * virtio_pmem.c: Virtio pmem Driver
> > + *
> > + * Discovers persistent memory range information
> > + * from host and provides a virtio based flushing
> > + * interface.
> > + */
> > +#include 
> > +#include "nd.h"
> > +
> > + /* The interrupt handler */
> > +void host_ack(struct virtqueue *vq)
> > +{
> > +   unsigned int len;
> > +   unsigned long flags;
> > +   struct virtio_pmem_request *req, *req_buf;
> > +   struct virtio_pmem *vpmem = vq->vdev->priv;
> > +
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   while ((req = virtqueue_get_buf(vq, )) != NULL) {
> > +   req->done = true;
> > +   wake_up(>host_acked);
> > +
> > +   if (!list_empty(>req_list)) {
> > +   req_buf = list_first_entry(>req_list,
> > +   struct virtio_pmem_request, list);
> > +   list_del(>req_list);
> > +   req_buf->wq_buf_avail = true;
> > +   wake_up(_buf->wq_buf);
> > +   }
> > +   }
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +}
> > +EXPORT_SYMBOL_GPL(host_ack);
> > +
> > + /* The request submission function */
> > +int virtio_pmem_flush(struct nd_region *nd_region)
> > +{
> > +   int err;
> > +   unsigned long flags;
> > +   struct scatterlist *sgs[2], sg, ret;
> > +   struct virtio_device *vdev = nd_region->provider_data;
> > +   struct virtio_pmem *vpmem = vdev->priv;
> > +   struct virtio_pmem_request *req;
> > +
> > +   might_sleep();
> > +   req = kmalloc(sizeof(*req), GFP_KERNEL);
> > +   if (!req)
> > +   return -ENOMEM;
> > +
> > +   req->done = req->wq_buf_avail = false;
> > +   strcpy(req->name, "FLUSH");
> > +   init_waitqueue_head(>host_acked);
> > +   init_waitqueue_head(>wq_buf);
> > +   sg_init_one(, req->name, strlen(req->name));
> > +   sgs[0] = 
> > +   sg_init_one(, >ret, sizeof(req->ret));
> > +   sgs[1] = 
> > +
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
> > +   if (err) {
> > +   dev_err(>dev, "failed to send command to virtio pmem 
> > device\n");
> > +
> > +   list_add_tail(>req_list, >list);
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->wq_buf, req->wq_buf_avail);
> > +   spin_lock_irqsave(>pmem_lock, flags);
> > +   }
> > +   virtqueue_kick(vpmem->req_vq);
> > +   spin_unlock_irqrestore(>pmem_lock, flags);
> > +
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->host_acked, req->done);
> > +   err = req->ret;
> > +   kfree(req);
> > +
> > +   return err;
> > +};
> > +EXPORT_SYMBOL_GPL(virtio_pmem_flush);
> > +MODULE_LICENSE("GPL");
> > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > index 3589764..9f634a2 100644
> > --- a/drivers/virtio/Kconfig
> > +++ b/drivers/virtio/Kconfig
> > @@ -42,6 +42,16 @@ config VIRTIO_PCI_LEGACY
> >  
> >   If unsure, say Y.
> >  
> > +config VIRTIO_PMEM
> > +   tristate "Support for virtio pmem driver"
> > +   depends on VIRTIO
> > +   depends on 

[PATCH v2 3/4] arm64: kprobes: Move exception_text check in blacklist

2019-01-14 Thread Masami Hiramatsu
Move exception/irqentry text address check in blacklist,
since those are symbol based rejection.

If we prohibit probing on the symbols in exception_text,
those should be blacklisted.

Signed-off-by: Masami Hiramatsu 
---
 arch/arm64/kernel/probes/kprobes.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index 1dae500d0a81..b9e9758b6534 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -98,9 +98,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
/* copy instruction */
p->opcode = le32_to_cpu(*p->addr);
 
-   if (in_exception_text(probe_addr))
-   return -EINVAL;
-
if (search_exception_tables(probe_addr))
return -EINVAL;
 
@@ -475,7 +472,8 @@ bool arch_within_kprobe_blacklist(unsigned long addr)
(addr >= (unsigned long)__entry_text_start &&
addr < (unsigned long)__entry_text_end) ||
(addr >= (unsigned long)__idmap_text_start &&
-   addr < (unsigned long)__idmap_text_end))
+   addr < (unsigned long)__idmap_text_end) ||
+   in_exception_text(addr))
return true;
 
if (!is_kernel_in_hyp_mode()) {



[PATCH v2 4/4] arm64: kprobes: Use arch_populate_kprobe_blacklist()

2019-01-14 Thread Masami Hiramatsu
Use arch_populate_kprobe_blacklist() instead of
arch_within_kprobe_blacklist() so that we can see the full
blacklisted symbols under the debugfs.

Signed-off-by: Masami Hiramatsu 
---
 arch/arm64/kernel/probes/kprobes.c |   42 
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index b9e9758b6534..6c066c34c8a4 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -465,26 +465,30 @@ kprobe_breakpoint_handler(struct pt_regs *regs, unsigned 
int esr)
return DBG_HOOK_HANDLED;
 }
 
-bool arch_within_kprobe_blacklist(unsigned long addr)
+int __init arch_populate_kprobe_blacklist(void)
 {
-   if ((addr >= (unsigned long)__kprobes_text_start &&
-   addr < (unsigned long)__kprobes_text_end) ||
-   (addr >= (unsigned long)__entry_text_start &&
-   addr < (unsigned long)__entry_text_end) ||
-   (addr >= (unsigned long)__idmap_text_start &&
-   addr < (unsigned long)__idmap_text_end) ||
-   in_exception_text(addr))
-   return true;
-
-   if (!is_kernel_in_hyp_mode()) {
-   if ((addr >= (unsigned long)__hyp_text_start &&
-   addr < (unsigned long)__hyp_text_end) ||
-   (addr >= (unsigned long)__hyp_idmap_text_start &&
-   addr < (unsigned long)__hyp_idmap_text_end))
-   return true;
-   }
-
-   return false;
+   int ret;
+
+   ret = kprobe_add_area_blacklist((unsigned long)__kprobes_text_start,
+   (unsigned long)__kprobes_text_end);
+   if (ret)
+   return ret;
+   ret = kprobe_add_area_blacklist((unsigned long)__entry_text_start,
+   (unsigned long)__entry_text_end);
+   if (ret)
+   return ret;
+   ret = kprobe_add_area_blacklist((unsigned long)__idmap_text_start,
+   (unsigned long)__idmap_text_end);
+   if (ret || is_kernel_in_hyp_mode())
+   return ret;
+
+   ret = kprobe_add_area_blacklist((unsigned long)__hyp_text_start,
+   (unsigned long)__hyp_text_end);
+   if (ret)
+   return ret;
+   ret = kprobe_add_area_blacklist((unsigned long)__hyp_idmap_text_start,
+   (unsigned long)__hyp_idmap_text_end);
+   return ret;
 }
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)



[PATCH v2 2/4] arm64: kprobes: Remove unneeded RODATA check

2019-01-14 Thread Masami Hiramatsu
Remove unneeded RODATA check from arch_prepare_kprobe().

Since check_kprobe_address_safe() already ensured that
the probe address is in kernel text, we don't need to
check whether the address in RODATA or not. That must
be always false.

Signed-off-by: Masami Hiramatsu 
---
 arch/arm64/kernel/probes/kprobes.c |6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index b2d4b7428ebc..1dae500d0a81 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -91,8 +91,6 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, 
struct pt_regs *regs)
 int __kprobes arch_prepare_kprobe(struct kprobe *p)
 {
unsigned long probe_addr = (unsigned long)p->addr;
-   extern char __start_rodata[];
-   extern char __end_rodata[];
 
if (probe_addr & 0x3)
return -EINVAL;
@@ -106,10 +104,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
if (search_exception_tables(probe_addr))
return -EINVAL;
 
-   if (probe_addr >= (unsigned long) __start_rodata &&
-   probe_addr <= (unsigned long) __end_rodata)
-   return -EINVAL;
-
/* decode instruction */
switch (arm_kprobe_decode_insn(p->addr, >ainsn)) {
case INSN_REJECTED: /* insn not supported */



[PATCH v2 1/4] arm64: kprobes: Move extable address check into arch_prepare_kprobe()

2019-01-14 Thread Masami Hiramatsu
Move extable address check into arch_prepare_kprobe() from
arch_within_kprobe_blacklist().
The blacklist is exposed via debugfs as a list of symbols.
The extable entries are smaller, so must be filtered out
by arch_prepare_kprobe().

Signed-off-by: Masami Hiramatsu 
Reviewed-by: James Morse 
---
 Update in v2:
  - Update commit message.
  - Add Reviewed-by from James.
---
 arch/arm64/kernel/probes/kprobes.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index 2a5b338b2542..b2d4b7428ebc 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -102,6 +102,10 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
 
if (in_exception_text(probe_addr))
return -EINVAL;
+
+   if (search_exception_tables(probe_addr))
+   return -EINVAL;
+
if (probe_addr >= (unsigned long) __start_rodata &&
probe_addr <= (unsigned long) __end_rodata)
return -EINVAL;
@@ -477,8 +481,7 @@ bool arch_within_kprobe_blacklist(unsigned long addr)
(addr >= (unsigned long)__entry_text_start &&
addr < (unsigned long)__entry_text_end) ||
(addr >= (unsigned long)__idmap_text_start &&
-   addr < (unsigned long)__idmap_text_end) ||
-   !!search_exception_tables(addr))
+   addr < (unsigned long)__idmap_text_end))
return true;
 
if (!is_kernel_in_hyp_mode()) {



[PATCH v2 0/4] arm64: kprobes: Update blacklist checking on arm64

2019-01-14 Thread Masami Hiramatsu
Hello,

Here is the v2 series of update of the kprobe blacklist
checking on arm64.

I found that some blacklist checking code were mis-placed in
arch_prepare_kprobe() and arch_within_kprobe_blacklist().
Since the blacklist just filters by symbol, smaller than the
symbol, like extable must be checked in arch_prepare_kprobe().
Also, all function (symbol) level check must be done by blacklist.

For arm64, it checks the extable entry address in blacklist
and exception/irqentry function in arch_prepare_kprobe().
And, RODATA check is unneeded since kernel/kprobes.c
already ensures the probe address is in kernel-text area.

In v2, I updated [1/4]'s description and added James'
Reviewed-by. Also, in this version, I added a patch which
uses arch_populate_kprobe_blacklist() instead of
arch_within_kprobe_blacklist() so that user can see the full
list of blacklisted symbols under the debugfs.

Changes in v2:
 - [1/4] change description so that it make clear and add
 James' Reviewed-by.
 - [4/4] new patch.

Thank you,

---

Masami Hiramatsu (4):
  arm64: kprobes: Move extable address check into arch_prepare_kprobe()
  arm64: kprobes: Remove unneeded RODATA check
  arm64: kprobes: Move exception_text check in blacklist
  arm64: kprobes: Use arch_populate_kprobe_blacklist()


 arch/arm64/kernel/probes/kprobes.c |   49 ++--
 1 file changed, 24 insertions(+), 25 deletions(-)

--
Masami Hiramatsu (Linaro) 


Re: [PATCHv2 0/7] x86_64/mm: remove bottom-up allocation style by pushing forward the parsing of mem hotplug info

2019-01-14 Thread Pingfan Liu
On Tue, Jan 15, 2019 at 7:02 AM Dave Hansen  wrote:
>
> On 1/10/19 9:12 PM, Pingfan Liu wrote:
> > Background
> > When kaslr kernel can be guaranteed to sit inside unmovable node
> > after [1].
>
> What does this "[1]" refer to?
>
https://lore.kernel.org/patchwork/patch/1029376/

> Also, can you clarify your terminology here a bit.  By "kaslr kernel",
> do you mean the base address?
>
It should be the randomization of load address. Googled, and found out
that it is "base address".

> > But if kaslr kernel is located near the end of the movable node,
> > then bottom-up allocator may create pagetable which crosses the boundary
> > between unmovable node and movable node.
>
> Again, I'm confused.  Do you literally mean a single page table page?  I
> think you mean the page tables, but it would be nice to clarify this,
> and also explicitly state which page tables these are.
>
It should be page table pages. The page table is built by init_mem_mapping().

> >  It is a probability issue,
> > two factors include -1. how big the gap between kernel end and
> > unmovable node's end.  -2. how many memory does the system own.
> > Alternative way to fix this issue is by increasing the gap by
> > boot/compressed/kaslr*.
>
> Oh, you mean the KASLR code in arch/x86/boot/compressed/kaslr*.[ch]?
>
Sorry, and yes, code in arch/x86/boot/compressed/kaslr_64.c and kaslr.c

> It took me a minute to figure out you were talking about filenames.
>
> > But taking the scenario of PB level memory, the pagetable will take
> > server MB even if using 1GB page, different page attr and fragment
> > will make things worse. So it is hard to decide how much should the
> > gap increase.
> I'm not following this.  If we move the image around, we leave holes.
> Why do we need page table pages allocated to cover these holes?
>
I means in arch/x86/boot/compressed/kaslr.c, store_slot_info() {
slot_area.num = (region->size - image_size) /CONFIG_PHYSICAL_ALIGN + 1
}.  Let us denote the size of page table as "X", then the formula is
changed to slot_area.num = (region->size - image_size -X)
/CONFIG_PHYSICAL_ALIGN + 1. And it is hard to decide X due to the
above factors.

> > The following figure show the defection of current bottom-up style:
> >   [startA, endA][startB, "kaslr kernel verly close to" endB][startC, endC]
>
> "defection"?
>
Oh, defect.

> > If nodeA,B is unmovable, while nodeC is movable, then init_mem_mapping()
> > can generate pgtable on nodeC, which stain movable node.
>
> Let me see if I can summarize this:
> 1. The kernel ASLR decompression code picks a spot to place the kernel
>image in physical memory.
> 2. Some page tables are dynamically allocated near (after) this spot.
> 3. Sometimes, based on the random ASLR location, these page tables fall
>over into the "movable node" area.  Being unmovable allocations, this
>is not cool.
> 4. To fix this (on 64-bit at least), we stop allocating page tables
>based on the location of the kernel image.  Instead, we allocate
>using the memblock allocator itself, which knows how to avoid the
>movable node.
>
Yes, you get my idea exactly. Thanks for your help to summary it. Hard
for me to express it clearly in English.

> > This patch makes it certainty instead of a probablity problem. It achieves
> > this by pushing forward the parsing of mem hotplug info ahead of 
> > init_mem_mapping().
>
> What does memory hotplug have to do with this?  I thought this was all
> about early boot.

Put the info about memory hot plugable to memblock allocator,
initmem_init()->...->acpi_numa_memory_affinity_init(), where
memblock_mark_hotplug() does it. Later when memory allocator works, in
__next_mem_range(), it will check this info by
memblock_is_hotpluggable().

Thanks and regards,
Pingfan


[PATCH] RDMA/mlx5: Replace kzalloc with kcalloc

2019-01-14 Thread Gustavo A. R. Silva
Replace kzalloc() function with its 2-factor argument form, kcalloc().

This patch replaces cases of:

kzalloc(a * b, gfp)

with:
kcalloc(a, b, gfp)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/infiniband/hw/mlx5/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index 11e9783cefcc..89dc2c6fc173 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3823,7 +3823,7 @@ mlx5_ib_raw_fs_rule_add(struct mlx5_ib_dev *dev,
if (fs_matcher->priority > MLX5_IB_FLOW_LAST_PRIO)
return ERR_PTR(-ENOMEM);
 
-   dst = kzalloc(sizeof(*dst) * 2, GFP_KERNEL);
+   dst = kcalloc(2, sizeof(*dst), GFP_KERNEL);
if (!dst)
return ERR_PTR(-ENOMEM);
 
-- 
2.20.1



Re: [PATCH] Documentation/ABI: Correct mlxreg-io KernelVersion for 5.0

2019-01-14 Thread Andy Shevchenko
On Sun, Jan 13, 2019 at 6:32 AM Darren Hart (VMware)
 wrote:
>
> The mlxreg-io for the merge window assumed 4.21 as the next kernel
> version. Replace 4.21 with 5.0.
>

Reviewed-by: Andy Shevchenko 

> Signed-off-by: Darren Hart (VMware) 
> ---
>  Documentation/ABI/stable/sysfs-driver-mlxreg-io | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/ABI/stable/sysfs-driver-mlxreg-io 
> b/Documentation/ABI/stable/sysfs-driver-mlxreg-io
> index 9b642669cb16..169fe08a649b 100644
> --- a/Documentation/ABI/stable/sysfs-driver-mlxreg-io
> +++ b/Documentation/ABI/stable/sysfs-driver-mlxreg-io
> @@ -24,7 +24,7 @@ What: 
> /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
> cpld3_version
>
>  Date:  November 2018
> -KernelVersion: 4.21
> +KernelVersion: 5.0
>  Contact:   Vadim Pasternak 
>  Description:   These files show with which CPLD versions have been burned
> on LED board.
> @@ -35,7 +35,7 @@ What: 
> /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
> jtag_enable
>
>  Date:  November 2018
> -KernelVersion: 4.21
> +KernelVersion: 5.0
>  Contact:   Vadim Pasternak 
>  Description:   These files enable and disable the access to the JTAG domain.
> By default access to the JTAG domain is disabled.
> @@ -105,7 +105,7 @@ What:   
> /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
> reset_voltmon_upgrade_fail
>
>  Date:  November 2018
> -KernelVersion: 4.21
> +KernelVersion: 5.0
>  Contact:   Vadim Pasternak 
>  Description:   These files show the system reset cause, as following: ComEx
> power fail, reset from ComEx, system platform reset, reset
> --
> 2.17.2
>
>
> --
> Darren Hart
> VMware Open Source Technology Center



-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH 2/8] libertas: change snprintf to scnprintf for possible overflow

2019-01-14 Thread Kalle Valo
Willy Tarreau  writes:

> From: Silvio Cesare 
>
> Change snprintf to scnprintf. There are generally two cases where using
> snprintf causes problems.
>
> 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
> In this case, if snprintf would have written more characters than what the
> buffer size (SIZE) is, then size will end up larger than SIZE. In later
> uses of snprintf, SIZE - size will result in a negative number, leading
> to problems. Note that size might already be too large by using
> size = snprintf before the code reaches a case of size += snprintf.
>
> 2) If size is ultimately used as a length parameter for a copy back to user
> space, then it will potentially allow for a buffer overflow and information
> disclosure when size is greater than SIZE. When the size is used to index
> the buffer directly, we can have memory corruption. This also means when
> size = snprintf... is used, it may also cause problems since size may become
> large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
> configuration.
>
> The solution to these issues is to use scnprintf which returns the number of
> characters actually written to the buffer, so the size variable will never
> exceed SIZE.
>
> Signed-off-by: Silvio Cesare 
> Cc: Kalle Valo 
> Cc: Dan Carpenter 
> Cc: Kees Cook 
> Cc: Will Deacon 
> Cc: Greg KH 
> Signed-off-by: Willy Tarreau 

I don't see any mention about which tree this should go to. Can I take
this to wireless-drivers-next?

-- 
Kalle Valo


Re: [PATCH] remoteproc/qcom_sysmon.c: Remove duplicate header

2019-01-14 Thread Bjorn Andersson
On Wed 09 Jan 06:29 PST 2019, Brajeswar Ghosh wrote:

> Remove linux/notifier.h which is included more than once
> 

Applied, with Souptick's ack.

Thanks,
Bjorn

> Signed-off-by: Brajeswar Ghosh 
> ---
>  drivers/remoteproc/qcom_sysmon.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/qcom_sysmon.c 
> b/drivers/remoteproc/qcom_sysmon.c
> index e976a602b015..603b813151f2 100644
> --- a/drivers/remoteproc/qcom_sysmon.c
> +++ b/drivers/remoteproc/qcom_sysmon.c
> @@ -7,7 +7,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> -- 
> 2.17.1
> 


Re: [PATCH 1/3] arm64: kprobes: Move extable address check into arch_prepare_kprobe()

2019-01-14 Thread Masami Hiramatsu
On Fri, 11 Jan 2019 18:22:38 +
James Morse  wrote:

> Hi,
> 
> On 09/01/2019 02:05, Masami Hiramatsu wrote:
> > On Tue, 8 Jan 2019 17:13:36 +
> > James Morse  wrote:
> >> On 08/01/2019 02:39, Masami Hiramatsu wrote:
> >>> On Thu, 3 Jan 2019 17:05:18 +
> >>> James Morse  wrote:
>  On 17/12/2018 06:40, Masami Hiramatsu wrote:
> > Move extable address check into arch_prepare_kprobe() from
> > arch_within_kprobe_blacklist().
> 
>  I'm trying to work out the pattern for what should go in the blacklist, 
>  and what
>  should be rejected by the arch code.
> 
>  It seems address-ranges should be blacklisted as the contents don't 
>  matter.
>  easy-example: the idmap text.
> >>>
> >>> Yes, more precisely, the code smaller than a function (symbol), it must be
> >>> rejected by arch_prepare_kprobe(), since blacklist is poplated based on
> >>> kallsyms.
> >>
> >> Ah, okay, so the pattern is the blacklist should only be for whole symbols,
> >> (which explains why its usually based on sections).
> > 
> > Correct. Actually, the blacklist is generated based on the symbol info
> > from symbol address.
> > 
> >> I see kprobe_add_ksym_blacklist() would go wrong if you give it something 
> >> like:
> >> platform_drv_probe+0x50/0xb0, as it will log platform_drv_probe+0x50 as the
> >> start_addr and platform_drv_probe+0x50+0xb0 as the end.
> > 
> > Yes, it expects given address is the entry of a symbol.
> 
> >> But how does anything from the arch code's blacklist get into the
> >> kprobe_blacklist list?
> > 
> > It should be done via arch_populate_kprobe_blacklist().
> 
> >> We don't have an arch_populate_kprobe_blacklist(), so rely on
> >> within_kprobe_blacklist() calling arch_within_kprobe_blacklist() with the
> >> address, as well as walking kprobe_blacklist.
> >>
> >> Is this cleanup ahead of a series that does away with
> >> arch_within_kprobe_blacklist() so that debugfs list is always complete?
> > 
> > Right, after this cleanup, I will send arch_populate_kprobe_blacklist()
> > patch for arm64 and others. My plan is to move all 
> > arch_within_kprobe_blacklist()
> > to arch_populate_kprobe_blacklist() so that user can get more precise 
> > blacklist
> > via debugfs.
> 
> Thanks, now it all makes sense!
> 
> Reviewed-by: James Morse 

Thanks!

> 
> 
> Could you include a paragraph like that in the cover-letter or commit-message?
> The 'fix' in the cover-letter subject had me looking for the bug!

Ok, I'll update commit message with your reviewed-by.

Thank you!

> 
> 
> Thanks,
> 
> James


-- 
Masami Hiramatsu 


Re: [Ocfs2-devel] [PATCH] ocfs2: fix the application IO timeout when fstrim is running

2019-01-14 Thread Gang He
Hello Changewei,

>>> On 2019/1/15 at 11:50, in message
<63adc13fd55d6546b7dece290d39e3730127825...@h3cmlb12-ex.srv.huawei-3com.com>,
Changwei Ge  wrote:
> Hi Gang,
> 
> Most parts of this patch look sane to me, just a tiny question...
> 
> On 2019/1/11 17:01, Gang He wrote:
>> The user reported this problem, the upper application IO was
>> timeout when fstrim was running on this ocfs2 partition. the
>> application monitoring resource agent considered that this
>> application did not work, then this node was fenced by the cluster
>> brain (e.g. pacemaker).
>> The root cause is that fstrim thread always holds main_bm meta-file
>> related locks until all the cluster groups are trimmed.
>> This patch will make fstrim thread release main_bm meta-file
>> related locks when each cluster group is trimmed, this will let
>> the current application IO has a chance to claim the clusters from
>> main_bm meta-file.
>> 
>> Signed-off-by: Gang He 
>> ---
>>   fs/ocfs2/alloc.c   | 159 +
>>   fs/ocfs2/dlmglue.c |   5 ++
>>   fs/ocfs2/ocfs2.h   |   1 +
>>   fs/ocfs2/ocfs2_trace.h |   2 +
>>   fs/ocfs2/super.c   |   2 +
>>   5 files changed, 106 insertions(+), 63 deletions(-)
>> 
>> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
>> index d1cbb27808e2..6f0999015a44 100644
>> --- a/fs/ocfs2/alloc.c
>> +++ b/fs/ocfs2/alloc.c
>> @@ -7532,10 +7532,11 @@ static int ocfs2_trim_group(struct super_block *sb,
>>  return count;
>>   }
>>   
>> -int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
>> +static
>> +int ocfs2_trim_mainbm(struct super_block *sb, struct fstrim_range *range)
>>   {
>>  struct ocfs2_super *osb = OCFS2_SB(sb);
>> -u64 start, len, trimmed, first_group, last_group, group;
>> +u64 start, len, trimmed = 0, first_group, last_group = 0, group = 0;
>>  int ret, cnt;
>>  u32 first_bit, last_bit, minlen;
>>  struct buffer_head *main_bm_bh = NULL;
>> @@ -7543,7 +7544,6 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>>  struct buffer_head *gd_bh = NULL;
>>  struct ocfs2_dinode *main_bm;
>>  struct ocfs2_group_desc *gd = NULL;
>> -struct ocfs2_trim_fs_info info, *pinfo = NULL;
>>   
>>  start = range->start >> osb->s_clustersize_bits;
>>  len = range->len >> osb->s_clustersize_bits;
>> @@ -7552,6 +7552,9 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>>  if (minlen >= osb->bitmap_cpg || range->len < sb->s_blocksize)
>>  return -EINVAL;
>>   
>> +trace_ocfs2_trim_mainbm(start, len, minlen);
>> +
>> +next_group:
>>  main_bm_inode = ocfs2_get_system_file_inode(osb,
>>  GLOBAL_BITMAP_SYSTEM_INODE,
>>  OCFS2_INVALID_SLOT);
>> @@ -7570,64 +7573,34 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>>  }
>>  main_bm = (struct ocfs2_dinode *)main_bm_bh->b_data;
>>   
>> -if (start >= le32_to_cpu(main_bm->i_clusters)) {
>> -ret = -EINVAL;
>> -goto out_unlock;
>> -}
>> -
>> -len = range->len >> osb->s_clustersize_bits;
>> -if (start + len > le32_to_cpu(main_bm->i_clusters))
>> -len = le32_to_cpu(main_bm->i_clusters) - start;
>> -
>> -trace_ocfs2_trim_fs(start, len, minlen);
>> -
>> -ocfs2_trim_fs_lock_res_init(osb);
>> -ret = ocfs2_trim_fs_lock(osb, NULL, 1);
>> -if (ret < 0) {
>> -if (ret != -EAGAIN) {
>> -mlog_errno(ret);
>> -ocfs2_trim_fs_lock_res_uninit(osb);
>> +/*
>> + * Do some check before trim the first group.
>> + */
>> +if (!group) {
>> +if (start >= le32_to_cpu(main_bm->i_clusters)) {
>> +ret = -EINVAL;
>>  goto out_unlock;
>>  }
>>   
>> -mlog(ML_NOTICE, "Wait for trim on device (%s) to "
>> - "finish, which is running from another node.\n",
>> - osb->dev_str);
>> -ret = ocfs2_trim_fs_lock(osb, , 0);
>> -if (ret < 0) {
>> -mlog_errno(ret);
>> -ocfs2_trim_fs_lock_res_uninit(osb);
>> -goto out_unlock;
>> -}
>> +if (start + len > le32_to_cpu(main_bm->i_clusters))
>> +len = le32_to_cpu(main_bm->i_clusters) - start;
>>   
>> -if (info.tf_valid && info.tf_success &&
>> -info.tf_start == start && info.tf_len == len &&
>> -info.tf_minlen == minlen) {
>> -/* Avoid sending duplicated trim to a shared device */
>> -mlog(ML_NOTICE, "The same trim on device (%s) was "
>> - "just done from node (%u), return.\n",
>> - osb->dev_str, info.tf_nodenum);
>> -range->len = info.tf_trimlen;
>> -   

Re: [PATCH] staging/android/vsoc: Remove duplicate header

2019-01-14 Thread Souptick Joarder
On Wed, Jan 9, 2019 at 8:56 PM Brajeswar Ghosh
 wrote:
>
> Remove linux/mutex.h.h which is included more than once
>
> Signed-off-by: Brajeswar Ghosh 

Acked-by: Souptick Joarder 

> ---
>  drivers/staging/android/vsoc.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c
> index 22571abcaa4e..8a75bd27c413 100644
> --- a/drivers/staging/android/vsoc.c
> +++ b/drivers/staging/android/vsoc.c
> @@ -29,7 +29,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include "uapi/vsoc_shm.h"
> --
> 2.17.1
>


Re: [PATCH] remoteproc/qcom_sysmon.c: Remove duplicate header

2019-01-14 Thread Souptick Joarder
On Wed, Jan 9, 2019 at 8:00 PM Brajeswar Ghosh
 wrote:
>
> Remove linux/notifier.h which is included more than once
>
> Signed-off-by: Brajeswar Ghosh 

Acked-by: Souptick Joarder 

> ---
>  drivers/remoteproc/qcom_sysmon.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/remoteproc/qcom_sysmon.c 
> b/drivers/remoteproc/qcom_sysmon.c
> index e976a602b015..603b813151f2 100644
> --- a/drivers/remoteproc/qcom_sysmon.c
> +++ b/drivers/remoteproc/qcom_sysmon.c
> @@ -7,7 +7,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> --
> 2.17.1
>


Re: [PATCH] mfd: cros_ec: Add support for MKBP more event flags

2019-01-14 Thread Gwendal Grignou
On Mon, Jan 14, 2019 at 6:04 PM Brian Norris  wrote:
>
> Hi Gwendal,
>
> On Mon, Jan 14, 2019 at 3:50 PM Gwendal Grignou  wrote:
> > On Fri, Dec 7, 2018 at 2:22 PM Brian Norris  
> > wrote:
> > > On Thu, Nov 29, 2018 at 11:55:48AM -0800, egran...@google.com wrote:
> > > > --- a/drivers/platform/chrome/cros_ec_proto.c
> > > > +++ b/drivers/platform/chrome/cros_ec_proto.c
> > > > @@ -420,10 +420,14 @@ int cros_ec_query_all(struct cros_ec_device 
> > > > *ec_dev)
> > > >   ret = cros_ec_get_host_command_version_mask(ec_dev,
> > > >   EC_CMD_GET_NEXT_EVENT,
> > > >   _mask);
> > >
> > > It's not exactly new here (although you're using 'ver_mask' in new
> > > ways), but cros_ec_get_host_command_version_mask() doesn't look 100%
> > > right. It doesn't look at msg->result, and instead just assumes that if
> > > we got some data back (send_command() > 0), then it must have been a
> > > success. I don't think that's really guaranteed in general, although it
> > > might be for the specific case of EC_CMD_GET_CMD_VERSIONS.
>
> > It is guaranteed: if msg->result is not EC_RES_SUCCESS, then ret can
> > not be greater than 0. At best it will be 0, or a negative number if
> > we can already qualify the error in the errno space (see
> > T() for instance).
>
> Sorry, where do you guarantee that? The only enforcements I see in
> those xfer implementation are:
> (1) if result == EC_RES_IN_PROGRESS, we convert that to an errno
> (2) if the expected length or checksum are bad, we turn that to an errno
>
> So technically, the EC *could* be sending a valid, checksummed
> response of the expected length, while still setting the ->result
> field to something besides EC_RES_SUCCESS or EC_RES_IN_PROGRESS. And
> we would treat that as a valid 'ver_mask'.
You're right, I misread cros_ec_pkt_xfer_i2c().
> Albeit, that seems unlikely, given understanding of how the EC is
> supposed to behave, but our code is not properly defensive AIUI. This
> is basically why cros_ec_cmd_xfer_status() exists -- so that
> sub-drivers don't get lazy and use cros_ec_cmd_xfer() without handling
> the ->result field properly.
send_command is called for a very small subset of command where the
ec_dev mutex is already held. We indeed need to be careful when
calling it directly.

Gwendal.
>
> Brian


Re: [PATCH 9/9] xen/privcmd-buf.c: Convert to use vm_insert_range_buggy

2019-01-14 Thread Souptick Joarder
On Tue, Jan 15, 2019 at 5:01 AM Boris Ostrovsky
 wrote:
>
> On 1/11/19 10:13 AM, Souptick Joarder wrote:
> > Convert to use vm_insert_range_buggy() to map range of kernel
> > memory to user vma.
> >
> > This driver has ignored vm_pgoff. We could later "fix" these drivers
> > to behave according to the normal vm_pgoff offsetting simply by
> > removing the _buggy suffix on the function name and if that causes
> > regressions, it gives us an easy way to revert.
> >
> > Signed-off-by: Souptick Joarder 
> > ---
> >  drivers/xen/privcmd-buf.c | 8 ++--
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/xen/privcmd-buf.c b/drivers/xen/privcmd-buf.c
> > index de01a6d..a9d7e97 100644
> > --- a/drivers/xen/privcmd-buf.c
> > +++ b/drivers/xen/privcmd-buf.c
> > @@ -166,12 +166,8 @@ static int privcmd_buf_mmap(struct file *file, struct 
> > vm_area_struct *vma)
> >   if (vma_priv->n_pages != count)
> >   ret = -ENOMEM;
> >   else
> > - for (i = 0; i < vma_priv->n_pages; i++) {
> > - ret = vm_insert_page(vma, vma->vm_start + i * 
> > PAGE_SIZE,
> > -  vma_priv->pages[i]);
> > - if (ret)
> > - break;
> > - }
> > + ret = vm_insert_range_buggy(vma, vma_priv->pages,
> > + vma_priv->n_pages);
>
> This can use the non-buggy version. But since the original code was
> indeed buggy in this respect I can submit this as a separate patch later.
>
> So
>
> Reviewed-by: Boris Ostrovsky 

Thanks Boris.
>
>
> >
> >   if (ret)
> >   privcmd_buf_vmapriv_free(vma_priv);
>


Re: [PATCH v3 0/6] Static calls

2019-01-14 Thread H. Peter Anvin
On 1/14/19 9:01 PM, H. Peter Anvin wrote:
> 
> This could be as simple as spinning for a limited time waiting for
> states 0 or 3 if we are not the patching CPU. It is also not necessary
> to wait for the mask to become zero for the first sync if we find
> ourselves suddenly in state 4.
> 

So this would look something like this for the #BP handler; I think this
is safe.  This uses the TLB miss on the write page intentionally to slow
down the loop a bit to reduce the risk of livelock.  Note that
"bp_write_addr" here refers to the write address for the breakpoint that
was taken.

state = atomic_read(_poke_state);
if (state == 0)
return 0;   /* No patching in progress */

recheck:
clear bit in mask

switch (state) {
case 1:
case 4:
if (smp_processor_id() != bp_patching_cpu) {
int retries = NNN;
while (retries--) {
invlpg
if (*bp_write_addr != 0xcc)
goto recheck;
state = atomic_read(_poke_state);
if (state != 1 && state != 4)
goto recheck;
}
}
state = cmpxchg(_poke_state, 1, 4);
if (state != 1 && state != 4)
goto recheck;
atomic_write(bp_write_addr, bp_old_value);
break;
case 2:
if (smp_processor_id() != bp_patching_cpu) {
invlpg
state = atomic_read(_poke_state);
if (state != 2)
goto recheck;
}
complete patch sequence
remove breakpoint
break;

case 3:
case 0:
/*
 * If we are here, the #BP will go away on its
 * own, or we will re-take it if it was a "real"
 * breakpoint.
 */
break;
}
return 1;


Problem of TCP bandwidth drops when change MTU to a small value

2019-01-14 Thread Weilong Chen
Hi, when we change the mtu to a small value, for example, ifconfig eth0 
mtu 68, IPERF test shows there's a great bandwidth drop while previous 
kernel versions don't.
Git bisect find the differences is from the patch 
28d35bcdd3925e7293408cdb8aa5f2aac5f0d6e3 (net: ipv4: don't let PMTU 
updates increase route MTU). After this patch, the stack send packets 
use a small mtu.


This problem can be reproduced easily on a qemu-kvm platform with virt 
nic E1000:

# ethtool -i eth0
driver: e1000
version: 7.3.21-k8-NAPI
firmware-version:
expansion-rom-version:
bus-info: :00:03.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
# # iperf -c 9.81.3.11 -t 3

Client connecting to 9.81.3.11, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local 9.83.1.202 port 44644 connected with 9.81.3.11 port 5001
[ ID] Interval   Transfer Bandwidth
[  3]  0.0- 3.0 sec   336 MBytes   938 Mbits/sec
# ifconfig eth0 mtu 68
# iperf -c 9.81.3.11 -t 3

Client connecting to 9.81.3.11, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local 9.83.1.202 port 44646 connected with 9.81.3.11 port 5001
[ ID] Interval   Transfer Bandwidth
[  3]  0.0- 3.3 sec  3.62 MBytes  9.18 Mbits/sec



Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-14 Thread Pankaj Gupta


> > > >
> > > > On Mon, Jan 14, 2019 at 02:15:40AM -0500, Pankaj Gupta wrote:
> > > > >
> > > > > > > Until you have images (and hence host page cache) shared between
> > > > > > > multiple guests. People will want to do this, because it means
> > > > > > > they
> > > > > > > only need a single set of pages in host memory for executable
> > > > > > > binaries rather than a set of pages per guest. Then you have
> > > > > > > multiple guests being able to detect residency of the same set of
> > > > > > > pages. If the guests can then, in any way, control eviction of
> > > > > > > the
> > > > > > > pages from the host cache, then we have a guest-to-guest
> > > > > > > information
> > > > > > > leak channel.
> > > > > >
> > > > > > I don't think we should ever be considering something that would
> > > > > > allow a
> > > > > > guest to evict page's from the host's pagecache [1].  The guest
> > > > > > should
> > > > > > be able to kick its own references to the host's pagecache out of
> > > > > > its
> > > > > > own pagecache, but not be able to influence whether the host or
> > > > > > another
> > > > > > guest has a read-only mapping cached.
> > > > > >
> > > > > > [1] Unless the guest is allowed to modify the host's file;
> > > > > > obviously
> > > > > > truncation, holepunching, etc are going to evict pages from the
> > > > > > host's
> > > > > > page cache.
> > > > >
> > > > > This is so correct. Guest does not not evict host page cache pages
> > > > > directly.
> > > >
> > > > They don't right now.
> > > >
> > > > But someone is going to end up asking for discard to work so that
> > > > the guest can free unused space in the underlying spares image (i.e.
> > > > make use of fstrim or mount -o discard) because they have workloads
> > > > that have bursts of space usage and they need to trim the image
> > > > files afterwards to keep their overall space usage under control.
> > > >
> > > > And then
> > > 
> > > ...we reject / push back on that patch citing the above concern.
> > 
> > So at what point do we draw the line?
> > 
> > We're allowing writable DAX mappings, but as I've pointed out that
> > means we are going to be allowing  a potential information leak via
> > files with shared extents to be directly mapped and written to.
> > 
> > But we won't allow useful admin operations that allow better
> > management of host side storage space similar to how normal image
> > files are used by guests because it's an information leak vector?
> > 
> > That's splitting some really fine hairs there...
> 
> May I summarize that th security implications need to
> be documented?
> 
> In fact that would make a fine security implications section
> in the device specification.

This is a very good suggestion. 

I will document the security implications in details in device specification
with details of what all filesystem features we don't support and why.

Best regards,
Pankaj

> 
> 
> 
> 
> 
> > > > > In case of virtio-pmem & DAX, guest clears guest page cache
> > > > > exceptional entries.
> > > > > Its solely decision of host to take action on the host page cache
> > > > > pages.
> > > > >
> > > > > In case of virtio-pmem, guest does not modify host file directly i.e
> > > > > don't
> > > > > perform hole punch & truncation operation directly on host file.
> > > >
> > > > ... this will no longer be true, and the nuclear landmine in this
> > > > driver interface will have been armed
> > > 
> > > I agree with the need to be careful when / if explicit cache control
> > > is added, but that's not the case today.
> > 
> > "if"?
> > 
> > I expect it to be "when", not if. Expect the worst, plan for it now.
> > 
> > Cheers,
> > 
> > Dave.
> > --
> > Dave Chinner
> > da...@fromorbit.com
> 
> 


Re: [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-14 Thread Pankaj Gupta


> > > On Mon, Jan 14, 2019 at 02:15:40AM -0500, Pankaj Gupta wrote:
> > > >
> > > > > > Until you have images (and hence host page cache) shared between
> > > > > > multiple guests. People will want to do this, because it means they
> > > > > > only need a single set of pages in host memory for executable
> > > > > > binaries rather than a set of pages per guest. Then you have
> > > > > > multiple guests being able to detect residency of the same set of
> > > > > > pages. If the guests can then, in any way, control eviction of the
> > > > > > pages from the host cache, then we have a guest-to-guest
> > > > > > information
> > > > > > leak channel.
> > > > >
> > > > > I don't think we should ever be considering something that would
> > > > > allow a
> > > > > guest to evict page's from the host's pagecache [1].  The guest
> > > > > should
> > > > > be able to kick its own references to the host's pagecache out of its
> > > > > own pagecache, but not be able to influence whether the host or
> > > > > another
> > > > > guest has a read-only mapping cached.
> > > > >
> > > > > [1] Unless the guest is allowed to modify the host's file; obviously
> > > > > truncation, holepunching, etc are going to evict pages from the
> > > > > host's
> > > > > page cache.
> > > >
> > > > This is so correct. Guest does not not evict host page cache pages
> > > > directly.
> > >
> > > They don't right now.
> > >
> > > But someone is going to end up asking for discard to work so that
> > > the guest can free unused space in the underlying spares image (i.e.
> > > make use of fstrim or mount -o discard) because they have workloads
> > > that have bursts of space usage and they need to trim the image
> > > files afterwards to keep their overall space usage under control.
> > >
> > > And then
> > 
> > ...we reject / push back on that patch citing the above concern.
> 
> So at what point do we draw the line?
> 
> We're allowing writable DAX mappings, but as I've pointed out that
> means we are going to be allowing  a potential information leak via
> files with shared extents to be directly mapped and written to.
> 
> But we won't allow useful admin operations that allow better
> management of host side storage space similar to how normal image
> files are used by guests because it's an information leak vector?

First of all Thank you for all the useful discussions. 
I am summarizing here:

- We have to live with the limitation to not support fstrim and 
  mount -o discard options with virtio-pmem as they will evict 
  host page cache pages. We cannot allow this for virtio-pmem
  for security reasons. These filesystem commands will just zero out 
  unused pages currently.

- If alot of space is unused and not freed guest can request host 
  Administrator for truncating the host backing image. 
  We are also planning to support qcow2 sparse image format at 
  host side with virtio-pmem.

- There is no existing solution for Qemu persistent memory 
  emulation with write support currently. This solution provides 
  us the paravartualized way of emulating persistent memory. It 
  does not emulate of ACPI structures instead it just uses VIRTIO 
  for communication between guest & host. It is fast because of its
  asynchronous nature and it works well. This makes use of at guest 
  side libnvdimm API's 
  
- If disk size freeing problem with guest files trim truncate is 
  very important for users, they can still use real hardware which 
  will provide them both (advance disk features & page cache by pass).

Considering all the above reasons I think this feature is useful
from virtualization point of view. As Dave rightly said we should
be careful and I think now we are careful with the security implications
of this device. 

Thanks again for all the inputs.

Best regards,
Pankaj  


> 
> That's splitting some really fine hairs there...
> 
> > > > In case of virtio-pmem & DAX, guest clears guest page cache exceptional
> > > > entries.
> > > > Its solely decision of host to take action on the host page cache
> > > > pages.
> > > >
> > > > In case of virtio-pmem, guest does not modify host file directly i.e
> > > > don't
> > > > perform hole punch & truncation operation directly on host file.
> > >
> > > ... this will no longer be true, and the nuclear landmine in this
> > > driver interface will have been armed
> > 
> > I agree with the need to be careful when / if explicit cache control
> > is added, but that's not the case today.
> 
> "if"?
> 
> I expect it to be "when", not if. Expect the worst, plan for it now.
> 
> Cheers,
> 
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com
> 


Re: [PATCH] pinctrl: cherryview: fix Strago DMI workaround

2019-01-14 Thread Andy Shevchenko
On Mon, Jan 14, 2019 at 07:38:36PM -0800, Dmitry Torokhov wrote:
> Well, hopefully 3rd time is a charm. We tried making that check
> DMI_BIOS_VERSION and DMI_BOARD_VERSION, but the real one is
> DMI_PRODUCT_VERSION.

Reviewed-by: Andy Shevchenko 

> Fixes: 86c5dd6860a6 ("pinctrl: cherryview: limit Strago DMI workarounds to 
> version 1.0")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197953
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1631930
> Cc: sta...@vger.kernel.org
> Signed-off-by: Dmitry Torokhov 
> ---
>  drivers/pinctrl/intel/pinctrl-cherryview.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c 
> b/drivers/pinctrl/intel/pinctrl-cherryview.c
> index 9b0f4b9ef482..8efe8ea45602 100644
> --- a/drivers/pinctrl/intel/pinctrl-cherryview.c
> +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
> @@ -1507,7 +1507,7 @@ static const struct dmi_system_id chv_no_valid_mask[] = 
> {
>   .matches = {
>   DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
>   DMI_MATCH(DMI_PRODUCT_FAMILY, "Intel_Strago"),
> - DMI_MATCH(DMI_BOARD_VERSION, "1.0"),
> + DMI_MATCH(DMI_PRODUCT_VERSION, "1.0"),
>   },
>   },
>   {
> @@ -1515,7 +1515,7 @@ static const struct dmi_system_id chv_no_valid_mask[] = 
> {
>   .matches = {
>   DMI_MATCH(DMI_SYS_VENDOR, "HP"),
>   DMI_MATCH(DMI_PRODUCT_NAME, "Setzer"),
> - DMI_MATCH(DMI_BOARD_VERSION, "1.0"),
> + DMI_MATCH(DMI_PRODUCT_VERSION, "1.0"),
>   },
>   },
>   {
> @@ -1523,7 +1523,7 @@ static const struct dmi_system_id chv_no_valid_mask[] = 
> {
>   .matches = {
>   DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
>   DMI_MATCH(DMI_PRODUCT_NAME, "Cyan"),
> - DMI_MATCH(DMI_BOARD_VERSION, "1.0"),
> + DMI_MATCH(DMI_PRODUCT_VERSION, "1.0"),
>   },
>   },
>   {
> @@ -1531,7 +1531,7 @@ static const struct dmi_system_id chv_no_valid_mask[] = 
> {
>   .matches = {
>   DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
>   DMI_MATCH(DMI_PRODUCT_NAME, "Celes"),
> - DMI_MATCH(DMI_BOARD_VERSION, "1.0"),
> + DMI_MATCH(DMI_PRODUCT_VERSION, "1.0"),
>   },
>   },
>   {}
> -- 
> 2.20.1.97.g81188d93c3-goog
> 
> 
> -- 
> Dmitry

-- 
With Best Regards,
Andy Shevchenko




Re: [RFC PATCH] x86, numa: always initialize all possible nodes

2019-01-14 Thread Pingfan Liu
[...]
> >
> > I would appreciate a help with those architectures because I couldn't
> > really grasp how the memoryless nodes are really initialized there. E.g.
> > ppc only seem to call setup_node_data for online nodes but I couldn't
> > find any special treatment for nodes without any memory.
>
> We have a somewhat dubious hack in our hotplug code, see:
>
> e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with memoryless nodes")
>
> Which basically onlines the node when we hotplug a CPU into it.
>
This bug should be related with the present state of numa node during
boot time. On PowerNV and PSeries, the boot code seems not to bring up
all nodes if memoryless. Then it can not avoid this bug.

Thanks,
Pingfan


Re: [PATCH v2] rbtree: fix the red root

2019-01-14 Thread Qian Cai


> [  114.913404] Padding 6913c65d: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.915437] Padding 2d53f25c: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.917390] Padding 78f7d621: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.919402] Padding 63547658: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.921414] Padding 1a301f4e: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.923364] Padding 46589d24: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.925340] Padding 08fb13da: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.927291] Padding ae5cc298: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.929239] Padding d49cc239: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.931177] Padding d66ad6f5: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.933110] Padding 069ad671: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.934986] Padding ffaf648c: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.936895] Padding c96d1b58: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.938848] Padding 768e4920: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.940965] Padding 0d06b43c: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.942890] Padding af5ae9fa: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.944790] Padding 6b526f1e: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  
> [  114.946727] Padding 9c8dffe3: 00 00 00 00 00 00 00 00 00 00 00 00 
> 00 00 00 00  

Another testing angle,

It might be something is doing __GFP_ZERO and overwriting those slabs. This also
matched the red root corruption, since RB_RED is bit 0.

One thing to try is to enable page_owner=on in the command-line, and then obtain
sorted_page_owner.txt before and after running the reproducer.

$ cd tools/vm
$ make page_owner_sort

$ cat /sys/kernel/debug/page_owner > page_owner_full.txt
$ grep -v ^PFN page_owner_full.txt > page_owner.txt
$ ./page_owner_sort page_owner.txt sorted_page_owner.txt


Regression in 32-bit-compat TIOCGPTPEER ioctl due to 311fc65c9fb9c966bca8e6f3ff8132ce57344ab9

2019-01-14 Thread Robert O'Callahan
This commit refactored the implementation of TIOCGPTPEER, moving "case
TIOCGPTPEER" from pty_unix98_ioctl() to tty_ioctl().
pty_unix98_ioctl() is called by pty_unix98_compat_ioctl(), so before
the commit, TIOCGPTPEER worked for 32-bit userspace. Unfortunately
tty_compat_ioctl() does not call tty_ioctl() so after the commit,
TIOCGPTPEER from 32-bit userspace fails with ENOTTY.

Testcase in https://bugzilla.kernel.org/show_bug.cgi?id=202271.

I found this bug running the rr test suite.

Rob
-- 
Su ot deraeppa sah dna Rehtaf eht htiw saw hcihw, efil lanrete eht uoy
ot mialcorp ew dna, ti ot yfitset dna ti nees evah ew; deraeppa efil
eht. Efil fo Drow eht gninrecnoc mialcorp ew siht - dehcuot evah sdnah
ruo dna ta dekool evah ew hcihw, seye ruo htiw nees evah ew hcihw,
draeh evah ew hcihw, gninnigeb eht morf saw hcihw taht.


[PATCH] drm/amdgpu: Replace kzalloc with kcalloc

2019-01-14 Thread Gustavo A. R. Silva
Replace kzalloc() function with its 2-factor argument form, kcalloc().

This patch replaces cases of:

kzalloc(a * b, gfp)

with:
kcalloc(a, b, gfp)

Also, improve the coding style and the use of sizeof during
allocation by changing sizeof(struct dc_surface_update) and
sizeof(struct dc_plane_state) to sizeof(*updates) and
sizeof(*surfaces), correspondingly.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a3e65e457348..4c201de38329 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5782,11 +5782,14 @@ dm_determine_update_type_for_commit(struct dc *dc,
struct dm_crtc_state *new_dm_crtc_state, *old_dm_crtc_state;
struct dc_stream_status *status = NULL;
 
-   struct dc_surface_update *updates = kzalloc(MAX_SURFACES * 
sizeof(struct dc_surface_update), GFP_KERNEL);
-   struct dc_plane_state *surface = kzalloc(MAX_SURFACES * sizeof(struct 
dc_plane_state), GFP_KERNEL);
+   struct dc_surface_update *updates;
+   struct dc_plane_state *surface;
struct dc_stream_update stream_update;
enum surface_update_type update_type = UPDATE_TYPE_FAST;
 
+   updates = kcalloc(MAX_SURFACES, sizeof(*updates), GFP_KERNEL);
+   surface = kcalloc(MAX_SURFACES, sizeof(*surface), GFP_KERNEL);
+
if (!updates || !surface) {
DRM_ERROR("Plane or surface update failed to allocate");
/* Set type to FULL to avoid crashing in DC*/
-- 
2.20.1



Re: [PATCH V2] x86/kexec: fix a kexec_file_load failure

2019-01-14 Thread Dave Young
On 12/28/18 at 09:12am, Dave Young wrote:
> The code cleanup mentioned in Fixes tag changed the behavior of
> kexec_locate_mem_hole.  The kexec_locate_mem_hole will try to
> allocate free memory only when kbuf.mem is initialized as zero.
> 
> But in x86 kexec_file_load implementation there are a few places
> the kbuf.mem is reused like below:
>   /* kbuf initialized, kbuf.mem = 0 */
>   ...
>   kexec_add_buffer()
>   ...
>   kexec_add_buffer()
> 
>   The second kexec_add_buffer will reuse previous kbuf but not
>   reinitialize the kbuf.mem.
> 
> Thus kexec_file_load failed because the sanity check failed.
> 
> So explictily reset kbuf.mem to fix the issue.
> 
> Fixes: b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()")
> Signed-off-by: Dave Young 
> Cc: 
> ---
> V1 -> V2: use KEXEC_BUF_MEM_UNKNOWN in code.
>  arch/x86/kernel/crash.c   | 1 +
>  arch/x86/kernel/kexec-bzimage64.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index f631a3f15587..6b7890c7889b 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -469,6 +469,7 @@ int crash_load_segments(struct kimage *image)
>  
>   kbuf.memsz = kbuf.bufsz;
>   kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer();
>   if (ret) {
>   vfree((void *)image->arch.elf_headers);
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 278cd07228dd..0d5efa34f359 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -434,6 +434,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   kbuf.memsz = PAGE_ALIGN(header->init_size);
>   kbuf.buf_align = header->kernel_alignment;
>   kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer();
>   if (ret)
>   goto out_free_params;
> @@ -448,6 +449,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   kbuf.bufsz = kbuf.memsz = initrd_len;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.buf_min = MIN_INITRD_LOAD_ADDR;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer();
>   if (ret)
>   goto out_free_params;
> -- 
> 2.17.0
> 

Andrew, Boris,  can any of you take this patch? Without this fix we have a 
regression.

Thanks
Dave


Re: [PATCH v3 0/6] Static calls

2019-01-14 Thread H. Peter Anvin
On 1/14/19 7:05 PM, Andy Lutomirski wrote:
> On Mon, Jan 14, 2019 at 2:55 PM H. Peter Anvin  wrote:
>>
>> I think this sequence ought to work (keep in mind we are already under a
>> mutex, so the global data is safe even if we are preempted):
> 
> I'm trying to wrap my head around this.  The states are:
> 
> 0: normal operation
> 1: writing 0xcc, can be canceled
> 2: writing final instruction.  The 0xcc was definitely synced to all CPUs.
> 3: patch is definitely installed but maybe not sync_cored.
> 

4: breakpoint has been canceled; need to redo patching.

>>
>> set up page table entries
>> invlpg
>> set up bp patching global data
>>
>> cpu = get_cpu()
>>
> So we're assuming that the state is
> 
>> bp_old_value = atomic_read(bp_write_addr)
>>
>> do {
> 
> So we're assuming that the state is 0 here.  A WARN_ON_ONCE to check
> that would be nice.

The state here can be 0 or 4.

>> atomic_write(_poke_state, 1)
>>
>> atomic_write(bp_write_addr, 0xcc)
>>
>> mask <- online_cpu_mask - self
>> send IPIs
>> wait for mask = 0
>>
>> } while (cmpxchg(_poke_state, 1, 2) != 1);
>>
>> patch sites, remove breakpoints after patching each one
> 
> Not sure what you mean by patch *sites*.  As written, this only
> supports one patch site at a time, since there's only one
> bp_write_addr, and fixing that may be complicated.  Not fixing it
> might also be a scalability problem.

Fixing it isn't all that complicated; we just need to have a list of
patch locations (which we need anyway!) and walk (or search) it instead
of checking just one; I omitted that detail for simplicity.

>> atomic_write(_poke_state, 3);
>>
>> mask <- online_cpu_mask - self
>> send IPIs
>> wait for mask = 0
>>
>> atomic_write(_poke_state, 0);
>>
>> tear down patching global data
>> tear down page table entries
>>
>>
>>
>> The #BP handler would then look like:
>>
>> state = cmpxchg(_poke_state, 1, 4);
>> switch (state) {
>> case 1:
>> case 4:
> 
> What is state 4?
> 
>> invlpg
>> cmpxchg(bp_write_addr, 0xcc, bp_old_value)

I'm 85% sure that the cmpxchg here is actually unnecessary, an
atomic_write() is sufficient.

>> break;
>> case 2:
>> invlpg
>> complete patch sequence
>> remove breakpoint
>> break;
> 
> ISTM you might as well change state to 3 here, but it's arguably unnecessary.

If and only if you have only one patch location you could, but again,
unnecessary.

>> case 3:
>> /* If we are here, the #BP will go away on its own */
>> break;
>> case 0:
>> /* No patching in progress!!! */
>> return 0;
>> }
>>
>> clear bit in mask
>> return 1;
>>
>> The IPI handler:
>>
>> clear bit in mask
>> sync_core   /* Needed if multiple IPI events are chained */
> 
> I really like that this doesn't require fixups -- text_poke_bp() just
> works.  But I'm nervous about livelocks or maybe just extreme slowness
> under nasty loads.  Suppose some perf NMI code does a static call or
> uses a static call.  Now there's a situation where, under high
> frequency perf sampling, the patch process might almost always hit the
> breakpoint while in state 1.  It'll get reversed and done again, and
> we get stuck.  It would be neat if we could get the same "no
> deadlocks" property while significantly reducing the chance of a
> rollback.

This could be as simple as spinning for a limited time waiting for
states 0 or 3 if we are not the patching CPU. It is also not necessary
to wait for the mask to become zero for the first sync if we find
ourselves suddenly in state 4.

This wouldn't reduce the livelock probability to zero, but it ought to
reduce it enough that if we really are under such heavy event load we
may end up getting stuck in any number of ways...

> This is why I proposed something where we try to guarantee forward
> progress by making sure that any NMI code that might spin and wait for
> other CPUs is guaranteed to eventually sync_core(), clear its bit, and
> possibly finish a patch.  But this is a bit gross.

Yes, this gets really grotty and who knows how many code paths it would
touch.

-hpa




[PATCH v6] ftrace: support early boot function tracing

2019-01-14 Thread Abderrahmane Benbachir
Rebasing and fixing conflicts.
Thanks.

Previous changes:
PATCH v1: Initial patch
PATCH v2:
   Removed arch specific code and use the default clock.
   Add more code re-usability
   Add HAVE_EARLY_BOOT_FTRACE config option, which will be disabled by default
PATCH v3:
   Write early boot temporary buffer to a sub-buffer instead of the global one.
   Improve Kconfig help text.
PATCH v4 : Some code refactoring.
PATCH v5 : fixing the build failers on arch i386.

Patch starts here:
--

The early boot tracing will start from the beginning of start_kernel()
and will stop at ftrace_init()

start_kernel()
{
  ftrace_early_init() <--- start early boot function tracing
  ...
  (calls)
  ...
  ftrace_init()   <--- stop early boot function tracing
  early_trace_init();
  ...
}

The events are placed in a temporary buffer, which will be copied to
the trace buffer after memory setup.

Dynamic tracing is not implemented with live patching, we use
ftrace_filter and ftrace_notrace to find which functions to be
filtered (traced / not traced), then during the callback from
mcount hook, we do binary search lookup to decide which function
to save and which one to discard.

Signed-off-by: Abderrahmane Benbachir 
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Mathieu Desnoyers 
Cc: Linux Kernel 
---
---
 arch/x86/Kconfig|   1 +
 arch/x86/kernel/ftrace_32.S |  45 --
 arch/x86/kernel/ftrace_64.S |  14 ++
 include/linux/ftrace.h  |  18 ++-
 init/main.c |   1 +
 kernel/trace/Kconfig|  51 +++
 kernel/trace/ftrace.c   | 293 +++-
 kernel/trace/trace.c|  41 +
 8 files changed, 452 insertions(+), 12 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 15af091611e2..1073ff11b8b5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -151,6 +151,7 @@ config X86
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
+   select HAVE_EARLY_BOOT_FTRACE
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT
select HAVE_IDE
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index 4c8440de3355..a247cbf4c529 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -31,12 +31,8 @@ EXPORT_SYMBOL(mcount)
 # define MCOUNT_FRAME  0   /* using frame = false */
 #endif
 
-ENTRY(function_hook)
-   ret
-END(function_hook)
-
-ENTRY(ftrace_caller)
 
+.macro save_mcount_regs
 #ifdef USING_FRAME_POINTER
 # ifdef CC_USING_FENTRY
/*
@@ -73,11 +69,9 @@ ENTRY(ftrace_caller)
 
movlfunction_trace_op, %ecx
subl$MCOUNT_INSN_SIZE, %eax
+   .endm
 
-.globl ftrace_call
-ftrace_call:
-   callftrace_stub
-
+.macro restore_mcount_regs
addl$4, %esp/* skip NULL pointer */
popl%edx
popl%ecx
@@ -90,6 +84,39 @@ ftrace_call:
addl$4, %esp/* skip parent ip */
 # endif
 #endif
+   .endm
+
+ENTRY(function_hook)
+#ifdef CONFIG_EARLY_BOOT_FUNCTION_TRACER
+   cmpl$__PAGE_OFFSET, %esp
+   jb  early_boot_stub /* Paging not enabled yet? */
+
+   cmpl$ftrace_stub, ftrace_early_boot_trace_function
+   jnz early_boot_trace
+
+early_boot_stub:
+   ret
+
+early_boot_trace:
+   save_mcount_regs
+   call*ftrace_early_boot_trace_function
+   restore_mcount_regs
+
+   jmp early_boot_stub
+#else
+   ret
+#endif
+END(function_hook)
+
+ENTRY(ftrace_caller)
+   save_mcount_regs
+
+.globl ftrace_call
+ftrace_call:
+   callftrace_stub
+
+   restore_mcount_regs
+
 .Lftrace_ret:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .globl ftrace_graph_call
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 75f2b36b41a6..e292ba26bf1d 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -151,7 +151,21 @@ EXPORT_SYMBOL(mcount)
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 ENTRY(function_hook)
+# ifdef CONFIG_EARLY_BOOT_FUNCTION_TRACER
+   cmpq $ftrace_stub, ftrace_early_boot_trace_function
+   jnz early_boot_trace
+
+early_boot_stub:
retq
+
+early_boot_trace:
+   save_mcount_regs
+   call *ftrace_early_boot_trace_function
+   restore_mcount_regs
+   jmp early_boot_stub
+# else
+   retq
+# endif
 ENDPROC(function_hook)
 
 ENTRY(ftrace_caller)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 730876187344..09977a28341f 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -239,6 +239,18 @@ static inline void ftrace_free_init_mem(void) { }
 static inline void ftrace_free_mem(struct module *mod, void *start, void *end) 
{ }
 #endif /* CONFIG_FUNCTION_TRACER */
 
+#ifdef CONFIG_EARLY_BOOT_FUNCTION_TRACER
+extern void __init ftrace_early_boot_init(char 

Re: [PATCH v2 4/8] ASoC: imx-audmux: change snprintf to scnprintf for possible overflow

2019-01-14 Thread Gustavo A. R. Silva

Hi Willy,

On 1/14/19 9:27 PM, Willy Tarreau wrote:

From: Silvio Cesare 

Change snprintf to scnprintf. There are generally two cases where using
snprintf causes problems.

1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
In this case, if snprintf would have written more characters than what the
buffer size (SIZE) is, then size will end up larger than SIZE. In later
uses of snprintf, SIZE - size will result in a negative number, leading
to problems. Note that size might already be too large by using
size = snprintf before the code reaches a case of size += snprintf.

2) If size is ultimately used as a length parameter for a copy back to user
space, then it will potentially allow for a buffer overflow and information
disclosure when size is greater than SIZE. When the size is used to index
the buffer directly, we can have memory corruption. This also means when
size = snprintf... is used, it may also cause problems since size may become
large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
configuration.

The solution to these issues is to use scnprintf which returns the number of
characters actually written to the buffer, so the size variable will never
exceed SIZE.

Signed-off-by: Silvio Cesare 
Cc: Timur Tabi 
Cc: Nicolin Chen 
Cc: Mark Brown 
Cc: Xiubo Li 
Cc: Fabio Estevam 
Cc: Dan Carpenter 
Cc: Kees Cook 
Cc: Will Deacon 
Cc: Greg KH 
Signed-off-by: Willy Tarreau 
Acked-by: Nicolin Chen 
Reviewed-by: Kees Cook 




You are still missing some people and mailing lists:

$ scripts/get_maintainer.pl --nokeywords --nogit --nogit-fallback 
sound/soc/fsl/imx-audmux.c
Timur Tabi  (maintainer:FREESCALE SOC SOUND DRIVERS)
Nicolin Chen  (maintainer:FREESCALE SOC SOUND DRIVERS)
Xiubo Li  (maintainer:FREESCALE SOC SOUND DRIVERS)
Fabio Estevam  (reviewer:FREESCALE SOC SOUND DRIVERS)
Liam Girdwood  (supporter:SOUND - SOC LAYER / DYNAMIC 
AUDIO POWER MANAGEM...)
Mark Brown  (supporter:SOUND - SOC LAYER / DYNAMIC AUDIO 
POWER MANAGEM...)
Jaroslav Kysela  (maintainer:SOUND)
Takashi Iwai  (maintainer:SOUND)
Shawn Guo  (maintainer:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE)
Sascha Hauer  (maintainer:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE)
Pengutronix Kernel Team  (reviewer:ARM/FREESCALE IMX / 
MXC ARM ARCHITECTURE)
NXP Linux Team  (reviewer:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE)
alsa-de...@alsa-project.org (moderated list:FREESCALE SOC SOUND DRIVERS)
linuxppc-...@lists.ozlabs.org (open list:FREESCALE SOC SOUND DRIVERS)
linux-arm-ker...@lists.infradead.org (moderated list:ARM/FREESCALE IMX / MXC 
ARM ARCHITECTURE)
linux-kernel@vger.kernel.org (open list)

--
Gustavo


--

v2:
   - adjust subject line
   - added alsa-devel & Mark
   - added acked-by
   - added reviewed-by

The patches in this series were sent by Silvio to the security list
to fix a minor memory disclosure issue after this article was published:
   
   http://blog.infosectcbr.com.au/2018/11/memory-bugs-in-multiple-linux-kernel.html



  sound/soc/fsl/imx-audmux.c | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/sound/soc/fsl/imx-audmux.c b/sound/soc/fsl/imx-audmux.c
index 392d5eef356d..99e07b01a2ce 100644
--- a/sound/soc/fsl/imx-audmux.c
+++ b/sound/soc/fsl/imx-audmux.c
@@ -86,49 +86,49 @@ static ssize_t audmux_read_file(struct file *file, char 
__user *user_buf,
if (!buf)
return -ENOMEM;
  
-	ret = snprintf(buf, PAGE_SIZE, "PDCR: %08x\nPTCR: %08x\n",

+   ret = scnprintf(buf, PAGE_SIZE, "PDCR: %08x\nPTCR: %08x\n",
   pdcr, ptcr);
  
  	if (ptcr & IMX_AUDMUX_V2_PTCR_TFSDIR)

-   ret += snprintf(buf + ret, PAGE_SIZE - ret,
+   ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"TxFS output from %s, ",
audmux_port_string((ptcr >> 27) & 0x7));
else
-   ret += snprintf(buf + ret, PAGE_SIZE - ret,
+   ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"TxFS input, ");
  
  	if (ptcr & IMX_AUDMUX_V2_PTCR_TCLKDIR)

-   ret += snprintf(buf + ret, PAGE_SIZE - ret,
+   ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"TxClk output from %s",
audmux_port_string((ptcr >> 22) & 0x7));
else
-   ret += snprintf(buf + ret, PAGE_SIZE - ret,
+   ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"TxClk input");
  
-	ret += snprintf(buf + ret, PAGE_SIZE - ret, "\n");

+   ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
  
  	if (ptcr & IMX_AUDMUX_V2_PTCR_SYN) {

-   ret += snprintf(buf + ret, PAGE_SIZE - ret,
+   ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"Port is symmetric");
} else {
if (ptcr & IMX_AUDMUX_V2_PTCR_RFSDIR)
-

Re: [PATCH 8/9] xen/gntdev.c: Convert to use vm_insert_range

2019-01-14 Thread Souptick Joarder
On Tue, Jan 15, 2019 at 4:58 AM Boris Ostrovsky
 wrote:
>
> On 1/11/19 10:12 AM, Souptick Joarder wrote:
> > Convert to use vm_insert_range() to map range of kernel
> > memory to user vma.
> >
> > Signed-off-by: Souptick Joarder 
>
> Reviewed-by: Boris Ostrovsky 
>
> (although it would be good to mention in the commit that you are also
> replacing count with vma_pages(vma), and why)

The original code was using count ( *count = vma_pages(vma)* )
which is same as this patch. Do I need capture it change log ?

>
>
> > ---
> >  drivers/xen/gntdev.c | 16 ++--
> >  1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> > index b0b02a5..ca4acee 100644
> > --- a/drivers/xen/gntdev.c
> > +++ b/drivers/xen/gntdev.c
> > @@ -1082,18 +1082,17 @@ static int gntdev_mmap(struct file *flip, struct 
> > vm_area_struct *vma)
> >  {
> >   struct gntdev_priv *priv = flip->private_data;
> >   int index = vma->vm_pgoff;
> > - int count = vma_pages(vma);
> >   struct gntdev_grant_map *map;
> > - int i, err = -EINVAL;
> > + int err = -EINVAL;
> >
> >   if ((vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHARED))
> >   return -EINVAL;
> >
> >   pr_debug("map %d+%d at %lx (pgoff %lx)\n",
> > - index, count, vma->vm_start, vma->vm_pgoff);
> > + index, vma_pages(vma), vma->vm_start, vma->vm_pgoff);
> >
> >   mutex_lock(>lock);
> > - map = gntdev_find_map_index(priv, index, count);
> > + map = gntdev_find_map_index(priv, index, vma_pages(vma));
> >   if (!map)
> >   goto unlock_out;
> >   if (use_ptemod && map->vma)
> > @@ -1145,12 +1144,9 @@ static int gntdev_mmap(struct file *flip, struct 
> > vm_area_struct *vma)
> >   goto out_put_map;
> >
> >   if (!use_ptemod) {
> > - for (i = 0; i < count; i++) {
> > - err = vm_insert_page(vma, vma->vm_start + i*PAGE_SIZE,
> > - map->pages[i]);
> > - if (err)
> > - goto out_put_map;
> > - }
> > + err = vm_insert_range(vma, map->pages, map->count);
> > + if (err)
> > + goto out_put_map;
> >   } else {
> >  #ifdef CONFIG_X86
> >   /*
>


[PATCH v2] Staging: fbtft: Switch to the gpio descriptor interface

2019-01-14 Thread Nishad Kamdar
This switches the fbtft driver to use GPIO descriptors
rather than numerical gpios:

Utilize the GPIO library's intrinsic handling of OF GPIOs
and polarity. If the line is flagged active low, gpiolib
will deal with this.

Remove gpios from platform device structure. Neither assign
statically numbers to gpios in platform device nor allow
gpios to be parsed as module parameters.

Signed-off-by: Nishad Kamdar 
Changes in v2:
 - Merge all patches in a single patch. This is because the
   first patch changes par->gpio from an int to a pointer
   so all the checks have to be updated in the same patch.
   Otherwie it breaks 'git bisect'.
---
 drivers/staging/fbtft/fb_agm1264k-fl.c |  52 ++--
 drivers/staging/fbtft/fb_bd663474.c|   6 +-
 drivers/staging/fbtft/fb_ili9163.c |   6 +-
 drivers/staging/fbtft/fb_ili9320.c |   2 +-
 drivers/staging/fbtft/fb_ili9325.c |   6 +-
 drivers/staging/fbtft/fb_ili9340.c |   2 +-
 drivers/staging/fbtft/fb_pcd8544.c |   4 +-
 drivers/staging/fbtft/fb_ra8875.c  |   4 +-
 drivers/staging/fbtft/fb_s6d1121.c |   6 +-
 drivers/staging/fbtft/fb_sh1106.c  |   2 +-
 drivers/staging/fbtft/fb_ssd1289.c |   6 +-
 drivers/staging/fbtft/fb_ssd1305.c |   4 +-
 drivers/staging/fbtft/fb_ssd1306.c |   4 +-
 drivers/staging/fbtft/fb_ssd1325.c |   6 +-
 drivers/staging/fbtft/fb_ssd1331.c |  10 +-
 drivers/staging/fbtft/fb_ssd1351.c |   2 +-
 drivers/staging/fbtft/fb_tls8204.c |   6 +-
 drivers/staging/fbtft/fb_uc1611.c  |   4 +-
 drivers/staging/fbtft/fb_uc1701.c  |   6 +-
 drivers/staging/fbtft/fb_upd161704.c   |   6 +-
 drivers/staging/fbtft/fb_watterott.c   |   4 +-
 drivers/staging/fbtft/fbtft-bus.c  |   6 +-
 drivers/staging/fbtft/fbtft-core.c | 173 +++--
 drivers/staging/fbtft/fbtft-io.c   |  26 +-
 drivers/staging/fbtft/fbtft.h  |  21 +-
 drivers/staging/fbtft/fbtft_device.c   | 344 +
 drivers/staging/fbtft/flexfb.c |  12 +-
 27 files changed, 143 insertions(+), 587 deletions(-)

diff --git a/drivers/staging/fbtft/fb_agm1264k-fl.c 
b/drivers/staging/fbtft/fb_agm1264k-fl.c
index f6f30f5bf15a..8f27bd8da17d 100644
--- a/drivers/staging/fbtft/fb_agm1264k-fl.c
+++ b/drivers/staging/fbtft/fb_agm1264k-fl.c
@@ -8,7 +8,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -79,14 +79,14 @@ static int init_display(struct fbtft_par *par)
 
 static void reset(struct fbtft_par *par)
 {
-   if (par->gpio.reset == -1)
+   if (!par->gpio.reset)
return;
 
dev_dbg(par->info->device, "%s()\n", __func__);
 
-   gpio_set_value(par->gpio.reset, 0);
+   gpiod_set_value(par->gpio.reset, 0);
udelay(20);
-   gpio_set_value(par->gpio.reset, 1);
+   gpiod_set_value(par->gpio.reset, 1);
mdelay(120);
 }
 
@@ -98,30 +98,30 @@ static int verify_gpios(struct fbtft_par *par)
dev_dbg(par->info->device,
"%s()\n", __func__);
 
-   if (par->EPIN < 0) {
+   if (!par->EPIN) {
dev_err(par->info->device,
"Missing info about 'wr' (aka E) gpio. Aborting.\n");
return -EINVAL;
}
for (i = 0; i < 8; ++i) {
-   if (par->gpio.db[i] < 0) {
+   if (!par->gpio.db[i]) {
dev_err(par->info->device,
"Missing info about 'db[%i]' gpio. Aborting.\n",
i);
return -EINVAL;
}
}
-   if (par->CS0 < 0) {
+   if (!par->CS0) {
dev_err(par->info->device,
"Missing info about 'cs0' gpio. Aborting.\n");
return -EINVAL;
}
-   if (par->CS1 < 0) {
+   if (!par->CS1) {
dev_err(par->info->device,
"Missing info about 'cs1' gpio. Aborting.\n");
return -EINVAL;
}
-   if (par->RW < 0) {
+   if (!par->RW) {
dev_err(par->info->device,
"Missing info about 'rw' gpio. Aborting.\n");
return -EINVAL;
@@ -139,22 +139,22 @@ request_gpios_match(struct fbtft_par *par, const struct 
fbtft_gpio *gpio)
if (strcasecmp(gpio->name, "wr") == 0) {
/* left ks0108 E pin */
par->EPIN = gpio->gpio;
-   return GPIOF_OUT_INIT_LOW;
+   return GPIOD_OUT_LOW;
} else if (strcasecmp(gpio->name, "cs0") == 0) {
/* left ks0108 controller pin */
par->CS0 = gpio->gpio;
-   return GPIOF_OUT_INIT_HIGH;
+   return GPIOD_OUT_HIGH;
} else if (strcasecmp(gpio->name, "cs1") == 0) {
/* right ks0108 controller pin */
par->CS1 = gpio->gpio;
-   return GPIOF_OUT_INIT_HIGH;
+   return GPIOD_OUT_HIGH;
}
 
/* if write (rw = 0) 

Re: [PATCH] powerpc: PCI does not require PowerNV

2019-01-14 Thread Alexey Kardashevskiy



On 15/01/2019 11:47, Jason A. Donenfeld wrote:
> Commit 0e759bd75285 moved around the declaration of pnv_npu2_init, but
> did not conditionalize it inside of the PCI pSeries driver. This meant
> that CONFIG_PCI && CONFIG_PPC_PSERIES && !CONFIG_PPC_POWERNV resulted
> in:
> 
> powerpc64le-pc-linux-gnu-ld: arch/powerpc/platforms/pseries/pci.o: in 
> function `pSeries_final_fixup':
> pci.c:(.init.text+0x1b0): undefined reference to `pnv_npu2_init'
> 
> This commit therefore wraps that line in an ifdef, so that PCI works
> without PowerNV.
> 
> Signed-off-by: Jason A. Donenfeld 
> Fixes: 0e759bd75285 ("powerpc/powernv/npu: Move OPAL calls away from context 
> manipulation")
> Cc: Alexey Kardashevskiy 
> Cc: Michael Ellerman 



Reviewed-by: Alexey Kardashevskiy 



> ---
>  arch/powerpc/platforms/pseries/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/pci.c 
> b/arch/powerpc/platforms/pseries/pci.c
> index 7725825d887d..37a77e57893e 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -264,7 +264,9 @@ void __init pSeries_final_fixup(void)
>   if (!of_device_is_compatible(nvdn->parent,
>   "ibm,power9-npu"))
>   continue;
> +#ifdef CONFIG_PPC_POWERNV
>   WARN_ON_ONCE(pnv_npu2_init(hose));
> +#endif
>   break;
>   }
>   }
> 

-- 
Alexey


KASAN: use-after-scope Read in corrupted

2019-01-14 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:1bdbe2274920 Merge tag 'vfio-v5.0-rc2' of git://github.com..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1519d39f40
kernel config:  https://syzkaller.appspot.com/x/.config?x=edf1c3031097c304
dashboard link: https://syzkaller.appspot.com/bug?extid=bd36b7dd9330f67037ab
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10fce14f40
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=110b201740

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+bd36b7dd9330f6703...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-scope in debug_lockdep_rcu_enabled.part.0+0x50/0x60  
kernel/rcu/update.c:249
Read of size 4 at addr 8880a945eabc by task  
`9��#�(�<�k���E�>9hA/-2122188634


CPU: 0 PID: -2122188634 Comm: ��E�O2� Not tainted 5.0.0-rc1+  
#19
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

[ cut here ]
Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected  
to SLAB object 'task_struct' (offset 1344, size 8)!
WARNING: CPU: 0 PID: -1455036288 at mm/usercopy.c:78  
usercopy_warn+0xeb/0x110 mm/usercopy.c:78

Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: -1455036288 Comm: ��E�O2� Not tainted 5.0.0-rc1+  
#19
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches


Re: [PATCH v4 0/3] Reset controller support for i.MX8MQ

2019-01-14 Thread Andrey Smirnov
On Wed, Dec 19, 2018 at 5:07 PM Andrey Smirnov  wrote:
>
> Everyone:
>
> This patch contains changes I made in order to add support for i.MX8MQ
> to reset-imx7.c in order to enable support of PCIE IP block on i.MX8MQ
> SoCs.
>
> Feedback is welcome!
>
> Thanks,
> Andrey Smirnov
>

Philipp, are there any changes that needs to be made to this series,
or is it good enough to be accepted?

Thanks,
Andrey Smirnov


[PATCH] staging: rtl8188eu: Replace kzalloc with kcalloc

2019-01-14 Thread Gustavo A. R. Silva
Replace kzalloc() function with its 2-factor argument form, kcalloc().

This patch replaces cases of:

kzalloc(a * b, gfp)

with:
kcalloc(a, b, gfp)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/staging/rtl8188eu/core/rtw_efuse.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_efuse.c 
b/drivers/staging/rtl8188eu/core/rtw_efuse.c
index b7be71f904ed..51c3dd6d7ffb 100644
--- a/drivers/staging/rtl8188eu/core/rtw_efuse.c
+++ b/drivers/staging/rtl8188eu/core/rtw_efuse.c
@@ -88,7 +88,9 @@ efuse_phymap_to_logical(u8 *phymap, u16 _offset, u16 
_size_byte, u8  *pbuf)
if (!efuseTbl)
return;
 
-   tmp = kzalloc(EFUSE_MAX_SECTION_88E * (sizeof(void *) + 
EFUSE_MAX_WORD_UNIT * sizeof(u16)), GFP_KERNEL);
+   tmp = kcalloc(EFUSE_MAX_SECTION_88E,
+ sizeof(void *) + EFUSE_MAX_WORD_UNIT * sizeof(u16),
+ GFP_KERNEL);
if (!tmp) {
DBG_88E("%s: alloc eFuseWord fail!\n", __func__);
goto eFuseWord_failed;
-- 
2.20.1



Re: [PATCH] sbitmap: Protect swap_lock from hardirq

2019-01-14 Thread Jens Axboe
On 1/14/19 9:31 PM, Linus Torvalds wrote:
> On Tue, Jan 15, 2019 at 4:28 PM Jens Axboe  wrote:
>>
>> Thanks Ming, I'll queue this up for shipping this week.
> 
> Oops. I _just_ applied it to my tree as a follow-up to Steven's
> softirq version. I just hadn't had time to build test and push out
> yet.

No big deal, fwiw, this is what I queued up:

http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus=8218a55b6b911d396565da4ed5ca8b18bf0d38fb

which has a spelling error fixed, and the indentation a bit nicer
for the locking scenario. But I can just drop it.

-- 
Jens Axboe



[PATCH] tracing: Replace kzalloc with kcalloc

2019-01-14 Thread Gustavo A. R. Silva
Replace kzalloc() function with its 2-factor argument form, kcalloc().

This patch replaces cases of:

kzalloc(a * b, gfp)

with:
kcalloc(a * b, gfp)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
 kernel/trace/trace_probe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 9962cb5da8ac..57f0cbaf9c58 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -429,7 +429,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, 
ssize_t *size,
 parg->count);
}
 
-   code = tmp = kzalloc(sizeof(*code) * FETCH_INSN_MAX, GFP_KERNEL);
+   code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL);
if (!code)
return -ENOMEM;
code[FETCH_INSN_MAX - 1].op = FETCH_OP_END;
@@ -501,7 +501,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, 
ssize_t *size,
code->op = FETCH_OP_END;
 
/* Shrink down the code buffer */
-   parg->code = kzalloc(sizeof(*code) * (code - tmp + 1), GFP_KERNEL);
+   parg->code = kcalloc(code - tmp + 1, sizeof(*code), GFP_KERNEL);
if (!parg->code)
ret = -ENOMEM;
else
-- 
2.20.1



Re: [PATCH] sbitmap: Protect swap_lock from hardirq

2019-01-14 Thread Linus Torvalds
On Tue, Jan 15, 2019 at 4:28 PM Jens Axboe  wrote:
>
> Thanks Ming, I'll queue this up for shipping this week.

Oops. I _just_ applied it to my tree as a follow-up to Steven's
softirq version. I just hadn't had time to build test and push out
yet.

   Linus


Re: [PATCH] sbitmap: Protect swap_lock from hardirq

2019-01-14 Thread Jens Axboe
On 1/14/19 8:59 PM, Ming Lei wrote:
> The original report is actually one real deadlock:
> 
> [  106.132865]  Possible interrupt unsafe locking scenario:
> [  106.132865]
> [  106.133659]CPU0CPU1
> [  106.134194]
> [  106.134733]   lock(&(>map[i].swap_lock)->rlock);
> [  106.135318]local_irq_disable();
> [  106.136014]lock(>ws[i].wait);
> [  106.136747]
> lock(&(>dispatch_wait_lock)->rlock);
> [  106.137742]   
> [  106.138110] lock(>ws[i].wait);
> 
> Because we may call blk_mq_get_driver_tag() directly from
> blk_mq_dispatch_rq_list() without holding any lock, then HARDIRQ may come
> and the above DEADLOCK is triggered.
> 
> ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq") tries to fix
> this issue by using 'spin_lock_bh', which isn't enough because we complete
> request from hardirq context direclty in case of multiqueue.

Thanks Ming, I'll queue this up for shipping this week.

-- 
Jens Axboe



Re: Real deadlock being suppressed in sbitmap

2019-01-14 Thread Steven Rostedt
On Tue, 15 Jan 2019 12:14:27 +0800
Ming Lei  wrote:

> As I mentioned, it should be fine given it is triggered only after one word
> is run out of.
> 
> Follows the lockdep warning on the latest linus tree:

Thanks for following up on this. Yes, this requires the irqsave then.

-- Steve


[PATCH v9 16/22] char/nvram: Add "devname:nvram" module alias

2019-01-14 Thread Finn Thain
Signed-off-by: Finn Thain 
---
 drivers/char/nvram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index adcc213c331e..c9e295d73dc5 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -503,3 +503,4 @@ module_exit(nvram_module_exit);
 
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_MISCDEV(NVRAM_MINOR);
+MODULE_ALIAS("devname:nvram");
-- 
2.19.2



[PATCH v9 05/22] m68k/atari: Implement arch_nvram_ops struct

2019-01-14 Thread Finn Thain
By implementing an arch_nvram_ops struct, a platform can re-use the
drivers/char/nvram.c module without needing any arch-specific code
in that module. Atari does so here.

Acked-by: Geert Uytterhoeven 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Added static inline wrapper functions to nvram.h.
 - Removed excess whitespace.
 - Renamed functions to avoid collisions with nvram.h wrapper functions.
 - Moved nvram_check_checksum() changes to the preceding patch.
---
 arch/m68k/atari/nvram.c | 49 +
 include/linux/nvram.h   | 14 
 2 files changed, 63 insertions(+)

diff --git a/arch/m68k/atari/nvram.c b/arch/m68k/atari/nvram.c
index 1d767847ffa6..e75adebe6e7d 100644
--- a/arch/m68k/atari/nvram.c
+++ b/arch/m68k/atari/nvram.c
@@ -74,6 +74,55 @@ static void __nvram_set_checksum(void)
__nvram_write_byte(sum, ATARI_CKS_LOC + 1);
 }
 
+static ssize_t atari_nvram_read(char *buf, size_t count, loff_t *ppos)
+{
+   char *p = buf;
+   loff_t i;
+
+   spin_lock_irq(_lock);
+   if (!__nvram_check_checksum()) {
+   spin_unlock_irq(_lock);
+   return -EIO;
+   }
+   for (i = *ppos; count > 0 && i < NVRAM_BYTES; --count, ++i, ++p)
+   *p = __nvram_read_byte(i);
+   spin_unlock_irq(_lock);
+
+   *ppos = i;
+   return p - buf;
+}
+
+static ssize_t atari_nvram_write(char *buf, size_t count, loff_t *ppos)
+{
+   char *p = buf;
+   loff_t i;
+
+   spin_lock_irq(_lock);
+   if (!__nvram_check_checksum()) {
+   spin_unlock_irq(_lock);
+   return -EIO;
+   }
+   for (i = *ppos; count > 0 && i < NVRAM_BYTES; --count, ++i, ++p)
+   __nvram_write_byte(*p, i);
+   __nvram_set_checksum();
+   spin_unlock_irq(_lock);
+
+   *ppos = i;
+   return p - buf;
+}
+
+static ssize_t atari_nvram_get_size(void)
+{
+   return NVRAM_BYTES;
+}
+
+const struct nvram_ops arch_nvram_ops = {
+   .read   = atari_nvram_read,
+   .write  = atari_nvram_write,
+   .get_size   = atari_nvram_get_size,
+};
+EXPORT_SYMBOL(arch_nvram_ops);
+
 #ifdef CONFIG_PROC_FS
 static struct {
unsigned char val;
diff --git a/include/linux/nvram.h b/include/linux/nvram.h
index eb5b52a9a747..a1e01dc89759 100644
--- a/include/linux/nvram.h
+++ b/include/linux/nvram.h
@@ -5,8 +5,18 @@
 #include 
 #include 
 
+struct nvram_ops {
+   ssize_t (*get_size)(void);
+   ssize_t (*read)(char *, size_t, loff_t *);
+   ssize_t (*write)(char *, size_t, loff_t *);
+};
+
+extern const struct nvram_ops arch_nvram_ops;
+
 static inline ssize_t nvram_get_size(void)
 {
+   if (arch_nvram_ops.get_size)
+   return arch_nvram_ops.get_size();
return -ENODEV;
 }
 
@@ -21,11 +31,15 @@ static inline void nvram_write_byte(unsigned char val, int 
addr)
 
 static inline ssize_t nvram_read(char *buf, size_t count, loff_t *ppos)
 {
+   if (arch_nvram_ops.read)
+   return arch_nvram_ops.read(buf, count, ppos);
return -ENODEV;
 }
 
 static inline ssize_t nvram_write(char *buf, size_t count, loff_t *ppos)
 {
+   if (arch_nvram_ops.write)
+   return arch_nvram_ops.write(buf, count, ppos);
return -ENODEV;
 }
 
-- 
2.19.2



[PATCH v9 02/22] m68k/atari: Move Atari-specific code out of drivers/char/nvram.c

2019-01-14 Thread Finn Thain
Move the m68k-specific code out of the driver to make the driver generic.

I've used 'SPDX-License-Identifier: GPL-2.0+' for the new file because the
old file is covered by MODULE_LICENSE("GPL").

Acked-by: Geert Uytterhoeven 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Fixed an old bug by adding a missing new line character.

Changed since v7:
 - Added SPDX-License-Identifier.
---
 arch/m68k/atari/Makefile |   2 +
 arch/m68k/atari/nvram.c  | 243 +
 drivers/char/nvram.c | 280 +--
 3 files changed, 280 insertions(+), 245 deletions(-)
 create mode 100644 arch/m68k/atari/nvram.c

diff --git a/arch/m68k/atari/Makefile b/arch/m68k/atari/Makefile
index 0cac723306f9..0b86bb6cfa87 100644
--- a/arch/m68k/atari/Makefile
+++ b/arch/m68k/atari/Makefile
@@ -6,3 +6,5 @@ obj-y   := config.o time.o debug.o ataints.o stdma.o \
atasound.o stram.o
 
 obj-$(CONFIG_ATARI_KBD_CORE)   += atakeyb.o
+
+obj-$(CONFIG_NVRAM:m=y)+= nvram.o
diff --git a/arch/m68k/atari/nvram.c b/arch/m68k/atari/nvram.c
new file mode 100644
index ..a8c457e40b0b
--- /dev/null
+++ b/arch/m68k/atari/nvram.c
@@ -0,0 +1,243 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * CMOS/NV-RAM driver for Atari. Adapted from drivers/char/nvram.c.
+ * Copyright (C) 1997 Roman Hodek 
+ * idea by and with help from Richard Jelinek 
+ * Portions copyright (c) 2001,2002 Sun Microsystems (thoc...@sun.com)
+ * Further contributions from Cesar Barros, Erik Gilling, Tim Hockin and
+ * Wim Van Sebroeck.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define NVRAM_BYTES50
+
+/* It is worth noting that these functions all access bytes of general
+ * purpose memory in the NVRAM - that is to say, they all add the
+ * NVRAM_FIRST_BYTE offset. Pass them offsets into NVRAM as if you did not
+ * know about the RTC cruft.
+ */
+
+/* Note that *all* calls to CMOS_READ and CMOS_WRITE must be done with
+ * rtc_lock held. Due to the index-port/data-port design of the RTC, we
+ * don't want two different things trying to get to it at once. (e.g. the
+ * periodic 11 min sync from kernel/time/ntp.c vs. this driver.)
+ */
+
+unsigned char __nvram_read_byte(int i)
+{
+   return CMOS_READ(NVRAM_FIRST_BYTE + i);
+}
+
+unsigned char nvram_read_byte(int i)
+{
+   unsigned long flags;
+   unsigned char c;
+
+   spin_lock_irqsave(_lock, flags);
+   c = __nvram_read_byte(i);
+   spin_unlock_irqrestore(_lock, flags);
+   return c;
+}
+EXPORT_SYMBOL(nvram_read_byte);
+
+/* This races nicely with trying to read with checksum checking */
+void __nvram_write_byte(unsigned char c, int i)
+{
+   CMOS_WRITE(c, NVRAM_FIRST_BYTE + i);
+}
+
+void nvram_write_byte(unsigned char c, int i)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(_lock, flags);
+   __nvram_write_byte(c, i);
+   spin_unlock_irqrestore(_lock, flags);
+}
+
+/* On Ataris, the checksum is over all bytes except the checksum bytes
+ * themselves; these are at the very end.
+ */
+#define ATARI_CKS_RANGE_START  0
+#define ATARI_CKS_RANGE_END47
+#define ATARI_CKS_LOC  48
+
+int __nvram_check_checksum(void)
+{
+   int i;
+   unsigned char sum = 0;
+
+   for (i = ATARI_CKS_RANGE_START; i <= ATARI_CKS_RANGE_END; ++i)
+   sum += __nvram_read_byte(i);
+   return (__nvram_read_byte(ATARI_CKS_LOC) == (~sum & 0xff)) &&
+  (__nvram_read_byte(ATARI_CKS_LOC + 1) == (sum & 0xff));
+}
+
+int nvram_check_checksum(void)
+{
+   unsigned long flags;
+   int rv;
+
+   spin_lock_irqsave(_lock, flags);
+   rv = __nvram_check_checksum();
+   spin_unlock_irqrestore(_lock, flags);
+   return rv;
+}
+EXPORT_SYMBOL(nvram_check_checksum);
+
+static void __nvram_set_checksum(void)
+{
+   int i;
+   unsigned char sum = 0;
+
+   for (i = ATARI_CKS_RANGE_START; i <= ATARI_CKS_RANGE_END; ++i)
+   sum += __nvram_read_byte(i);
+   __nvram_write_byte(~sum, ATARI_CKS_LOC);
+   __nvram_write_byte(sum, ATARI_CKS_LOC + 1);
+}
+
+#ifdef CONFIG_PROC_FS
+static struct {
+   unsigned char val;
+   const char *name;
+} boot_prefs[] = {
+   { 0x80, "TOS" },
+   { 0x40, "ASV" },
+   { 0x20, "NetBSD (?)" },
+   { 0x10, "Linux" },
+   { 0x00, "unspecified" },
+};
+
+static const char * const languages[] = {
+   "English (US)",
+   "German",
+   "French",
+   "English (UK)",
+   "Spanish",
+   "Italian",
+   "6 (undefined)",
+   "Swiss (French)",
+   "Swiss (German)",
+};
+
+static const char * const dateformat[] = {
+   "MM%cDD%cYY",
+   "DD%cMM%cYY",
+   "YY%cMM%cDD",
+   "YY%cDD%cMM",
+   "4 (undefined)",
+   "5 (undefined)",
+   "6 (undefined)",
+   "7 (undefined)",
+};
+
+static const char * const colors[] = {
+ 

[PATCH v9 03/22] char/nvram: Re-order functions to remove forward declarations and #ifdefs

2019-01-14 Thread Finn Thain
Also give functions more sensible names: nvram_misc_* for misc device ops,
nvram_proc_* for proc file ops and nvram_module_* for init and exit
functions. This prevents name collisions with nvram.h helper functions
and improves readability.

Signed-off-by: Finn Thain 
---
 drivers/char/nvram.c | 167 +++
 1 file changed, 72 insertions(+), 95 deletions(-)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index a9d4652f9e90..c660cff9faf4 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -55,11 +55,6 @@ static int nvram_open_mode;  /* special open modes */
 #define NVRAM_WRITE1 /* opened for writing (exclusive) */
 #define NVRAM_EXCL 2 /* opened with O_EXCL */
 
-#ifdef CONFIG_PROC_FS
-static void pc_nvram_proc_read(unsigned char *contents, struct seq_file *seq,
-  void *offset);
-#endif
-
 /*
  * These functions are provided to be called internally or by other parts of
  * the kernel. It's up to the caller to ensure correct checksum before reading
@@ -171,14 +166,14 @@ void nvram_set_checksum(void)
  * The are the file operation function for user access to /dev/nvram
  */
 
-static loff_t nvram_llseek(struct file *file, loff_t offset, int origin)
+static loff_t nvram_misc_llseek(struct file *file, loff_t offset, int origin)
 {
return generic_file_llseek_size(file, offset, origin, MAX_LFS_FILESIZE,
NVRAM_BYTES);
 }
 
-static ssize_t nvram_read(struct file *file, char __user *buf,
-   size_t count, loff_t *ppos)
+static ssize_t nvram_misc_read(struct file *file, char __user *buf,
+  size_t count, loff_t *ppos)
 {
unsigned char contents[NVRAM_BYTES];
unsigned i = *ppos;
@@ -206,8 +201,8 @@ static ssize_t nvram_read(struct file *file, char __user 
*buf,
return -EIO;
 }
 
-static ssize_t nvram_write(struct file *file, const char __user *buf,
-   size_t count, loff_t *ppos)
+static ssize_t nvram_misc_write(struct file *file, const char __user *buf,
+   size_t count, loff_t *ppos)
 {
unsigned char contents[NVRAM_BYTES];
unsigned i = *ppos;
@@ -245,8 +240,8 @@ static ssize_t nvram_write(struct file *file, const char 
__user *buf,
return -EIO;
 }
 
-static long nvram_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
+static long nvram_misc_ioctl(struct file *file, unsigned int cmd,
+unsigned long arg)
 {
int i;
 
@@ -286,7 +281,7 @@ static long nvram_ioctl(struct file *file, unsigned int cmd,
}
 }
 
-static int nvram_open(struct inode *inode, struct file *file)
+static int nvram_misc_open(struct inode *inode, struct file *file)
 {
spin_lock(_state_lock);
 
@@ -308,7 +303,7 @@ static int nvram_open(struct inode *inode, struct file 
*file)
return 0;
 }
 
-static int nvram_release(struct inode *inode, struct file *file)
+static int nvram_misc_release(struct inode *inode, struct file *file)
 {
spin_lock(_state_lock);
 
@@ -325,87 +320,6 @@ static int nvram_release(struct inode *inode, struct file 
*file)
return 0;
 }
 
-#ifndef CONFIG_PROC_FS
-static int nvram_add_proc_fs(void)
-{
-   return 0;
-}
-
-#else
-
-static int nvram_proc_read(struct seq_file *seq, void *offset)
-{
-   unsigned char contents[NVRAM_BYTES];
-   int i = 0;
-
-   spin_lock_irq(_lock);
-   for (i = 0; i < NVRAM_BYTES; ++i)
-   contents[i] = __nvram_read_byte(i);
-   spin_unlock_irq(_lock);
-
-   pc_nvram_proc_read(contents, seq, offset);
-
-   return 0;
-}
-
-static int nvram_add_proc_fs(void)
-{
-   if (!proc_create_single("driver/nvram", 0, NULL, nvram_proc_read))
-   return -ENOMEM;
-   return 0;
-}
-
-#endif /* CONFIG_PROC_FS */
-
-static const struct file_operations nvram_fops = {
-   .owner  = THIS_MODULE,
-   .llseek = nvram_llseek,
-   .read   = nvram_read,
-   .write  = nvram_write,
-   .unlocked_ioctl = nvram_ioctl,
-   .open   = nvram_open,
-   .release= nvram_release,
-};
-
-static struct miscdevice nvram_dev = {
-   NVRAM_MINOR,
-   "nvram",
-   _fops
-};
-
-static int __init nvram_init(void)
-{
-   int ret;
-
-   ret = misc_register(_dev);
-   if (ret) {
-   printk(KERN_ERR "nvram: can't misc_register on minor=%d\n",
-   NVRAM_MINOR);
-   goto out;
-   }
-   ret = nvram_add_proc_fs();
-   if (ret) {
-   printk(KERN_ERR "nvram: can't create /proc/driver/nvram\n");
-   goto outmisc;
-   }
-   ret = 0;
-   printk(KERN_INFO "Non-volatile memory driver v" NVRAM_VERSION "\n");
-out:
-   return ret;
-outmisc:
-   

[PATCH v9 04/22] nvram: Replace nvram_* function exports with static functions

2019-01-14 Thread Finn Thain
Replace nvram_* functions with static functions in nvram.h. These will
become wrappers for struct nvram_ops method calls.

This patch effectively disables existing NVRAM functionality so as to
allow the rest of the series to be bisected without build failures.
That functionality is gradually re-implemented in subsequent patches.

Replace the sole validate-checksum-and-read-byte sequence with a call to
nvram_read() which will gain the same semantics in subsequent patches.

Remove unused exports.

Acked-by: Geert Uytterhoeven 
Signed-off-by: Finn Thain 
---
 arch/m68k/atari/nvram.c   | 39 +++
 drivers/char/nvram.c  | 27 +--
 drivers/scsi/atari_scsi.c |  8 +---
 include/linux/nvram.h | 32 +---
 4 files changed, 38 insertions(+), 68 deletions(-)

diff --git a/arch/m68k/atari/nvram.c b/arch/m68k/atari/nvram.c
index a8c457e40b0b..1d767847ffa6 100644
--- a/arch/m68k/atari/nvram.c
+++ b/arch/m68k/atari/nvram.c
@@ -34,38 +34,17 @@
  * periodic 11 min sync from kernel/time/ntp.c vs. this driver.)
  */
 
-unsigned char __nvram_read_byte(int i)
+static unsigned char __nvram_read_byte(int i)
 {
return CMOS_READ(NVRAM_FIRST_BYTE + i);
 }
 
-unsigned char nvram_read_byte(int i)
-{
-   unsigned long flags;
-   unsigned char c;
-
-   spin_lock_irqsave(_lock, flags);
-   c = __nvram_read_byte(i);
-   spin_unlock_irqrestore(_lock, flags);
-   return c;
-}
-EXPORT_SYMBOL(nvram_read_byte);
-
 /* This races nicely with trying to read with checksum checking */
-void __nvram_write_byte(unsigned char c, int i)
+static void __nvram_write_byte(unsigned char c, int i)
 {
CMOS_WRITE(c, NVRAM_FIRST_BYTE + i);
 }
 
-void nvram_write_byte(unsigned char c, int i)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(_lock, flags);
-   __nvram_write_byte(c, i);
-   spin_unlock_irqrestore(_lock, flags);
-}
-
 /* On Ataris, the checksum is over all bytes except the checksum bytes
  * themselves; these are at the very end.
  */
@@ -73,7 +52,7 @@ void nvram_write_byte(unsigned char c, int i)
 #define ATARI_CKS_RANGE_END47
 #define ATARI_CKS_LOC  48
 
-int __nvram_check_checksum(void)
+static int __nvram_check_checksum(void)
 {
int i;
unsigned char sum = 0;
@@ -84,18 +63,6 @@ int __nvram_check_checksum(void)
   (__nvram_read_byte(ATARI_CKS_LOC + 1) == (sum & 0xff));
 }
 
-int nvram_check_checksum(void)
-{
-   unsigned long flags;
-   int rv;
-
-   spin_lock_irqsave(_lock, flags);
-   rv = __nvram_check_checksum();
-   spin_unlock_irqrestore(_lock, flags);
-   return rv;
-}
-EXPORT_SYMBOL(nvram_check_checksum);
-
 static void __nvram_set_checksum(void)
 {
int i;
diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index c660cff9faf4..c98775bfd896 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -74,13 +74,12 @@ static int nvram_open_mode; /* special open modes */
  * periodic 11 min sync from kernel/time/ntp.c vs. this driver.)
  */
 
-unsigned char __nvram_read_byte(int i)
+static unsigned char __nvram_read_byte(int i)
 {
return CMOS_READ(NVRAM_FIRST_BYTE + i);
 }
-EXPORT_SYMBOL(__nvram_read_byte);
 
-unsigned char nvram_read_byte(int i)
+static unsigned char pc_nvram_read_byte(int i)
 {
unsigned long flags;
unsigned char c;
@@ -90,16 +89,14 @@ unsigned char nvram_read_byte(int i)
spin_unlock_irqrestore(_lock, flags);
return c;
 }
-EXPORT_SYMBOL(nvram_read_byte);
 
 /* This races nicely with trying to read with checksum checking (nvram_read) */
-void __nvram_write_byte(unsigned char c, int i)
+static void __nvram_write_byte(unsigned char c, int i)
 {
CMOS_WRITE(c, NVRAM_FIRST_BYTE + i);
 }
-EXPORT_SYMBOL(__nvram_write_byte);
 
-void nvram_write_byte(unsigned char c, int i)
+static void pc_nvram_write_byte(unsigned char c, int i)
 {
unsigned long flags;
 
@@ -107,14 +104,13 @@ void nvram_write_byte(unsigned char c, int i)
__nvram_write_byte(c, i);
spin_unlock_irqrestore(_lock, flags);
 }
-EXPORT_SYMBOL(nvram_write_byte);
 
 /* On PCs, the checksum is built only over bytes 2..31 */
 #define PC_CKS_RANGE_START 2
 #define PC_CKS_RANGE_END   31
 #define PC_CKS_LOC 32
 
-int __nvram_check_checksum(void)
+static int __nvram_check_checksum(void)
 {
int i;
unsigned short sum = 0;
@@ -126,19 +122,6 @@ int __nvram_check_checksum(void)
__nvram_read_byte(PC_CKS_LOC+1);
return (sum & 0x) == expect;
 }
-EXPORT_SYMBOL(__nvram_check_checksum);
-
-int nvram_check_checksum(void)
-{
-   unsigned long flags;
-   int rv;
-
-   spin_lock_irqsave(_lock, flags);
-   rv = __nvram_check_checksum();
-   spin_unlock_irqrestore(_lock, flags);
-   return rv;
-}
-EXPORT_SYMBOL(nvram_check_checksum);
 
 static void __nvram_set_checksum(void)
 {
diff --git a/drivers/scsi/atari_scsi.c 

[PATCH v9 08/22] char/nvram: Allow the set_checksum and initialize ioctls to be omitted

2019-01-14 Thread Finn Thain
The drivers/char/nvram.c module has previously supported only RTC "CMOS"
NVRAM, for which it provides appropriate checksum ioctls. Make these
ioctls optional so the module can be re-used with other kinds of NVRAM.

The ops struct methods that implement the ioctls now return error
codes so that a multi-platform kernel binary can do the right thing when
running on hardware without a suitable NVRAM.

Signed-off-by: Finn Thain 
---
Changed since v8:
 - Renamed nvram_* functions to avoid name collisions.
---
 drivers/char/nvram.c  | 70 ---
 include/linux/nvram.h |  2 ++
 2 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index 2df391f78986..f88ef41d0598 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -136,16 +136,25 @@ static void __nvram_set_checksum(void)
__nvram_write_byte(sum & 0xff, PC_CKS_LOC + 1);
 }
 
-#if 0
-void nvram_set_checksum(void)
+static long pc_nvram_set_checksum(void)
 {
-   unsigned long flags;
+   spin_lock_irq(_lock);
+   __nvram_set_checksum();
+   spin_unlock_irq(_lock);
+   return 0;
+}
 
-   spin_lock_irqsave(_lock, flags);
+static long pc_nvram_initialize(void)
+{
+   ssize_t i;
+
+   spin_lock_irq(_lock);
+   for (i = 0; i < NVRAM_BYTES; ++i)
+   __nvram_write_byte(0, i);
__nvram_set_checksum();
-   spin_unlock_irqrestore(_lock, flags);
+   spin_unlock_irq(_lock);
+   return 0;
 }
-#endif  /*  0  */
 
 static ssize_t pc_nvram_get_size(void)
 {
@@ -156,6 +165,8 @@ const struct nvram_ops arch_nvram_ops = {
.read_byte  = pc_nvram_read_byte,
.write_byte = pc_nvram_write_byte,
.get_size   = pc_nvram_get_size,
+   .set_checksum   = pc_nvram_set_checksum,
+   .initialize = pc_nvram_initialize,
 };
 EXPORT_SYMBOL(arch_nvram_ops);
 #endif /* CONFIG_X86 */
@@ -241,51 +252,50 @@ static ssize_t nvram_misc_write(struct file *file, const 
char __user *buf,
 static long nvram_misc_ioctl(struct file *file, unsigned int cmd,
 unsigned long arg)
 {
-   int i;
+   long ret = -ENOTTY;
 
switch (cmd) {
-
case NVRAM_INIT:
/* initialize NVRAM contents and checksum */
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
 
-   mutex_lock(_mutex);
-   spin_lock_irq(_lock);
-
-   for (i = 0; i < NVRAM_BYTES; ++i)
-   __nvram_write_byte(0, i);
-   __nvram_set_checksum();
-
-   spin_unlock_irq(_lock);
-   mutex_unlock(_mutex);
-   return 0;
-
+   if (arch_nvram_ops.initialize != NULL) {
+   mutex_lock(_mutex);
+   ret = arch_nvram_ops.initialize();
+   mutex_unlock(_mutex);
+   }
+   break;
case NVRAM_SETCKS:
/* just set checksum, contents unchanged (maybe useful after
 * checksum garbaged somehow...) */
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
 
-   mutex_lock(_mutex);
-   spin_lock_irq(_lock);
-   __nvram_set_checksum();
-   spin_unlock_irq(_lock);
-   mutex_unlock(_mutex);
-   return 0;
-
-   default:
-   return -ENOTTY;
+   if (arch_nvram_ops.set_checksum != NULL) {
+   mutex_lock(_mutex);
+   ret = arch_nvram_ops.set_checksum();
+   mutex_unlock(_mutex);
+   }
+   break;
}
+   return ret;
 }
 
 static int nvram_misc_open(struct inode *inode, struct file *file)
 {
spin_lock(_state_lock);
 
+   /* Prevent multiple readers/writers if desired. */
if ((nvram_open_cnt && (file->f_flags & O_EXCL)) ||
-   (nvram_open_mode & NVRAM_EXCL) ||
-   ((file->f_mode & FMODE_WRITE) && (nvram_open_mode & NVRAM_WRITE))) {
+   (nvram_open_mode & NVRAM_EXCL)) {
+   spin_unlock(_state_lock);
+   return -EBUSY;
+   }
+
+   /* Prevent multiple writers if the set_checksum ioctl is implemented. */
+   if ((arch_nvram_ops.set_checksum != NULL) &&
+   (file->f_mode & FMODE_WRITE) && (nvram_open_mode & NVRAM_WRITE)) {
spin_unlock(_state_lock);
return -EBUSY;
}
diff --git a/include/linux/nvram.h b/include/linux/nvram.h
index bb4ea8cc6ea6..31c763087746 100644
--- a/include/linux/nvram.h
+++ b/include/linux/nvram.h
@@ -31,6 +31,8 @@ struct nvram_ops {
void(*write_byte)(unsigned char, int);
ssize_t (*read)(char *, size_t, loff_t *);
ssize_t (*write)(char *, size_t, loff_t *);
+   long(*initialize)(void);
+   long

[PATCH v9 01/22] scsi/atari_scsi: Don't select CONFIG_NVRAM

2019-01-14 Thread Finn Thain
On powerpc, setting CONFIG_NVRAM=n builds a kernel with no NVRAM support.
Setting CONFIG_NVRAM=m enables the /dev/nvram misc device module without
enabling NVRAM support in drivers. Setting CONFIG_NVRAM=y enables the
misc device (built-in) and also enables NVRAM support in drivers.

m68k shares the valkyriefb driver with powerpc, and since that driver uses
NVRAM, it is affected by CONFIG_ATARI_SCSI, because of the use of
"select NVRAM". We can avoid the "select" here, but drivers still have
to interpret the CONFIG_NVRAM symbol consistently regardless of platform.

In this patch and the subsequent fbdev driver patch, the convention is
adopted across all relevant platforms whereby NVRAM functionality gets
enabled in a given device driver when the nvram misc device is built-in
or when both drivers are modules.

Acked-by: Michael Schmitz 
Signed-off-by: Finn Thain 
---
This patch temporarily disables CONFIG_NVRAM on Atari, to prevent build
failures when bisecting the rest of this patch series. It gets enabled
again with the introduction of CONFIG_HAVE_ARCH_NVRAM_OPS, once the
nvram_* global functions have been moved to an ops struct.

Changed since v8:
 - Replaced defined(CONFIG_NVRAM) with IS_REACHABLE(CONFIG_NVRAM) as
suggested by James Bottomley.
 - Changed #ifdef to if as suggested by Christophe Leroy.
---
 drivers/char/Kconfig  | 5 +
 drivers/scsi/Kconfig  | 6 +++---
 drivers/scsi/atari_scsi.c | 2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 2e2ffe7010aa..a8cac68de177 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -244,7 +244,7 @@ source "drivers/char/hw_random/Kconfig"
 
 config NVRAM
tristate "/dev/nvram support"
-   depends on ATARI || X86 || GENERIC_NVRAM
+   depends on X86 || GENERIC_NVRAM
---help---
  If you say Y here and create a character special file /dev/nvram
  with major number 10 and minor number 144 using mknod ("man mknod"),
@@ -262,9 +262,6 @@ config NVRAM
  should NEVER idly tamper with it. See Ralf Brown's interrupt list
  for a guide to the use of CMOS bytes by your BIOS.
 
- On Atari machines, /dev/nvram is always configured and does not need
- to be selected.
-
  To compile this driver as a module, choose M here: the
  module will be called nvram.
 
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index f38882f6f37d..8f9d9e9fa695 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1369,14 +1369,14 @@ config ATARI_SCSI
tristate "Atari native SCSI support"
depends on ATARI && SCSI
select SCSI_SPI_ATTRS
-   select NVRAM
---help---
  If you have an Atari with built-in NCR5380 SCSI controller (TT,
  Falcon, ...) say Y to get it supported. Of course also, if you have
  a compatible SCSI controller (e.g. for Medusa).
 
- To compile this driver as a module, choose M here: the
- module will be called atari_scsi.
+ To compile this driver as a module, choose M here: the module will
+ be called atari_scsi. If you also enable NVRAM support, the SCSI
+ host's ID is taken from the setting in TT RTC NVRAM.
 
  This driver supports both styles of NCR integration into the
  system: the TT style (separate DMA), and the Falcon style (via
diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index a503dc50c4f8..78b43200c99e 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -757,7 +757,7 @@ static int __init atari_scsi_probe(struct platform_device 
*pdev)
 
if (setup_hostid >= 0) {
atari_scsi_template.this_id = setup_hostid & 7;
-   } else {
+   } else if (IS_REACHABLE(CONFIG_NVRAM)) {
/* Test if a host id is set in the NVRam */
if (ATARIHW_PRESENT(TT_CLK) && nvram_check_checksum()) {
unsigned char b = nvram_read_byte(16);
-- 
2.19.2



[PATCH v9 06/22] powerpc: Replace nvram_* extern declarations with standard header

2019-01-14 Thread Finn Thain
Remove the nvram_read_byte() and nvram_write_byte() declarations in
powerpc/include/asm/nvram.h and use the cross-platform static functions
in linux/nvram.h instead.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Added nvram_read_byte() and nvram_write_byte() functions to avoid a
potential build failure during 'git bisect'.
 - Brought forward some powerpc cleanup to avoid naming collisions with
nvram.h functions.
 - Replaced the ppc_md.nvram_* method wrappers with the ones in nvram.h.
---
 arch/powerpc/include/asm/nvram.h   |  6 --
 arch/powerpc/kernel/setup_32.c | 25 +-
 drivers/char/generic_nvram.c   |  1 +
 drivers/video/fbdev/matrox/matroxfb_base.c |  2 +-
 include/linux/nvram.h  |  3 +++
 5 files changed, 6 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/nvram.h b/arch/powerpc/include/asm/nvram.h
index 09a518bb7c03..56a388da9c4f 100644
--- a/arch/powerpc/include/asm/nvram.h
+++ b/arch/powerpc/include/asm/nvram.h
@@ -98,10 +98,4 @@ extern int nvram_write_os_partition(struct 
nvram_os_partition *part,
unsigned int err_type,
unsigned int error_log_cnt);
 
-/* Determine NVRAM size */
-extern ssize_t nvram_get_size(void);
-
-/* Normal access to NVRAM */
-extern unsigned char nvram_read_byte(int i);
-extern void nvram_write_byte(unsigned char c, int i);
 #endif /* _ASM_POWERPC_NVRAM_H */
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 947f904688b0..f5107796e2d7 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -149,30 +150,6 @@ __setup("l3cr=", ppc_setup_l3cr);
 
 #ifdef CONFIG_GENERIC_NVRAM
 
-/* Generic nvram hooks used by drivers/char/gen_nvram.c */
-unsigned char nvram_read_byte(int addr)
-{
-   if (ppc_md.nvram_read_val)
-   return ppc_md.nvram_read_val(addr);
-   return 0xff;
-}
-EXPORT_SYMBOL(nvram_read_byte);
-
-void nvram_write_byte(unsigned char val, int addr)
-{
-   if (ppc_md.nvram_write_val)
-   ppc_md.nvram_write_val(addr, val);
-}
-EXPORT_SYMBOL(nvram_write_byte);
-
-ssize_t nvram_get_size(void)
-{
-   if (ppc_md.nvram_size)
-   return ppc_md.nvram_size();
-   return -1;
-}
-EXPORT_SYMBOL(nvram_get_size);
-
 void nvram_sync(void)
 {
if (ppc_md.nvram_sync)
diff --git a/drivers/char/generic_nvram.c b/drivers/char/generic_nvram.c
index ff5394f47587..0c22b9503e84 100644
--- a/drivers/char/generic_nvram.c
+++ b/drivers/char/generic_nvram.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/video/fbdev/matrox/matroxfb_base.c 
b/drivers/video/fbdev/matrox/matroxfb_base.c
index 838869c6490c..0a4e5bad33f4 100644
--- a/drivers/video/fbdev/matrox/matroxfb_base.c
+++ b/drivers/video/fbdev/matrox/matroxfb_base.c
@@ -111,12 +111,12 @@
 #include "matroxfb_g450.h"
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #ifdef CONFIG_PPC_PMAC
 #include 
-unsigned char nvram_read_byte(int);
 static int default_vmode = VMODE_NVRAM;
 static int default_cmode = CMODE_NVRAM;
 #endif
diff --git a/include/linux/nvram.h b/include/linux/nvram.h
index a1e01dc89759..79431dab87a1 100644
--- a/include/linux/nvram.h
+++ b/include/linux/nvram.h
@@ -15,8 +15,11 @@ extern const struct nvram_ops arch_nvram_ops;
 
 static inline ssize_t nvram_get_size(void)
 {
+#ifdef CONFIG_PPC
+#else
if (arch_nvram_ops.get_size)
return arch_nvram_ops.get_size();
+#endif
return -ENODEV;
 }
 
-- 
2.19.2



[PATCH v9 10/22] m68k/atari: Implement arch_nvram_ops methods and enable CONFIG_HAVE_ARCH_NVRAM_OPS

2019-01-14 Thread Finn Thain
Atari RTC NVRAM uses a checksum so implement the remaining arch_nvram_ops
methods for the set_checksum and initialize ioctls. Enable
CONFIG_HAVE_ARCH_NVRAM_OPS.

Acked-by: Geert Uytterhoeven 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Moved the HAVE_ARCH_NVRAM_OPS symbol to common code as suggested by
Christoph Hellwig.
 - Renamed functions to avoid name collisions with nvram.h.

Changed since v7:
 - Changed the default for CONFIG_NVRAM, because "select NVRAM" was
removed from ATARI_SCSI in patch 1.
---
 arch/Kconfig  |  3 +++
 arch/m68k/Kconfig.machine |  1 +
 arch/m68k/atari/nvram.c   | 24 
 drivers/char/Kconfig  |  3 ++-
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 4cfb6de48f79..87393fb8141c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -701,6 +701,9 @@ config HAVE_ARCH_HASH
  file which provides platform-specific implementations of some
  functions in  or fs/namei.c.
 
+config HAVE_ARCH_NVRAM_OPS
+   bool
+
 config ISA_BUS_API
def_bool ISA
 
diff --git a/arch/m68k/Kconfig.machine b/arch/m68k/Kconfig.machine
index 328ba83d735b..ad584e3eb8f7 100644
--- a/arch/m68k/Kconfig.machine
+++ b/arch/m68k/Kconfig.machine
@@ -16,6 +16,7 @@ config ATARI
bool "Atari support"
depends on MMU
select MMU_MOTOROLA if MMU
+   select HAVE_ARCH_NVRAM_OPS
help
  This option enables support for the 68000-based Atari series of
  computers (including the TT, Falcon and Medusa). If you plan to use
diff --git a/arch/m68k/atari/nvram.c b/arch/m68k/atari/nvram.c
index e75adebe6e7d..c347fd206ddf 100644
--- a/arch/m68k/atari/nvram.c
+++ b/arch/m68k/atari/nvram.c
@@ -74,6 +74,26 @@ static void __nvram_set_checksum(void)
__nvram_write_byte(sum, ATARI_CKS_LOC + 1);
 }
 
+static long atari_nvram_set_checksum(void)
+{
+   spin_lock_irq(_lock);
+   __nvram_set_checksum();
+   spin_unlock_irq(_lock);
+   return 0;
+}
+
+static long atari_nvram_initialize(void)
+{
+   loff_t i;
+
+   spin_lock_irq(_lock);
+   for (i = 0; i < NVRAM_BYTES; ++i)
+   __nvram_write_byte(0, i);
+   __nvram_set_checksum();
+   spin_unlock_irq(_lock);
+   return 0;
+}
+
 static ssize_t atari_nvram_read(char *buf, size_t count, loff_t *ppos)
 {
char *p = buf;
@@ -113,6 +133,8 @@ static ssize_t atari_nvram_write(char *buf, size_t count, 
loff_t *ppos)
 
 static ssize_t atari_nvram_get_size(void)
 {
+   if (!MACH_IS_ATARI)
+   return -ENODEV;
return NVRAM_BYTES;
 }
 
@@ -120,6 +142,8 @@ const struct nvram_ops arch_nvram_ops = {
.read   = atari_nvram_read,
.write  = atari_nvram_write,
.get_size   = atari_nvram_get_size,
+   .set_checksum   = atari_nvram_set_checksum,
+   .initialize = atari_nvram_initialize,
 };
 EXPORT_SYMBOL(arch_nvram_ops);
 
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index a8cac68de177..ce9979529cf3 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -244,7 +244,8 @@ source "drivers/char/hw_random/Kconfig"
 
 config NVRAM
tristate "/dev/nvram support"
-   depends on X86 || GENERIC_NVRAM
+   depends on X86 || GENERIC_NVRAM || HAVE_ARCH_NVRAM_OPS
+   default M68K
---help---
  If you say Y here and create a character special file /dev/nvram
  with major number 10 and minor number 144 using mknod ("man mknod"),
-- 
2.19.2



[PATCH v9 22/22] powerpc: Adopt nvram module for PPC64

2019-01-14 Thread Finn Thain
Adopt nvram module to reduce code duplication. This means CONFIG_NVRAM
becomes available to PPC64 builds. Previously it was only available to
PPC32 builds because it depended on CONFIG_GENERIC_NVRAM.

The IOC_NVRAM_GET_OFFSET ioctl as implemented on PPC64 validates the
offset returned by pmac_get_partition(). Do the same in the nvram module.

Note that the old PPC32 generic_nvram module lacked this test.
So when CONFIG_PPC32 && CONFIG_PPC_PMAC, the IOC_NVRAM_GET_OFFSET ioctl
would have returned 0 (always). But when CONFIG_PPC64 && CONFIG_PPC_PMAC,
the IOC_NVRAM_GET_OFFSET ioctl would have returned -1 (which is -EPERM)
when the requested partition was not found.

With this patch, the result is now -EINVAL on both PPC32 and PPC64 when
the requested PowerMac NVRAM partition is not found. This is a userspace-
visible change, in the non-existent partition case, which would be in
an error path for an IOC_NVRAM_GET_OFFSET ioctl syscall.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
BTW, the IOC_NVRAM_SYNC ioctl call returns an error on PPC64. This patch
retains this behaviour though it might be better to actually perform a sync
since both PPC64 and PPC32 do implement ppc_md.nvram_sync() for Core99.

Changed since v8:
 - Dropped the arch_nvram_ops struct in favour of equivalent ppc_md
method calls. Regardless of the actual implementation, the presence of
this functionality is indicated by CONFIG_HAVE_ARCH_NVRAM_OPS=y.

Changed since v7:
 - Dropped pointless comment edit.
---
 arch/powerpc/Kconfig |   2 +-
 arch/powerpc/kernel/nvram_64.c   | 158 +--
 arch/powerpc/platforms/powermac/Makefile |   2 -
 arch/powerpc/platforms/powermac/setup.c  |   2 +-
 arch/powerpc/platforms/powermac/time.c   |   2 +-
 arch/powerpc/platforms/pseries/nvram.c   |   2 -
 drivers/char/nvram.c |   4 +
 7 files changed, 9 insertions(+), 163 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f62e6a3f9c4e..621912365508 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -178,7 +178,7 @@ config PPC
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if COMPAT
-   select HAVE_ARCH_NVRAM_OPS  if PPC32
+   select HAVE_ARCH_NVRAM_OPS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_CBPF_JITif !PPC64
diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 38b03a330cd2..244d2462e781 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -7,12 +7,6 @@
  *  2 of the License, or (at your option) any later version.
  *
  * /dev/nvram driver for PPC64
- *
- * This perhaps should live in drivers/char
- *
- * TODO: Split the /dev/nvram part (that one can use
- *   drivers/char/generic_nvram.c) from the arch & partition
- *   parsing code.
  */
 
 #include 
@@ -714,137 +708,6 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
spin_unlock_irqrestore(, flags);
 }
 
-static loff_t dev_nvram_llseek(struct file *file, loff_t offset, int origin)
-{
-   if (ppc_md.nvram_size == NULL)
-   return -ENODEV;
-   return generic_file_llseek_size(file, offset, origin, MAX_LFS_FILESIZE,
-   ppc_md.nvram_size());
-}
-
-
-static ssize_t dev_nvram_read(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
-{
-   ssize_t ret;
-   char *tmp = NULL;
-   ssize_t size;
-
-   if (!ppc_md.nvram_size) {
-   ret = -ENODEV;
-   goto out;
-   }
-
-   size = ppc_md.nvram_size();
-   if (size < 0) {
-   ret = size;
-   goto out;
-   }
-
-   if (*ppos >= size) {
-   ret = 0;
-   goto out;
-   }
-
-   count = min_t(size_t, count, size - *ppos);
-   count = min(count, PAGE_SIZE);
-
-   tmp = kmalloc(count, GFP_KERNEL);
-   if (!tmp) {
-   ret = -ENOMEM;
-   goto out;
-   }
-
-   ret = ppc_md.nvram_read(tmp, count, ppos);
-   if (ret <= 0)
-   goto out;
-
-   if (copy_to_user(buf, tmp, ret))
-   ret = -EFAULT;
-
-out:
-   kfree(tmp);
-   return ret;
-
-}
-
-static ssize_t dev_nvram_write(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
-{
-   ssize_t ret;
-   char *tmp = NULL;
-   ssize_t size;
-
-   ret = -ENODEV;
-   if (!ppc_md.nvram_size)
-   goto out;
-
-   ret = 0;
-   size = ppc_md.nvram_size();
-   if (*ppos >= size || size < 0)
-   goto out;
-
-   count = min_t(size_t, count, size - *ppos);
-   count = min(count, PAGE_SIZE);
-
-   tmp = memdup_user(buf, count);
-   if (IS_ERR(tmp)) {
-   ret = PTR_ERR(tmp);
-   

[PATCH v9 18/22] powerpc: Implement nvram ioctls

2019-01-14 Thread Finn Thain
Add the powerpc-specific ioctls to the nvram module. This allows the nvram
module to replace the generic_nvram module.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
On PPC32, the IOC_NVRAM_SYNC ioctl call always returns 0, even for those
platforms that don't implement ppc_md.nvram_sync. This patch retains
that quirk. It may be better to return an error (which is what PPC64 does).

Changed since v8:
 - Changed #else to fully specified #elif conditional.
 - Changed arch_nvram_ops method calls to ppc_md method calls.
---
 drivers/char/nvram.c  | 38 ++
 include/linux/nvram.h |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index c9e295d73dc5..944f05fddacd 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -48,6 +48,9 @@
 #include 
 #include 
 
+#ifdef CONFIG_PPC
+#include 
+#endif
 
 static DEFINE_MUTEX(nvram_mutex);
 static DEFINE_SPINLOCK(nvram_state_lock);
@@ -283,6 +286,38 @@ static long nvram_misc_ioctl(struct file *file, unsigned 
int cmd,
long ret = -ENOTTY;
 
switch (cmd) {
+#ifdef CONFIG_PPC
+   case OBSOLETE_PMAC_NVRAM_GET_OFFSET:
+   pr_warn("nvram: Using obsolete PMAC_NVRAM_GET_OFFSET ioctl\n");
+   /* fall through */
+   case IOC_NVRAM_GET_OFFSET:
+   ret = -EINVAL;
+#ifdef CONFIG_PPC_PMAC
+   if (machine_is(powermac)) {
+   int part, offset;
+
+   if (copy_from_user(, (void __user *)arg,
+  sizeof(part)) != 0)
+   return -EFAULT;
+   if (part < pmac_nvram_OF || part > pmac_nvram_NR)
+   return -EINVAL;
+   offset = pmac_get_partition(part);
+   if (copy_to_user((void __user *)arg,
+, sizeof(offset)) != 0)
+   return -EFAULT;
+   ret = 0;
+   }
+#endif
+   break;
+   case IOC_NVRAM_SYNC:
+   if (ppc_md.nvram_sync != NULL) {
+   mutex_lock(_mutex);
+   ppc_md.nvram_sync();
+   mutex_unlock(_mutex);
+   }
+   ret = 0;
+   break;
+#elif defined(CONFIG_X86) || defined(CONFIG_M68K)
case NVRAM_INIT:
/* initialize NVRAM contents and checksum */
if (!capable(CAP_SYS_ADMIN))
@@ -306,6 +341,7 @@ static long nvram_misc_ioctl(struct file *file, unsigned 
int cmd,
mutex_unlock(_mutex);
}
break;
+#endif /* CONFIG_X86 || CONFIG_M68K */
}
return ret;
 }
@@ -321,12 +357,14 @@ static int nvram_misc_open(struct inode *inode, struct 
file *file)
return -EBUSY;
}
 
+#if defined(CONFIG_X86) || defined(CONFIG_M68K)
/* Prevent multiple writers if the set_checksum ioctl is implemented. */
if ((arch_nvram_ops.set_checksum != NULL) &&
(file->f_mode & FMODE_WRITE) && (nvram_open_mode & NVRAM_WRITE)) {
spin_unlock(_state_lock);
return -EBUSY;
}
+#endif
 
if (file->f_flags & O_EXCL)
nvram_open_mode |= NVRAM_EXCL;
diff --git a/include/linux/nvram.h b/include/linux/nvram.h
index 9df85703735c..9e3a957c8f1f 100644
--- a/include/linux/nvram.h
+++ b/include/linux/nvram.h
@@ -31,8 +31,10 @@ struct nvram_ops {
void(*write_byte)(unsigned char, int);
ssize_t (*read)(char *, size_t, loff_t *);
ssize_t (*write)(char *, size_t, loff_t *);
+#if defined(CONFIG_X86) || defined(CONFIG_M68K)
long(*initialize)(void);
long(*set_checksum)(void);
+#endif
 };
 
 extern const struct nvram_ops arch_nvram_ops;
-- 
2.19.2



[PATCH v9 20/22] powerpc: Enable HAVE_ARCH_NVRAM_OPS and disable GENERIC_NVRAM

2019-01-14 Thread Finn Thain
Switch PPC32 kernels from the generic_nvram module to the nvram module.

Also fix a theoretical bug where CHRP omits the chrp_nvram_init() call
when CONFIG_NVRAM_MODULE=m.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
The change in the name of the module is visible to userspace. The module
that implements /dev/nvram on PowerPC now has suitable aliases, i.e.
MODULE_ALIAS_MISCDEV(NVRAM_MINOR);
MODULE_ALIAS("devname:nvram");
so that the device special file can be automatically created and the
module automatically loaded when needed. Previously this was not the case.

Changed since v8:
 - Moved the HAVE_ARCH_NVRAM_OPS symbol to common code as suggested by
Christoph Hellwig.
 - Changed arch_nvram_ops method calls to ppc_md method calls.
 - Removed the now unused nvram_sync() export.

Changed since v7:
 - Improved Kconfig help text for CONFIG_NVRAM.
 - Changed the default for CONFIG_NVRAM, which used to be "n". This is to
reduce the risk that CONFIG_GENERIC_NVRAM=y accidentally gets changed to
CONFIG_NVRAM=n.
---
 arch/powerpc/Kconfig|  6 +-
 arch/powerpc/include/asm/nvram.h|  3 ---
 arch/powerpc/kernel/setup_32.c  | 11 ---
 arch/powerpc/platforms/chrp/Makefile|  2 +-
 arch/powerpc/platforms/chrp/setup.c |  2 +-
 arch/powerpc/platforms/powermac/setup.c |  3 +--
 drivers/char/Kconfig| 19 +--
 include/linux/nvram.h   | 20 
 8 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2890d36eb531..f62e6a3f9c4e 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -178,6 +178,7 @@ config PPC
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if COMPAT
+   select HAVE_ARCH_NVRAM_OPS  if PPC32
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_CBPF_JITif !PPC64
@@ -274,11 +275,6 @@ config SYSVIPC_COMPAT
depends on COMPAT && SYSVIPC
default y
 
-# All PPC32s use generic nvram driver through ppc_md
-config GENERIC_NVRAM
-   bool
-   default y if PPC32
-
 config SCHED_OMIT_FRAME_POINTER
bool
default y
diff --git a/arch/powerpc/include/asm/nvram.h b/arch/powerpc/include/asm/nvram.h
index 56a388da9c4f..629a5cdcc865 100644
--- a/arch/powerpc/include/asm/nvram.h
+++ b/arch/powerpc/include/asm/nvram.h
@@ -78,9 +78,6 @@ extern intpmac_get_partition(int partition);
 extern u8  pmac_xpram_read(int xpaddr);
 extern voidpmac_xpram_write(int xpaddr, u8 data);
 
-/* Synchronize NVRAM */
-extern voidnvram_sync(void);
-
 /* Initialize NVRAM OS partition */
 extern int __init nvram_init_os_partition(struct nvram_os_partition *part);
 
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index f5107796e2d7..c31082233a25 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -148,17 +148,6 @@ static int __init ppc_setup_l3cr(char *str)
 }
 __setup("l3cr=", ppc_setup_l3cr);
 
-#ifdef CONFIG_GENERIC_NVRAM
-
-void nvram_sync(void)
-{
-   if (ppc_md.nvram_sync)
-   ppc_md.nvram_sync();
-}
-EXPORT_SYMBOL(nvram_sync);
-
-#endif /* CONFIG_NVRAM */
-
 static int __init ppc_init(void)
 {
/* clear the progress line */
diff --git a/arch/powerpc/platforms/chrp/Makefile 
b/arch/powerpc/platforms/chrp/Makefile
index 4b3bfadc70fa..dc3465cc8bc6 100644
--- a/arch/powerpc/platforms/chrp/Makefile
+++ b/arch/powerpc/platforms/chrp/Makefile
@@ -1,3 +1,3 @@
 obj-y  += setup.o time.o pegasos_eth.o pci.o
 obj-$(CONFIG_SMP)  += smp.o
-obj-$(CONFIG_NVRAM)+= nvram.o
+obj-$(CONFIG_NVRAM:m=y)+= nvram.o
diff --git a/arch/powerpc/platforms/chrp/setup.c 
b/arch/powerpc/platforms/chrp/setup.c
index e66644e0fb40..e8e804289c8e 100644
--- a/arch/powerpc/platforms/chrp/setup.c
+++ b/arch/powerpc/platforms/chrp/setup.c
@@ -550,7 +550,7 @@ static void __init chrp_init_IRQ(void)
 static void __init
 chrp_init2(void)
 {
-#ifdef CONFIG_NVRAM
+#if IS_ENABLED(CONFIG_NVRAM)
chrp_nvram_init();
 #endif
 
diff --git a/arch/powerpc/platforms/powermac/setup.c 
b/arch/powerpc/platforms/powermac/setup.c
index 2e8221e20ee8..b47f49cf9c4d 100644
--- a/arch/powerpc/platforms/powermac/setup.c
+++ b/arch/powerpc/platforms/powermac/setup.c
@@ -316,8 +316,7 @@ static void __init pmac_setup_arch(void)
find_via_pmu();
smu_init();
 
-#if defined(CONFIG_NVRAM) || defined(CONFIG_NVRAM_MODULE) || \
-defined(CONFIG_PPC64)
+#if IS_ENABLED(CONFIG_NVRAM) || defined(CONFIG_PPC64)
pmac_nvram_init();
 #endif
 #ifdef CONFIG_PPC32
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index ce9979529cf3..72866a004f07 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -244,25 +244,24 @@ source 

[PATCH v9 07/22] char/nvram: Adopt arch_nvram_ops

2019-01-14 Thread Finn Thain
NVRAMs on different platforms and architectures have different attributes
and access methods. E.g. some platforms have byte-at-a-time accessor
functions while others have byte-range accessor functions. Some have
checksum functionality while others do not. By calling ops struct methods
via the common wrapper functions, the nvram module and other drivers can
make use of the available NVRAM functionality in a portable way.

Signed-off-by: Finn Thain 
---
It might be nice if the NVRAM Kconfig symbol depended only on
HAVE_ARCH_NVRAM_OPS and all the x86 code here were moved to arch/x86.
This driver would then be more "generic". However, that x86 code would
have to be built-in when used by thinkpad_acpi or else a new module
would have to be added to arch/x86 too. Better to avoid that bloat
because most x86 platforms won't benefit.

Changed since v8:
 - Added kernel-doc comment describing the nvram_ops methods.
 - Renamed static nvram_* functions to avoid name collisions.
 - Converted arch_nvram_ops method calls to nvram.h wrapper function calls.
---
 drivers/char/nvram.c  | 30 --
 include/linux/nvram.h | 32 
 2 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index c98775bfd896..2df391f78986 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -52,9 +52,11 @@ static DEFINE_MUTEX(nvram_mutex);
 static DEFINE_SPINLOCK(nvram_state_lock);
 static int nvram_open_cnt; /* #times opened */
 static int nvram_open_mode;/* special open modes */
+static ssize_t nvram_size;
 #define NVRAM_WRITE1 /* opened for writing (exclusive) */
 #define NVRAM_EXCL 2 /* opened with O_EXCL */
 
+#ifdef CONFIG_X86
 /*
  * These functions are provided to be called internally or by other parts of
  * the kernel. It's up to the caller to ensure correct checksum before reading
@@ -145,6 +147,19 @@ void nvram_set_checksum(void)
 }
 #endif  /*  0  */
 
+static ssize_t pc_nvram_get_size(void)
+{
+   return NVRAM_BYTES;
+}
+
+const struct nvram_ops arch_nvram_ops = {
+   .read_byte  = pc_nvram_read_byte,
+   .write_byte = pc_nvram_write_byte,
+   .get_size   = pc_nvram_get_size,
+};
+EXPORT_SYMBOL(arch_nvram_ops);
+#endif /* CONFIG_X86 */
+
 /*
  * The are the file operation function for user access to /dev/nvram
  */
@@ -152,7 +167,7 @@ void nvram_set_checksum(void)
 static loff_t nvram_misc_llseek(struct file *file, loff_t offset, int origin)
 {
return generic_file_llseek_size(file, offset, origin, MAX_LFS_FILESIZE,
-   NVRAM_BYTES);
+   nvram_size);
 }
 
 static ssize_t nvram_misc_read(struct file *file, char __user *buf,
@@ -303,8 +318,7 @@ static int nvram_misc_release(struct inode *inode, struct 
file *file)
return 0;
 }
 
-#ifdef CONFIG_PROC_FS
-
+#if defined(CONFIG_X86) && defined(CONFIG_PROC_FS)
 static const char * const floppy_types[] = {
"none", "5.25'' 360k", "5.25'' 1.2M", "3.5'' 720k", "3.5'' 1.44M",
"3.5'' 2.88M", "3.5'' 2.88M"
@@ -394,7 +408,7 @@ static int nvram_proc_read(struct seq_file *seq, void 
*offset)
 
return 0;
 }
-#endif /* CONFIG_PROC_FS */
+#endif /* CONFIG_X86 && CONFIG_PROC_FS */
 
 static const struct file_operations nvram_misc_fops = {
.owner  = THIS_MODULE,
@@ -416,13 +430,17 @@ static int __init nvram_module_init(void)
 {
int ret;
 
+   nvram_size = nvram_get_size();
+   if (nvram_size < 0)
+   return nvram_size;
+
ret = misc_register(_misc);
if (ret) {
pr_err("nvram: can't misc_register on minor=%d\n", NVRAM_MINOR);
return ret;
}
 
-#ifdef CONFIG_PROC_FS
+#if defined(CONFIG_X86) && defined(CONFIG_PROC_FS)
if (!proc_create_single("driver/nvram", 0, NULL, nvram_proc_read)) {
pr_err("nvram: can't create /proc/driver/nvram\n");
misc_deregister(_misc);
@@ -436,7 +454,7 @@ static int __init nvram_module_init(void)
 
 static void __exit nvram_module_exit(void)
 {
-#ifdef CONFIG_PROC_FS
+#if defined(CONFIG_X86) && defined(CONFIG_PROC_FS)
remove_proc_entry("driver/nvram", NULL);
 #endif
misc_deregister(_misc);
diff --git a/include/linux/nvram.h b/include/linux/nvram.h
index 79431dab87a1..bb4ea8cc6ea6 100644
--- a/include/linux/nvram.h
+++ b/include/linux/nvram.h
@@ -5,8 +5,30 @@
 #include 
 #include 
 
+/**
+ * struct nvram_ops - NVRAM functionality made available to drivers
+ * @read: validate checksum (if any) then load a range of bytes from NVRAM
+ * @write: store a range of bytes to NVRAM then update checksum (if any)
+ * @read_byte: load a single byte from NVRAM
+ * @write_byte: store a single byte to NVRAM
+ * @get_size: return the fixed number of bytes in the NVRAM
+ *
+ * Architectures which provide an nvram ops struct need not implement all
+ * of these methods. If 

[PATCH v9 17/22] powerpc: Define missing ppc_md.nvram_size for CHRP and PowerMac

2019-01-14 Thread Finn Thain
Add the nvram_size() function to those PowerPC platforms that don't already
have one: CHRP and PowerMac. This means that the ppc_md.nvram_size()
function can be called by nvram_get_size().

Since we are addressing CHRP inconsistencies here, rename chrp_nvram_read
and chrp_nvram_write, which break the naming convention used across
powerpc platforms for NVRAM accessor functions.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Renamed functions to correspond with ppc_md member names.
---
 arch/powerpc/platforms/chrp/nvram.c | 14 ++
 arch/powerpc/platforms/powermac/nvram.c |  9 +
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/chrp/nvram.c 
b/arch/powerpc/platforms/chrp/nvram.c
index 791b86398e1d..37ac20ccbb19 100644
--- a/arch/powerpc/platforms/chrp/nvram.c
+++ b/arch/powerpc/platforms/chrp/nvram.c
@@ -24,7 +24,7 @@ static unsigned int nvram_size;
 static unsigned char nvram_buf[4];
 static DEFINE_SPINLOCK(nvram_lock);
 
-static unsigned char chrp_nvram_read(int addr)
+static unsigned char chrp_nvram_read_val(int addr)
 {
unsigned int done;
unsigned long flags;
@@ -46,7 +46,7 @@ static unsigned char chrp_nvram_read(int addr)
return ret;
 }
 
-static void chrp_nvram_write(int addr, unsigned char val)
+static void chrp_nvram_write_val(int addr, unsigned char val)
 {
unsigned int done;
unsigned long flags;
@@ -64,6 +64,11 @@ static void chrp_nvram_write(int addr, unsigned char val)
spin_unlock_irqrestore(_lock, flags);
 }
 
+static ssize_t chrp_nvram_size(void)
+{
+   return nvram_size;
+}
+
 void __init chrp_nvram_init(void)
 {
struct device_node *nvram;
@@ -85,8 +90,9 @@ void __init chrp_nvram_init(void)
printk(KERN_INFO "CHRP nvram contains %u bytes\n", nvram_size);
of_node_put(nvram);
 
-   ppc_md.nvram_read_val = chrp_nvram_read;
-   ppc_md.nvram_write_val = chrp_nvram_write;
+   ppc_md.nvram_read_val  = chrp_nvram_read_val;
+   ppc_md.nvram_write_val = chrp_nvram_write_val;
+   ppc_md.nvram_size  = chrp_nvram_size;
 
return;
 }
diff --git a/arch/powerpc/platforms/powermac/nvram.c 
b/arch/powerpc/platforms/powermac/nvram.c
index ae54d7fe68f3..9360cdc408c1 100644
--- a/arch/powerpc/platforms/powermac/nvram.c
+++ b/arch/powerpc/platforms/powermac/nvram.c
@@ -147,6 +147,11 @@ static ssize_t core99_nvram_size(void)
 static volatile unsigned char __iomem *nvram_addr;
 static int nvram_mult;
 
+static ssize_t ppc32_nvram_size(void)
+{
+   return NVRAM_SIZE;
+}
+
 static unsigned char direct_nvram_read_byte(int addr)
 {
return in_8(_data[(addr & (NVRAM_SIZE - 1)) * nvram_mult]);
@@ -590,21 +595,25 @@ int __init pmac_nvram_init(void)
nvram_mult = 1;
ppc_md.nvram_read_val   = direct_nvram_read_byte;
ppc_md.nvram_write_val  = direct_nvram_write_byte;
+   ppc_md.nvram_size   = ppc32_nvram_size;
} else if (nvram_naddrs == 1) {
nvram_data = ioremap(r1.start, s1);
nvram_mult = (s1 + NVRAM_SIZE - 1) / NVRAM_SIZE;
ppc_md.nvram_read_val   = direct_nvram_read_byte;
ppc_md.nvram_write_val  = direct_nvram_write_byte;
+   ppc_md.nvram_size   = ppc32_nvram_size;
} else if (nvram_naddrs == 2) {
nvram_addr = ioremap(r1.start, s1);
nvram_data = ioremap(r2.start, s2);
ppc_md.nvram_read_val   = indirect_nvram_read_byte;
ppc_md.nvram_write_val  = indirect_nvram_write_byte;
+   ppc_md.nvram_size   = ppc32_nvram_size;
} else if (nvram_naddrs == 0 && sys_ctrler == SYS_CTRLER_PMU) {
 #ifdef CONFIG_ADB_PMU
nvram_naddrs = -1;
ppc_md.nvram_read_val   = pmu_nvram_read_byte;
ppc_md.nvram_write_val  = pmu_nvram_write_byte;
+   ppc_md.nvram_size   = ppc32_nvram_size;
 #endif /* CONFIG_ADB_PMU */
} else {
printk(KERN_ERR "Incompatible type of NVRAM\n");
-- 
2.19.2



[PATCH v9 15/22] m68k: Dispatch nvram_ops calls to Atari or Mac functions

2019-01-14 Thread Finn Thain
A multi-platform kernel binary has to decide at run-time how to dispatch
the arch_nvram_ops calls. Add a platform-independent arch_nvram_ops
struct for this, to replace the atari-specific one.

Enable CONFIG_HAVE_ARCH_NVRAM_OPS for Macs.

Acked-by: Geert Uytterhoeven 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Adopted nvram_read_bytes() and nvram_write_bytes() where possible.
---
 arch/m68k/Kconfig.machine |  1 +
 arch/m68k/atari/nvram.c   | 21 ++--
 arch/m68k/include/asm/atarihw.h   |  6 +++
 arch/m68k/include/asm/macintosh.h |  4 ++
 arch/m68k/kernel/setup_mm.c   | 82 ++-
 arch/m68k/mac/misc.c  | 11 +
 6 files changed, 108 insertions(+), 17 deletions(-)

diff --git a/arch/m68k/Kconfig.machine b/arch/m68k/Kconfig.machine
index ad584e3eb8f7..c01e103492fd 100644
--- a/arch/m68k/Kconfig.machine
+++ b/arch/m68k/Kconfig.machine
@@ -27,6 +27,7 @@ config MAC
bool "Macintosh support"
depends on MMU
select MMU_MOTOROLA if MMU
+   select HAVE_ARCH_NVRAM_OPS
help
  This option enables support for the Apple Macintosh series of
  computers (yes, there is experimental support now, at least for part
diff --git a/arch/m68k/atari/nvram.c b/arch/m68k/atari/nvram.c
index c347fd206ddf..7000d2443aa3 100644
--- a/arch/m68k/atari/nvram.c
+++ b/arch/m68k/atari/nvram.c
@@ -74,7 +74,7 @@ static void __nvram_set_checksum(void)
__nvram_write_byte(sum, ATARI_CKS_LOC + 1);
 }
 
-static long atari_nvram_set_checksum(void)
+long atari_nvram_set_checksum(void)
 {
spin_lock_irq(_lock);
__nvram_set_checksum();
@@ -82,7 +82,7 @@ static long atari_nvram_set_checksum(void)
return 0;
 }
 
-static long atari_nvram_initialize(void)
+long atari_nvram_initialize(void)
 {
loff_t i;
 
@@ -94,7 +94,7 @@ static long atari_nvram_initialize(void)
return 0;
 }
 
-static ssize_t atari_nvram_read(char *buf, size_t count, loff_t *ppos)
+ssize_t atari_nvram_read(char *buf, size_t count, loff_t *ppos)
 {
char *p = buf;
loff_t i;
@@ -112,7 +112,7 @@ static ssize_t atari_nvram_read(char *buf, size_t count, 
loff_t *ppos)
return p - buf;
 }
 
-static ssize_t atari_nvram_write(char *buf, size_t count, loff_t *ppos)
+ssize_t atari_nvram_write(char *buf, size_t count, loff_t *ppos)
 {
char *p = buf;
loff_t i;
@@ -131,22 +131,11 @@ static ssize_t atari_nvram_write(char *buf, size_t count, 
loff_t *ppos)
return p - buf;
 }
 
-static ssize_t atari_nvram_get_size(void)
+ssize_t atari_nvram_get_size(void)
 {
-   if (!MACH_IS_ATARI)
-   return -ENODEV;
return NVRAM_BYTES;
 }
 
-const struct nvram_ops arch_nvram_ops = {
-   .read   = atari_nvram_read,
-   .write  = atari_nvram_write,
-   .get_size   = atari_nvram_get_size,
-   .set_checksum   = atari_nvram_set_checksum,
-   .initialize = atari_nvram_initialize,
-};
-EXPORT_SYMBOL(arch_nvram_ops);
-
 #ifdef CONFIG_PROC_FS
 static struct {
unsigned char val;
diff --git a/arch/m68k/include/asm/atarihw.h b/arch/m68k/include/asm/atarihw.h
index 9000b249d225..533008262b69 100644
--- a/arch/m68k/include/asm/atarihw.h
+++ b/arch/m68k/include/asm/atarihw.h
@@ -33,6 +33,12 @@ extern int atari_dont_touch_floppy_select;
 
 extern int atari_SCC_reset_done;
 
+extern ssize_t atari_nvram_read(char *, size_t, loff_t *);
+extern ssize_t atari_nvram_write(char *, size_t, loff_t *);
+extern ssize_t atari_nvram_get_size(void);
+extern long atari_nvram_set_checksum(void);
+extern long atari_nvram_initialize(void);
+
 /* convenience macros for testing machine type */
 #define MACH_IS_ST ((atari_mch_cookie >> 16) == ATARI_MCH_ST)
 #define MACH_IS_STE((atari_mch_cookie >> 16) == ATARI_MCH_STE && \
diff --git a/arch/m68k/include/asm/macintosh.h 
b/arch/m68k/include/asm/macintosh.h
index 08cee11180e6..d9a08bed4b12 100644
--- a/arch/m68k/include/asm/macintosh.h
+++ b/arch/m68k/include/asm/macintosh.h
@@ -19,6 +19,10 @@ extern void mac_init_IRQ(void);
 extern void mac_irq_enable(struct irq_data *data);
 extern void mac_irq_disable(struct irq_data *data);
 
+extern unsigned char mac_pram_read_byte(int);
+extern void mac_pram_write_byte(unsigned char, int);
+extern ssize_t mac_pram_get_size(void);
+
 /*
  * Macintosh Table
  */
diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
index ad0195cbe042..528484feff80 100644
--- a/arch/m68k/kernel/setup_mm.c
+++ b/arch/m68k/kernel/setup_mm.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -37,13 +38,14 @@
 #ifdef CONFIG_AMIGA
 #include 
 #endif
-#ifdef CONFIG_ATARI
 #include 
+#ifdef CONFIG_ATARI
 #include 
 #endif
 #ifdef CONFIG_SUN3X
 #include 
 #endif
+#include 
 #include 
 
 #if !FPSTATESIZE || !NR_IRQS
@@ -547,3 +549,81 @@ static int __init adb_probe_sync_enable (char *str) {
 
 __setup("adb_sync", 

[PATCH v9 11/22] m68k/mac: Adopt naming and calling conventions for PRAM routines

2019-01-14 Thread Finn Thain
Adopt the existing *_read_byte and *_write_byte naming convention.
Rename via_pram_readbyte and via_pram_writebyte to avoid confusion.
Adjust calling conventions of mac_pram_* functions to match the
struct nvram_ops methods.

Acked-by: Geert Uytterhoeven 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v7:
 - Removed some gratuitous function pointers.
---
 arch/m68k/mac/misc.c | 61 +---
 1 file changed, 23 insertions(+), 38 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index 71c4735a31ee..78c807025436 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -37,7 +37,7 @@
 static void (*rom_reset)(void);
 
 #ifdef CONFIG_ADB_CUDA
-static __u8 cuda_read_pram(int offset)
+static unsigned char cuda_pram_read_byte(int offset)
 {
struct adb_request req;
 
@@ -49,7 +49,7 @@ static __u8 cuda_read_pram(int offset)
return req.reply[3];
 }
 
-static void cuda_write_pram(int offset, __u8 data)
+static void cuda_pram_write_byte(unsigned char data, int offset)
 {
struct adb_request req;
 
@@ -62,7 +62,7 @@ static void cuda_write_pram(int offset, __u8 data)
 #endif /* CONFIG_ADB_CUDA */
 
 #ifdef CONFIG_ADB_PMU
-static __u8 pmu_read_pram(int offset)
+static unsigned char pmu_pram_read_byte(int offset)
 {
struct adb_request req;
 
@@ -74,7 +74,7 @@ static __u8 pmu_read_pram(int offset)
return req.reply[3];
 }
 
-static void pmu_write_pram(int offset, __u8 data)
+static void pmu_pram_write_byte(unsigned char data, int offset)
 {
struct adb_request req;
 
@@ -93,7 +93,7 @@ static void pmu_write_pram(int offset, __u8 data)
  * the RTC should be enabled.
  */
 
-static __u8 via_pram_readbyte(void)
+static __u8 via_rtc_recv(void)
 {
int i, reg;
__u8 data;
@@ -120,7 +120,7 @@ static __u8 via_pram_readbyte(void)
return data;
 }
 
-static void via_pram_writebyte(__u8 data)
+static void via_rtc_send(__u8 data)
 {
int i, reg, bit;
 
@@ -157,17 +157,17 @@ static void via_pram_command(int command, __u8 *data)
via1[vBufB] = (via1[vBufB] | VIA1B_vRTCClk) & ~VIA1B_vRTCEnb;
 
if (command & 0xFF00) { /* extended (two-byte) command */
-   via_pram_writebyte((command & 0xFF00) >> 8);
-   via_pram_writebyte(command & 0xFF);
+   via_rtc_send((command & 0xFF00) >> 8);
+   via_rtc_send(command & 0xFF);
is_read = command & 0x8000;
} else {/* one-byte command */
-   via_pram_writebyte(command);
+   via_rtc_send(command);
is_read = command & 0x80;
}
if (is_read) {
-   *data = via_pram_readbyte();
+   *data = via_rtc_recv();
} else {
-   via_pram_writebyte(*data);
+   via_rtc_send(*data);
}
 
/* All done, disable the RTC */
@@ -177,12 +177,12 @@ static void via_pram_command(int command, __u8 *data)
local_irq_restore(flags);
 }
 
-static __u8 via_read_pram(int offset)
+static unsigned char via_pram_read_byte(int offset)
 {
return 0;
 }
 
-static void via_write_pram(int offset, __u8 data)
+static void via_pram_write_byte(unsigned char data, int offset)
 {
 }
 
@@ -326,63 +326,48 @@ static void cuda_shutdown(void)
  *---
  */
 
-void mac_pram_read(int offset, __u8 *buffer, int len)
+unsigned char mac_pram_read_byte(int addr)
 {
-   __u8 (*func)(int);
-   int i;
-
switch (macintosh_config->adb_type) {
case MAC_ADB_IOP:
case MAC_ADB_II:
case MAC_ADB_PB1:
-   func = via_read_pram;
-   break;
+   return via_pram_read_byte(addr);
 #ifdef CONFIG_ADB_CUDA
case MAC_ADB_EGRET:
case MAC_ADB_CUDA:
-   func = cuda_read_pram;
-   break;
+   return cuda_pram_read_byte(addr);
 #endif
 #ifdef CONFIG_ADB_PMU
case MAC_ADB_PB2:
-   func = pmu_read_pram;
-   break;
+   return pmu_pram_read_byte(addr);
 #endif
default:
-   return;
-   }
-   for (i = 0 ; i < len ; i++) {
-   buffer[i] = (*func)(offset++);
+   return 0xFF;
}
 }
 
-void mac_pram_write(int offset, __u8 *buffer, int len)
+void mac_pram_write_byte(unsigned char val, int addr)
 {
-   void (*func)(int, __u8);
-   int i;
-
switch (macintosh_config->adb_type) {
case MAC_ADB_IOP:
case MAC_ADB_II:
case MAC_ADB_PB1:
-   func = via_write_pram;
+   via_pram_write_byte(val, addr);
break;
 #ifdef CONFIG_ADB_CUDA
case MAC_ADB_EGRET:
case MAC_ADB_CUDA:
-   func = cuda_write_pram;
+   cuda_pram_write_byte(val, addr);
break;
 #endif
 #ifdef CONFIG_ADB_PMU
case 

[PATCH v9 14/22] macintosh/via-cuda: Don't rely on Cuda to end a transfer

2019-01-14 Thread Finn Thain
Certain Cuda transfers have to be ended by the driver. According
to Apple's open source Cuda driver, as found in mkLinux and XNU, this
applies to any "open ended request such as PRAM read". This fixes an
infinite polling loop in cuda_pram_read_byte().

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/macintosh/via-cuda.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/via-cuda.c b/drivers/macintosh/via-cuda.c
index bbec6ac0a966..3581abfb0c6a 100644
--- a/drivers/macintosh/via-cuda.c
+++ b/drivers/macintosh/via-cuda.c
@@ -569,6 +569,7 @@ cuda_interrupt(int irq, void *arg)
 unsigned char ibuf[16];
 int ibuf_len = 0;
 int complete = 0;
+bool full;
 
 spin_lock_irqsave(_lock, flags);
 
@@ -656,12 +657,13 @@ cuda_interrupt(int irq, void *arg)
break;
 
 case reading:
-   if (reading_reply ? ARRAY_FULL(current_req->reply, reply_ptr)
- : ARRAY_FULL(cuda_rbuf, reply_ptr))
+   full = reading_reply ? ARRAY_FULL(current_req->reply, reply_ptr)
+: ARRAY_FULL(cuda_rbuf, reply_ptr);
+   if (full)
(void)in_8([SR]);
else
*reply_ptr++ = in_8([SR]);
-   if (!TREQ_asserted(status)) {
+   if (!TREQ_asserted(status) || full) {
if (mcu_is_egret)
assert_TACK();
/* that's all folks */
-- 
2.19.2



[PATCH v9 13/22] m68k/mac: Fix PRAM accessors

2019-01-14 Thread Finn Thain
PMU-based m68k Macs pre-date PowerMac-style NVRAM. Use the appropriate
PMU commands. Also implement the missing XPRAM accessors for VIA-based
Macs.

Acked-by: Geert Uytterhoeven 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v7:
 - Revised PMU response decoding due to via-pmu68k driver replacement.
---
 arch/m68k/mac/misc.c | 43 ++--
 include/uapi/linux/pmu.h |  2 ++
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index af000a015f68..d016ca2e0d10 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -66,23 +66,22 @@ static unsigned char pmu_pram_read_byte(int offset)
 {
struct adb_request req;
 
-   if (pmu_request(, NULL, 3, PMU_READ_NVRAM,
-   (offset >> 8) & 0xFF, offset & 0xFF) < 0)
+   if (pmu_request(, NULL, 3, PMU_READ_XPRAM,
+   offset & 0xFF, 1) < 0)
return 0;
-   while (!req.complete)
-   pmu_poll();
-   return req.reply[3];
+   pmu_wait_complete();
+
+   return req.reply[0];
 }
 
 static void pmu_pram_write_byte(unsigned char data, int offset)
 {
struct adb_request req;
 
-   if (pmu_request(, NULL, 4, PMU_WRITE_NVRAM,
-   (offset >> 8) & 0xFF, offset & 0xFF, data) < 0)
+   if (pmu_request(, NULL, 4, PMU_WRITE_XPRAM,
+   offset & 0xFF, 1, data) < 0)
return;
-   while (!req.complete)
-   pmu_poll();
+   pmu_wait_complete();
 }
 #endif /* CONFIG_ADB_PMU */
 
@@ -151,6 +150,16 @@ static void via_rtc_send(__u8 data)
 #define RTC_REG_SECONDS_3   3
 #define RTC_REG_WRITE_PROTECT   13
 
+/*
+ * Inside Mac has no information about two-byte RTC commands but
+ * the MAME/MESS source code has the essentials.
+ */
+
+#define RTC_REG_XPRAM   14
+#define RTC_CMD_XPRAM_READ  (RTC_CMD_READ(RTC_REG_XPRAM) << 8)
+#define RTC_CMD_XPRAM_WRITE (RTC_CMD_WRITE(RTC_REG_XPRAM) << 8)
+#define RTC_CMD_XPRAM_ARG(a)(((a & 0xE0) << 3) | ((a & 0x1F) << 2))
+
 /*
  * Execute a VIA PRAM/RTC command. For read commands
  * data should point to a one-byte buffer for the
@@ -198,11 +207,25 @@ static void via_rtc_command(int command, __u8 *data)
 
 static unsigned char via_pram_read_byte(int offset)
 {
-   return 0;
+   unsigned char temp;
+
+   via_rtc_command(RTC_CMD_XPRAM_READ | RTC_CMD_XPRAM_ARG(offset), );
+
+   return temp;
 }
 
 static void via_pram_write_byte(unsigned char data, int offset)
 {
+   unsigned char temp;
+
+   temp = 0x55;
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_WRITE_PROTECT), );
+
+   temp = data;
+   via_rtc_command(RTC_CMD_XPRAM_WRITE | RTC_CMD_XPRAM_ARG(offset), );
+
+   temp = 0x55 | RTC_FLG_WRITE_PROTECT;
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_WRITE_PROTECT), );
 }
 
 /*
diff --git a/include/uapi/linux/pmu.h b/include/uapi/linux/pmu.h
index 97256f90e6df..f2fc1bd80017 100644
--- a/include/uapi/linux/pmu.h
+++ b/include/uapi/linux/pmu.h
@@ -19,7 +19,9 @@
 #define PMU_POWER_CTRL 0x11/* control power of some devices */
 #define PMU_ADB_CMD0x20/* send ADB packet */
 #define PMU_ADB_POLL_OFF   0x21/* disable ADB auto-poll */
+#define PMU_WRITE_XPRAM0x32/* write eXtended Parameter RAM 
*/
 #define PMU_WRITE_NVRAM0x33/* write non-volatile RAM */
+#define PMU_READ_XPRAM 0x3a/* read eXtended Parameter RAM */
 #define PMU_READ_NVRAM 0x3b/* read non-volatile RAM */
 #define PMU_SET_RTC0x30/* set real-time clock */
 #define PMU_READ_RTC   0x38/* read real-time clock */
-- 
2.19.2



[PATCH v9 00/22] Re-use nvram module

2019-01-14 Thread Finn Thain
The "generic" NVRAM module, drivers/char/generic_nvram.c, implements a
/dev/nvram misc device. This module is used only by 32-bit PowerPC
platforms.

The RTC "CMOS" NVRAM module, drivers/char/nvram.c, also implements a
/dev/nvram misc device. This module is now used only by x86 and m68k
thanks to commit 3ba9faedc180 ("char: nvram: disable on ARM").

The "generic" module cannot be used by x86 or m68k platforms because it
cannot co-exist with the "CMOS" module. One reason for that is the
CONFIG_GENERIC_NVRAM kludge in drivers/char/Makefile. Another reason is
that automatically loading the appropriate module would be impossible
because only one module can provide the char-major-10-144 alias.

A multi-platform kernel binary needs a single, generic module. With this
patch series, drivers/char/nvram.c becomes more generic and some of the
arch-specific code gets moved under arch/. The nvram module is then
usable by all m68k, powerpc and x86 platforms.

This allows for removal of drivers/char/generic_nvram.c as well as a
duplicate in arch/powerpc/kernel/nvram_64.c. By reducing the number of
/dev/nvram char misc device implementations, the number of bugs and
inconsistencies is also reduced.

This approach reduces inconsistencies between PPC32 and PPC64 and also
between PPC_PMAC and MAC. A uniform API has benefits for userspace.

For example, some error codes for some ioctl calls become consistent
across PowerPC platforms. The uniform API can potentially benefit any
bootloader that works across the various platforms having XPRAM
(e.g. Emile).

This patch series was tested on Atari, Mac, PowerMac (both 32-bit and
64-bit) and ThinkPad hardware. AFAIK, it has not yet been tested on
pSeries or CHRP.

I think there are two possible merge strategies for this patch series.
The char misc maintainer could take the entire series. Alternatively,
the m68k maintainer could take patches 1 thru 16 (though some of these
have nothing to do with m68k) and after those patches reach mainline
the powerpc maintainer could take 17 thru 22.

Changed since v8:
 - Replaced defined(CONFIG_NVRAM) with IS_REACHABLE(CONFIG_NVRAM) as
suggested by James Bottomley.
 - Changed #ifdef to if as suggested by Christophe Leroy.
 - Expanded the fbdev patch to include controlfb.c and platinumfb.c.
 - Added kernel-doc comment to describe struct nvram_ops.
 - Moved the HAVE_ARCH_NVRAM_OPS symbol to common code as suggested
by Christoph Hellwig.
 - Abandoned conversion of powerpc drivers to arch_nvram_ops, as discussed
with Arnd Bergmann.
 - Dropped patch 6 ("x86/thinkpad_acpi: Use arch_nvram_ops methods").
 - Dropped patch 17 ("powerpc: Implement arch_nvram_ops.get_size() ...").
 - Dropped patch 20 ("powerpc, fbdev: Use arch_nvram_ops methods ...").
 - Dropped patch 25 ("powerpc: Remove pmac_xpram_{read,write} functions").
 - Added portable static functions to nvram.h which wrap both arch_nvram_ops
and ppc_md method calls.
 - Re-ordered and revised patches to resolve conflicts with existing extern
definitions in nvram.h and elsewhere.
 - Rebased on v5.0-rc2.
 - Added patch 14 ("macintosh/via-cuda: Don't rely on Cuda to end a transfer").

Changed since v7:
 - Rebased.
 - Dropped patch 9/26, "char/nvram: Use generic fixed_size_llseek()"
because generic_file_llseek_size() was adopted in commit b808b1d632f6.
 - Reordered the m68k and powerpc patches to simplify the merge strategy.
 - Addressed some trivial checkpatch.pl complaints.
 - Improved some commit log entries.
 - Changed the CONFIG_NVRAM default to better approximate the present code.
In particular, the CONFIG_GENERIC_NVRAM default and use of "select NVRAM".
 - Added more tested-by tags.

For older change logs, please refer to,
https://lore.kernel.org/lkml/20151101104202.301856...@telegraphics.com.au/


Finn Thain (22):
  scsi/atari_scsi: Don't select CONFIG_NVRAM
  m68k/atari: Move Atari-specific code out of drivers/char/nvram.c
  char/nvram: Re-order functions to remove forward declarations and
#ifdefs
  nvram: Replace nvram_* function exports with static functions
  m68k/atari: Implement arch_nvram_ops struct
  powerpc: Replace nvram_* extern declarations with standard header
  char/nvram: Adopt arch_nvram_ops
  char/nvram: Allow the set_checksum and initialize ioctls to be omitted
  char/nvram: Implement NVRAM read/write methods
  m68k/atari: Implement arch_nvram_ops methods and enable
CONFIG_HAVE_ARCH_NVRAM_OPS
  m68k/mac: Adopt naming and calling conventions for PRAM routines
  m68k/mac: Use macros for RTC accesses not magic numbers
  m68k/mac: Fix PRAM accessors
  macintosh/via-cuda: Don't rely on Cuda to end a transfer
  m68k: Dispatch nvram_ops calls to Atari or Mac functions
  char/nvram: Add "devname:nvram" module alias
  powerpc: Define missing ppc_md.nvram_size for CHRP and PowerMac
  powerpc: Implement nvram ioctls
  powerpc, fbdev: Use NV_CMODE and NV_VMODE only when CONFIG_PPC32 &&
CONFIG_PPC_PMAC && CONFIG_NVRAM
  powerpc: Enable HAVE_ARCH_NVRAM_OPS and disable GENERIC_NVRAM

[PATCH v9 21/22] char/generic_nvram: Remove as unused

2019-01-14 Thread Finn Thain
Signed-off-by: Finn Thain 
---
 drivers/char/Makefile|   6 +-
 drivers/char/generic_nvram.c | 160 ---
 2 files changed, 1 insertion(+), 165 deletions(-)
 delete mode 100644 drivers/char/generic_nvram.c

diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index b8d42b4e979b..fbea7dd12932 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -26,11 +26,7 @@ obj-$(CONFIG_RTC)+= rtc.o
 obj-$(CONFIG_HPET) += hpet.o
 obj-$(CONFIG_EFI_RTC)  += efirtc.o
 obj-$(CONFIG_XILINX_HWICAP)+= xilinx_hwicap/
-ifeq ($(CONFIG_GENERIC_NVRAM),y)
-  obj-$(CONFIG_NVRAM)  += generic_nvram.o
-else
-  obj-$(CONFIG_NVRAM)  += nvram.o
-endif
+obj-$(CONFIG_NVRAM)+= nvram.o
 obj-$(CONFIG_TOSHIBA)  += toshiba.o
 obj-$(CONFIG_DS1620)   += ds1620.o
 obj-$(CONFIG_HW_RANDOM)+= hw_random/
diff --git a/drivers/char/generic_nvram.c b/drivers/char/generic_nvram.c
deleted file mode 100644
index 0c22b9503e84..
--- a/drivers/char/generic_nvram.c
+++ /dev/null
@@ -1,160 +0,0 @@
-/*
- * Generic /dev/nvram driver for architectures providing some
- * "generic" hooks, that is :
- *
- * nvram_read_byte, nvram_write_byte, nvram_sync, nvram_get_size
- *
- * Note that an additional hook is supported for PowerMac only
- * for getting the nvram "partition" informations
- *
- */
-
-#define NVRAM_VERSION "1.1"
-
-#include 
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#ifdef CONFIG_PPC_PMAC
-#include 
-#endif
-
-#define NVRAM_SIZE 8192
-
-static DEFINE_MUTEX(nvram_mutex);
-static ssize_t nvram_len;
-
-static loff_t nvram_llseek(struct file *file, loff_t offset, int origin)
-{
-   return generic_file_llseek_size(file, offset, origin,
-   MAX_LFS_FILESIZE, nvram_len);
-}
-
-static ssize_t read_nvram(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
-{
-   unsigned int i;
-   char __user *p = buf;
-
-   if (!access_ok(buf, count))
-   return -EFAULT;
-   if (*ppos >= nvram_len)
-   return 0;
-   for (i = *ppos; count > 0 && i < nvram_len; ++i, ++p, --count)
-   if (__put_user(nvram_read_byte(i), p))
-   return -EFAULT;
-   *ppos = i;
-   return p - buf;
-}
-
-static ssize_t write_nvram(struct file *file, const char __user *buf,
-  size_t count, loff_t *ppos)
-{
-   unsigned int i;
-   const char __user *p = buf;
-   char c;
-
-   if (!access_ok(buf, count))
-   return -EFAULT;
-   if (*ppos >= nvram_len)
-   return 0;
-   for (i = *ppos; count > 0 && i < nvram_len; ++i, ++p, --count) {
-   if (__get_user(c, p))
-   return -EFAULT;
-   nvram_write_byte(c, i);
-   }
-   *ppos = i;
-   return p - buf;
-}
-
-static int nvram_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
-{
-   switch(cmd) {
-#ifdef CONFIG_PPC_PMAC
-   case OBSOLETE_PMAC_NVRAM_GET_OFFSET:
-   printk(KERN_WARNING "nvram: Using obsolete 
PMAC_NVRAM_GET_OFFSET ioctl\n");
-   case IOC_NVRAM_GET_OFFSET: {
-   int part, offset;
-
-   if (!machine_is(powermac))
-   return -EINVAL;
-   if (copy_from_user(, (void __user*)arg, sizeof(part)) != 0)
-   return -EFAULT;
-   if (part < pmac_nvram_OF || part > pmac_nvram_NR)
-   return -EINVAL;
-   offset = pmac_get_partition(part);
-   if (copy_to_user((void __user*)arg, , sizeof(offset)) != 
0)
-   return -EFAULT;
-   break;
-   }
-#endif /* CONFIG_PPC_PMAC */
-   case IOC_NVRAM_SYNC:
-   nvram_sync();
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   return 0;
-}
-
-static long nvram_unlocked_ioctl(struct file *file, unsigned int cmd, unsigned 
long arg)
-{
-   int ret;
-
-   mutex_lock(_mutex);
-   ret = nvram_ioctl(file, cmd, arg);
-   mutex_unlock(_mutex);
-
-   return ret;
-}
-
-const struct file_operations nvram_fops = {
-   .owner  = THIS_MODULE,
-   .llseek = nvram_llseek,
-   .read   = read_nvram,
-   .write  = write_nvram,
-   .unlocked_ioctl = nvram_unlocked_ioctl,
-};
-
-static struct miscdevice nvram_dev = {
-   NVRAM_MINOR,
-   "nvram",
-   _fops
-};
-
-int __init nvram_init(void)
-{
-   int ret = 0;
-
-   printk(KERN_INFO "Generic non-volatile memory driver v%s\n",
-   NVRAM_VERSION);
-   ret = misc_register(_dev);
-   if (ret != 0)
-   goto out;
-
-   nvram_len = nvram_get_size();
-   if (nvram_len < 0)
-   nvram_len = NVRAM_SIZE;

[PATCH v9 19/22] powerpc, fbdev: Use NV_CMODE and NV_VMODE only when CONFIG_PPC32 && CONFIG_PPC_PMAC && CONFIG_NVRAM

2019-01-14 Thread Finn Thain
This patch addresses inconsistencies in Mac framebuffer drivers and their
use of Kconfig symbols relating to NVRAM, so PPC64 can use CONFIG_NVRAM.

The defined(CONFIG_NVRAM) condition is replaced with the weaker
IS_REACHABLE(CONFIG_NVRAM) condition, like atari_scsi.

Macintosh framebuffer drivers use default settings for color mode and
video mode that are found in NVRAM. On PCI Macs, MacOS stores display
settings in the Name Registry (NR) partition in NVRAM*. On NuBus Macs,
there is no NR partition and MacOS stores display mode settings in PRAM**.

Early-model Macs are the ones most likely to benefit from these settings,
since they are more likely to have a fixed-frequency monitor connected to
the built-in framebuffer device. Moreover, a single NV_CMODE value and
a single NV_VMODE value provide for only one display.

The NV_CMODE and NV_VMODE constants are apparently offsets into the NR
partition for Old World machines. This also suggests that these defaults
are not useful on later models. The NR partition seems to be optional on
New World machines. CONFIG_NVRAM cannot be enabled on PPC64 at present.

It is safe to say that NVRAM support in PowerMac fbdev drivers is only
applicable to CONFIG_PPC32 so make this condition explicit. This means
matroxfb driver won't crash on PPC64 when CONFIG_NVRAM becomes available
there.

For imsttfb, add the missing CONFIG_NVRAM test to prevent a build failure,
since PPC64 does not implement nvram_read_byte(). Also add a missing
machine_is(powermac) check. Change the inconsistent dependency on
CONFIG_PPC and the matching #ifdef tests to CONFIG_PPC_PMAC.

For valkyriefb, to improve clarity and consistency with the other PowerMac
fbdev drivers, test for CONFIG_PPC_PMAC instead of !CONFIG_MAC. Remove a
bogus comment regarding PRAM.

* See GetPreferredConfiguration and SavePreferredConfiguration in
"Designing PCI Cards and Drivers for Power Macintosh Computers".

** See SetDefaultMode and GetDefaultMode in "Designing Cards and Drivers
for the Macintosh Family".

Signed-off-by: Finn Thain 
---
Changed since v8:
 - Replaced defined(CONFIG_NVRAM) with IS_REACHABLE(CONFIG_NVRAM) as
suggested by James Bottomley.
 - Changed #ifdef to if as suggested by Christophe Leroy.
 - Expanded the patch to include controlfb.c and platinumfb.c due to the
conversion from '#if defined(CONFIG_NVRAM)' to
'if (IS_REACHABLE(CONFIG_NVRAM))'.
---
 drivers/video/fbdev/Kconfig|  2 +-
 drivers/video/fbdev/controlfb.c| 42 --
 drivers/video/fbdev/imsttfb.c  | 23 ++--
 drivers/video/fbdev/matrox/matroxfb_base.c |  5 +--
 drivers/video/fbdev/platinumfb.c   | 21 +--
 drivers/video/fbdev/valkyriefb.c   | 30 ++--
 6 files changed, 48 insertions(+), 75 deletions(-)

diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index ae7712c9687a..58a9590c9db6 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -536,7 +536,7 @@ config FB_IMSTT
bool "IMS Twin Turbo display support"
depends on (FB = y) && PCI
select FB_CFB_IMAGEBLIT
-   select FB_MACMODES if PPC
+   select FB_MACMODES if PPC_PMAC
help
  The IMS Twin Turbo is a PCI-based frame buffer card bundled with
  many Macintosh and compatible computers.
diff --git a/drivers/video/fbdev/controlfb.c b/drivers/video/fbdev/controlfb.c
index 9cb0ef7ac29e..7af8db28bb80 100644
--- a/drivers/video/fbdev/controlfb.c
+++ b/drivers/video/fbdev/controlfb.c
@@ -411,35 +411,23 @@ static int __init init_control(struct fb_info_control *p)
full = p->total_vram == 0x40;
 
/* Try to pick a video mode out of NVRAM if we have one. */
-#ifdef CONFIG_NVRAM
-   if (default_cmode == CMODE_NVRAM) {
+   cmode = default_cmode;
+   if (IS_REACHABLE(CONFIG_NVRAM) && cmode == CMODE_NVRAM)
cmode = nvram_read_byte(NV_CMODE);
-   if(cmode < CMODE_8 || cmode > CMODE_32)
-   cmode = CMODE_8;
-   } else
-#endif
-   cmode=default_cmode;
-#ifdef CONFIG_NVRAM
-   if (default_vmode == VMODE_NVRAM) {
+   if (cmode < CMODE_8 || cmode > CMODE_32)
+   cmode = CMODE_8;
+
+   vmode = default_vmode;
+   if (IS_REACHABLE(CONFIG_NVRAM) && vmode == VMODE_NVRAM)
vmode = nvram_read_byte(NV_VMODE);
-   if (vmode < 1 || vmode > VMODE_MAX ||
-   control_mac_modes[vmode - 1].m[full] < cmode) {
-   sense = read_control_sense(p);
-   printk("Monitor sense value = 0x%x, ", sense);
-   vmode = mac_map_monitor_sense(sense);
-   if (control_mac_modes[vmode - 1].m[full] < cmode)
-   vmode = VMODE_640_480_60;
-   }
-   } else
-#endif
-   {
-   vmode=default_vmode;
-   if (control_mac_modes[vmode - 1].m[full] < cmode) {

[PATCH v9 12/22] m68k/mac: Use macros for RTC accesses not magic numbers

2019-01-14 Thread Finn Thain
This is intended to improve code style and not affect code behaviour.

Acked-by: Geert Uytterhoeven 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 arch/m68k/mac/misc.c | 59 ++--
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index 78c807025436..af000a015f68 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -136,6 +136,21 @@ static void via_rtc_send(__u8 data)
}
 }
 
+/*
+ * These values can be found in Inside Macintosh vol. III ch. 2
+ * which has a description of the RTC chip in the original Mac.
+ */
+
+#define RTC_FLG_READBIT(7)
+#define RTC_FLG_WRITE_PROTECT   BIT(7)
+#define RTC_CMD_READ(r) (RTC_FLG_READ | (r << 2))
+#define RTC_CMD_WRITE(r)(r << 2)
+#define RTC_REG_SECONDS_0   0
+#define RTC_REG_SECONDS_1   1
+#define RTC_REG_SECONDS_2   2
+#define RTC_REG_SECONDS_3   3
+#define RTC_REG_WRITE_PROTECT   13
+
 /*
  * Execute a VIA PRAM/RTC command. For read commands
  * data should point to a one-byte buffer for the
@@ -145,13 +160,17 @@ static void via_rtc_send(__u8 data)
  * This function disables all interrupts while running.
  */
 
-static void via_pram_command(int command, __u8 *data)
+static void via_rtc_command(int command, __u8 *data)
 {
unsigned long flags;
int is_read;
 
local_irq_save(flags);
 
+   /* The least significant bits must be 0b01 according to Inside Mac */
+
+   command = (command & ~3) | 1;
+
/* Enable the RTC and make sure the strobe line is high */
 
via1[vBufB] = (via1[vBufB] | VIA1B_vRTCClk) & ~VIA1B_vRTCEnb;
@@ -159,10 +178,10 @@ static void via_pram_command(int command, __u8 *data)
if (command & 0xFF00) { /* extended (two-byte) command */
via_rtc_send((command & 0xFF00) >> 8);
via_rtc_send(command & 0xFF);
-   is_read = command & 0x8000;
+   is_read = command & (RTC_FLG_READ << 8);
} else {/* one-byte command */
via_rtc_send(command);
-   is_read = command & 0x80;
+   is_read = command & RTC_FLG_READ;
}
if (is_read) {
*data = via_rtc_recv();
@@ -201,10 +220,10 @@ static time64_t via_read_time(void)
} result, last_result;
int count = 1;
 
-   via_pram_command(0x81, _result.cdata[3]);
-   via_pram_command(0x85, _result.cdata[2]);
-   via_pram_command(0x89, _result.cdata[1]);
-   via_pram_command(0x8D, _result.cdata[0]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_0), _result.cdata[3]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_1), _result.cdata[2]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_2), _result.cdata[1]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_3), _result.cdata[0]);
 
/*
 * The NetBSD guys say to loop until you get the same reading
@@ -212,10 +231,14 @@ static time64_t via_read_time(void)
 */
 
while (1) {
-   via_pram_command(0x81, [3]);
-   via_pram_command(0x85, [2]);
-   via_pram_command(0x89, [1]);
-   via_pram_command(0x8D, [0]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_0),
+   [3]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_1),
+   [2]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_2),
+   [1]);
+   via_rtc_command(RTC_CMD_READ(RTC_REG_SECONDS_3),
+   [0]);
 
if (result.idata == last_result.idata)
return (time64_t)result.idata - RTC_OFFSET;
@@ -254,18 +277,18 @@ static void via_set_rtc_time(struct rtc_time *tm)
/* Clear the write protect bit */
 
temp = 0x55;
-   via_pram_command(0x35, );
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_WRITE_PROTECT), );
 
data.idata = lower_32_bits(time + RTC_OFFSET);
-   via_pram_command(0x01, [3]);
-   via_pram_command(0x05, [2]);
-   via_pram_command(0x09, [1]);
-   via_pram_command(0x0D, [0]);
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_SECONDS_0), [3]);
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_SECONDS_1), [2]);
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_SECONDS_2), [1]);
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_SECONDS_3), [0]);
 
/* Set the write protect bit */
 
-   temp = 0xD5;
-   via_pram_command(0x35, );
+   temp = 0x55 | RTC_FLG_WRITE_PROTECT;
+   via_rtc_command(RTC_CMD_WRITE(RTC_REG_WRITE_PROTECT), );
 }
 
 static void via_shutdown(void)
-- 
2.19.2



[PATCH v9 09/22] char/nvram: Implement NVRAM read/write methods

2019-01-14 Thread Finn Thain
Refactor the RTC "CMOS" NVRAM functions so that they can be used as
arch_nvram_ops methods. Checksumming logic is moved from the misc device
operations to the nvram read/write operations. This makes the misc device
implementation more generic.

This preserves the locking mechanism such that "read if checksum valid"
and "write and update checksum" remain atomic operations.

Some platforms implement byte-range read/write methods which are similar
to file_operations struct methods. Other platforms provide only
byte-at-a-time methods. The former are more efficient but may be
unavailable so fall back on the latter methods when necessary.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Changed since v8:
 - Renamed nvram_* functions to avoid name collisions.
 - Added nvram_read_bytes() and nvram_write_bytes() helpers for use by
those platforms which access NVRAM only one-byte-at-a-time.

Changed since v7:
 - Adopted memdup_user(), like arch/powerpc/kernel/nvram_64.c.
---
 drivers/char/nvram.c  | 120 ++
 include/linux/nvram.h |  32 ++-
 2 files changed, 104 insertions(+), 48 deletions(-)

diff --git a/drivers/char/nvram.c b/drivers/char/nvram.c
index f88ef41d0598..adcc213c331e 100644
--- a/drivers/char/nvram.c
+++ b/drivers/char/nvram.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -161,7 +162,46 @@ static ssize_t pc_nvram_get_size(void)
return NVRAM_BYTES;
 }
 
+static ssize_t pc_nvram_read(char *buf, size_t count, loff_t *ppos)
+{
+   char *p = buf;
+   loff_t i;
+
+   spin_lock_irq(_lock);
+   if (!__nvram_check_checksum()) {
+   spin_unlock_irq(_lock);
+   return -EIO;
+   }
+   for (i = *ppos; count > 0 && i < NVRAM_BYTES; --count, ++i, ++p)
+   *p = __nvram_read_byte(i);
+   spin_unlock_irq(_lock);
+
+   *ppos = i;
+   return p - buf;
+}
+
+static ssize_t pc_nvram_write(char *buf, size_t count, loff_t *ppos)
+{
+   char *p = buf;
+   loff_t i;
+
+   spin_lock_irq(_lock);
+   if (!__nvram_check_checksum()) {
+   spin_unlock_irq(_lock);
+   return -EIO;
+   }
+   for (i = *ppos; count > 0 && i < NVRAM_BYTES; --count, ++i, ++p)
+   __nvram_write_byte(*p, i);
+   __nvram_set_checksum();
+   spin_unlock_irq(_lock);
+
+   *ppos = i;
+   return p - buf;
+}
+
 const struct nvram_ops arch_nvram_ops = {
+   .read   = pc_nvram_read,
+   .write  = pc_nvram_write,
.read_byte  = pc_nvram_read_byte,
.write_byte = pc_nvram_write_byte,
.get_size   = pc_nvram_get_size,
@@ -184,69 +224,57 @@ static loff_t nvram_misc_llseek(struct file *file, loff_t 
offset, int origin)
 static ssize_t nvram_misc_read(struct file *file, char __user *buf,
   size_t count, loff_t *ppos)
 {
-   unsigned char contents[NVRAM_BYTES];
-   unsigned i = *ppos;
-   unsigned char *tmp;
-
-   spin_lock_irq(_lock);
+   char *tmp;
+   ssize_t ret;
 
-   if (!__nvram_check_checksum())
-   goto checksum_err;
 
-   for (tmp = contents; count-- > 0 && i < NVRAM_BYTES; ++i, ++tmp)
-   *tmp = __nvram_read_byte(i);
+   if (!access_ok(buf, count))
+   return -EFAULT;
+   if (*ppos >= nvram_size)
+   return 0;
 
-   spin_unlock_irq(_lock);
+   count = min_t(size_t, count, nvram_size - *ppos);
+   count = min_t(size_t, count, PAGE_SIZE);
 
-   if (copy_to_user(buf, contents, tmp - contents))
-   return -EFAULT;
+   tmp = kmalloc(count, GFP_KERNEL);
+   if (!tmp)
+   return -ENOMEM;
 
-   *ppos = i;
+   ret = nvram_read(tmp, count, ppos);
+   if (ret <= 0)
+   goto out;
 
-   return tmp - contents;
+   if (copy_to_user(buf, tmp, ret)) {
+   *ppos -= ret;
+   ret = -EFAULT;
+   }
 
-checksum_err:
-   spin_unlock_irq(_lock);
-   return -EIO;
+out:
+   kfree(tmp);
+   return ret;
 }
 
 static ssize_t nvram_misc_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
 {
-   unsigned char contents[NVRAM_BYTES];
-   unsigned i = *ppos;
-   unsigned char *tmp;
-
-   if (i >= NVRAM_BYTES)
-   return 0;   /* Past EOF */
-
-   if (count > NVRAM_BYTES - i)
-   count = NVRAM_BYTES - i;
-   if (count > NVRAM_BYTES)
-   return -EFAULT; /* Can't happen, but prove it to gcc */
+   char *tmp;
+   ssize_t ret;
 
-   if (copy_from_user(contents, buf, count))
+   if (!access_ok(buf, count))
return -EFAULT;
+   if (*ppos >= nvram_size)
+   return 0;
 
-   spin_lock_irq(_lock);
-
-   if (!__nvram_check_checksum())
-   goto checksum_err;
-
-   for 

Re: Real deadlock being suppressed in sbitmap

2019-01-14 Thread Ming Lei
On Mon, Jan 14, 2019 at 10:50:17PM -0500, Steven Rostedt wrote:
> On Tue, 15 Jan 2019 11:23:56 +0800
> Ming Lei  wrote:
> 
> > Given 'swap_lock' can be acquired from blk_mq_dispatch_rq_list() via
> > blk_mq_get_driver_tag() directly, the above deadlock may be possible.
> > 
> > Sounds the correct fix may be the following one, and the irqsave cost
> > should be fine given sbitmap_deferred_clear is only triggered when one
> > word is run out of.
> 
> Since the lockdep splat only showed SOFTIRQ issues, I figured I would
> only protect it from that. Linus already accepted my patch, can you run
> tests on that kernel with LOCKDEP enabled and see if it will trigger
> with IRQ issues, then we can most definitely upgrade that to
> spin_lock_irqsave(). But I was trying to keep the overhead down, as
> that's a bit more heavy weight than a spin_lock_bh().

As I mentioned, it should be fine given it is triggered only after one word
is run out of.

Follows the lockdep warning on the latest linus tree:

[  107.431033] [ cut here ]

[  107.431786] IRQs not enabled as expected
[  107.432047] 
[  107.432633] WARNING: CPU: 2 PID: 919 at kernel/softirq.c:169 
__local_bh_enable_ip+0x5c/0xe2
[  107.433302] WARNING: inconsistent lock state
[  107.433304] 5.0.0-rc2+ #554 Not tainted
[  107.434513] Modules linked in: null_blk iTCO_wdt iTCO_vendor_support 
crc32c_intel usb_storage virtio_scsi i2c_i801 i2c_core nvme lpc_ich nvme_core 
mfd_core qemu_fw_cfg ip_tables
[  107.435124] 
[  107.435126] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  107.435679] CPU: 2 PID: 919 Comm: fio Not tainted 5.0.0-rc2+ #554
[  107.438082] fio/917 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  107.438084] b6dd09e0 (>ws[i].wait){+.?.}, at: 
blk_mq_dispatch_rq_list+0x149/0x45d
[  107.438696] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.10.2-2.fc27 04/01/2014
[  107.439599] {IN-SOFTIRQ-W} state was registered at:
[  107.439604]   _raw_spin_lock_irqsave+0x46/0x55
[  107.440481] RIP: 0010:__local_bh_enable_ip+0x5c/0xe2
[  107.441239]   __wake_up_common_lock+0x61/0xd0
[  107.441241]   sbitmap_queue_clear+0x38/0x59
[  107.442468] Code: 00 00 00 75 27 83 b8 a8 0c 00 00 00 75 1e 80 3d f9 15 1d 
01 00 75 15 48 c7 c7 d4 43 e5 81 c6 05 e9 15 1d 01 01 e8 fa 91 ff ff <0f> 0b fa 
66 0f 1f 44 00 00 e8 54 a8 0e 00 65 8b 05 57 b3 f8 7e 25
[  107.443760]   __blk_mq_free_request+0x7d/0x97
[  107.443764]   scsi_end_request+0x19d/0x2f5
[  107.62] RSP: 0018:c9000268b848 EFLAGS: 00010086
[  107.445171]   scsi_io_completion+0x290/0x52d
[  107.445173]   blk_done_softirq+0xa3/0xc0
[  107.445880] RAX:  RBX: 0201 RCX: 0007
[  107.446502]   __do_softirq+0x1e7/0x3ff
[  107.446505]   run_ksoftirqd+0x2f/0x3c
[  107.447120] RDX:  RSI: 81ea4392 RDI: 
[  107.447122] RBP: 813fc0c7 R08: 0001 R09: 0001
[  107.450027]   smpboot_thread_fn+0x1d8/0x1ef
[  107.450030]   kthread+0x115/0x11d
[  107.450656] R10: 0001 R11: c9000268b6d7 R12: 
[  107.451305]   ret_from_fork+0x3a/0x50
[  107.451307] irq event stamp: 1066
[  107.452050] R13:  R14: 0001 R15: 888470254c78
[  107.452052] FS:  7f8eeefd3740() GS:888477a4() 
knlGS:
[  107.452665] hardirqs last  enabled at (1063): [] 
__local_bh_enable_ip+0xc8/0xe2
[  107.452669] hardirqs last disabled at (1064): [] 
_raw_spin_lock_irq+0x15/0x45
[  107.453241] CS:  0010 DS:  ES:  CR0: 80050033
[  107.454355] softirqs last  enabled at (1066): [] 
sbitmap_get+0xea/0x127
[  107.454357] softirqs last disabled at (1065): [] 
sbitmap_get+0x7d/0x127
[  107.454898] CR2: 7f8eeefc1000 CR3: 000471fd2003 CR4: 00760ee0
[  107.454902] DR0:  DR1:  DR2: 
[  107.455458] 
   other info that might help us debug this:
[  107.455460]  Possible unsafe locking scenario:

[  107.456482] DR3:  DR6: fffe0ff0 DR7: 0400
[  107.456483] PKRU: 5554
[  107.457515]CPU0
[  107.457517]
[  107.458114] Call Trace:
[  107.458119]  sbitmap_get+0xea/0x127
[  107.458585]   lock(>ws[i].wait);
[  107.459626]  __sbitmap_queue_get+0x3e/0x73
[  107.460207]   
[  107.460208] lock(>ws[i].wait);
[  107.460695]  blk_mq_get_tag+0xa6/0x2c6
[  107.461796] 
*** DEADLOCK ***

[  107.461798] 3 locks held by fio/917:
[  107.462954]  ? wait_woken+0x6d/0x6d
[  107.464333]  #0: e24edc0f (rcu_read_lock){}, at: 
hctx_lock+0x1a/0xcb
[  107.465561]  blk_mq_get_driver_tag+0x81/0xdb
[  107.466465]  #1: b6dd09e0 (>ws[i].wait){+.?.}, at: 
blk_mq_dispatch_rq_list+0x149/0x45d
[  107.467645]  blk_mq_dispatch_rq_list+0x1a7/0x45d
[  107.468912]  #2: b92e5983 
(&(>dispatch_wait_lock)->rlock){+...}, at: 

Re: [PATCH v2] PCI: avoid bridge feature re-probing on hotplug

2019-01-14 Thread Michael S. Tsirkin
On Thu, Dec 20, 2018 at 05:36:03PM -0500, Michael S. Tsirkin wrote:
> On Thu, Dec 20, 2018 at 04:31:58PM -0600, Bjorn Helgaas wrote:
> > On Thu, Dec 20, 2018 at 04:26:54PM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Dec 20, 2018 at 01:49:50PM -0600, Bjorn Helgaas wrote:
> > > > On Mon, Dec 17, 2018 at 07:45:41PM -0500, Michael S. Tsirkin wrote:
> > > > > commit 1f82de10d6b1 ("PCI/x86: don't assume prefetchable ranges are 
> > > > > 64bit")
> > > > > added probing of bridge support for 64 bit memory each time bridge is
> > > > > re-enumerated.
> > > > > 
> > > > > Unfortunately this probing is destructive if any device behind
> > > > > the bridge is in use at this time.
> > > > > 
> > > > > This was observed in the field, see
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01711.html
> > > > > and specifically
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html
> > > > > 
> > > > > There's no real need to re-probe the bridge features as the
> > > > > registers in question never change - detect that using
> > > > > the memory flag being set (it's always set on the 1st pass since
> > > > > all PCI2PCI bridges support memory forwarding) and skip the probing.
> > > > > Thus, only the first call will perform the disruptive probing and sets
> > > > > the resource flags as required - which we can be reasonably sure 
> > > > > happens
> > > > > before any devices have been configured.
> > > > > Avoiding repeated calls to pci_bridge_check_ranges might be even 
> > > > > nicer.
> > > > > Unfortunately I couldn't come up with a clean way to do it without a
> > > > > major probing code refactoring.
> > > > 
> > > > I'm OK with major probe code refactoring as long as it's done
> > > > carefully.  Doing a special-case fix like this solves the immediate
> > > > problem but adds to the long-term maintenance problem.
> > > > 
> > > > As far as I can tell, everything in pci_bridge_check_ranges() should
> > > > be done once at enumeration-time, e.g., in the pci_read_bridge_bases()
> > > > path, and pci_bridge_check_ranges() itself should be removed.
> > > > 
> > > > If that turns out to be impossible for some reason, we need a comment
> > > > explaining why.
> > > 
> > > Maybe possible but I am not sure how.
> > > 
> > > Here's why:
> > > 
> > > Upon hotplug we want to poke at new bridges if any, but not the old
> > > ones.  The issue is that e.g. with ACPI hotplug the event that Linux
> > > knows how to handle is by design a heavy weight bus rescan.
> > 
> > Yeah, it's tricky.  But I don't think it's PCI or ACPI that makes this
> > tricky; I think it's just the historical baggage of the PCI core
> > design that makes it hard.
> > 
> > Even in the ACPI hotplug path, I think we use this pci_scan_device()
> > path:
> > 
> >   pci_scan_device
> > pci_setup_device
> >   case PCI_HEADER_TYPE_NORMAL:
> > pci_read_bases(6) # normal PCI BARs
> >   case PCI_HEADER_TYPE_BRIDGE:
> > pci_read_bases(2) # bridge BARs (not windows)
> > 
> > Unfortunately that path doesn't call pci_read_bridge_bases() to read
> > the bridge windows; that currently happens in pcibios_fixup_bus(),
> > which is only called from pci_scan_child_bus_extend().
> > 
> > This is a broken design because reading the bridge apertures is not at
> > all platform-specific, so it shouldn't be done in a pcibios hook.
> > And, more to the issue at hand, it shouldn't be done in
> > pci_scan_child_bus() either.  We might have to *update* the windows
> > when scanning child buses, but we should be able to do the work of
> > finding out what windows are implemented and their properties
> > somewhere in the pci_setup_device() path.
> > 
> > > Specifically 
> > > 
> > > pci_read_bridge_bases does not
> > > seem to be called on ACPI hotplug path.
> > > 
> > > Rather,
> > > 
> > > pci_assign_unassigned_root_bus_resources
> > > pci_assign_unassigned_bridge_resources
> > > 
> > > would be the two functions in question.
> > > 
> > > 
> > > Would above explanation be sufficient? If not, since I understand your
> > > reluctance to pile up hacks, would you be open to doing the suggested
> > > rewrite yourself? Me and xuyandong can help test it.
> > 
> > I'll be on vacation or holiday most of the time until the new year,
> > but I put a reminder on my calendar to look at this again then.  I'm
> > pretty sure we've tried to unravel this in the past, but I can't
> > remember what issues we tripped over.  Maybe we can make some progress
> > by restricting the problem we're trying to solve.
> > 
> > Thanks for bringing this up!  This is a wart in the PCI core that has
> > bothered me for a long time, and maybe this is the incentive we need
> > to make some progress on it.
> > 
> > Bjorn
> 
> 
> Sounds good, let me know when I can help with testing.

FWIW this patch has been in -next through my tree for a while now with
no ill effects. And the bug it fixes is real. So ... nudge nudge ...

Would 

[PATCH] sbitmap: Protect swap_lock from hardirq

2019-01-14 Thread Ming Lei
The original report is actually one real deadlock:

[  106.132865]  Possible interrupt unsafe locking scenario:
[  106.132865]
[  106.133659]CPU0CPU1
[  106.134194]
[  106.134733]   lock(&(>map[i].swap_lock)->rlock);
[  106.135318]local_irq_disable();
[  106.136014]lock(>ws[i].wait);
[  106.136747]
lock(&(>dispatch_wait_lock)->rlock);
[  106.137742]   
[  106.138110] lock(>ws[i].wait);

Because we may call blk_mq_get_driver_tag() directly from
blk_mq_dispatch_rq_list() without holding any lock, then HARDIRQ may come
and the above DEADLOCK is triggered.

ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq") tries to fix
this issue by using 'spin_lock_bh', which isn't enough because we complete
request from hardirq context direclty in case of multiqueue.

Cc: Clark Williams 
Fixes: ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq")
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Guenter Roeck 
Cc: Steven Rostedt (VMware) 
Cc: Linus Torvalds 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ming Lei 
---
 lib/sbitmap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 864354000e04..5b382c1244ed 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -27,8 +27,9 @@ static inline bool sbitmap_deferred_clear(struct sbitmap *sb, 
int index)
 {
unsigned long mask, val;
bool ret = false;
+   unsigned long flags;
 
-   spin_lock_bh(>map[index].swap_lock);
+   spin_lock_irqsave(>map[index].swap_lock, flags);
 
if (!sb->map[index].cleared)
goto out_unlock;
@@ -49,7 +50,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap *sb, 
int index)
 
ret = true;
 out_unlock:
-   spin_unlock_bh(>map[index].swap_lock);
+   spin_unlock_irqrestore(>map[index].swap_lock, flags);
return ret;
 }
 
-- 
2.14.4



Re: [PATCH] block/blk-sysfs.c: Remove last reference of blk_init_queue

2019-01-14 Thread Jens Axboe
On 1/14/19 8:06 PM, Marcos Paulo de Souza wrote:
> blk_init_queue was removed in a1ce35fa4985.

Honestly, most of that comment is wrong anyway. Since this isn't
a visible API, I'd just kill the comment completely.

> Signed-off-by: Marcos Paulo de Souza 
> ---
>  There are more two references in Documentation/block/biodoc.txt, but maybe 
> that
>  file needs a rewrite in rst anyway?

It probably needs a rewrite - period :-)

-- 
Jens Axboe



Re: [PATCH V2 2/2] Input: rotaty-encoder - Add DT binding document

2019-01-14 Thread Dmitry Torokhov
[ resending to Rob... ]
On Tue, Jan 08, 2019 at 01:42:49AM +0900, Donghoon Han wrote:
> Add DT binding document for rotary-encoder, keycode options.
> 
> Signed-off-by: Donghoon Han 
> Cc: Dmitry Torokhov 
> Cc: Daniel Mack 
> Cc: devicet...@vger.kernel.org
> To: linux-in...@vger.kernel.org
> ---
>  .../devicetree/bindings/input/rotary-encoder.txt | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/input/rotary-encoder.txt 
> b/Documentation/devicetree/bindings/input/rotary-encoder.txt
> index f99fe5cdeaec..9986ec2af2d4 100644
> --- a/Documentation/devicetree/bindings/input/rotary-encoder.txt
> +++ b/Documentation/devicetree/bindings/input/rotary-encoder.txt
> @@ -12,6 +12,10 @@ Optional properties:
>  - rotary-encoder,relative-axis: register a relative axis rather than an
>absolute one. Relative axis will only generate +1/-1 events on the input
>device, hence no steps need to be passed.
> +- rotary-encoder,relative-keys : generate pair of key events. This setting
> +  behaves just like relative-axis, generating key events instead.
> +  (Keycodes[2] corresponds to -1/1 events.)
> +- rotary-encoder,relative-keycodes : keycodes for relative-keys

Given that keycodes are linux-specific, I think the property should be
linux,keycodes. Also, I am not sure we need separate
rotary-encoder,relative-keys property as we can infer that we want to
generate keys from presence of linux,keycodes property.

Rob, any comments?

>  - rotary-encoder,rollover: Automatic rollover when the rotary value becomes
>greater than the specified steps or smaller than 0. For absolute axis only.
>  - rotary-encoder,steps-per-period: Number of steps (stable states) per 
> period.
> @@ -48,3 +52,11 @@ Example:
>   rotary-encoder,encoding = "binary";
>   rotary-encoder,rollover;
>   };
> +
> + rotary@2 {
> + compatible = "rotary-encoder";
> + gpios = < 21 0>, < 22 0>;
> + rotary-encoder,relative-keys;
> + rotary-encoder,relative-keycode = <103>, <108>;
> + rotary-encoder,steps-per-period = <2>;
> + };
> -- 
> 2.17.1
> 

Thanks.

-- 
Dmitry


Re: [PATCH V2 2/2] Input: rotaty-encoder - Add DT binding document

2019-01-14 Thread Dmitry Torokhov
On Tue, Jan 08, 2019 at 01:42:49AM +0900, Donghoon Han wrote:
> Add DT binding document for rotary-encoder, keycode options.
> 
> Signed-off-by: Donghoon Han 
> Cc: Dmitry Torokhov 
> Cc: Daniel Mack 
> Cc: devicet...@vger.kernel.org
> To: linux-in...@vger.kernel.org
> ---
>  .../devicetree/bindings/input/rotary-encoder.txt | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/input/rotary-encoder.txt 
> b/Documentation/devicetree/bindings/input/rotary-encoder.txt
> index f99fe5cdeaec..9986ec2af2d4 100644
> --- a/Documentation/devicetree/bindings/input/rotary-encoder.txt
> +++ b/Documentation/devicetree/bindings/input/rotary-encoder.txt
> @@ -12,6 +12,10 @@ Optional properties:
>  - rotary-encoder,relative-axis: register a relative axis rather than an
>absolute one. Relative axis will only generate +1/-1 events on the input
>device, hence no steps need to be passed.
> +- rotary-encoder,relative-keys : generate pair of key events. This setting
> +  behaves just like relative-axis, generating key events instead.
> +  (Keycodes[2] corresponds to -1/1 events.)
> +- rotary-encoder,relative-keycodes : keycodes for relative-keys

Given that keycodes are linux-specific, I think the property should be
linux,keycodes. Also, I am not sure we need separate
rotary-encoder,relative-keys property as we can infer that we want to
generate keys from presence of linux,keycodes property.

Rob, any comments?

>  - rotary-encoder,rollover: Automatic rollover when the rotary value becomes
>greater than the specified steps or smaller than 0. For absolute axis only.
>  - rotary-encoder,steps-per-period: Number of steps (stable states) per 
> period.
> @@ -48,3 +52,11 @@ Example:
>   rotary-encoder,encoding = "binary";
>   rotary-encoder,rollover;
>   };
> +
> + rotary@2 {
> + compatible = "rotary-encoder";
> + gpios = < 21 0>, < 22 0>;
> + rotary-encoder,relative-keys;
> + rotary-encoder,relative-keycode = <103>, <108>;
> + rotary-encoder,steps-per-period = <2>;
> + };
> -- 
> 2.17.1
> 

Thanks.

-- 
Dmitry


Re: [Ocfs2-devel] [PATCH] ocfs2: fix the application IO timeout when fstrim is running

2019-01-14 Thread Changwei Ge
Hi Gang,

Most parts of this patch look sane to me, just a tiny question...

On 2019/1/11 17:01, Gang He wrote:
> The user reported this problem, the upper application IO was
> timeout when fstrim was running on this ocfs2 partition. the
> application monitoring resource agent considered that this
> application did not work, then this node was fenced by the cluster
> brain (e.g. pacemaker).
> The root cause is that fstrim thread always holds main_bm meta-file
> related locks until all the cluster groups are trimmed.
> This patch will make fstrim thread release main_bm meta-file
> related locks when each cluster group is trimmed, this will let
> the current application IO has a chance to claim the clusters from
> main_bm meta-file.
> 
> Signed-off-by: Gang He 
> ---
>   fs/ocfs2/alloc.c   | 159 +
>   fs/ocfs2/dlmglue.c |   5 ++
>   fs/ocfs2/ocfs2.h   |   1 +
>   fs/ocfs2/ocfs2_trace.h |   2 +
>   fs/ocfs2/super.c   |   2 +
>   5 files changed, 106 insertions(+), 63 deletions(-)
> 
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index d1cbb27808e2..6f0999015a44 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -7532,10 +7532,11 @@ static int ocfs2_trim_group(struct super_block *sb,
>   return count;
>   }
>   
> -int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
> +static
> +int ocfs2_trim_mainbm(struct super_block *sb, struct fstrim_range *range)
>   {
>   struct ocfs2_super *osb = OCFS2_SB(sb);
> - u64 start, len, trimmed, first_group, last_group, group;
> + u64 start, len, trimmed = 0, first_group, last_group = 0, group = 0;
>   int ret, cnt;
>   u32 first_bit, last_bit, minlen;
>   struct buffer_head *main_bm_bh = NULL;
> @@ -7543,7 +7544,6 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>   struct buffer_head *gd_bh = NULL;
>   struct ocfs2_dinode *main_bm;
>   struct ocfs2_group_desc *gd = NULL;
> - struct ocfs2_trim_fs_info info, *pinfo = NULL;
>   
>   start = range->start >> osb->s_clustersize_bits;
>   len = range->len >> osb->s_clustersize_bits;
> @@ -7552,6 +7552,9 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>   if (minlen >= osb->bitmap_cpg || range->len < sb->s_blocksize)
>   return -EINVAL;
>   
> + trace_ocfs2_trim_mainbm(start, len, minlen);
> +
> +next_group:
>   main_bm_inode = ocfs2_get_system_file_inode(osb,
>   GLOBAL_BITMAP_SYSTEM_INODE,
>   OCFS2_INVALID_SLOT);
> @@ -7570,64 +7573,34 @@ int ocfs2_trim_fs(struct super_block *sb, struct 
> fstrim_range *range)
>   }
>   main_bm = (struct ocfs2_dinode *)main_bm_bh->b_data;
>   
> - if (start >= le32_to_cpu(main_bm->i_clusters)) {
> - ret = -EINVAL;
> - goto out_unlock;
> - }
> -
> - len = range->len >> osb->s_clustersize_bits;
> - if (start + len > le32_to_cpu(main_bm->i_clusters))
> - len = le32_to_cpu(main_bm->i_clusters) - start;
> -
> - trace_ocfs2_trim_fs(start, len, minlen);
> -
> - ocfs2_trim_fs_lock_res_init(osb);
> - ret = ocfs2_trim_fs_lock(osb, NULL, 1);
> - if (ret < 0) {
> - if (ret != -EAGAIN) {
> - mlog_errno(ret);
> - ocfs2_trim_fs_lock_res_uninit(osb);
> + /*
> +  * Do some check before trim the first group.
> +  */
> + if (!group) {
> + if (start >= le32_to_cpu(main_bm->i_clusters)) {
> + ret = -EINVAL;
>   goto out_unlock;
>   }
>   
> - mlog(ML_NOTICE, "Wait for trim on device (%s) to "
> -  "finish, which is running from another node.\n",
> -  osb->dev_str);
> - ret = ocfs2_trim_fs_lock(osb, , 0);
> - if (ret < 0) {
> - mlog_errno(ret);
> - ocfs2_trim_fs_lock_res_uninit(osb);
> - goto out_unlock;
> - }
> + if (start + len > le32_to_cpu(main_bm->i_clusters))
> + len = le32_to_cpu(main_bm->i_clusters) - start;
>   
> - if (info.tf_valid && info.tf_success &&
> - info.tf_start == start && info.tf_len == len &&
> - info.tf_minlen == minlen) {
> - /* Avoid sending duplicated trim to a shared device */
> - mlog(ML_NOTICE, "The same trim on device (%s) was "
> -  "just done from node (%u), return.\n",
> -  osb->dev_str, info.tf_nodenum);
> - range->len = info.tf_trimlen;
> - goto out_trimunlock;
> - }
> + /*
> +  * Determine first and last group to examine based on
> +  * start and len
> +  */
> +

Re: Real deadlock being suppressed in sbitmap

2019-01-14 Thread Steven Rostedt
On Tue, 15 Jan 2019 11:23:56 +0800
Ming Lei  wrote:

> Given 'swap_lock' can be acquired from blk_mq_dispatch_rq_list() via
> blk_mq_get_driver_tag() directly, the above deadlock may be possible.
> 
> Sounds the correct fix may be the following one, and the irqsave cost
> should be fine given sbitmap_deferred_clear is only triggered when one
> word is run out of.

Since the lockdep splat only showed SOFTIRQ issues, I figured I would
only protect it from that. Linus already accepted my patch, can you run
tests on that kernel with LOCKDEP enabled and see if it will trigger
with IRQ issues, then we can most definitely upgrade that to
spin_lock_irqsave(). But I was trying to keep the overhead down, as
that's a bit more heavy weight than a spin_lock_bh().

-- Steve


Re: Real deadlock being suppressed in sbitmap

2019-01-14 Thread Ming Lei
On Mon, Jan 14, 2019 at 08:41:16PM -0700, Jens Axboe wrote:
> On 1/14/19 8:23 PM, Ming Lei wrote:
> > Hi Steven,
> > 
> > On Mon, Jan 14, 2019 at 12:14:14PM -0500, Steven Rostedt wrote:
> >> It was brought to my attention (by this creating a splat in the RT tree
> >> too) this code:
> >>
> >> static inline bool sbitmap_deferred_clear(struct sbitmap *sb, int index)
> >> {
> >>unsigned long mask, val;
> >>unsigned long __maybe_unused flags;
> >>bool ret = false;
> >>
> >>/* Silence bogus lockdep warning */
> >> #if defined(CONFIG_LOCKDEP)
> >>local_irq_save(flags);
> >> #endif
> >>spin_lock(>map[index].swap_lock);
> >>
> >> Commit 58ab5e32e6f ("sbitmap: silence bogus lockdep IRQ warning")
> >> states the following:
> >>
> >> For this case, it's a false positive. The swap_lock is used from 
> >> process
> >> context only, when we swap the bits in the word and cleared mask. We
> >> also end up doing that when we are getting a driver tag, from the
> >> blk_mq_mark_tag_wait(), and from there we hold the waitqueue lock with
> >> IRQs disabled. However, this isn't from an actual IRQ, it's still
> >> process context.
> >>
> >> The thing is, lockdep doesn't define a lock as "irq-safe" based on it
> >> being taken under interrupts disabled or not. It detects when locks are
> >> used in actual interrupts. Further in that commit we have this:
> >>
> >>[  106.097386] fio/1043 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> >> [  106.098231] 4c43fa71
> >> (&(>map[i].swap_lock)->rlock){+.+.}, at: sbitmap_get+0xd5/0x22c
> >> [  106.099431]
> >> [  106.099431] and this task is already holding:
> >> [  106.100229] 7eec8b2f
> >> (&(>dispatch_wait_lock)->rlock){}, at:
> >> blk_mq_dispatch_rq_list+0x4c1/0xd7c
> >> [  106.101630] which would create a new lock dependency:
> >> [  106.102326]  (&(>dispatch_wait_lock)->rlock){} ->
> >> (&(>map[i].swap_lock)->rlock){+.+.}
> >>
> >> Saying that you are trying to take the swap_lock while holding the
> >> dispatch_wait_lock.
> >>
> >>
> >> [  106.103553] but this new dependency connects a SOFTIRQ-irq-safe 
> >> lock:
> >> [  106.104580]  (>ws[i].wait){..-.}
> >>
> >> Which means that there's already a chain of:
> >>
> >>  sbq->ws[i].wait -> dispatch_wait_lock
> >>
> >> [  106.104582]
> >> [  106.104582] ... which became SOFTIRQ-irq-safe at:
> >> [  106.105751]   _raw_spin_lock_irqsave+0x4b/0x82
> >> [  106.106284]   __wake_up_common_lock+0x119/0x1b9
> >> [  106.106825]   sbitmap_queue_wake_up+0x33f/0x383
> >> [  106.107456]   sbitmap_queue_clear+0x4c/0x9a
> >> [  106.108046]   __blk_mq_free_request+0x188/0x1d3
> >> [  106.108581]   blk_mq_free_request+0x23b/0x26b
> >> [  106.109102]   scsi_end_request+0x345/0x5d7
> >> [  106.109587]   scsi_io_completion+0x4b5/0x8f0
> >> [  106.110099]   scsi_finish_command+0x412/0x456
> >> [  106.110615]   scsi_softirq_done+0x23f/0x29b
> >> [  106.15]   blk_done_softirq+0x2a7/0x2e6
> >> [  106.111608]   __do_softirq+0x360/0x6ad
> >> [  106.112062]   run_ksoftirqd+0x2f/0x5b
> >> [  106.112499]   smpboot_thread_fn+0x3a5/0x3db
> >> [  106.113000]   kthread+0x1d4/0x1e4
> >> [  106.113457]   ret_from_fork+0x3a/0x50
> >>
> >>
> >> We see that sbq->ws[i].wait was taken from a softirq context.
> > 
> > Actually sbq->ws[i].wait is taken from a softirq context only in case
> > of single-queue, see __blk_mq_complete_request(). For multiple queue,
> > sbq->ws[i].wait is taken from hardirq context.
> 
> That's a good point, but that's just current implementation, we can't
> assume any of those relationsships. Any completion can happen from
> softirq or hardirq. So the patch is inadequate.
> 
> > Sounds the correct fix may be the following one, and the irqsave cost
> > should be fine given sbitmap_deferred_clear is only triggered when one
> > word is run out of.
> 
> Yes, the _bh() variant isn't going to cut it. Can you send this patch
> against Linus's master?

OK, will post it out soon.

Thanks,
Ming


  1   2   3   4   5   6   7   8   9   10   >