[PATCH] rpadlpar_io:Add MODULE_DESCRIPTION entries to kernel modules

2020-09-23 Thread Mamatha Inamdar
This patch adds a brief MODULE_DESCRIPTION to rpadlpar_io kernel modules
(descriptions taken from Kconfig file)

Signed-off-by: Mamatha Inamdar 
---
 drivers/pci/hotplug/rpadlpar_core.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/hotplug/rpadlpar_core.c 
b/drivers/pci/hotplug/rpadlpar_core.c
index f979b70..bac65ed 100644
--- a/drivers/pci/hotplug/rpadlpar_core.c
+++ b/drivers/pci/hotplug/rpadlpar_core.c
@@ -478,3 +478,4 @@ static void __exit rpadlpar_io_exit(void)
 module_init(rpadlpar_io_init);
 module_exit(rpadlpar_io_exit);
 MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("RPA Dynamic Logical Partitioning driver for I/O slots");



Re: [PATCH] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier

2020-09-23 Thread Srikar Dronamraju
* Daniel Axtens  [2020-09-24 12:35:05]:

> Hi Srikar,
> 
> 
> This looks a lot like commit d95fe371ecd2 ("cpufreq: powernv: Fix 
> frame-size-overflow in powernv_cpufreq_work_fn").
> 

Yes, very very similar.

> As with that patch, I have checked for matching puts/gets and that all
> uses of '&' check out.
> 
> I tried to look at the snowpatch tests: they apparently reported a
> checkpatch warning but the file has since disappeared so I can't see
> what it was. Running checkpatch locally:
> 
> $ scripts/checkpatch.pl -g HEAD -strict
> WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per 
> line)
> #15: 
> make[3]: *** [./scripts/Makefile.build:316: 
> drivers/cpufreq/powernv-cpufreq.o] Error 1
> 
> This is benign and you shouldn't wrap that line anyway.
> 
> On that basis:
> 
> Reviewed-by: Daniel Axtens 
> 

Thanks Daniel.

> Kind regards,
> Daniel
> 
-- 
Thanks and Regards
Srikar Dronamraju


Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed

2020-09-23 Thread Alexey Kardashevskiy



On 23/09/2020 17:06, Cédric Le Goater wrote:
> On 9/23/20 2:33 AM, Qian Cai wrote:
>> On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote:
>>> When a passthrough IO adapter is removed from a pseries machine using
>>> hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
>>> guest OS to clear all page table entries related to the adapter. If
>>> some are still present, the RTAS call which isolates the PCI slot
>>> returns error 9001 "valid outstanding translations" and the removal of
>>> the IO adapter fails. This is because when the PHBs are scanned, Linux
>>> maps automatically the INTx interrupts in the Linux interrupt number
>>> space but these are never removed.
>>>
>>> To solve this problem, we introduce a PPC platform specific
>>> pcibios_remove_bus() routine which clears all interrupt mappings when
>>> the bus is removed. This also clears the associated page table entries
>>> of the ESB pages when using XIVE.
>>>
>>> For this purpose, we record the logical interrupt numbers of the
>>> mapped interrupt under the PHB structure and let pcibios_remove_bus()
>>> do the clean up.
>>>
>>> Since some PCI adapters, like GPUs, use the "interrupt-map" property
>>> to describe interrupt mappings other than the legacy INTx interrupts,
>>> we can not restrict the size of the mapping array to PCI_NUM_INTX. The
>>> number of interrupt mappings is computed from the "interrupt-map"
>>> property and the mapping array is allocated accordingly.
>>>
>>> Cc: "Oliver O'Halloran" 
>>> Cc: Alexey Kardashevskiy 
>>> Signed-off-by: Cédric Le Goater 
>>
>> Some syscall fuzzing will trigger this on POWER9 NV where the traces pointed 
>> to
>> this patch.
>>
>> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config
> 
> OK. The patch is missing a NULL assignement after kfree() and that
> might be the issue. 
> 
> I did try PHB removal under PowerNV, so I would like to understand 
> how we managed to remove twice the PCI bus and possibly reproduce. 
> Any chance we could grab what the syscall fuzzer (syzkaller) did ? 



My guess would be it is doing this in parallel to provoke races.



-- 
Alexey


[PATCH] powerpc: switch 85xx defconfigs from legacy ide to libata

2020-09-23 Thread Christoph Hellwig
Switch the 85xx defconfigs from the soon to be removed legacy ide
driver to libata.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/configs/85xx/mpc85xx_cds_defconfig | 6 +++---
 arch/powerpc/configs/85xx/tqm8540_defconfig | 6 +++---
 arch/powerpc/configs/85xx/tqm8541_defconfig | 6 +++---
 arch/powerpc/configs/85xx/tqm8555_defconfig | 6 +++---
 arch/powerpc/configs/85xx/tqm8560_defconfig | 6 +++---
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/configs/85xx/mpc85xx_cds_defconfig 
b/arch/powerpc/configs/85xx/mpc85xx_cds_defconfig
index 0683d8c292a89b..cea72e85ed2614 100644
--- a/arch/powerpc/configs/85xx/mpc85xx_cds_defconfig
+++ b/arch/powerpc/configs/85xx/mpc85xx_cds_defconfig
@@ -29,9 +29,9 @@ CONFIG_SYN_COOKIES=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=32768
-CONFIG_IDE=y
-CONFIG_BLK_DEV_GENERIC=y
-CONFIG_BLK_DEV_VIA82CXXX=y
+CONFIG_ATA=y
+CONFIG_ATA_GENERIC=y
+CONFIG_PATA_VIA=y
 CONFIG_NETDEVICES=y
 CONFIG_GIANFAR=y
 CONFIG_E1000=y
diff --git a/arch/powerpc/configs/85xx/tqm8540_defconfig 
b/arch/powerpc/configs/85xx/tqm8540_defconfig
index 98982a0e82d804..bbf040aa1f9aa7 100644
--- a/arch/powerpc/configs/85xx/tqm8540_defconfig
+++ b/arch/powerpc/configs/85xx/tqm8540_defconfig
@@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=32768
-CONFIG_IDE=y
-CONFIG_BLK_DEV_GENERIC=y
-CONFIG_BLK_DEV_VIA82CXXX=y
+CONFIG_ATA=y
+CONFIG_ATA_GENERIC=y
+CONFIG_PATA_VIA=y
 CONFIG_NETDEVICES=y
 CONFIG_GIANFAR=y
 CONFIG_E100=y
diff --git a/arch/powerpc/configs/85xx/tqm8541_defconfig 
b/arch/powerpc/configs/85xx/tqm8541_defconfig
index a6e21db1dafe3a..523ad8dcfd9d08 100644
--- a/arch/powerpc/configs/85xx/tqm8541_defconfig
+++ b/arch/powerpc/configs/85xx/tqm8541_defconfig
@@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=32768
-CONFIG_IDE=y
-CONFIG_BLK_DEV_GENERIC=y
-CONFIG_BLK_DEV_VIA82CXXX=y
+CONFIG_ATA=y
+CONFIG_ATA_GENERIC=y
+CONFIG_PATA_VIA=y
 CONFIG_NETDEVICES=y
 CONFIG_GIANFAR=y
 CONFIG_E100=y
diff --git a/arch/powerpc/configs/85xx/tqm8555_defconfig 
b/arch/powerpc/configs/85xx/tqm8555_defconfig
index ca1de3979474ba..0032ce1e8c9cbf 100644
--- a/arch/powerpc/configs/85xx/tqm8555_defconfig
+++ b/arch/powerpc/configs/85xx/tqm8555_defconfig
@@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=32768
-CONFIG_IDE=y
-CONFIG_BLK_DEV_GENERIC=y
-CONFIG_BLK_DEV_VIA82CXXX=y
+CONFIG_ATA=y
+CONFIG_ATA_GENERIC=y
+CONFIG_PATA_VIA=y
 CONFIG_NETDEVICES=y
 CONFIG_GIANFAR=y
 CONFIG_E100=y
diff --git a/arch/powerpc/configs/85xx/tqm8560_defconfig 
b/arch/powerpc/configs/85xx/tqm8560_defconfig
index ca3b8c8ef30f9c..a80b971f7d6e0b 100644
--- a/arch/powerpc/configs/85xx/tqm8560_defconfig
+++ b/arch/powerpc/configs/85xx/tqm8560_defconfig
@@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=32768
-CONFIG_IDE=y
-CONFIG_BLK_DEV_GENERIC=y
-CONFIG_BLK_DEV_VIA82CXXX=y
+CONFIG_ATA=y
+CONFIG_ATA_GENERIC=y
+CONFIG_PATA_VIA=y
 CONFIG_NETDEVICES=y
 CONFIG_GIANFAR=y
 CONFIG_E100=y
-- 
2.28.0



Re: [PATCH -next v2] powerpc/perf: Fix symbol undeclared warning

2020-09-23 Thread Athira Rajeev



> On 23-Sep-2020, at 12:44 PM, Wang Wensheng  wrote:
> 
> Build kernel with `C=2`:
> arch/powerpc/perf/isa207-common.c:24:18: warning: symbol
> 'isa207_pmu_format_attr' was not declared. Should it be static?
> arch/powerpc/perf/power9-pmu.c:101:5: warning: symbol 'p9_dd21_bl_ev'
> was not declared. Should it be static?
> arch/powerpc/perf/power9-pmu.c:115:5: warning: symbol 'p9_dd22_bl_ev'
> was not declared. Should it be static?
> 
> Those symbols are used only in the files that define them so we declare
> them as static to fix the warnings.

Hi, 

Looks fine to me. 

Reviewed-by: Athira Rajeev 

Thanks
Athira
> 
> Signed-off-by: Wang Wensheng 
> ---
> arch/powerpc/perf/isa207-common.c | 2 +-
> arch/powerpc/perf/power9-pmu.c| 4 ++--
> 2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/perf/isa207-common.c 
> b/arch/powerpc/perf/isa207-common.c
> index 964437adec18..85dc860b265b 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -21,7 +21,7 @@ PMU_FORMAT_ATTR(thresh_stop,"config:32-35");
> PMU_FORMAT_ATTR(thresh_start, "config:36-39");
> PMU_FORMAT_ATTR(thresh_cmp,   "config:40-49");
> 
> -struct attribute *isa207_pmu_format_attr[] = {
> +static struct attribute *isa207_pmu_format_attr[] = {
>   _attr_event.attr,
>   _attr_pmcxsel.attr,
>   _attr_mark.attr,
> diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
> index 2a57e93a79dc..4a315fad1f99 100644
> --- a/arch/powerpc/perf/power9-pmu.c
> +++ b/arch/powerpc/perf/power9-pmu.c
> @@ -98,7 +98,7 @@ extern u64 PERF_REG_EXTENDED_MASK;
> /* PowerISA v2.07 format attribute structure*/
> extern struct attribute_group isa207_pmu_format_group;
> 
> -int p9_dd21_bl_ev[] = {
> +static int p9_dd21_bl_ev[] = {
>   PM_MRK_ST_DONE_L2,
>   PM_RADIX_PWC_L1_HIT,
>   PM_FLOP_CMPL,
> @@ -112,7 +112,7 @@ int p9_dd21_bl_ev[] = {
>   PM_DISP_HELD_SYNC_HOLD,
> };
> 
> -int p9_dd22_bl_ev[] = {
> +static int p9_dd22_bl_ev[] = {
>   PM_DTLB_MISS_16G,
>   PM_DERAT_MISS_2M,
>   PM_DTLB_MISS_2M,
> -- 
> 2.25.0
> 



Re: [PATCH V2] powerpc/perf: Exclude pmc5/6 from the irrelevant PMU group constraints

2020-09-23 Thread Athira Rajeev



> On 22-Sep-2020, at 4:16 PM, Paul A. Clarke  wrote:
> 
> Just one nit in a comment below...
> (and this is not worthy of tags like "reviewed-by" ;-)
> 
> On Mon, Sep 21, 2020 at 03:10:04AM -0400, Athira Rajeev wrote:
>> PMU counter support functions enforces event constraints for group of
>> events to check if all events in a group can be monitored. Incase of
>> event codes using PMC5 and PMC6 ( 500fa and 600f4 respectively ),
>> not all constraints are applicable, say the threshold or sample bits.
>> But current code includes pmc5 and pmc6 in some group constraints (like
>> IC_DC Qualifier bits) which is actually not applicable and hence results
>> in those events not getting counted when scheduled along with group of
>> other events. Patch fixes this by excluding PMC5/6 from constraints
>> which are not relevant for it.
>> 
>> Fixes: 7ffd948 ("powerpc/perf: factor out power8 pmu functions")
>> Signed-off-by: Athira Rajeev 
>> ---
> 
>> diff --git a/arch/powerpc/perf/isa207-common.c 
>> b/arch/powerpc/perf/isa207-common.c
>> index 964437a..12153da 100644
>> --- a/arch/powerpc/perf/isa207-common.c
>> +++ b/arch/powerpc/perf/isa207-common.c
>> @@ -288,6 +288,15 @@ int isa207_get_constraint(u64 event, unsigned long 
>> *maskp, unsigned long *valp)
>> 
>>  mask  |= CNST_PMC_MASK(pmc);
>>  value |= CNST_PMC_VAL(pmc);
>> +
>> +/*
>> + * PMC5 and PMC6 are used to count cycles and instructions
>> + * and these doesnot support most of the constraint bits.
> 
> s/doesnot/do not/

Hi Paul,

Thanks for checking the patch and sharing the comment. 

Athira
> 
>> + * Add a check to exclude PMC5/6 from most of the constraints
>> + * except for ebb/bhrb.
>> + */
>> +if (pmc >= 5)
>> +goto ebb_bhrb;
> 
> PC



Re: [PATCH] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier

2020-09-23 Thread Daniel Axtens
Hi Srikar,

> The patch avoids allocating cpufreq_policy on stack hence fixing frame
> size overflow in 'powernv_cpufreq_reboot_notifier'
>
> ./drivers/cpufreq/powernv-cpufreq.c: In function 
> _powernv_cpufreq_reboot_notifier_:
> ./drivers/cpufreq/powernv-cpufreq.c:906:1: error: the frame size of 2064 
> bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>  }
>  ^
> cc1: all warnings being treated as errors
> make[3]: *** [./scripts/Makefile.build:316: 
> drivers/cpufreq/powernv-cpufreq.o] Error 1
> make[2]: *** [./scripts/Makefile.build:556: drivers/cpufreq] Error 2
> make[1]: *** [./Makefile:1072: drivers] Error 2
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:157: sub-make] Error 2
>

This looks a lot like commit d95fe371ecd2 ("cpufreq: powernv: Fix 
frame-size-overflow in powernv_cpufreq_work_fn").

As with that patch, I have checked for matching puts/gets and that all
uses of '&' check out.

I tried to look at the snowpatch tests: they apparently reported a
checkpatch warning but the file has since disappeared so I can't see
what it was. Running checkpatch locally:

$ scripts/checkpatch.pl -g HEAD -strict
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per 
line)
#15: 
make[3]: *** [./scripts/Makefile.build:316: drivers/cpufreq/powernv-cpufreq.o] 
Error 1

This is benign and you shouldn't wrap that line anyway.

On that basis:

Reviewed-by: Daniel Axtens 

Kind regards,
Daniel

> Fixes: cf30af76 ("cpufreq: powernv: Set the cpus to nominal frequency during 
> reboot/kexec")
> Cc: Pratik Rajesh Sampat 
> Cc: Daniel Axtens 
> Cc: Michael Ellerman 
> Cc: linuxppc-dev 
> Signed-off-by: Srikar Dronamraju 
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
> b/drivers/cpufreq/powernv-cpufreq.c
> index a9af15e..e439b43 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -885,12 +885,15 @@ static int powernv_cpufreq_reboot_notifier(struct 
> notifier_block *nb,
>   unsigned long action, void *unused)
>  {
>   int cpu;
> - struct cpufreq_policy cpu_policy;
> + struct cpufreq_policy *cpu_policy;
>  
>   rebooting = true;
>   for_each_online_cpu(cpu) {
> - cpufreq_get_policy(_policy, cpu);
> - powernv_cpufreq_target_index(_policy, get_nominal_index());
> + cpu_policy = cpufreq_cpu_get(cpu);
> + if (!cpu_policy)
> + continue;
> + powernv_cpufreq_target_index(cpu_policy, get_nominal_index());
> + cpufreq_cpu_put(cpu_policy);
>   }
>  
>   return NOTIFY_DONE;
> -- 
> 1.8.3.1


[PATCH] powerpc: PPC_SECURE_BOOT should not require PowerNV

2020-09-23 Thread Daniel Axtens
In commit 61f879d97ce4 ("powerpc/pseries: Detect secure and trusted
boot state of the system.") we taught the kernel how to understand the
secure-boot parameters used by a pseries guest.

However, CONFIG_PPC_SECURE_BOOT still requires PowerNV. I didn't
catch this because pseries_le_defconfig includes support for
PowerNV and so everything still worked. Indeed, most configs will.
Nonetheless, technically PPC_SECURE_BOOT doesn't require PowerNV
any more.

The secure variables support (PPC_SECVAR_SYSFS) doesn't do anything
on pSeries yet, but I don't think it's worth adding a new condition -
at some stage we'll want to add a backend for pSeries anyway.

Fixes: 61f879d97ce4 ("powerpc/pseries: Detect secure and trusted boot state of 
the system.")
Cc: Nayna Jain 
Signed-off-by: Daniel Axtens 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4b33477dafb8..f645fa934853 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -983,7 +983,7 @@ config PPC_MEM_KEYS
 config PPC_SECURE_BOOT
prompt "Enable secure boot support"
bool
-   depends on PPC_POWERNV
+   depends on PPC_POWERNV || PPC_PSERIES
depends on IMA_ARCH_POLICY
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
help
-- 
2.25.1



RE: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread David Laight
From: Arnd Bergmann
> Sent: 23 September 2020 19:46
...
> Regardless of that, another advantage of having the SYSCALL_DECLAREx()
> would be the ability to include that header file from elsewhere with a 
> different
> macro definition to create a machine-readable version of the interface when
> combined with the syscall.tbl files. This could be used to create a user
> space stub for calling into the low-level syscall regardless of the
> libc interfaces,
> or for synchronizing the interfaces with strace, qemu-user, or anything that
> needs to deal with the low-level interface.

A similar 'trick' (that probably won't work here) is to pass
the name of a #define function as a parameter to another define.
Useful for defining constants and error strings together. eg:
#define TRAFFIC_LIGHTS(x) \
x(RED, 0, "Red") \
x(YELLOW, 1, "Yellow) \
x(GREEN, 2, "GREEN)

You can then do thing like:
#define x(token, value, string) token = value,
enum {TRAFFIC_LIGHTS(x) NUM_LIGHTS};
#undef x
#define x(token, value, string) [value] = string,
const char *colours[] = {TRAFFIC_LIGHTS(x)};
#undef x
to initialise constants and a name table that are always in sync.

It is also a good way to generate source lines that are over 1MB.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Steven Rostedt
On Wed, 23 Sep 2020 22:55:54 +0200
Thomas Gleixner  wrote:

> > Perhaps make migrate_disable() an anonymous local_lock()?
> >
> > This should lower the SHC in theory, if you can't have stacked migrate
> > disables on the same CPU.  
> 
> I'm pretty sure this ends up in locking hell pretty fast and aside of
> that it's not working for scenarios like:
> 
>  kmap_local();
>migrate_disable();
>...
> 
>  copy_from_user()
> -> #PF
>-> schedule()  
> 
> which brought us into that discussion in the first place. You would stop
> any other migrate disable user from running until the page fault is
> resolved...

Then scratch the idea of having anonymous local_lock() and just bring
local_lock in directly? Then have a kmap local lock, which would only
block those that need to do a kmap.

Now as for migration disabled nesting, at least now we would have
groupings of this, and perhaps the theorists can handle that. I mean,
how is this much different that having a bunch of tasks blocked on a
mutex with the owner is pinned on a CPU?

migrate_disable() is a BKL of pinning affinity. If we only have
local_lock() available (even on !RT), then it makes the blocking in
groups. At least this way you could grep for all the different
local_locks in the system and plug that into the algorithm for WCS,
just like one would with a bunch of mutexes.

-- Steve


Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-23 Thread Chris Packham

On 24/09/20 8:27 am, Heiner Kallweit wrote:
> On 04.09.2020 02:28, Chris Packham wrote:
>> The SPIE register contains counts for the TX FIFO so any time the irq
>> handler was invoked we would attempt to process the RX/TX fifos. Use the
>> SPIM value to mask the events so that we only process interrupts that
>> were expected.
>>
>> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
>> Implement soft interrupt replay in C").
>>
>> Signed-off-by: Chris Packham 
>> Cc: sta...@vger.kernel.org
>> ---
>>
>> Notes:
>>  I've tested this on a T2080RDB and a custom board using the T2081 SoC. 
>> With
>>  this change I don't see any spurious instances of the "Transfer done but
>>  SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" 
>> messages
>>  and the updates to spi flash are successful.
>>  
>>  I think this should go into the stable trees that contain 3282a3da25bd 
>> but I
>>  haven't added a Fixes: tag because I think 3282a3da25bd exposed the 
>> issue as
>>  opposed to causing it.
>>
>>   drivers/spi/spi-fsl-espi.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
>> index 7e7c92cafdbb..cb120b68c0e2 100644
>> --- a/drivers/spi/spi-fsl-espi.c
>> +++ b/drivers/spi/spi-fsl-espi.c
>> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, 
>> u32 events)
>>   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>>   {
>>  struct fsl_espi *espi = context_data;
>> -u32 events;
>> +u32 events, mask;
>>   
>>  spin_lock(>lock);
>>   
>>  /* Get interrupt events(tx/rx) */
>>  events = fsl_espi_read_reg(espi, ESPI_SPIE);
>> -if (!events) {
>> +mask = fsl_espi_read_reg(espi, ESPI_SPIM);
>> +if (!(events & mask)) {
>>  spin_unlock(>lock);
>>  return IRQ_NONE;
> Sorry, I was on vacation and therefore couldn't comment earlier.
> I'm fine with the change, just one thing could be improved IMO.
> If we skip an unneeded interrupt now, then returning IRQ_NONE
> causes reporting this interrupt as spurious. This isn't too nice
> as spurious interrupts typically are seen as a problem indicator.
> Therefore returning IRQ_HANDLED should be more appropriate.
> This would just require a comment in the code explaining why we
> do this, and why it can happen that we receive interrupts
> we're not interested in.
I'd be happy to send a follow-up to change IRQ_NONE to IRQ_HANDLED. I 
don't think the old code could have ever hit the IRQ_NONE (because event 
will always be non-zero) so it won't really be a change in behaviour. 
With the patch (that is now in spi/for-next) so far I do see a low 
number of spurious interrupts on the test setup where previously I would 
have seen failure to talk to the spi-flash.


Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Thomas Gleixner
On Wed, Sep 23 2020 at 11:52, Steven Rostedt wrote:
> On Wed, 23 Sep 2020 10:40:32 +0200
> pet...@infradead.org wrote:
>
>> However, with migrate_disable() we can have each task preempted in a
>> migrate_disable() region, worse we can stack them all on the _same_ CPU
>> (super ridiculous odds, sure). And then we end up only able to run one
>> task, with the rest of the CPUs picking their nose.
>
> What if we just made migrate_disable() a local_lock() available for !RT?
>
> I mean make it a priority inheritance PER CPU lock.
>
> That is, no two tasks could do a migrate_disable() on the same CPU? If
> one task does a migrate_disable() and then gets preempted and the
> preempting task does a migrate_disable() on the same CPU, it will block
> and wait for the first task to do a migrate_enable().
>
> No two tasks on the same CPU could enter the migrate_disable() section
> simultaneously, just like no two tasks could enter a preempt_disable()
> section.
>
> In essence, we just allow local_lock() to be used for both RT and !RT.
>
> Perhaps make migrate_disable() an anonymous local_lock()?
>
> This should lower the SHC in theory, if you can't have stacked migrate
> disables on the same CPU.

I'm pretty sure this ends up in locking hell pretty fast and aside of
that it's not working for scenarios like:

 kmap_local();
   migrate_disable();
   ...

 copy_from_user()
-> #PF
   -> schedule()

which brought us into that discussion in the first place. You would stop
any other migrate disable user from running until the page fault is
resolved...

Thanks,

tglx


Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-23 Thread Heiner Kallweit
On 04.09.2020 02:28, Chris Packham wrote:
> The SPIE register contains counts for the TX FIFO so any time the irq
> handler was invoked we would attempt to process the RX/TX fifos. Use the
> SPIM value to mask the events so that we only process interrupts that
> were expected.
> 
> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
> Implement soft interrupt replay in C").
> 
> Signed-off-by: Chris Packham 
> Cc: sta...@vger.kernel.org
> ---
> 
> Notes:
> I've tested this on a T2080RDB and a custom board using the T2081 SoC. 
> With
> this change I don't see any spurious instances of the "Transfer done but
> SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" 
> messages
> and the updates to spi flash are successful.
> 
> I think this should go into the stable trees that contain 3282a3da25bd 
> but I
> haven't added a Fixes: tag because I think 3282a3da25bd exposed the issue 
> as
> opposed to causing it.
> 
>  drivers/spi/spi-fsl-espi.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
> index 7e7c92cafdbb..cb120b68c0e2 100644
> --- a/drivers/spi/spi-fsl-espi.c
> +++ b/drivers/spi/spi-fsl-espi.c
> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 
> events)
>  static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>  {
>   struct fsl_espi *espi = context_data;
> - u32 events;
> + u32 events, mask;
>  
>   spin_lock(>lock);
>  
>   /* Get interrupt events(tx/rx) */
>   events = fsl_espi_read_reg(espi, ESPI_SPIE);
> - if (!events) {
> + mask = fsl_espi_read_reg(espi, ESPI_SPIM);
> + if (!(events & mask)) {
>   spin_unlock(>lock);
>   return IRQ_NONE;

Sorry, I was on vacation and therefore couldn't comment earlier.
I'm fine with the change, just one thing could be improved IMO.
If we skip an unneeded interrupt now, then returning IRQ_NONE
causes reporting this interrupt as spurious. This isn't too nice
as spurious interrupts typically are seen as a problem indicator.
Therefore returning IRQ_HANDLED should be more appropriate.
This would just require a comment in the code explaining why we
do this, and why it can happen that we receive interrupts
we're not interested in.

>   }
> 



Re: [PATCH v6 04/11] PCI: designware-ep: Modify MSI and MSIX CAP way of finding

2020-09-23 Thread Bjorn Helgaas
s/MSIX/MSI-X/ (subject and below)

On Sat, Mar 14, 2020 at 11:30:31AM +0800, Xiaowei Bao wrote:
> Each PF of EP device should have it's own MSI or MSIX capabitily
> struct, so create a dw_pcie_ep_func struct and remove the msi_cap
> and msix_cap to this struct from dw_pcie_ep, and manage the PFs
> with a list.

s/capabitily/capability/

I know Lorenzo has already applied this, but for the future, or
in case there are other reasons to update this patch.

There are a bunch of unnecessary initializations below for future
cleanup.

> Signed-off-by: Xiaowei Bao 
> ---
> v3:
>  - This is a new patch, to fix the issue of MSI and MSIX CAP way of
>finding.
> v4:
>  - Correct some word of commit message.
> v5:
>  - No change.
> v6:
>  - Fix up the compile error.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 135 
> +---
>  drivers/pci/controller/dwc/pcie-designware.h|  18 +++-
>  2 files changed, 134 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
> b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 933bb89..fb915f2 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -19,6 +19,19 @@ void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
>   pci_epc_linkup(epc);
>  }
>  
> +struct dw_pcie_ep_func *
> +dw_pcie_ep_get_func_from_ep(struct dw_pcie_ep *ep, u8 func_no)
> +{
> + struct dw_pcie_ep_func *ep_func;
> +
> + list_for_each_entry(ep_func, >func_list, list) {
> + if (ep_func->func_no == func_no)
> + return ep_func;
> + }
> +
> + return NULL;
> +}
> +
>  static unsigned int dw_pcie_ep_func_select(struct dw_pcie_ep *ep, u8 func_no)
>  {
>   unsigned int func_offset = 0;
> @@ -59,6 +72,47 @@ void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum 
> pci_barno bar)
>   __dw_pcie_ep_reset_bar(pci, func_no, bar, 0);
>  }
>  
> +static u8 __dw_pcie_ep_find_next_cap(struct dw_pcie_ep *ep, u8 func_no,
> + u8 cap_ptr, u8 cap)
> +{
> + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + unsigned int func_offset = 0;

Unnecessary initialization.

> + u8 cap_id, next_cap_ptr;
> + u16 reg;
> +
> + if (!cap_ptr)
> + return 0;
> +
> + func_offset = dw_pcie_ep_func_select(ep, func_no);
> +
> + reg = dw_pcie_readw_dbi(pci, func_offset + cap_ptr);
> + cap_id = (reg & 0x00ff);
> +
> + if (cap_id > PCI_CAP_ID_MAX)
> + return 0;
> +
> + if (cap_id == cap)
> + return cap_ptr;
> +
> + next_cap_ptr = (reg & 0xff00) >> 8;
> + return __dw_pcie_ep_find_next_cap(ep, func_no, next_cap_ptr, cap);
> +}
> +
> +static u8 dw_pcie_ep_find_capability(struct dw_pcie_ep *ep, u8 func_no, u8 
> cap)
> +{
> + struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> + unsigned int func_offset = 0;

Unnecessary initialization.

> + u8 next_cap_ptr;
> + u16 reg;
> +
> + func_offset = dw_pcie_ep_func_select(ep, func_no);
> +
> + reg = dw_pcie_readw_dbi(pci, func_offset + PCI_CAPABILITY_LIST);
> + next_cap_ptr = (reg & 0x00ff);
> +
> + return __dw_pcie_ep_find_next_cap(ep, func_no, next_cap_ptr, cap);
> +}
> +
>  static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
>  struct pci_epf_header *hdr)
>  {
> @@ -246,13 +300,18 @@ static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 
> func_no)
>   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>   u32 val, reg;
>   unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> + struct dw_pcie_ep_func *ep_func;
>  
> - if (!ep->msi_cap)
> + ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> + if (!ep_func)
> + return -EINVAL;
> +
> + if (!ep_func->msi_cap)
>   return -EINVAL;
>  
>   func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> - reg = ep->msi_cap + func_offset + PCI_MSI_FLAGS;
> + reg = ep_func->msi_cap + func_offset + PCI_MSI_FLAGS;
>   val = dw_pcie_readw_dbi(pci, reg);
>   if (!(val & PCI_MSI_FLAGS_ENABLE))
>   return -EINVAL;
> @@ -268,13 +327,18 @@ static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 
> func_no, u8 interrupts)
>   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>   u32 val, reg;
>   unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> + struct dw_pcie_ep_func *ep_func;
> +
> + ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> + if (!ep_func)
> + return -EINVAL;
>  
> - if (!ep->msi_cap)
> + if (!ep_func->msi_cap)
>   return -EINVAL;
>  
>   func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> - reg = ep->msi_cap + func_offset + PCI_MSI_FLAGS;
> + reg = ep_func->msi_cap + func_offset + PCI_MSI_FLAGS;
>   val = dw_pcie_readw_dbi(pci, reg);
>   val &= ~PCI_MSI_FLAGS_QMASK;
>   val 

Re: [PATCH v6 0/3] Carry forward IMA measurement log on kexec on ARM64

2020-09-23 Thread Mimi Zohar
[Cc'ing Nayna Jain, linuxppc-dev@lists.ozlabs.org]

Hi Lakshmi,

On Tue, 2020-09-08 at 16:08 -0700, Lakshmi Ramasubramanian wrote:
> On kexec file load Integrity Measurement Architecture(IMA) subsystem
> may verify the IMA signature of the kernel and initramfs, and measure
> it. The command line parameters passed to the kernel in the kexec call
> may also be measured by IMA. A remote attestation service can verify
> the measurement through the IMA log and the TPM PCR data. This can be
> achieved only if the IMA measurement log is carried over from
> the current kernel to the next kernel across the kexec call.
> However in the current implementation the IMA measurement logs are not
> carried over on ARM64 platforms. Therefore a remote attestation service
> cannot verify the authenticity of the running kernel on ARM64 platforms
> when the kernel is updated through the kexec system call.
> 
> This patch series adds support for carrying forward the IMA measurement
> log on kexec on ARM64. powerpc already supports carrying forward
> the IMA measurement log on kexec.
> 
> This series refactors the platform independent code defined for powerpc
> such that it can be reused for ARM64 as well. A chosen node namely
> "linux,ima-kexec-buffer" is added to the DTB for ARM64 to hold
> the address and the size of the memory reserved to carry
> the IMA measurement log.
> 
> This patch series has been tested for ARM64 platform using QEMU.
> I would like help from the community for testing this change on powerpc.
> Thanks.

Getting access to and upgrading our group's Power system firmware took
time due to limited employee site access.  Confirmed, with these
patches applied, ima-evm-utils still properly verifies carrying the IMA
measurement list across kexec.

I plan on reviewing this patch set shortly.

thanks,

Mimi

> 
> This series is based on commit f4d51dffc6c0 ("Linux 5.9-rc4") in
> https://github.com/torvalds/linux "master" branch.
> 
> Changelog:
> 
> v6:
>   - Remove any existing FDT_PROP_IMA_KEXEC_BUFFER property in the device
> tree and also its corresponding memory reservation in the currently
> running kernel.
>   - Moved the function remove_ima_buffer() defined for powerpc to IMA
> and renamed the function to ima_remove_kexec_buffer(). Also, moved
> delete_fdt_mem_rsv() from powerpc to IMA.
> 
> v5:
>   - Merged get_addr_size_cells() and do_get_kexec_buffer() into a single
> function when moving the arch independent code from powerpc to IMA
>   - Reverted the change to use FDT functions in powerpc code and added
> back the original code in get_addr_size_cells() and
> do_get_kexec_buffer() for powerpc.
>   - Added fdt_add_mem_rsv() for ARM64 to reserve the memory for
> the IMA log buffer during kexec.
>   - Fixed the warning reported by kernel test bot for ARM64
> arch_ima_add_kexec_buffer() - moved this function to a new file
> namely arch/arm64/kernel/ima_kexec.c
> 
> v4:
>   - Submitting the patch series on behalf of the original author
> Prakhar Srivastava 
>   - Moved FDT_PROP_IMA_KEXEC_BUFFER ("linux,ima-kexec-buffer") to
> libfdt.h so that it can be shared by multiple platforms.
> 
> v3:
> Breakup patches further into separate patches.
>   - Refactoring non architecture specific code out of powerpc
>   - Update powerpc related code to use fdt functions
>   - Update IMA buffer read related code to use of functions
>   - Add support to store the memory information of the IMA
> measurement logs to be carried forward.
>   - Update the property strings to align with documented nodes
> https://github.com/devicetree-org/dt-schema/pull/46
> 
> v2:
>   Break patches into separate patches.
>   - Powerpc related Refactoring
>   - Updating the docuemntation for chosen node
>   - Updating arm64 to support IMA buffer pass
> 
> v1:
>   Refactoring carrying over IMA measuremnet logs over Kexec. This patch
> moves the non-architecture specific code out of powerpc and adds to
> security/ima.(Suggested by Thiago)
>   Add Documentation regarding the ima-kexec-buffer node in the chosen
> node documentation
> 
> v0:
>   Add a layer of abstraction to use the memory reserved by device tree
> for ima buffer pass.
>   Add support for ima buffer pass using reserved memory for arm64 kexec.
> Update the arch sepcific code path in kexec file load to store the
> ima buffer in the reserved memory. The same reserved memory is read
> on kexec or cold boot.
> 
> Lakshmi Ramasubramanian (3):
>   powerpc: Refactor kexec functions to move arch independent code to IMA
>   arm64: Store IMA log information in kimage used for kexec
>   arm64: Add IMA kexec buffer to DTB
> 
>  arch/arm64/Kconfig  |   1 +
>  arch/arm64/include/asm/ima.h|  18 
>  arch/arm64/include/asm/kexec.h  |   3 +
>  arch/arm64/kernel/Makefile  |   1 +
>  arch/arm64/kernel/ima_kexec.c   |  34 
>  

Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Arnd Bergmann
On Wed, Sep 23, 2020 at 9:48 PM Al Viro  wrote:
> FWIW, after playing with that for a while...  Do we really want the
> compat_sys_...() declarations to live in linux/compat.h?  Most of
> the users of that file don't want those; why not move them to
> linux/syscalls.h?

Sure, let's do that. The trend overall is to integrate the compat stuff
more closely into where the native implementation lives, so this
would just follow that trend.

I think with Christoph's latest patches, about half of them are
going away as well.

> Reason: there's a lot more users of linux/compat.h than those of
> linux/syscalls.h - it's pulled by everything in the networking stack,
> for starters...

Right, the network headers pull in almost everything else through
multiple indirect inclusions, anything we can do to reduce that
helps.

 Arnd


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 08:45:51PM +0200, Arnd Bergmann wrote:
> On Wed, Sep 23, 2020 at 6:38 PM Al Viro  wrote:
> >
> > I wonder if we should do something like
> >
> > SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >  unsigned long, vlen);
> > in syscalls.h instead, and not under that ifdef.
> >
> > Let it expand to declaration of sys_...() in generic case and, on x86, into
> > __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> > what SYSCALL_DEFINE ends up using.
> >
> > Similar macro would cover compat_sys_...() declarations.  That would
> > restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> > be terribly high - cpp would have more to chew through in syscalls.h,
> > but it shouldn't be all that costly.  Famous last words, of course...
> >
> > Does anybody see fundamental problems with that?
> 
> I've had some ideas along those lines in the past and I think it should work.
> 
> As a variation of this, the SYSCALL_DEFINEx() macros could go away
> entirely, leaving only the macro instantiations from the header to
> require that syntax. It would require first changing the remaining
> architectures to build the syscall table from C code instead of
> assembler though.
> 
> Regardless of that, another advantage of having the SYSCALL_DECLAREx()
> would be the ability to include that header file from elsewhere with a 
> different
> macro definition to create a machine-readable version of the interface when
> combined with the syscall.tbl files. This could be used to create a user
> space stub for calling into the low-level syscall regardless of the
> libc interfaces,
> or for synchronizing the interfaces with strace, qemu-user, or anything that
> needs to deal with the low-level interface.

FWIW, after playing with that for a while...  Do we really want the
compat_sys_...() declarations to live in linux/compat.h?  Most of
the users of that file don't want those; why not move them to
linux/syscalls.h?

Reason: there's a lot more users of linux/compat.h than those of
linux/syscalls.h - it's pulled by everything in the networking stack,
for starters...


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Arnd Bergmann
On Wed, Sep 23, 2020 at 6:38 PM Al Viro  wrote:
>
> I wonder if we should do something like
>
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>  unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
>
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
>
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
>
> Does anybody see fundamental problems with that?

I've had some ideas along those lines in the past and I think it should work.

As a variation of this, the SYSCALL_DEFINEx() macros could go away
entirely, leaving only the macro instantiations from the header to
require that syntax. It would require first changing the remaining
architectures to build the syscall table from C code instead of
assembler though.

Regardless of that, another advantage of having the SYSCALL_DECLAREx()
would be the ability to include that header file from elsewhere with a different
macro definition to create a machine-readable version of the interface when
combined with the syscall.tbl files. This could be used to create a user
space stub for calling into the low-level syscall regardless of the
libc interfaces,
or for synchronizing the interfaces with strace, qemu-user, or anything that
needs to deal with the low-level interface.

  Arnd


Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Christophe Leroy




Le 23/09/2020 à 18:08, Wolfram Sang a écrit :

On Wed, Sep 23, 2020 at 04:08:40PM +0200, nico.vi...@gmail.com wrote:

From: Nicolas VINCENT 

the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.

Signed-off-by: Nicolas VINCENT 


Thanks!

Is someone able to identify a Fixes: tag I could add?


I'd suggest

Fixes: 61045dbe9d8d ("i2c: Add support for I2C bus on Freescale 
CPM1/CPM2 controllers")


Christophe




---
  drivers/i2c/busses/i2c-cpm.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..24d584a1c9a7 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -65,6 +65,9 @@ struct i2c_ram {
charres1[4];/* Reserved */
ushort  rpbase; /* Relocation pointer */
charres2[2];/* Reserved */
+   /* The following elements are only for CPM2 */
+   charres3[4];/* Reserved */
+   uintsdmatmp;/* Internal */
  };
  
  #define I2COM_START	0x80

--
2.17.1



Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Christoph Hellwig
On Wed, Sep 23, 2020 at 06:05:27PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 05:38:31PM +0100, Al Viro wrote:
> > On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
> > 
> > > > That's a very good question.  But it does not just compile but actually
> > > > works.  Probably because all the syscall wrappers mean that we don't
> > > > actually generate the normal names.  I just tried this:
> > > > 
> > > > --- a/include/linux/syscalls.h
> > > > +++ b/include/linux/syscalls.h
> > > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t 
> > > > offset,
> > > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t 
> > > > count);
> > > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > > > size_t count);
> > > > -asmlinkage long sys_readv(unsigned long fd,
> > > > +asmlinkage long sys_readv(void *fd,
> > > > 
> > > > for fun, and the compiler doesn't care either..
> > > 
> > > Try to build it for sparc or ppc...
> > 
> > FWIW, declarations in syscalls.h used to serve 4 purposes:
> > 1) syscall table initializers needed symbols declared
> > 2) direct calls needed the same
> > 3) catching mismatches between the declarations and definitions
> > 4) centralized list of all syscalls
> > 
> > (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> > used for the remaining ones.
> > 
> > (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> > and s390.  On those 3 (1) is done otherwise (near the syscall table 
> > initializer)
> > and (3) is not done at all.
> > 
> > I wonder if we should do something like
> > 
> > SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >  unsigned long, vlen);
> > in syscalls.h instead, and not under that ifdef.
> > 
> > Let it expand to declaration of sys_...() in generic case and, on x86, into
> > __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> > what SYSCALL_DEFINE ends up using.
> > 
> > Similar macro would cover compat_sys_...() declarations.  That would
> > restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> > be terribly high - cpp would have more to chew through in syscalls.h,
> > but it shouldn't be all that costly.  Famous last words, of course...
> > 
> > Does anybody see fundamental problems with that?
> 
> Just to make it clear - I do not propose to fold that into this series;
> there we just need to keep those declarations in sync with fs/read_write.c

Agreed.  The above idea generally sounds sane to me.


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Brian Gerst
On Wed, Sep 23, 2020 at 12:39 PM Al Viro  wrote:
>
> On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
>
> > > That's a very good question.  But it does not just compile but actually
> > > works.  Probably because all the syscall wrappers mean that we don't
> > > actually generate the normal names.  I just tried this:
> > >
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t 
> > > offset,
> > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t 
> > > count);
> > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > > size_t count);
> > > -asmlinkage long sys_readv(unsigned long fd,
> > > +asmlinkage long sys_readv(void *fd,
> > >
> > > for fun, and the compiler doesn't care either..
> >
> > Try to build it for sparc or ppc...
>
> FWIW, declarations in syscalls.h used to serve 4 purposes:
> 1) syscall table initializers needed symbols declared
> 2) direct calls needed the same
> 3) catching mismatches between the declarations and definitions
> 4) centralized list of all syscalls
>
> (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> used for the remaining ones.
>
> (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> and s390.  On those 3 (1) is done otherwise (near the syscall table 
> initializer)
> and (3) is not done at all.
>
> I wonder if we should do something like
>
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>  unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
>
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
>
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
>
> Does anybody see fundamental problems with that?

I think this would be a good idea.  I have been working on a patchset
to clean up the conditional syscall handling (sys_ni.c), and conflicts
with the prototypes in syscalls.h have been getting in the way.
Having the prototypes use SYSCALL_DECLAREx(...) would solve that
issue.

--
Brian Gerst


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 05:38:31PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
> 
> > > That's a very good question.  But it does not just compile but actually
> > > works.  Probably because all the syscall wrappers mean that we don't
> > > actually generate the normal names.  I just tried this:
> > > 
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t 
> > > offset,
> > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t 
> > > count);
> > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > > size_t count);
> > > -asmlinkage long sys_readv(unsigned long fd,
> > > +asmlinkage long sys_readv(void *fd,
> > > 
> > > for fun, and the compiler doesn't care either..
> > 
> > Try to build it for sparc or ppc...
> 
> FWIW, declarations in syscalls.h used to serve 4 purposes:
>   1) syscall table initializers needed symbols declared
>   2) direct calls needed the same
>   3) catching mismatches between the declarations and definitions
>   4) centralized list of all syscalls
> 
> (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> used for the remaining ones.
> 
> (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> and s390.  On those 3 (1) is done otherwise (near the syscall table 
> initializer)
> and (3) is not done at all.
> 
> I wonder if we should do something like
> 
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
> 
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
> 
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
> 
> Does anybody see fundamental problems with that?

Just to make it clear - I do not propose to fold that into this series;
there we just need to keep those declarations in sync with fs/read_write.c


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:

> > That's a very good question.  But it does not just compile but actually
> > works.  Probably because all the syscall wrappers mean that we don't
> > actually generate the normal names.  I just tried this:
> > 
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
> >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
> >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > size_t count);
> > -asmlinkage long sys_readv(unsigned long fd,
> > +asmlinkage long sys_readv(void *fd,
> > 
> > for fun, and the compiler doesn't care either..
> 
> Try to build it for sparc or ppc...

FWIW, declarations in syscalls.h used to serve 4 purposes:
1) syscall table initializers needed symbols declared
2) direct calls needed the same
3) catching mismatches between the declarations and definitions
4) centralized list of all syscalls

(2) has been (thankfully) reduced for some time; in any case, ksys_... is
used for the remaining ones.

(1) and (3) are served by syscalls.h in architectures other than x86, arm64
and s390.  On those 3 (1) is done otherwise (near the syscall table initializer)
and (3) is not done at all.

I wonder if we should do something like

SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
 unsigned long, vlen);
in syscalls.h instead, and not under that ifdef.

Let it expand to declaration of sys_...() in generic case and, on x86, into
__do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
what SYSCALL_DEFINE ends up using.

Similar macro would cover compat_sys_...() declarations.  That would
restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
be terribly high - cpp would have more to chew through in syscalls.h,
but it shouldn't be all that costly.  Famous last words, of course...

Does anybody see fundamental problems with that?


Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Wolfram Sang
On Wed, Sep 23, 2020 at 04:08:40PM +0200, nico.vi...@gmail.com wrote:
> From: Nicolas VINCENT 
> 
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
> 
> Signed-off-by: Nicolas VINCENT 

Thanks!

Is someone able to identify a Fixes: tag I could add?

> ---
>  drivers/i2c/busses/i2c-cpm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..24d584a1c9a7 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -65,6 +65,9 @@ struct i2c_ram {
>   charres1[4];/* Reserved */
>   ushort  rpbase; /* Relocation pointer */
>   charres2[2];/* Reserved */
> + /* The following elements are only for CPM2 */
> + charres3[4];/* Reserved */
> + uintsdmatmp;/* Internal */
>  };
>  
>  #define I2COM_START  0x80
> -- 
> 2.17.1
> 


signature.asc
Description: PGP signature


Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Steven Rostedt
On Wed, 23 Sep 2020 10:40:32 +0200
pet...@infradead.org wrote:

> However, with migrate_disable() we can have each task preempted in a
> migrate_disable() region, worse we can stack them all on the _same_ CPU
> (super ridiculous odds, sure). And then we end up only able to run one
> task, with the rest of the CPUs picking their nose.

What if we just made migrate_disable() a local_lock() available for !RT?

I mean make it a priority inheritance PER CPU lock.

That is, no two tasks could do a migrate_disable() on the same CPU? If
one task does a migrate_disable() and then gets preempted and the
preempting task does a migrate_disable() on the same CPU, it will block
and wait for the first task to do a migrate_enable().

No two tasks on the same CPU could enter the migrate_disable() section
simultaneously, just like no two tasks could enter a preempt_disable()
section.

In essence, we just allow local_lock() to be used for both RT and !RT.

Perhaps make migrate_disable() an anonymous local_lock()?

This should lower the SHC in theory, if you can't have stacked migrate
disables on the same CPU.

-- Steve


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 04:32:51PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 23, 2020 at 03:25:49PM +0100, Al Viro wrote:
> > On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
> > >  COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> > > - const struct compat_iovec __user *,vec,
> > > + const struct iovec __user *, vec,
> > 
> > Um...  Will it even compile?
> > 
> > >  #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
> > >  COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> > > - const struct compat_iovec __user *,vec,
> > > + const struct iovec __user *, vec,
> > 
> > Ditto.  Look into include/linux/compat.h and you'll see
> > 
> > asmlinkage long compat_sys_preadv64(unsigned long fd,
> > const struct compat_iovec __user *vec,
> > unsigned long vlen, loff_t pos);
> > 
> > How does that manage to avoid the compiler screaming bloody
> > murder?
> 
> That's a very good question.  But it does not just compile but actually
> works.  Probably because all the syscall wrappers mean that we don't
> actually generate the normal names.  I just tried this:
> 
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
>  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
>  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> size_t count);
> -asmlinkage long sys_readv(unsigned long fd,
> +asmlinkage long sys_readv(void *fd,
> 
> for fun, and the compiler doesn't care either..

Try to build it for sparc or ppc...


Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 02:38:24PM +, David Laight wrote:
> From: Al Viro
> > Sent: 23 September 2020 15:17
> > 
> > On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> > 
> > > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > + unsigned long nr_segs, unsigned long fast_segs,
> > 
> > Hmm...  For fast_segs unsigned long had always been ridiculous
> > (4G struct iovec on caller stack frame?), but that got me wondering about
> > nr_segs and I wish I'd thought of that when introducing import_iovec().
> > 
> > The thing is, import_iovec() takes unsigned int there.  Which is fine
> > (hell, the maximal value that can be accepted in 1024), except that
> > we do pass unsigned long syscall argument to it in some places.
> 
> It will make diddly-squit difference.
> The parameters end up in registers on most calling conventions.
> Plausibly you get an extra 'REX' byte on x86 for the 64bit value.
> What you want to avoid is explicit sign/zero extension and value
> masking after arithmetic.

Don't tell me what I want; your telepathic abilities are consistently sucky.

I am *NOT* talking about microoptimization here.  I have described
the behaviour change of syscall caused by commit 5 years ago.  Which is
generally considered a problem.  Then I asked whether that behaviour
change would fall under the "if nobody noticed, it's not a userland ABI
breakage" exception.

Could you show me the point where I have expressed concerns about
the quality of amd64 code generated for that thing, before or after
the change in question?


Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 03:16:54PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> 
> > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > +   unsigned long nr_segs, unsigned long fast_segs,
> 
> Hmm...  For fast_segs unsigned long had always been ridiculous
> (4G struct iovec on caller stack frame?), but that got me wondering about
> nr_segs and I wish I'd thought of that when introducing import_iovec().
> 
> The thing is, import_iovec() takes unsigned int there.  Which is fine
> (hell, the maximal value that can be accepted in 1024), except that
> we do pass unsigned long syscall argument to it in some places.
> 
> E.g. vfs_readv() quietly truncates vlen to 32 bits, and vlen can
> come unchanged through sys_readv() -> do_readv() -> vfs_readv().
> With unsigned long passed by syscall glue.
> 
> AFAICS, passing 4G+1 as the third argument to readv(2) on 64bit box
> will be quietly treated as 1 these days.  Which would be fine, except
> that before "switch {compat_,}do_readv_writev() to {compat_,}import_iovec()"
> it used to fail with -EINVAL.
> 
> Userland, BTW, describes readv(2) iovcnt as int; process_vm_readv(),
> OTOH, has these counts unsigned long from the userland POV...
> 
> I suppose we ought to switch import_iovec() to unsigned long for nr_segs ;-/
> Strictly speaking that had been a userland ABI change, even though nothing
> except regression tests checking for expected errors would've been likely
> to notice.  And it looks like no regression tests covered that one...
> 
> Linus, does that qualify for your "if no userland has noticed the change,
> it's not a breakage"?

Egads...  We have sys_readv() with unsigned long for file descriptor, since
1.3.31 when it had been introduced.  And originally it did comparison with
NR_OPEN right in sys_readv().  Then in 2.1.60 it had been switched to
fget(), which used to take unsigned long at that point.  And in 2.1.90pre1
it went unsigned int, so non-zero upper 32 bits in readv(2) first argument
ceased to cause EBADF...

Of course, libc had it as int fd all along.


RE: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec

2020-09-23 Thread David Laight
From: Al Viro
> Sent: 23 September 2020 15:17
> 
> On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> 
> > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > +   unsigned long nr_segs, unsigned long fast_segs,
> 
> Hmm...  For fast_segs unsigned long had always been ridiculous
> (4G struct iovec on caller stack frame?), but that got me wondering about
> nr_segs and I wish I'd thought of that when introducing import_iovec().
> 
> The thing is, import_iovec() takes unsigned int there.  Which is fine
> (hell, the maximal value that can be accepted in 1024), except that
> we do pass unsigned long syscall argument to it in some places.

It will make diddly-squit difference.
The parameters end up in registers on most calling conventions.
Plausibly you get an extra 'REX' byte on x86 for the 64bit value.
What you want to avoid is explicit sign/zero extension and value
masking after arithmetic.

On x86-64 the 'horrid' type is actually 'signed int'.
It often needs sign extending to 64bits (eg when being
used as an array subscript).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Thomas Gleixner
On Mon, Sep 21 2020 at 21:27, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 09:24, Linus Torvalds wrote:
>> On Mon, Sep 21, 2020 at 12:39 AM Thomas Gleixner  wrote:
>> Maybe we really *could* call this new kmap functionality something
>> like "kmap_percpu()" (or maybe "local" is good enough), and make it
>> act like your RT code does for spinlocks - not disable preemption, but
>> only disabling CPU migration.
>
> I"m all for it, but the scheduler people have opinions :)

I just took the latest version of migrate disable patches

  https://lore.kernel.org/r/20200921163557.234036...@infradead.org

removed the RT dependency on top of them and adopted the kmap stuff
(addressing the various comments while at it and unbreaking ARM).

I'm not going to post that until there is consensus about the general
availability of migrate disable, but for those who want to play with it
I've pushed the resulting combo out to:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git highmem 

For testing I've tweaked a few places to use the new _local() variants
and it survived testing so far and I've verified that there is actual
preemption which means zap/restore of the thread local kmaps.

Thanks,

tglx


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Christoph Hellwig
On Wed, Sep 23, 2020 at 03:25:49PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
> >  COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> > -   const struct compat_iovec __user *,vec,
> > +   const struct iovec __user *, vec,
> 
> Um...  Will it even compile?
> 
> >  #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
> >  COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> > -   const struct compat_iovec __user *,vec,
> > +   const struct iovec __user *, vec,
> 
> Ditto.  Look into include/linux/compat.h and you'll see
> 
> asmlinkage long compat_sys_preadv64(unsigned long fd,
> const struct compat_iovec __user *vec,
> unsigned long vlen, loff_t pos);
> 
> How does that manage to avoid the compiler screaming bloody
> murder?

That's a very good question.  But it does not just compile but actually
works.  Probably because all the syscall wrappers mean that we don't
actually generate the normal names.  I just tried this:

--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
 asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
 asmlinkage long sys_write(unsigned int fd, const char __user *buf,
size_t count);
-asmlinkage long sys_readv(unsigned long fd,
+asmlinkage long sys_readv(void *fd,

for fun, and the compiler doesn't care either..


Re: [PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
>  COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> - const struct compat_iovec __user *,vec,
> + const struct iovec __user *, vec,

Um...  Will it even compile?

>  #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
>  COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> - const struct compat_iovec __user *,vec,
> + const struct iovec __user *, vec,

Ditto.  Look into include/linux/compat.h and you'll see

asmlinkage long compat_sys_preadv64(unsigned long fd,
const struct compat_iovec __user *vec,
unsigned long vlen, loff_t pos);

How does that manage to avoid the compiler screaming bloody
murder?


Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:

> +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> + unsigned long nr_segs, unsigned long fast_segs,

Hmm...  For fast_segs unsigned long had always been ridiculous
(4G struct iovec on caller stack frame?), but that got me wondering about
nr_segs and I wish I'd thought of that when introducing import_iovec().

The thing is, import_iovec() takes unsigned int there.  Which is fine
(hell, the maximal value that can be accepted in 1024), except that
we do pass unsigned long syscall argument to it in some places.

E.g. vfs_readv() quietly truncates vlen to 32 bits, and vlen can
come unchanged through sys_readv() -> do_readv() -> vfs_readv().
With unsigned long passed by syscall glue.

AFAICS, passing 4G+1 as the third argument to readv(2) on 64bit box
will be quietly treated as 1 these days.  Which would be fine, except
that before "switch {compat_,}do_readv_writev() to {compat_,}import_iovec()"
it used to fail with -EINVAL.

Userland, BTW, describes readv(2) iovcnt as int; process_vm_readv(),
OTOH, has these counts unsigned long from the userland POV...

I suppose we ought to switch import_iovec() to unsigned long for nr_segs ;-/
Strictly speaking that had been a userland ABI change, even though nothing
except regression tests checking for expected errors would've been likely
to notice.  And it looks like no regression tests covered that one...

Linus, does that qualify for your "if no userland has noticed the change,
it's not a breakage"?


Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Jochen Friedrich

Acked-by: Jochen Friedrich 

Am 23.09.2020 um 16:08 schrieb nico.vi...@gmail.com:

From: Nicolas VINCENT 

the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.

Signed-off-by: Nicolas VINCENT 
---
  drivers/i2c/busses/i2c-cpm.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..24d584a1c9a7 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -65,6 +65,9 @@ struct i2c_ram {
charres1[4];/* Reserved */
ushort  rpbase; /* Relocation pointer */
charres2[2];/* Reserved */
+   /* The following elements are only for CPM2 */
+   charres3[4];/* Reserved */
+   uintsdmatmp;/* Internal */
  };
  
  #define I2COM_START	0x80




Re: [PATCH kernel] powerpc/dma: Fix dma_map_ops::get_required_mask

2020-09-23 Thread Christoph Hellwig
On Tue, Sep 22, 2020 at 12:26:18PM +1000, Alexey Kardashevskiy wrote:
> > Well, the original intent of dma_get_required_mask is to return the
> > mask that the driver then uses to figure out what to set, so what aacraid
> > does fits that use case. 
> 
> What was the original intent exactly? The driver asks for the minimum or
> maximum DMA mask the platform supports?
> 
> As for now, we (ppc64/powernv) can do:
> 1. bypass (==64bit)
> 2. a DMA window which used to be limited by 2GB but not anymore.
> 
> I can understand if the driver asked for required mask in expectation to
> receive "less or equal than 32bit" and "more than 32 bit" and choose.
> And this probably was the intent as at the time when the bug was
> introduced, the window was always smaller than 4GB.
> 
> But today the window is bigger than than (44 bits now, or a similar
> value, depends on max page order) so the returned mask is >32. Which
> still enables that DAC in aacraid but I suspect this is accidental.

I think for powernv returning 64-bit always would make a lot of sense.
AFAIK all of powernv is PCIe and not legacy PCI, so returning anything
less isn't going to help to optimize anything.

> > Of course that idea is pretty bogus for
> > PCIe devices.
> 
> Why? From the PHB side, there are windows. From the device side, there
> are many crippled devices, like, no GPU I saw in last years supported
> more than 48bit.

Yes, but dma_get_required_mask is misnamed - the mask is not required,
it is the optimal mask.  Even if the window is smaller we handle it
some way, usually by using swiotlb, or by iommu tricks in your case.

> > I suspect the right fix is to just not query dma_get_required_mask for
> > PCIe devices in aacraid (and other drivers that do something similar).
> 
> May be, if you write nice and big comment next to
> dma_get_required_mask() explaining exactly what it does, then I will
> realize I am getting this all wrong and we will move to fixing the
> drivers :)

Yes, it needs a comment or two, and probaby be renamed to
dma_get_optimal_dma_mask, and a cleanup of most users.  I've added it
to my ever growing TODO list, but I would not be unhappy if someone
else gives it a spin.


[PATCH v2] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread nico . vince
From: Nicolas VINCENT 

the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.

Signed-off-by: Nicolas VINCENT 
---
 drivers/i2c/busses/i2c-cpm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..24d584a1c9a7 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -65,6 +65,9 @@ struct i2c_ram {
charres1[4];/* Reserved */
ushort  rpbase; /* Relocation pointer */
charres2[2];/* Reserved */
+   /* The following elements are only for CPM2 */
+   charres3[4];/* Reserved */
+   uintsdmatmp;/* Internal */
 };
 
 #define I2COM_START0x80
-- 
2.17.1



Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Thomas Gleixner
On Wed, Sep 23 2020 at 10:40, peterz wrote:
> Right, so I'm concerned. migrate_disable() wrecks pretty much all
> Real-Time scheduler theory we have, and PREEMPRT_RT bringing it in is
> somewhat ironic.

It's even more ironic that the approach of PREEMPT_RT has been
'pragmatic ignorance of theory' from the very beginning and despite
violating all theories it still works. :)

> Yes, it allows breaking up non-preemptible regions of non-deterministic
> duration, and thereby both reduce and bound the scheduling latency, the
> cost for doing that is that the theory on CPU utilization/bandwidth go
> out the window.

I agree, that the theory goes out of the window, but does it matter in
practice? I've yet to see a report of migrate disable stacking being the
culprit of a missed deadline and I surely have stared at lots of reports
in the past 10+ years.

> To easily see this consider an SMP system with a number of tasks equal
> to the number of CPUs. On a regular (preempt_disable) kernel we can
> always run each task, by virtue of always having an idle CPU to take the
> task.
>
> However, with migrate_disable() we can have each task preempted in a
> migrate_disable() region, worse we can stack them all on the _same_ CPU
> (super ridiculous odds, sure). And then we end up only able to run one
> task, with the rest of the CPUs picking their nose.
>
> The end result is that, like with unbounded latency, we will still miss
> our deadline, simply because we got starved for CPU.

I'm well aware of that.

> Now, while we could (with a _lot_ of work) rework the kernel to not rely
> on the implicit per-cpu ness of things like spinlock_t, the moment we
> bring in basic primitives that rely on migrate_disable() we're stuck
> with it.

Right, but we are stuck with per CPUness and distangling that is just
infeasible IMO.

> The problem is; afaict there's been no research into this problem.

There is no research on a lot of things the kernel does :)

> There might be scheduling (read: balancing) schemes that can
> mitigate/solve this problem, or it might prove to be a 'hard' problem,
> I just don't know.

In practive balancing surely can take the number of preempted tasks
which are in a migrate disable section into account which would be just
another measure to work around the fact that the kernel is not adhering
to the theories. It never did that even w/o migrate disable.

> But once we start down this road, it's going to be hell to get rid of
> it.

Like most of the other things the kernel came up with to deal with the
oddities of modern hardware :)

> That's why I've been arguing (for many years) to strictly limit this to
> PREEMPT_RT and only as a gap-stop, not a fundamental primitive to build
> on.

I know, but short of rewriting the world, I'm not seing the faintest
plan to remove the stop gap. :)

As we discussed not long ago we have too many inconsistent preemption
models already. RT is adding yet another one. And that's worse than
introducing migrate disable as a general available facility.

IMO, reaching a point of consistency where our different preemption
models (NONE, VOLUNTARY, PREEMPT. RT) build on each other is far more
important.

For !RT migrate disable is far less of an danger than for RT kernels
because the amount of code which will use it is rather limited compared
to code which still will disable preemption implicit through spin and rw
locks.

On RT converting these locks to 'sleepable spinlocks' is just possible
because RT forces almost everything into task context and migrate
disable is just the obvious decomposition of preempt disable which
implicitely disables migration.

But that means that RT is by orders of magnitude more prone to run into
the scheduling trainwreck you are worried about. It just refuses to do
so at least with real world work loads.

I'm surely in favour of having solid theories behind implementation, but
at some point you just have to bite the bullet and chose pragmatism in
order to make progress.

Proliferating inconsistency is not real progress, as it is violating the
most fundamental engineering principles. That's by far more dangerous
than violating scheduling theories which are built on perfect models and
therefore enforce violation by practical implementations anyway.

Thanks,

tglx


Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

2020-09-23 Thread Al Viro
On Wed, Sep 23, 2020 at 11:01:34AM +0300, Pavel Begunkov wrote:

> > I'm not following why that would be considered a valid option,
> > as that clearly breaks existing users that update from a 32-bit
> > kernel to a 64-bit one.
> 
> Do you mean users who move 32-bit binaries (without recompiling) to a
> new x64 kernel? Does the kernel guarantees that to work?

Yes.

No further (printable) comments for now...


Re: [PATCH] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Christophe Leroy




Le 23/09/2020 à 14:55, Vincent Nicolas a écrit :





From: Christophe Leroy 
Sent: Wednesday, 23 September 2020 10:01
To: Vincent Nicolas; joc...@scram.de
Cc: linuxppc-dev@lists.ozlabs.org; linux-...@vger.kernel.org
Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure



Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :




From: Christophe Leroy 
Sent: Tuesday, 22 September 2020 14:38
To: Vincent Nicolas ; joc...@scram.de 

Cc: linuxppc-dev@lists.ozlabs.org ; 
linux-...@vger.kernel.org 
Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure



Le 22/09/2020 à 11:04, nico.vi...@gmail.com a écrit :

From: Nicolas VINCENT 

the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.

Signed-off-by: Nicolas VINCENT 
---
 drivers/i2c/busses/i2c-cpm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..c5700addbf65 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -64,7 +64,8 @@ struct i2c_ram {
  uinttxtmp;  /* Internal */
  charres1[4];/* Reserved */
  ushort  rpbase; /* Relocation pointer */
- charres2[2];/* Reserved */
+ charres2[6];/* Reserved */
+ uintsdmatmp;/* Internal */


On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)

Your change overlaps the miscellaneous area that contains CP Microcode
Revision Number, ref MPC885 Reference Manual §18.7.3

As far as I understand the mpc885 contains in the dts the 
compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the 
address of the i2c_ram structure (cpm1), or dynamically allocate it with 
cpm_muram_alloc (cpm2).
In the first case the structure will indeed overlaps with the miscellaneous 
section but since the sdmatmp is only used by cpm2 hardware it shall not be an 
issue.

Please, let me know if I am mistaken. If the patch cannot be accepted as is, I 
would gladly accept pointers on how to address this kind of issue.



Please use a mail client that properly sets the > in front of
original/answered text. Here your mailer has mixed you text and mine,
that's unusable on the long term.


I changed my configuration, please let me know if there are still issues


Yes it is OK now.

Christophe



Re: [PATCH] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Vincent Nicolas


> 
> From: Christophe Leroy 
> Sent: Wednesday, 23 September 2020 10:01
> To: Vincent Nicolas; joc...@scram.de
> Cc: linuxppc-dev@lists.ozlabs.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>
>
>
> Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :
>>
>>
>>
>> From: Christophe Leroy 
>> Sent: Tuesday, 22 September 2020 14:38
>> To: Vincent Nicolas ; joc...@scram.de 
>> 
>> Cc: linuxppc-dev@lists.ozlabs.org ; 
>> linux-...@vger.kernel.org 
>> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>>
>>
>>
>> Le 22/09/2020 à 11:04, nico.vi...@gmail.com a écrit :
>>> From: Nicolas VINCENT 
>>>
>>> the i2c_ram structure is missing the sdmatmp field mentionned in
>>> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
>>> hardware would write past the allocated memory done through
>>> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
>>> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
>>> field is only set during setup(), the first i2c transaction would work
>>> and the following would send data read from an arbitrary memory
>>> location.
>>>
>>> Signed-off-by: Nicolas VINCENT 
>>> ---
>>> drivers/i2c/busses/i2c-cpm.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
>>> index 1213e1932ccb..c5700addbf65 100644
>>> --- a/drivers/i2c/busses/i2c-cpm.c
>>> +++ b/drivers/i2c/busses/i2c-cpm.c
>>> @@ -64,7 +64,8 @@ struct i2c_ram {
>>>  uinttxtmp;  /* Internal */
>>>  charres1[4];/* Reserved */
>>>  ushort  rpbase; /* Relocation pointer */
>>> - charres2[2];/* Reserved */
>>> + charres2[6];/* Reserved */
>>> + uintsdmatmp;/* Internal */
>>
>> On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)
>>
>> Your change overlaps the miscellaneous area that contains CP Microcode
>> Revision Number, ref MPC885 Reference Manual §18.7.3
>>
>> As far as I understand the mpc885 contains in the dts the 
>> compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the 
>> address of the i2c_ram structure (cpm1), or dynamically allocate it with 
>> cpm_muram_alloc (cpm2).
>> In the first case the structure will indeed overlaps with the miscellaneous 
>> section but since the sdmatmp is only used by cpm2 hardware it shall not be 
>> an issue.
>>
>> Please, let me know if I am mistaken. If the patch cannot be accepted as is, 
>> I would gladly accept pointers on how to address this kind of issue.
>
>
> Please use a mail client that properly sets the > in front of
> original/answered text. Here your mailer has mixed you text and mine,
> that's unusable on the long term.

I changed my configuration, please let me know if there are still issues

>
>
> I think you are right on the fact that it doesn't seem to be an issue.
> Nevertheless, that's confusing.
>
> What I would suggest is to leave res2[2] as is, and add something like:
>
> /* The following elements are only for CPM2 */
> char res3[4];   /* Reserved */
> uint sdmatmp;   /* Internal */

I'll repost the patch like this then.

>
>
> Other solution (not sure that's the best solution thought) would be to
> do as in spi-fsl-cpm: use iic_t structure from asm/cpm1.h when
> CONFIG_CPM1 is selected and use iic_t from asm/cpm2.h when CONFIG_CPM2
> is selected, taking into account that CONFIG_CPM1 and CONFIG_CPM2 are
> mutually exclusive at the time being.

Unless someone argues for this solution I will go with the first one.

Thank again for your time and quick responses.

Nicolas.

Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread Thomas Gleixner
On Wed, Sep 23 2020 at 12:19, peterz wrote:
> On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
>> Alternatively this could of course be solved with per CPU page tables
>> which will come around some day anyway I fear.
>
> Previously (with PTI) we looked at making the entire kernel map per-CPU,
> and that takes a 2K copy on switch_mm() (or more general, the user part
> of whatever the top level directory is for architectures that have a
> shared kernel/user page-table setup in the first place).
>
> The idea was having a fixed per-cpu kernel page-table, share a bunch of
> (kernel) page-tables between all CPUs and then copy in the user part on
> switch.
>
> I've forgotten what the plan was for ASID/PCID in that scheme.
>
> For x86_64 we've been fearing the performance of that 2k copy, but I
> don't think we've ever actually bit the bullet and implemented it to see
> how bad it really is.

I actually did at some point and depending on the workload the overhead
was clearly measurable. And yes, it fell apart with PCID and I could not
come up with a scheme for it which did not suck horribly. So I burried
the patches in the poison cabinet.

Aside of that, we'd need to implement that for a eight other
architectures as well...

Thanks,

tglx



Re: [PATCH] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Vincent Nicolas



From: Christophe Leroy 
Sent: Tuesday, 22 September 2020 14:38
To: Vincent Nicolas ; joc...@scram.de 

Cc: linuxppc-dev@lists.ozlabs.org ; 
linux-...@vger.kernel.org 
Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure 
 


Le 22/09/2020 à 11:04, nico.vi...@gmail.com a écrit :
> From: Nicolas VINCENT 
> 
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
> 
> Signed-off-by: Nicolas VINCENT 
> ---
>   drivers/i2c/busses/i2c-cpm.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..c5700addbf65 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -64,7 +64,8 @@ struct i2c_ram {
>    uint    txtmp;  /* Internal */
>    char    res1[4];    /* Reserved */
>    ushort  rpbase; /* Relocation pointer */
> - char    res2[2];    /* Reserved */
> + char    res2[6];    /* Reserved */
> + uint    sdmatmp;    /* Internal */

On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)

Your change overlaps the miscellaneous area that contains CP Microcode 
Revision Number, ref MPC885 Reference Manual §18.7.3

As far as I understand the mpc885 contains in the dts the 
compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the 
address of the i2c_ram structure (cpm1), or dynamically allocate it with 
cpm_muram_alloc (cpm2).
In the first case the structure will indeed overlaps with the miscellaneous 
section but since the sdmatmp is only used by cpm2 hardware it shall not be an 
issue.

Please, let me know if I am mistaken. If the patch cannot be accepted as is, I 
would gladly accept pointers on how to address this kind of issue.

Thanks,
Nicolas.

>   };
>   
>   #define I2COM_START 0x80
> 


Christophe

Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread peterz
On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
> Alternatively this could of course be solved with per CPU page tables
> which will come around some day anyway I fear.

Previously (with PTI) we looked at making the entire kernel map per-CPU,
and that takes a 2K copy on switch_mm() (or more general, the user part
of whatever the top level directory is for architectures that have a
shared kernel/user page-table setup in the first place).

The idea was having a fixed per-cpu kernel page-table, share a bunch of
(kernel) page-tables between all CPUs and then copy in the user part on
switch.

I've forgotten what the plan was for ASID/PCID in that scheme.

For x86_64 we've been fearing the performance of that 2k copy, but I
don't think we've ever actually bit the bullet and implemented it to see
how bad it really is.


Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-23 Thread peterz
On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 09:24, Linus Torvalds wrote:
> > On Mon, Sep 21, 2020 at 12:39 AM Thomas Gleixner  wrote:
> >>
> >> If a task is migrated to a different CPU then the mapping address will
> >> change which will explode in colourful ways.
> >
> > Right you are.
> >
> > Maybe we really *could* call this new kmap functionality something
> > like "kmap_percpu()" (or maybe "local" is good enough), and make it
> > act like your RT code does for spinlocks - not disable preemption, but
> > only disabling CPU migration.
> 
> I"m all for it, but the scheduler people have opinions :)

Right, so I'm concerned. migrate_disable() wrecks pretty much all
Real-Time scheduler theory we have, and PREEMPRT_RT bringing it in is
somewhat ironic.

Yes, it allows breaking up non-preemptible regions of non-deterministic
duration, and thereby both reduce and bound the scheduling latency, the
cost for doing that is that the theory on CPU utilization/bandwidth go
out the window.

To easily see this consider an SMP system with a number of tasks equal
to the number of CPUs. On a regular (preempt_disable) kernel we can
always run each task, by virtue of always having an idle CPU to take the
task.

However, with migrate_disable() we can have each task preempted in a
migrate_disable() region, worse we can stack them all on the _same_ CPU
(super ridiculous odds, sure). And then we end up only able to run one
task, with the rest of the CPUs picking their nose.

The end result is that, like with unbounded latency, we will still miss
our deadline, simply because we got starved for CPU.


Now, while we could (with a _lot_ of work) rework the kernel to not rely
on the implicit per-cpu ness of things like spinlock_t, the moment we
bring in basic primitives that rely on migrate_disable() we're stuck
with it.


The problem is; afaict there's been no research into this problem. There
might be scheduling (read: balancing) schemes that can mitigate/solve
this problem, or it might prove to be a 'hard' problem, I just don't
know. But once we start down this road, it's going to be hell to get rid
of it.

That's why I've been arguing (for many years) to strictly limit this to
PREEMPT_RT and only as a gap-stop, not a fundamental primitive to build
on.


Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

2020-09-23 Thread Pavel Begunkov
On 22/09/2020 12:01, Arnd Bergmann wrote:
> On Tue, Sep 22, 2020 at 9:59 AM Pavel Begunkov  wrote:
>> On 22/09/2020 10:23, Arnd Bergmann wrote:
>>> On Tue, Sep 22, 2020 at 8:32 AM Pavel Begunkov  
>>> wrote:
 On 22/09/2020 03:58, Andy Lutomirski wrote:
> On Mon, Sep 21, 2020 at 5:24 PM Pavel Begunkov  
> wrote:
> I may be looking at a different kernel than you, but aren't you
> preventing creating an io_uring regardless of whether SQPOLL is
> requested?

 I diffed a not-saved file on a sleepy head, thanks for noticing.
 As you said, there should be an SQPOLL check.

 ...
 if (ctx->compat && (p->flags & IORING_SETUP_SQPOLL))
 goto err;
>>>
>>> Wouldn't that mean that now 32-bit containers behave differently
>>> between compat and native execution?
>>>
>>> I think if you want to prevent 32-bit applications from using SQPOLL,
>>> it needs to be done the same way on both to be consistent:
>>
>> The intention was to disable only compat not native 32-bit.
> 
> I'm not following why that would be considered a valid option,
> as that clearly breaks existing users that update from a 32-bit
> kernel to a 64-bit one.

Do you mean users who move 32-bit binaries (without recompiling) to a
new x64 kernel? Does the kernel guarantees that to work? I'd personally
care more native-bitness apps.

> 
> Taking away the features from users that are still on 32-bit kernels
> already seems questionable to me, but being inconsistent
> about it seems much worse, in particular when the regression
> is on the upgrade path.

TBH, this won't fix that entirely (e.g. passing non-compat io_uring
to a compat process should yield the same problem). So, let's put
it aside for now until this bikeshedding would be relevant.

> 
>>> Can we expect all existing and future user space to have a sane
>>> fallback when IORING_SETUP_SQPOLL fails?
>>
>> SQPOLL has a few differences with non-SQPOLL modes, but it's easy
>> to convert between them. Anyway, SQPOLL is a privileged special
>> case that's here for performance/latency reasons, I don't think
>> there will be any non-accidental users of it.
> 
> Ok, so the behavior of 32-bit tasks would be the same as running
> the same application as unprivileged 64-bit tasks, with applications

Yes, something like that, but that's not automatic and in some
(hopefully rare) cases there may be pitfalls. That's in short,
I can expand the idea a bit if anyone would be interested.

> already having to implement that fallback, right?

Well, not everyone _have_ to implement such a fallback, e.g.
applications working only whilst privileged may use SQPOLL only.

-- 
Pavel Begunkov


Re: [PATCH] i2c: cpm: Fix i2c_ram structure

2020-09-23 Thread Christophe Leroy




Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :




From: Christophe Leroy 
Sent: Tuesday, 22 September 2020 14:38
To: Vincent Nicolas ; joc...@scram.de 

Cc: linuxppc-dev@lists.ozlabs.org ; 
linux-...@vger.kernel.org 
Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
  



Le 22/09/2020 à 11:04, nico.vi...@gmail.com a écrit :

From: Nicolas VINCENT 

the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.

Signed-off-by: Nicolas VINCENT 
---
    drivers/i2c/busses/i2c-cpm.c | 3 ++-
    1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..c5700addbf65 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -64,7 +64,8 @@ struct i2c_ram {
     uint    txtmp;  /* Internal */
     char    res1[4];    /* Reserved */
     ushort  rpbase; /* Relocation pointer */
- char    res2[2];    /* Reserved */
+ char    res2[6];    /* Reserved */
+ uint    sdmatmp;    /* Internal */


On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)

Your change overlaps the miscellaneous area that contains CP Microcode
Revision Number, ref MPC885 Reference Manual §18.7.3

As far as I understand the mpc885 contains in the dts the 
compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the 
address of the i2c_ram structure (cpm1), or dynamically allocate it with 
cpm_muram_alloc (cpm2).
In the first case the structure will indeed overlaps with the miscellaneous 
section but since the sdmatmp is only used by cpm2 hardware it shall not be an 
issue.

Please, let me know if I am mistaken. If the patch cannot be accepted as is, I 
would gladly accept pointers on how to address this kind of issue.



Please use a mail client that properly sets the > in front of 
original/answered text. Here your mailer has mixed you text and mine, 
that's unusable on the long term.



I think you are right on the fact that it doesn't seem to be an issue. 
Nevertheless, that's confusing.


What I would suggest is to leave res2[2] as is, and add something like:

/* The following elements are only for CPM2 */
char res3[4];   /* Reserved */
uint sdmatmp;   /* Internal */


Other solution (not sure that's the best solution thought) would be to 
do as in spi-fsl-cpm: use iic_t structure from asm/cpm1.h when 
CONFIG_CPM1 is selected and use iic_t from asm/cpm2.h when CONFIG_CPM2 
is selected, taking into account that CONFIG_CPM1 and CONFIG_CPM2 are 
mutually exclusive at the time being.


Christophe


[PATCH v3] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed

2020-09-23 Thread Cédric Le Goater
When a passthrough IO adapter is removed from a pseries machine using
hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
guest OS to clear all page table entries related to the adapter. If
some are still present, the RTAS call which isolates the PCI slot
returns error 9001 "valid outstanding translations" and the removal of
the IO adapter fails. This is because when the PHBs are scanned, Linux
maps automatically the INTx interrupts in the Linux interrupt number
space but these are never removed.

To solve this problem, we introduce a PPC platform specific
pcibios_remove_bus() routine which clears all interrupt mappings when
the bus is removed. This also clears the associated page table entries
of the ESB pages when using XIVE.

For this purpose, we record the logical interrupt numbers of the
mapped interrupt under the PHB structure and let pcibios_remove_bus()
do the clean up.

Since some PCI adapters, like GPUs, use the "interrupt-map" property
to describe interrupt mappings other than the legacy INTx interrupts,
we can not restrict the size of the mapping array to PCI_NUM_INTX. The
number of interrupt mappings is computed from the "interrupt-map"
property and the mapping array is allocated accordingly.

Cc: "Oliver O'Halloran" 
Cc: Alexey Kardashevskiy 
Reviewed-by: Alexey Kardashevskiy 
Signed-off-by: Cédric Le Goater 
---

 Changes in v3 :

 - NULLified 'irq_map' in pci_irq_map_dispose()
 
 arch/powerpc/include/asm/pci-bridge.h |   6 ++
 arch/powerpc/kernel/pci-common.c  | 115 ++
 2 files changed, 121 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index d2a2a14e56f9..d21e070352dc 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -48,6 +48,9 @@ struct pci_controller_ops {
 
 /*
  * Structure of a PCI controller (host bridge)
+ *
+ * @irq_count: number of interrupt mappings
+ * @irq_map: interrupt mappings
  */
 struct pci_controller {
struct pci_bus *bus;
@@ -127,6 +130,9 @@ struct pci_controller {
 
void *private_data;
struct npu *npu;
+
+   unsigned int irq_count;
+   unsigned int *irq_map;
 };
 
 /* These are used for config access before all the PCI probing
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index be108616a721..fb492de6902e 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -353,6 +353,116 @@ struct pci_controller *pci_find_controller_for_domain(int 
domain_nr)
return NULL;
 }
 
+/*
+ * Assumption is made on the interrupt parent. All interrupt-map
+ * entries are considered to have the same parent.
+ */
+static int pcibios_irq_map_count(struct pci_controller *phb)
+{
+   const __be32 *imap;
+   int imaplen;
+   struct device_node *parent;
+   u32 intsize, addrsize, parintsize, paraddrsize;
+
+   if (of_property_read_u32(phb->dn, "#interrupt-cells", ))
+   return 0;
+   if (of_property_read_u32(phb->dn, "#address-cells", ))
+   return 0;
+
+   imap = of_get_property(phb->dn, "interrupt-map", );
+   if (!imap) {
+   pr_debug("%pOF : no interrupt-map\n", phb->dn);
+   return 0;
+   }
+   imaplen /= sizeof(u32);
+   pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen);
+
+   if (imaplen < (addrsize + intsize + 1))
+   return 0;
+
+   imap += intsize + addrsize;
+   parent = of_find_node_by_phandle(be32_to_cpup(imap));
+   if (!parent) {
+   pr_debug("%pOF : no imap parent found !\n", phb->dn);
+   return 0;
+   }
+
+   if (of_property_read_u32(parent, "#interrupt-cells", )) {
+   pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn);
+   return 0;
+   }
+
+   if (of_property_read_u32(parent, "#address-cells", ))
+   paraddrsize = 0;
+
+   return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize);
+}
+
+static void pcibios_irq_map_init(struct pci_controller *phb)
+{
+   phb->irq_count = pcibios_irq_map_count(phb);
+   if (phb->irq_count < PCI_NUM_INTX)
+   phb->irq_count = PCI_NUM_INTX;
+
+   pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count);
+
+   phb->irq_map = kcalloc(phb->irq_count, sizeof(unsigned int),
+  GFP_KERNEL);
+}
+
+static void pci_irq_map_register(struct pci_dev *pdev, unsigned int virq)
+{
+   struct pci_controller *phb = pci_bus_to_host(pdev->bus);
+   int i;
+
+   if (!phb->irq_map)
+   return;
+
+   for (i = 0; i < phb->irq_count; i++) {
+   /*
+* Look for an empty or an equivalent slot, as INTx
+* interrupts can be shared between adapters.
+*/
+   if (phb->irq_map[i] == virq || !phb->irq_map[i]) {
+   

[PATCH -next v2] powerpc/perf: Fix symbol undeclared warning

2020-09-23 Thread Wang Wensheng
Build kernel with `C=2`:
arch/powerpc/perf/isa207-common.c:24:18: warning: symbol
'isa207_pmu_format_attr' was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:101:5: warning: symbol 'p9_dd21_bl_ev'
was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:115:5: warning: symbol 'p9_dd22_bl_ev'
was not declared. Should it be static?

Those symbols are used only in the files that define them so we declare
them as static to fix the warnings.

Signed-off-by: Wang Wensheng 
---
 arch/powerpc/perf/isa207-common.c | 2 +-
 arch/powerpc/perf/power9-pmu.c| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 964437adec18..85dc860b265b 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -21,7 +21,7 @@ PMU_FORMAT_ATTR(thresh_stop,  "config:32-35");
 PMU_FORMAT_ATTR(thresh_start,  "config:36-39");
 PMU_FORMAT_ATTR(thresh_cmp,"config:40-49");
 
-struct attribute *isa207_pmu_format_attr[] = {
+static struct attribute *isa207_pmu_format_attr[] = {
_attr_event.attr,
_attr_pmcxsel.attr,
_attr_mark.attr,
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 2a57e93a79dc..4a315fad1f99 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -98,7 +98,7 @@ extern u64 PERF_REG_EXTENDED_MASK;
 /* PowerISA v2.07 format attribute structure*/
 extern struct attribute_group isa207_pmu_format_group;
 
-int p9_dd21_bl_ev[] = {
+static int p9_dd21_bl_ev[] = {
PM_MRK_ST_DONE_L2,
PM_RADIX_PWC_L1_HIT,
PM_FLOP_CMPL,
@@ -112,7 +112,7 @@ int p9_dd21_bl_ev[] = {
PM_DISP_HELD_SYNC_HOLD,
 };
 
-int p9_dd22_bl_ev[] = {
+static int p9_dd22_bl_ev[] = {
PM_DTLB_MISS_16G,
PM_DERAT_MISS_2M,
PM_DTLB_MISS_2M,
-- 
2.25.0



Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed

2020-09-23 Thread Cédric Le Goater
On 9/23/20 2:33 AM, Qian Cai wrote:
> On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote:
>> When a passthrough IO adapter is removed from a pseries machine using
>> hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
>> guest OS to clear all page table entries related to the adapter. If
>> some are still present, the RTAS call which isolates the PCI slot
>> returns error 9001 "valid outstanding translations" and the removal of
>> the IO adapter fails. This is because when the PHBs are scanned, Linux
>> maps automatically the INTx interrupts in the Linux interrupt number
>> space but these are never removed.
>>
>> To solve this problem, we introduce a PPC platform specific
>> pcibios_remove_bus() routine which clears all interrupt mappings when
>> the bus is removed. This also clears the associated page table entries
>> of the ESB pages when using XIVE.
>>
>> For this purpose, we record the logical interrupt numbers of the
>> mapped interrupt under the PHB structure and let pcibios_remove_bus()
>> do the clean up.
>>
>> Since some PCI adapters, like GPUs, use the "interrupt-map" property
>> to describe interrupt mappings other than the legacy INTx interrupts,
>> we can not restrict the size of the mapping array to PCI_NUM_INTX. The
>> number of interrupt mappings is computed from the "interrupt-map"
>> property and the mapping array is allocated accordingly.
>>
>> Cc: "Oliver O'Halloran" 
>> Cc: Alexey Kardashevskiy 
>> Signed-off-by: Cédric Le Goater 
> 
> Some syscall fuzzing will trigger this on POWER9 NV where the traces pointed 
> to
> this patch.
> 
> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config

OK. The patch is missing a NULL assignement after kfree() and that
might be the issue. 

I did try PHB removal under PowerNV, so I would like to understand 
how we managed to remove twice the PCI bus and possibly reproduce. 
Any chance we could grab what the syscall fuzzer (syzkaller) did ? 


Thanks,

C. 

> 
> [ 3574.564109][  T965] ata1.00: disabled
> [ 3574.580373][T151472] sd 0:0:0:0: [sdb] Synchronizing SCSI cache
> [ 3574.581180][T151472] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed: 
> Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.581226][T151472] sd 0:0:0:0: [sdb] Stopping disk
> [ 3574.581289][T151472] sd 0:0:0:0: [sdb] Start/Stop Unit failed: Result: 
> hostbyte=0x04 driverbyte=0x00
> [ 3574.611424][ T3019] Read-error on swap-device (254:1:849792)
> [ 3574.611685][ T3019] Read-error on swap-device (254:1:914944)
> [ 3574.611769][ T3019] Read-error on swap-device (254:1:915072)
> [ 3574.611838][ T3019] Read-error on swap-device (254:1:915200)
> [ 3574.611926][ T3019] Read-error on swap-device (254:1:915328)
> [ 3574.612268][ T3084] Read-error on swap-device (254:1:792576)
> [ 3574.612342][ T3084] Read-error on swap-device (254:1:792704)
> [ 3574.612757][ T2362] Read-error on swap-device (254:1:957440)
> [ 3574.612773][ T2905] Read-error on swap-device (254:1:784128)
> [ 3574.613015][ T2362] Read-error on swap-device (254:1:957568)
> [ 3574.613160][ T2905] Read-error on swap-device (254:1:784256)
> [ 3574.613241][ T2362] Read-error on swap-device (254:1:957696)
> [ 3574.613342][ T2362] Read-error on swap-device (254:1:957824)
> [ 3574.614448][ T3019] Core dump to |/usr/lib/systemd/systemd-coredump pipe 
> failed
> [ 3574.614663][ T3019] Read-error on swap-device (254:1:961536)
> [ 3574.675330][T151844] Read-error on swap-device (254:1:128)
> [ 3574.675515][T151844] Read-error on swap-device (254:1:256)
> [ 3574.675700][T151844] Read-error on swap-device (254:1:384)
> [ 3574.703570][  T971] ata2.00: disabled
> [ 3574.710393][T151472] sd 1:0:0:0: [sda] Synchronizing SCSI cache
> [ 3574.710864][T151472] sd 1:0:0:0: [sda] Synchronize Cache(10) failed: 
> Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.710922][T151472] sd 1:0:0:0: [sda] Stopping disk
> [ 3574.711010][T151472] sd 1:0:0:0: [sda] Start/Stop Unit failed: Result: 
> hostbyte=0x04 driverbyte=0x00
> [ 3574.826569][  T674] dm-0: writeback error on inode 68507862, offset 65536, 
> sector 54281504
> [ 3575.117547][ T3366] dm-0: writeback error on inode 68507851, offset 0, 
> sector 54378880
> [ 3575.140104][T151472] pci 0004:03:00.0: Removing from iommu group 3
> [ 3575.141778][T151472] pci 0004:03 : [PE# fb] Releasing PE
> [ 3575.141965][T151472] pci 0004:03 : [PE# fb] Removing DMA window #0
> [ 3575.142452][T151472] pci 0004:03 : [PE# fb] Disabling 64-bit DMA bypass
> [ 3575.149369][T151472] pci_bus 0004:03: busn_res: [bus 03] is released
> [ 3575.150574][T152037] Read-error on swap-device (254:1:35584)
> [ 3575.150713][T152037] Read-error on swap-device (254:1:35712)
> [ 3575.152632][T152037] Read-error on swap-device (254:1:915584)
> [ 3575.152706][T151472] pci_bus 0004:04: busn_res: [bus 04-08] is released
> [ 3575.152983][T151472] 
> =
> [ 3575.153937][T151472] BUG kmalloc-16 (Not tainted): Object 

RE: [PATCH 1/5] Documentation: dt: binding: fsl: Add 'fsl,ippdexpcr1-alt-addr' property

2020-09-23 Thread Ran Wang
Hi Rob,

On Wednesday, September 23, 2020 10:33 AM, Rob Herring wrote:
> 
> On Wed, Sep 16, 2020 at 04:18:27PM +0800, Ran Wang wrote:
> > From: Biwen Li 
> >
> > The 'fsl,ippdexpcr1-alt-addr' property is used to handle an errata
> > A-008646 on LS1021A
> >
> > Signed-off-by: Biwen Li 
> > Signed-off-by: Ran Wang 
> > ---
> >  Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 19
> > +++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > index 5a33619..1be58a3 100644
> > --- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > +++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > @@ -34,6 +34,11 @@ Chassis Version  Example Chips
> >  Optional properties:
> >   - little-endian : RCPM register block is Little Endian. Without it RCPM
> > will be Big Endian (default case).
> > + - fsl,ippdexpcr1-alt-addr : The property is related to a hardware issue
> > +   on SoC LS1021A and only needed on SoC LS1021A.
> > +   Must include 2 entries:
> > +   The first entry must be a link to the SCFG device node.
> > +   The 2nd entry must be offset of register IPPDEXPCR1 in SCFG.
> 
> You don't need a DT change for this. You can find SCFG node by its compatible
> string and then the offset should be known given this issue is only on 1 SoC.

Did you mean that RCPM driver just to access IPPDEXPCR1 shadowed register in 
SCFG
directly without fetching it's offset info. from DT?

Regards,
Ran



[PATCH v2 5/5] arm: dts: ls1021a: fix rcpm failed to claim resource

2020-09-23 Thread Ran Wang
The range of dcfg reg is wrong, which overlap with other device,
such as rcpm. This issue causing rcpm driver failed to claim
reg resource when calling devm_ioremap_resource().

Signed-off-by: Ran Wang 
Acked-by: Li Yang 
---
Change in v2:
 - None

 arch/arm/boot/dts/ls1021a.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index e372630f..286c547 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -173,7 +173,7 @@
 
dcfg: dcfg@1ee {
compatible = "fsl,ls1021a-dcfg", "syscon";
-   reg = <0x0 0x1ee 0x0 0x1>;
+   reg = <0x0 0x1ee 0x0 0x1000>;
big-endian;
};
 
-- 
2.7.4



[PATCH v2 4/5] arm: dts: ls1021a: fix flextimer failed to wake system

2020-09-23 Thread Ran Wang
The data of property 'fsl,rcpm-wakeup' is not corrcet, which causing
RCPM driver incorrectly program register IPPDEXPCR1, then flextimer is
wrongly clock gated during system suspend, can't send interrupt to
wake.

Signed-off-by: Ran Wang 
Acked-by: Li Yang 
---
Change in v2:
 - None

 arch/arm/boot/dts/ls1021a.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 089fe86..e372630f 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -1014,7 +1014,7 @@
compatible = "fsl,ls1021a-ftm-alarm";
reg = <0x0 0x29d 0x0 0x1>;
reg-names = "ftm";
-   fsl,rcpm-wakeup = < 0x2 0x0>;
+   fsl,rcpm-wakeup = < 0x0 0x2000>;
interrupts = ;
big-endian;
};
-- 
2.7.4



[PATCH v2 3/5] arm: dts: ls1021a: enable RCPM workaround for erratum A-008646

2020-09-23 Thread Ran Wang
From: Biwen Li 

The patch fixes a bug that FlexTimer cannot
wakeup system in deep sleep.

Signed-off-by: Biwen Li 
Signed-off-by: Ran Wang 
---
Change in v2:
 - Change subject of commit message to be consistent with other related patches.

 arch/arm/boot/dts/ls1021a.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 827373e..089fe86 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -1007,6 +1007,7 @@
compatible = "fsl,ls1021a-rcpm", "fsl,qoriq-rcpm-2.1+";
reg = <0x0 0x1ee2140 0x0 0x8>;
#fsl,rcpm-wakeup-cells = <2>;
+   fsl,ippdexpcr1-alt-addr = < 0x51c>;
};
 
ftm_alarm0: timer0@29d {
-- 
2.7.4



[PATCH v2 1/5] Documentation: dt: binding: fsl: Add 'fsl, ippdexpcr1-alt-addr' property

2020-09-23 Thread Ran Wang
From: Biwen Li 

The 'fsl,ippdexpcr1-alt-addr' property is used to handle an errata A-008646
on LS1021A

Signed-off-by: Biwen Li 
Signed-off-by: Ran Wang 
---
Change in v2:
 - None

 Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt 
b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
index 5a33619..1be58a3 100644
--- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
@@ -34,6 +34,11 @@ Chassis Version  Example Chips
 Optional properties:
  - little-endian : RCPM register block is Little Endian. Without it RCPM
will be Big Endian (default case).
+ - fsl,ippdexpcr1-alt-addr : The property is related to a hardware issue
+   on SoC LS1021A and only needed on SoC LS1021A.
+   Must include 2 entries:
+   The first entry must be a link to the SCFG device node.
+   The 2nd entry must be offset of register IPPDEXPCR1 in SCFG.
 
 Example:
 The RCPM node for T4240:
@@ -43,6 +48,20 @@ The RCPM node for T4240:
#fsl,rcpm-wakeup-cells = <2>;
};
 
+The RCPM node for LS1021A:
+   rcpm: rcpm@1ee2140 {
+   compatible = "fsl,ls1021a-rcpm", "fsl,qoriq-rcpm-2.1+";
+   reg = <0x0 0x1ee2140 0x0 0x8>;
+   #fsl,rcpm-wakeup-cells = <2>;
+
+   /*
+* The second and third entry compose an alt offset
+* address for IPPDEXPCR1(SCFG_SPARECR8)
+*/
+   fsl,ippdexpcr1-alt-addr = < 0x51c>;
+   };
+
+
 * Freescale RCPM Wakeup Source Device Tree Bindings
 ---
 Required fsl,rcpm-wakeup property should be added to a device node if the 
device
-- 
2.7.4



[PATCH v2 2/5] soc: fsl: handle RCPM errata A-008646 on SoC LS1021A

2020-09-23 Thread Ran Wang
From: Biwen Li 

Hardware issue:
- Reading register RCPM_IPPDEXPCR1 always return zero, this causes
  system firmware could not get correct information and wrongly do
  clock gating for all wakeup source IP during system suspend. Then
  those IPs will never get chance to wake system.

Workaround:
- Duplicate register RCPM_IPPDEXPCR1's setting to register SCFG_SPARECR8
  to allow system firmware's psci method read it and do things accordingly.

Signed-off-by: Biwen Li 
Signed-off-by: Ran Wang 
---
Change in v2:
 - Update commit message to be more clear.
 - Replace device_property_read_u32_array() with 
syscon_regmap_lookup_by_phandle_args()
   to make code simpler.

 drivers/soc/fsl/rcpm.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
index a093dbe..6e3 100644
--- a/drivers/soc/fsl/rcpm.c
+++ b/drivers/soc/fsl/rcpm.c
@@ -2,7 +2,7 @@
 //
 // rcpm.c - Freescale QorIQ RCPM driver
 //
-// Copyright 2019 NXP
+// Copyright 2019-2020 NXP
 //
 // Author: Ran Wang 
 
@@ -13,6 +13,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #define RCPM_WAKEUP_CELL_MAX_SIZE  7
 
@@ -37,6 +40,9 @@ static int rcpm_pm_prepare(struct device *dev)
struct device_node  *np = dev->of_node;
u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1];
u32 setting[RCPM_WAKEUP_CELL_MAX_SIZE] = {0};
+   struct regmap *scfg_addr_regmap = NULL;
+   u32 reg_offset = 0;
+   u32 reg_value = 0;
 
rcpm = dev_get_drvdata(dev);
if (!rcpm)
@@ -90,6 +96,29 @@ static int rcpm_pm_prepare(struct device *dev)
tmp |= ioread32be(address);
iowrite32be(tmp, address);
}
+   /*
+* Workaround of errata A-008646 on SoC LS1021A:
+* There is a bug of register ippdexpcr1.
+* Reading configuration register RCPM_IPPDEXPCR1
+* always return zero. So save ippdexpcr1's value
+* to register SCFG_SPARECR8.And the value of
+* ippdexpcr1 will be read from SCFG_SPARECR8.
+*/
+   if (dev_of_node(dev) && (i == 1)) {
+   scfg_addr_regmap = 
syscon_regmap_lookup_by_phandle_args(np,
+   "fsl,ippdexpcr1-alt-addr",
+   1,
+   _offset);
+   if (!IS_ERR_OR_NULL(scfg_addr_regmap)) {
+   /* Update value on register SCFG_SPARECR8 */
+   regmap_read(scfg_addr_regmap,
+   reg_offset,
+   _value);
+   regmap_write(scfg_addr_regmap,
+   reg_offset,
+   tmp | reg_value);
+   }
+   }
}
 
return 0;
-- 
2.7.4



[PATCH 4/9] iov_iter: transparently handle compat iovecs in import_iovec

2020-09-23 Thread Christoph Hellwig
Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.

This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.

Signed-off-by: Christoph Hellwig 
---
 block/scsi_ioctl.c | 12 ++--
 drivers/scsi/sg.c  |  9 +
 fs/aio.c   |  8 ++--
 fs/io_uring.c  | 20 
 fs/read_write.c|  6 --
 fs/splice.c|  2 +-
 include/linux/uio.h|  8 
 lib/iov_iter.c | 14 ++
 mm/process_vm_access.c |  3 ++-
 net/compat.c   |  4 ++--
 security/keys/compat.c |  5 ++---
 11 files changed, 26 insertions(+), 65 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index ef722f04f88a93..e08df86866ee5d 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -333,16 +333,8 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
struct iov_iter i;
struct iovec *iov = NULL;
 
-#ifdef CONFIG_COMPAT
-   if (in_compat_syscall())
-   ret = compat_import_iovec(rq_data_dir(rq),
-  hdr->dxferp, hdr->iovec_count,
-  0, , );
-   else
-#endif
-   ret = import_iovec(rq_data_dir(rq),
-  hdr->dxferp, hdr->iovec_count,
-  0, , );
+   ret = import_iovec(rq_data_dir(rq), hdr->dxferp,
+  hdr->iovec_count, 0, , );
if (ret < 0)
goto out_free_cdb;
 
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 20472aaaf630a4..bfa8d77322d732 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1820,14 +1820,7 @@ sg_start_req(Sg_request *srp, unsigned char *cmd)
struct iovec *iov = NULL;
struct iov_iter i;
 
-#ifdef CONFIG_COMPAT
-   if (in_compat_syscall())
-   res = compat_import_iovec(rw, hp->dxferp, iov_count,
- 0, , );
-   else
-#endif
-   res = import_iovec(rw, hp->dxferp, iov_count,
-  0, , );
+   res = import_iovec(rw, hp->dxferp, iov_count, 0, , );
if (res < 0)
return res;
 
diff --git a/fs/aio.c b/fs/aio.c
index d5ec303855669d..c45c20d875388c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1489,12 +1489,8 @@ static ssize_t aio_setup_rw(int rw, const struct iocb 
*iocb,
*iovec = NULL;
return ret;
}
-#ifdef CONFIG_COMPAT
-   if (compat)
-   return compat_import_iovec(rw, buf, len, UIO_FASTIOV, iovec,
-   iter);
-#endif
-   return import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter);
+
+   return __import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter, compat);
 }
 
 static inline void aio_rw_done(struct kiocb *req, ssize_t ret)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 3790c7fe9fee22..ba84ecea7cb1a4 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2837,13 +2837,8 @@ static ssize_t __io_import_iovec(int rw, struct io_kiocb 
*req,
return ret;
}
 
-#ifdef CONFIG_COMPAT
-   if (req->ctx->compat)
-   return compat_import_iovec(rw, buf, sqe_len, UIO_FASTIOV,
-   iovec, iter);
-#endif
-
-   return import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter);
+   return __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter,
+ req->ctx->compat);
 }
 
 static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
@@ -4179,8 +4174,9 @@ static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
sr->len);
iomsg->iov = NULL;
} else {
-   ret = import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
-   >iov, >msg.msg_iter);
+   ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
+>iov, >msg.msg_iter,
+false);
if (ret > 0)
ret = 0;
}
@@ -4220,9 +4216,9 @@ static int __io_compat_recvmsg_copy_hdr(struct io_kiocb 
*req,
sr->len = iomsg->iov[0].iov_len;
iomsg->iov = NULL;
} else {
-   ret = compat_import_iovec(READ, uiov, len, UIO_FASTIOV,
-   >iov,
-   >msg.msg_iter);
+   ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
+  UIO_FASTIOV, >iov,
+  >msg.msg_iter, 

[PATCH 1/9] compat.h: fix a spelling error in

2020-09-23 Thread Christoph Hellwig
There is no compat_sys_readv64v2 syscall, only a compat_sys_preadv64v2
one.

Signed-off-by: Christoph Hellwig 
---
 include/linux/compat.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index b354ce58966e2d..654c1ec36671a4 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -812,7 +812,7 @@ asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
const struct compat_iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
-asmlinkage long  compat_sys_readv64v2(unsigned long fd,
+asmlinkage long  compat_sys_preadv64v2(unsigned long fd,
const struct compat_iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags);
 #endif
-- 
2.28.0



[PATCH 5/9] fs: remove various compat readv/writev helpers

2020-09-23 Thread Christoph Hellwig
Now that import_iovec handles compat iovecs as well, all the duplicated
code in the compat readv/writev helpers is not needed.  Remove them
and switch the compat syscall handlers to use the native helpers.

Signed-off-by: Christoph Hellwig 
---
 fs/read_write.c | 179 
 1 file changed, 30 insertions(+), 149 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 0a68037580b455..eab427b7cc0a3f 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1068,226 +1068,107 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const 
struct iovec __user *, vec,
return do_pwritev(fd, vec, vlen, pos, flags);
 }
 
+/*
+ * Various compat syscalls.  Note that they all pretend to take a native
+ * iovec - import_iovec will properly treat those as compat_iovecs based on
+ * in_compat_syscall().
+ */
 #ifdef CONFIG_COMPAT
-static size_t compat_readv(struct file *file,
-  const struct compat_iovec __user *vec,
-  unsigned long vlen, loff_t *pos, rwf_t flags)
-{
-   struct iovec iovstack[UIO_FASTIOV];
-   struct iovec *iov = iovstack;
-   struct iov_iter iter;
-   ssize_t ret;
-
-   ret = import_iovec(READ, (const struct iovec __user *)vec, vlen,
-  UIO_FASTIOV, , );
-   if (ret >= 0) {
-   ret = do_iter_read(file, , pos, flags);
-   kfree(iov);
-   }
-   if (ret > 0)
-   add_rchar(current, ret);
-   inc_syscr(current);
-   return ret;
-}
-
-static size_t do_compat_readv(compat_ulong_t fd,
-const struct compat_iovec __user *vec,
-compat_ulong_t vlen, rwf_t flags)
-{
-   struct fd f = fdget_pos(fd);
-   ssize_t ret;
-   loff_t pos;
-
-   if (!f.file)
-   return -EBADF;
-   pos = f.file->f_pos;
-   ret = compat_readv(f.file, vec, vlen, , flags);
-   if (ret >= 0)
-   f.file->f_pos = pos;
-   fdput_pos(f);
-   return ret;
-
-}
-
 COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
-   const struct compat_iovec __user *,vec,
+   const struct iovec __user *, vec,
compat_ulong_t, vlen)
 {
-   return do_compat_readv(fd, vec, vlen, 0);
-}
-
-static long do_compat_preadv64(unsigned long fd,
- const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, rwf_t flags)
-{
-   struct fd f;
-   ssize_t ret;
-
-   if (pos < 0)
-   return -EINVAL;
-   f = fdget(fd);
-   if (!f.file)
-   return -EBADF;
-   ret = -ESPIPE;
-   if (f.file->f_mode & FMODE_PREAD)
-   ret = compat_readv(f.file, vec, vlen, , flags);
-   fdput(f);
-   return ret;
+   return do_readv(fd, vec, vlen, 0);
 }
 
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
 COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
-   const struct compat_iovec __user *,vec,
+   const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos)
 {
-   return do_compat_preadv64(fd, vec, vlen, pos, 0);
+   return do_preadv(fd, vec, vlen, pos, 0);
 }
 #endif
 
 COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
-   const struct compat_iovec __user *,vec,
+   const struct iovec __user *, vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high)
 {
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
 
-   return do_compat_preadv64(fd, vec, vlen, pos, 0);
+   return do_preadv(fd, vec, vlen, pos, 0);
 }
 
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
 COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
-   const struct compat_iovec __user *,vec,
+   const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos, rwf_t, flags)
 {
if (pos == -1)
-   return do_compat_readv(fd, vec, vlen, flags);
-
-   return do_compat_preadv64(fd, vec, vlen, pos, flags);
+   return do_readv(fd, vec, vlen, flags);
+   return do_preadv(fd, vec, vlen, pos, flags);
 }
 #endif
 
 COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
-   const struct compat_iovec __user *,vec,
+   const struct iovec __user *, vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
rwf_t, flags)
 {
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
 
if (pos == -1)
-   return do_compat_readv(fd, vec, vlen, flags);
-
-   return do_compat_preadv64(fd, vec, vlen, pos, flags);
-}
-
-static size_t compat_writev(struct file *file,
-   const struct compat_iovec __user *vec,
-   unsigned long vlen, loff_t *pos, rwf_t flags)
-{
-   struct iovec iovstack[UIO_FASTIOV];
-   struct iovec *iov = iovstack;
-   struct iov_iter 

[PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c

2020-09-23 Thread Christoph Hellwig
From: David Laight 

This lets the compiler inline it into import_iovec() generating
much better code.

Signed-off-by: David Laight 
Signed-off-by: Christoph Hellwig 
---
 fs/read_write.c | 179 
 lib/iov_iter.c  | 176 +++
 2 files changed, 176 insertions(+), 179 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 5db58b8c78d0dd..e5e891a88442ef 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -752,185 +752,6 @@ static ssize_t do_loop_readv_writev(struct file *filp, 
struct iov_iter *iter,
return ret;
 }
 
-/**
- * rw_copy_check_uvector() - Copy an array of  iovec from userspace
- * into the kernel and check that it is valid.
- *
- * @type: One of %CHECK_IOVEC_ONLY, %READ, or %WRITE.
- * @uvector: Pointer to the userspace array.
- * @nr_segs: Number of elements in userspace array.
- * @fast_segs: Number of elements in @fast_pointer.
- * @fast_pointer: Pointer to (usually small on-stack) kernel array.
- * @ret_pointer: (output parameter) Pointer to a variable that will point to
- * either @fast_pointer, a newly allocated kernel array, or NULL,
- * depending on which array was used.
- *
- * This function copies an array of  iovec of @nr_segs from
- * userspace into the kernel and checks that each element is valid (e.g.
- * it does not point to a kernel address or cause overflow by being too
- * large, etc.).
- *
- * As an optimization, the caller may provide a pointer to a small
- * on-stack array in @fast_pointer, typically %UIO_FASTIOV elements long
- * (the size of this array, or 0 if unused, should be given in @fast_segs).
- *
- * @ret_pointer will always point to the array that was used, so the
- * caller must take care not to call kfree() on it e.g. in case the
- * @fast_pointer array was used and it was allocated on the stack.
- *
- * Return: The total number of bytes covered by the iovec array on success
- *   or a negative error code on error.
- */
-ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer,
- struct iovec **ret_pointer)
-{
-   unsigned long seg;
-   ssize_t ret;
-   struct iovec *iov = fast_pointer;
-
-   /*
-* SuS says "The readv() function *may* fail if the iovcnt argument
-* was less than or equal to 0, or greater than {IOV_MAX}.  Linux has
-* traditionally returned zero for zero segments, so...
-*/
-   if (nr_segs == 0) {
-   ret = 0;
-   goto out;
-   }
-
-   /*
-* First get the "struct iovec" from user memory and
-* verify all the pointers
-*/
-   if (nr_segs > UIO_MAXIOV) {
-   ret = -EINVAL;
-   goto out;
-   }
-   if (nr_segs > fast_segs) {
-   iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
-   if (iov == NULL) {
-   ret = -ENOMEM;
-   goto out;
-   }
-   }
-   if (copy_from_user(iov, uvector, nr_segs*sizeof(*uvector))) {
-   ret = -EFAULT;
-   goto out;
-   }
-
-   /*
-* According to the Single Unix Specification we should return EINVAL
-* if an element length is < 0 when cast to ssize_t or if the
-* total length would overflow the ssize_t return value of the
-* system call.
-*
-* Linux caps all read/write calls to MAX_RW_COUNT, and avoids the
-* overflow case.
-*/
-   ret = 0;
-   for (seg = 0; seg < nr_segs; seg++) {
-   void __user *buf = iov[seg].iov_base;
-   ssize_t len = (ssize_t)iov[seg].iov_len;
-
-   /* see if we we're about to use an invalid len or if
-* it's about to overflow ssize_t */
-   if (len < 0) {
-   ret = -EINVAL;
-   goto out;
-   }
-   if (type >= 0
-   && unlikely(!access_ok(buf, len))) {
-   ret = -EFAULT;
-   goto out;
-   }
-   if (len > MAX_RW_COUNT - ret) {
-   len = MAX_RW_COUNT - ret;
-   iov[seg].iov_len = len;
-   }
-   ret += len;
-   }
-out:
-   *ret_pointer = iov;
-   return ret;
-}
-
-#ifdef CONFIG_COMPAT
-ssize_t compat_rw_copy_check_uvector(int type,
-   const struct compat_iovec __user *uvector, unsigned long 
nr_segs,
-   unsigned long fast_segs, struct iovec *fast_pointer,
-   struct iovec **ret_pointer)
-{
-   compat_ssize_t tot_len;
-   struct iovec *iov = *ret_pointer = fast_pointer;
-   ssize_t ret = 0;
-   int seg;
-
-   /*
-* SuS says 

[PATCH 8/9] mm: remove compat_process_vm_{readv,writev}

2020-09-23 Thread Christoph Hellwig
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/include/asm/unistd32.h |  4 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  4 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  4 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  4 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  4 +-
 arch/s390/kernel/syscalls/syscall.tbl |  4 +-
 arch/sparc/kernel/syscalls/syscall.tbl|  4 +-
 arch/x86/entry/syscall_x32.c  |  2 +
 arch/x86/entry/syscalls/syscall_32.tbl|  4 +-
 arch/x86/entry/syscalls/syscall_64.tbl|  4 +-
 include/linux/compat.h|  8 ---
 include/uapi/asm-generic/unistd.h |  6 +-
 mm/process_vm_access.c| 69 ---
 tools/include/uapi/asm-generic/unistd.h   |  6 +-
 .../arch/powerpc/entry/syscalls/syscall.tbl   |  4 +-
 .../perf/arch/s390/entry/syscalls/syscall.tbl |  4 +-
 .../arch/x86/entry/syscalls/syscall_64.tbl|  4 +-
 17 files changed, 30 insertions(+), 109 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 11dfae3a8563bd..0c280a05f699bf 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -763,9 +763,9 @@ __SYSCALL(__NR_sendmmsg, compat_sys_sendmmsg)
 #define __NR_setns 375
 __SYSCALL(__NR_setns, sys_setns)
 #define __NR_process_vm_readv 376
-__SYSCALL(__NR_process_vm_readv, compat_sys_process_vm_readv)
+__SYSCALL(__NR_process_vm_readv, sys_process_vm_readv)
 #define __NR_process_vm_writev 377
-__SYSCALL(__NR_process_vm_writev, compat_sys_process_vm_writev)
+__SYSCALL(__NR_process_vm_writev, sys_process_vm_writev)
 #define __NR_kcmp 378
 __SYSCALL(__NR_kcmp, sys_kcmp)
 #define __NR_finit_module 379
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 5a39d4de0ac85b..0bc2e0fcf1ee56 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -317,8 +317,8 @@
 306n32 syncfs  sys_syncfs
 307n32 sendmmsgcompat_sys_sendmmsg
 308n32 setns   sys_setns
-309n32 process_vm_readvcompat_sys_process_vm_readv
-310n32 process_vm_writev   compat_sys_process_vm_writev
+309n32 process_vm_readvsys_process_vm_readv
+310n32 process_vm_writev   sys_process_vm_writev
 311n32 kcmpsys_kcmp
 312n32 finit_modulesys_finit_module
 313n32 sched_setattr   sys_sched_setattr
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 136efc6b8c5444..b408c13b934296 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -356,8 +356,8 @@
 342o32 syncfs  sys_syncfs
 343o32 sendmmsgsys_sendmmsg
compat_sys_sendmmsg
 344o32 setns   sys_setns
-345o32 process_vm_readvsys_process_vm_readv
compat_sys_process_vm_readv
-346o32 process_vm_writev   sys_process_vm_writev   
compat_sys_process_vm_writev
+345o32 process_vm_readvsys_process_vm_readv
+346o32 process_vm_writev   sys_process_vm_writev
 347o32 kcmpsys_kcmp
 348o32 finit_modulesys_finit_module
 349o32 sched_setattr   sys_sched_setattr
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index a9e184192caedd..2015a5124b78ad 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -372,8 +372,8 @@
 327common  syncfs  sys_syncfs
 328common  setns   sys_setns
 329common  sendmmsgsys_sendmmsg
compat_sys_sendmmsg
-330common  process_vm_readvsys_process_vm_readv
compat_sys_process_vm_readv
-331common  process_vm_writev   sys_process_vm_writev   
compat_sys_process_vm_writev
+330common  process_vm_readvsys_process_vm_readv
+331common  process_vm_writev   sys_process_vm_writev
 332common  kcmpsys_kcmp
 333common  finit_modulesys_finit_module
 334common  sched_setattr   sys_sched_setattr
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 0d4985919ca34d..66a472aa635d3f 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ 

let import_iovec deal with compat_iovecs as well v3

2020-09-23 Thread Christoph Hellwig
Hi Al,

this series changes import_iovec to transparently deal with comat iovec
structures, and then cleanups up a lot of code dupliation.

Changes since v2:
 - revert the switch of the access process vm sysclls to iov_iter
 - refactor the import_iovec internals differently
 - switch aio to use __import_iovec

Changes since v1:
 - improve a commit message
 - drop a pointless unlikely
 - drop the PF_FORCE_COMPAT flag
 - add a few more cleanups (including two from David Laight)

Diffstat:
 arch/arm64/include/asm/unistd32.h  |   10 
 arch/mips/kernel/syscalls/syscall_n32.tbl  |   10 
 arch/mips/kernel/syscalls/syscall_o32.tbl  |   10 
 arch/parisc/kernel/syscalls/syscall.tbl|   10 
 arch/powerpc/kernel/syscalls/syscall.tbl   |   10 
 arch/s390/kernel/syscalls/syscall.tbl  |   10 
 arch/sparc/kernel/syscalls/syscall.tbl |   10 
 arch/x86/entry/syscall_x32.c   |5 
 arch/x86/entry/syscalls/syscall_32.tbl |   10 
 arch/x86/entry/syscalls/syscall_64.tbl |   10 
 block/scsi_ioctl.c |   12 
 drivers/scsi/sg.c  |9 
 fs/aio.c   |   38 --
 fs/io_uring.c  |   20 -
 fs/read_write.c|  362 +
 fs/splice.c|   57 ---
 include/linux/compat.h |   24 -
 include/linux/fs.h |   11 
 include/linux/uio.h|   10 
 include/uapi/asm-generic/unistd.h  |   12 
 lib/iov_iter.c |  161 +++--
 mm/process_vm_access.c |   85 
 net/compat.c   |4 
 security/keys/compat.c |   37 --
 security/keys/internal.h   |5 
 security/keys/keyctl.c |2 
 tools/include/uapi/asm-generic/unistd.h|   12 
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl |   10 
 tools/perf/arch/s390/entry/syscalls/syscall.tbl|   10 
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl  |   10 
 30 files changed, 280 insertions(+), 706 deletions(-)


[PATCH 9/9] security/keys: remove compat_keyctl_instantiate_key_iov

2020-09-23 Thread Christoph Hellwig
Now that import_iovec handles compat iovecs, the native version of
keyctl_instantiate_key_iov can be used for the compat case as well.

Signed-off-by: Christoph Hellwig 
---
 security/keys/compat.c   | 36 ++--
 security/keys/internal.h |  5 -
 security/keys/keyctl.c   |  2 +-
 3 files changed, 3 insertions(+), 40 deletions(-)

diff --git a/security/keys/compat.c b/security/keys/compat.c
index 7ae531db031cf8..1545efdca56227 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -11,38 +11,6 @@
 #include 
 #include "internal.h"
 
-/*
- * Instantiate a key with the specified compatibility multipart payload and
- * link the key into the destination keyring if one is given.
- *
- * The caller must have the appropriate instantiation permit set for this to
- * work (see keyctl_assume_authority).  No other permissions are required.
- *
- * If successful, 0 will be returned.
- */
-static long compat_keyctl_instantiate_key_iov(
-   key_serial_t id,
-   const struct compat_iovec __user *_payload_iov,
-   unsigned ioc,
-   key_serial_t ringid)
-{
-   struct iovec iovstack[UIO_FASTIOV], *iov = iovstack;
-   struct iov_iter from;
-   long ret;
-
-   if (!_payload_iov)
-   ioc = 0;
-
-   ret = import_iovec(WRITE, (const struct iovec __user *)_payload_iov,
-  ioc, ARRAY_SIZE(iovstack), , );
-   if (ret < 0)
-   return ret;
-
-   ret = keyctl_instantiate_key_common(id, , ringid);
-   kfree(iov);
-   return ret;
-}
-
 /*
  * The key control system call, 32-bit compatibility version for 64-bit archs
  */
@@ -113,8 +81,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
return keyctl_reject_key(arg2, arg3, arg4, arg5);
 
case KEYCTL_INSTANTIATE_IOV:
-   return compat_keyctl_instantiate_key_iov(
-   arg2, compat_ptr(arg3), arg4, arg5);
+   return keyctl_instantiate_key_iov(arg2, compat_ptr(arg3), arg4,
+ arg5);
 
case KEYCTL_INVALIDATE:
return keyctl_invalidate_key(arg2);
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 338a526cbfa516..9b9cf3b6fcbb4d 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -262,11 +262,6 @@ extern long keyctl_instantiate_key_iov(key_serial_t,
   const struct iovec __user *,
   unsigned, key_serial_t);
 extern long keyctl_invalidate_key(key_serial_t);
-
-struct iov_iter;
-extern long keyctl_instantiate_key_common(key_serial_t,
- struct iov_iter *,
- key_serial_t);
 extern long keyctl_restrict_keyring(key_serial_t id,
const char __user *_type,
const char __user *_restriction);
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 9febd37a168fd0..e26bbccda7ccee 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1164,7 +1164,7 @@ static int keyctl_change_reqkey_auth(struct key *key)
  *
  * If successful, 0 will be returned.
  */
-long keyctl_instantiate_key_common(key_serial_t id,
+static long keyctl_instantiate_key_common(key_serial_t id,
   struct iov_iter *from,
   key_serial_t ringid)
 {
-- 
2.28.0



[PATCH 7/9] fs: remove compat_sys_vmsplice

2020-09-23 Thread Christoph Hellwig
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/include/asm/unistd32.h |  2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  2 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  2 +-
 arch/s390/kernel/syscalls/syscall.tbl |  2 +-
 arch/sparc/kernel/syscalls/syscall.tbl|  2 +-
 arch/x86/entry/syscall_x32.c  |  1 +
 arch/x86/entry/syscalls/syscall_32.tbl|  2 +-
 arch/x86/entry/syscalls/syscall_64.tbl|  2 +-
 fs/splice.c   | 57 +--
 include/linux/compat.h|  4 --
 include/uapi/asm-generic/unistd.h |  2 +-
 tools/include/uapi/asm-generic/unistd.h   |  2 +-
 .../arch/powerpc/entry/syscalls/syscall.tbl   |  2 +-
 .../perf/arch/s390/entry/syscalls/syscall.tbl |  2 +-
 .../arch/x86/entry/syscalls/syscall_64.tbl|  2 +-
 17 files changed, 28 insertions(+), 62 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 4a236493dca5b9..11dfae3a8563bd 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -697,7 +697,7 @@ __SYSCALL(__NR_sync_file_range2, 
compat_sys_aarch32_sync_file_range2)
 #define __NR_tee 342
 __SYSCALL(__NR_tee, sys_tee)
 #define __NR_vmsplice 343
-__SYSCALL(__NR_vmsplice, compat_sys_vmsplice)
+__SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages 344
 __SYSCALL(__NR_move_pages, compat_sys_move_pages)
 #define __NR_getcpu 345
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index c99a92646f8ee9..5a39d4de0ac85b 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -278,7 +278,7 @@
 267n32 splice  sys_splice
 268n32 sync_file_range sys_sync_file_range
 269n32 tee sys_tee
-270n32 vmsplicecompat_sys_vmsplice
+270n32 vmsplicesys_vmsplice
 271n32 move_pages  compat_sys_move_pages
 272n32 set_robust_list compat_sys_set_robust_list
 273n32 get_robust_list compat_sys_get_robust_list
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 075064d10661bf..136efc6b8c5444 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -318,7 +318,7 @@
 304o32 splice  sys_splice
 305o32 sync_file_range sys_sync_file_range 
sys32_sync_file_range
 306o32 tee sys_tee
-307o32 vmsplicesys_vmsplice
compat_sys_vmsplice
+307o32 vmsplicesys_vmsplice
 308o32 move_pages  sys_move_pages  
compat_sys_move_pages
 309o32 set_robust_list sys_set_robust_list 
compat_sys_set_robust_list
 310o32 get_robust_list sys_get_robust_list 
compat_sys_get_robust_list
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index 192abde0001d9d..a9e184192caedd 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -330,7 +330,7 @@
 29232  sync_file_range parisc_sync_file_range
 29264  sync_file_range sys_sync_file_range
 293common  tee sys_tee
-294common  vmsplicesys_vmsplice
compat_sys_vmsplice
+294common  vmsplicesys_vmsplice
 295common  move_pages  sys_move_pages  
compat_sys_move_pages
 296common  getcpu  sys_getcpu
 297common  epoll_pwait sys_epoll_pwait 
compat_sys_epoll_pwait
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 6f1e2ecf0edad9..0d4985919ca34d 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -369,7 +369,7 @@
 282common  unshare sys_unshare
 283common  splice  sys_splice
 284common  tee sys_tee
-285common  vmsplicesys_vmsplice
compat_sys_vmsplice
+285common  vmsplicesys_vmsplice
 286common  openat  sys_openat  
compat_sys_openat
 287common  

[PATCH 6/9] fs: remove the compat readv/writev syscalls

2020-09-23 Thread Christoph Hellwig
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/include/asm/unistd32.h  |  4 ++--
 arch/mips/kernel/syscalls/syscall_n32.tbl  |  4 ++--
 arch/mips/kernel/syscalls/syscall_o32.tbl  |  4 ++--
 arch/parisc/kernel/syscalls/syscall.tbl|  4 ++--
 arch/powerpc/kernel/syscalls/syscall.tbl   |  4 ++--
 arch/s390/kernel/syscalls/syscall.tbl  |  4 ++--
 arch/sparc/kernel/syscalls/syscall.tbl |  4 ++--
 arch/x86/entry/syscall_x32.c   |  2 ++
 arch/x86/entry/syscalls/syscall_32.tbl |  4 ++--
 arch/x86/entry/syscalls/syscall_64.tbl |  4 ++--
 fs/read_write.c| 14 --
 include/linux/compat.h |  4 
 include/uapi/asm-generic/unistd.h  |  4 ++--
 tools/include/uapi/asm-generic/unistd.h|  4 ++--
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl |  4 ++--
 tools/perf/arch/s390/entry/syscalls/syscall.tbl|  4 ++--
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl  |  4 ++--
 17 files changed, 30 insertions(+), 46 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9d5..4a236493dca5b9 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -301,9 +301,9 @@ __SYSCALL(__NR_flock, sys_flock)
 #define __NR_msync 144
 __SYSCALL(__NR_msync, sys_msync)
 #define __NR_readv 145
-__SYSCALL(__NR_readv, compat_sys_readv)
+__SYSCALL(__NR_readv, sys_readv)
 #define __NR_writev 146
-__SYSCALL(__NR_writev, compat_sys_writev)
+__SYSCALL(__NR_writev, sys_writev)
 #define __NR_getsid 147
 __SYSCALL(__NR_getsid, sys_getsid)
 #define __NR_fdatasync 148
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index f9df9edb67a407..c99a92646f8ee9 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -25,8 +25,8 @@
 15 n32 ioctl   compat_sys_ioctl
 16 n32 pread64 sys_pread64
 17 n32 pwrite64sys_pwrite64
-18 n32 readv   compat_sys_readv
-19 n32 writev  compat_sys_writev
+18 n32 readv   sys_readv
+19 n32 writev  sys_writev
 20 n32 access  sys_access
 21 n32 pipesysm_pipe
 22 n32 _newselect  compat_sys_select
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 195b43cf27c848..075064d10661bf 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -156,8 +156,8 @@
 142o32 _newselect  sys_select  
compat_sys_select
 143o32 flock   sys_flock
 144o32 msync   sys_msync
-145o32 readv   sys_readv   
compat_sys_readv
-146o32 writev  sys_writev  
compat_sys_writev
+145o32 readv   sys_readv
+146o32 writev  sys_writev
 147o32 cacheflush  sys_cacheflush
 148o32 cachectlsys_cachectl
 149o32 sysmips __sys_sysmips
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index def64d221cd4fb..192abde0001d9d 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -159,8 +159,8 @@
 142common  _newselect  sys_select  
compat_sys_select
 143common  flock   sys_flock
 144common  msync   sys_msync
-145common  readv   sys_readv   
compat_sys_readv
-146common  writev  sys_writev  
compat_sys_writev
+145common  readv   sys_readv
+146common  writev  sys_writev
 147common  getsid  sys_getsid
 148common  fdatasync   sys_fdatasync
 149common  _sysctl sys_ni_syscall
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index c2d737ff2e7bec..6f1e2ecf0edad9 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -193,8 +193,8 @@
 142common  _newselect  sys_select  
compat_sys_select
 143common  

[PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec

2020-09-23 Thread Christoph Hellwig
Split rw_copy_check_uvector into two new helpers with more sensible
calling conventions:

 - iovec_from_user copies a iovec from userspace either into the provided
   stack buffer if it fits, or allocates a new buffer for it.  Returns
   the actually used iovec.  It also verifies that iov_len does fit a
   signed type, and handles compat iovecs if the compat flag is set.
 - __import_iovec consolidates the native and compat versions of
   import_iovec. It calls iovec_from_user, then validates each iovec
   actually points to user addresses, and ensures the total length
   doesn't overflow.

This has two major implications:

 - the access_process_vm case loses the total lenght checking, which
   wasn't required anyway, given that each call receives two iovecs
   for the local and remote side of the operation, and it verifies
   the total length on the local side already.
 - instead of a single loop there now are two loops over the iovecs.
   Given that the iovecs are cache hot this doesn't make a major
   difference

Signed-off-by: Christoph Hellwig 
---
 include/linux/compat.h |   6 -
 include/linux/fs.h |  13 --
 include/linux/uio.h|  12 +-
 lib/iov_iter.c | 300 -
 mm/process_vm_access.c |  34 +++--
 5 files changed, 138 insertions(+), 227 deletions(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 654c1ec36671a4..b930de791ff16b 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -451,12 +451,6 @@ extern long compat_arch_ptrace(struct task_struct *child, 
compat_long_t request,
 
 struct epoll_event;/* fortunately, this one is fixed-layout */
 
-extern ssize_t compat_rw_copy_check_uvector(int type,
-   const struct compat_iovec __user *uvector,
-   unsigned long nr_segs,
-   unsigned long fast_segs, struct iovec *fast_pointer,
-   struct iovec **ret_pointer);
-
 extern void __user *compat_alloc_user_space(unsigned long len);
 
 int compat_restore_altstack(const compat_stack_t __user *uss);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a082c..e69b45b6cc7b5f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -178,14 +178,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t 
offset,
 /* File supports async buffered reads */
 #define FMODE_BUF_RASYNC   ((__force fmode_t)0x4000)
 
-/*
- * Flag for rw_copy_check_uvector and compat_rw_copy_check_uvector
- * that indicates that they should check the contents of the iovec are
- * valid, but not check the memory that the iovec elements
- * points too.
- */
-#define CHECK_IOVEC_ONLY -1
-
 /*
  * Attribute flags.  These should be or-ed together to figure out what
  * has been changed!
@@ -1887,11 +1879,6 @@ static inline int call_mmap(struct file *file, struct 
vm_area_struct *vma)
return file->f_op->mmap(file, vma);
 }
 
-ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer,
- struct iovec **ret_pointer);
-
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 3835a8a8e9eae0..92c11fe41c6228 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -266,9 +266,15 @@ bool csum_and_copy_from_iter_full(void *addr, size_t 
bytes, __wsum *csum, struct
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
struct iov_iter *i);
 
-ssize_t import_iovec(int type, const struct iovec __user * uvector,
-unsigned nr_segs, unsigned fast_segs,
-struct iovec **iov, struct iov_iter *i);
+struct iovec *iovec_from_user(const struct iovec __user *uvector,
+   unsigned long nr_segs, unsigned long fast_segs,
+   struct iovec *fast_iov, bool compat);
+ssize_t import_iovec(int type, const struct iovec __user *uvec,
+unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+struct iov_iter *i);
+ssize_t __import_iovec(int type, const struct iovec __user *uvec,
+unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+struct iov_iter *i, bool compat);
 
 #ifdef CONFIG_COMPAT
 struct compat_iovec;
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index ccea9db3f72be8..d5d8afe31fca16 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1650,107 +1651,133 @@ const void *dup_iter(struct iov_iter *new, struct 
iov_iter *old, gfp_t flags)
 }
 EXPORT_SYMBOL(dup_iter);
 
-/**
- * rw_copy_check_uvector() - Copy an array of  iovec from userspace
- *