Re: [PATCH kernel] prom_init: Fetch flatten device tree from the system firmware

2019-05-02 Thread David Gibson
On Fri, May 03, 2019 at 10:10:57AM +1000, Stewart Smith wrote:
> David Gibson  writes:
> > On Wed, May 01, 2019 at 01:42:21PM +1000, Alexey Kardashevskiy wrote:
> >> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
> >> about 8.5sec to fetch the entire device tree via the client interface
> >> as the DT is traversed twice - for strings blob and for struct blob.
> >> Also, "getprop" is quite slow too as SLOF stores properties in a linked
> >> list.
> >> 
> >> However, since [1] SLOF builds flattened device tree (FDT) for another
> >> purpose. [2] adds a new "fdt-fetch" client interface for the OS to fetch
> >> the FDT.
> >> 
> >> This tries the new method; if not supported, this falls back to
> >> the old method.
> >> 
> >> There is a change in the FDT layout - the old method produced
> >> (reserved map, strings, structs), the new one receives only strings and
> >> structs from the firmware and adds the final reserved map to the end,
> >> so it is (fw reserved map, strings, structs, reserved map).
> >> This still produces the same unflattened device tree.
> >> 
> >> This merges the reserved map from the firmware into the kernel's reserved
> >> map. At the moment SLOF generates an empty reserved map so this does not
> >> change the existing behaviour in regard of reservations.
> >> 
> >> This supports only v17 onward as only that version provides dt_struct_size
> >> which works as "fdt-fetch" only produces v17 blobs.
> >> 
> >> If "fdt-fetch" is not available, the old method of fetching the DT is used.
> >> 
> >> [1] https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=e6fc84652c9c00
> >> [2] https://git.qemu.org/?p=SLOF.git;a=commit;h=ecda95906930b80
> >> 
> >> Signed-off-by: Alexey Kardashevskiy 
> >
> > Hrm.  I've gotta say I'm not terribly convinced that it's worth adding
> > a new interface we'll need to maintain to save 8s on a somewhat
> > contrived testcase.
> 
> 256CPUs aren't that many anymore though. Although I guess that many PCI
> devices is still a little uncommon.

Yeah, it was the PCI devices I was meaning, not the cpus.

> A 4 socket POWER8 or POWER9 can easily be that large, and a small test
> kernel/userspace will boot in ~2.5-4 seconds. So it's possible that
> the device tree fetch could be surprisingly non-trivial percentage of boot
> time at least on some machines.
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [EXT] Re: [PATCH V4] ASoC: fsl_esai: Add pm runtime function

2019-05-02 Thread Mark Brown
On Thu, May 02, 2019 at 09:13:58AM +, S.j. Wang wrote:

> I am checking, but I don't know why this patch failed in your side. I 
> Tried to apply this patch on for-5.1, for 5.2,  for-linus  and for-next, all 
> are
> Successful.  The git is 
> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git.

> I can't reproduce your problem. Is there any my operation wrong?

The error message I got was:

Applying: ASoC: fsl_esai: Add pm runtime function
error: patch failed: sound/soc/fsl/fsl_esai.c:9
error: sound/soc/fsl/fsl_esai.c: patch does not apply
Patch failed at 0001 ASoC: fsl_esai: Add pm runtime function

which is the header addition.  I can't spot any obvious issues visually
looking at the patch, only thing I can think is some kind of whitespace
damage somewhere.


signature.asc
Description: PGP signature


Re: [PATCH v2 2/2] powerpc/mm: Warn if W+X pages found on boot

2019-05-02 Thread Russell Currey
On Fri, 2019-05-03 at 00:37 +, Joel Stanley wrote:
> On Thu, 2 May 2019 at 07:42, Russell Currey 
> wrote:
> > Implement code to walk all pages and warn if any are found to be
> > both
> > writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and
> > is
> > behind the DEBUG_WX config option.
> > 
> > This only runs on boot and has no runtime performance implications.
> > 
> > Very heavily influenced (and in some cases copied verbatim) from
> > the
> > ARM64 code written by Laura Abbott (thanks!), since our ptdump
> > infrastructure is similar.
> > 
> > Signed-off-by: Russell Currey 
> > ---
> > v2: A myriad of fixes and cleanups thanks to Christophe Leroy
> > 
> >  arch/powerpc/Kconfig.debug | 19 ++
> >  arch/powerpc/include/asm/pgtable.h |  6 +
> >  arch/powerpc/mm/pgtable_32.c   |  3 +++
> >  arch/powerpc/mm/pgtable_64.c   |  3 +++
> >  arch/powerpc/mm/ptdump/ptdump.c| 41
> > +-
> >  5 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/Kconfig.debug
> > b/arch/powerpc/Kconfig.debug
> > index 4e00cb0a5464..9e8bcddd8b8f 100644
> > --- a/arch/powerpc/Kconfig.debug
> > +++ b/arch/powerpc/Kconfig.debug
> > @@ -361,6 +361,25 @@ config PPC_PTDUMP
> > 
> >   If you are unsure, say N.
> > 
> > +config PPC_DEBUG_WX
> 
> The other architectures call this DEBUG_WX, in case you wanted to
> name
> it the same.

I did originally, I changed it since we have PPC_PTDUMP but I don't
really care either way.  mpe can change it if he wants

> 
> > +   bool "Warn on W+X mappings at boot"
> > +   select PPC_PTDUMP
> > +   help
> > + Generate a warning if any W+X mappings are found at boot.



[PATCH v2 2/3] ibmvscsi: redo driver work thread to use enum action states

2019-05-02 Thread Tyrel Datwyler
From: Tyrel Datwyler 

The current implemenation relies on two flags in the drivers private host
structure to signal the need for a host reset or to reenable the CRQ after a
LPAR migration. This patch does away with those flags and introduces a single
action flag and defined enums for the supported kthread work actions. Lastly,
the if/else logic is replaced with a switch statement.

Signed-off-by: Tyrel Datwyler 
---
Changes in v2:
release/grab host_lock around reset/reenable calls

 drivers/scsi/ibmvscsi/ibmvscsi.c | 61 ++--
 drivers/scsi/ibmvscsi/ibmvscsi.h |  9 +++--
 2 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 65fc8ca962c5..8df82c58e7b9 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -828,7 +828,7 @@ static void ibmvscsi_reset_host(struct ibmvscsi_host_data 
*hostdata)
atomic_set(>request_limit, 0);
 
purge_requests(hostdata, DID_ERROR);
-   hostdata->reset_crq = 1;
+   hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
wake_up(>work_wait_q);
 }
 
@@ -1797,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
/* We need to re-setup the interpartition connection */
dev_info(hostdata->dev, "Re-enabling adapter!\n");
hostdata->client_migrated = 1;
-   hostdata->reenable_crq = 1;
+   hostdata->action = IBMVSCSI_HOST_ACTION_REENABLE;
purge_requests(hostdata, DID_REQUEUE);
wake_up(>work_wait_q);
} else {
@@ -2116,48 +2116,71 @@ static unsigned long ibmvscsi_get_desired_dma(struct 
vio_dev *vdev)
 
 static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
 {
+   unsigned long flags;
int rc;
char *action = "reset";
 
-   if (hostdata->reset_crq) {
-   smp_rmb();
-   hostdata->reset_crq = 0;
-
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
+   switch (hostdata->action) {
+   case IBMVSCSI_HOST_ACTION_NONE:
+   break;
+   case IBMVSCSI_HOST_ACTION_RESET:
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
rc = ibmvscsi_reset_crq_queue(>queue, hostdata);
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
if (!rc)
rc = ibmvscsi_send_crq(hostdata, 0xC001LL, 
0);
vio_enable_interrupts(to_vio_dev(hostdata->dev));
-   } else if (hostdata->reenable_crq) {
-   smp_rmb();
+   break;
+   case IBMVSCSI_HOST_ACTION_REENABLE:
action = "enable";
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
rc = ibmvscsi_reenable_crq_queue(>queue, hostdata);
-   hostdata->reenable_crq = 0;
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
if (!rc)
rc = ibmvscsi_send_crq(hostdata, 0xC001LL, 
0);
-   } else
-   return;
+   break;
+   default:
+   break;
+   }
+
+   hostdata->action = IBMVSCSI_HOST_ACTION_NONE;
 
if (rc) {
atomic_set(>request_limit, -1);
dev_err(hostdata->dev, "error after %s\n", action);
}
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 
scsi_unblock_requests(hostdata->host);
 }
 
-static int ibmvscsi_work_to_do(struct ibmvscsi_host_data *hostdata)
+static int __ibmvscsi_work_to_do(struct ibmvscsi_host_data *hostdata)
 {
if (kthread_should_stop())
return 1;
-   else if (hostdata->reset_crq) {
-   smp_rmb();
-   return 1;
-   } else if (hostdata->reenable_crq) {
-   smp_rmb();
-   return 1;
+   switch (hostdata->action) {
+   case IBMVSCSI_HOST_ACTION_NONE:
+   return 0;
+   case IBMVSCSI_HOST_ACTION_RESET:
+   case IBMVSCSI_HOST_ACTION_REENABLE:
+   default:
+   break;
}
 
-   return 0;
+   return 1;
+}
+
+static int ibmvscsi_work_to_do(struct ibmvscsi_host_data *hostdata)
+{
+   unsigned long flags;
+   int rc;
+
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
+   rc = __ibmvscsi_work_to_do(hostdata);
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+
+   return rc;
 }
 
 static int ibmvscsi_work(void *data)
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.h b/drivers/scsi/ibmvscsi/ibmvscsi.h
index 3a7875575616..04bcbc832dc9 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.h
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.h
@@ -88,13 +88,18 @@ struct event_pool {
dma_addr_t iu_token;
 };
 
+enum ibmvscsi_host_action {
+   IBMVSCSI_HOST_ACTION_NONE = 0,
+  

[PATCH v2 3/3] ibmvscsi: fix tripping of blk_mq_run_hw_queue WARN_ON

2019-05-02 Thread Tyrel Datwyler
From: Tyrel Datwyler 

After a successful SRP login response we call scsi_unblock_requests() to
kick any pending IO's. The callback to process this SRP response happens in
a tasklet and therefore is in softirq context. The result of such is
that when blk-mq is enabled it is no longer safe to call
scsi_unblock_requests() from this context. The result of duing so
triggers the following WARN_ON splat in dmesg after a host reset or CRQ
reenablement.

WARNING: CPU: 0 PID: 0 at block/blk-mq.c:1375 __blk_mq_run_hw_queue+0x120/0x180
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc8 #4
NIP [c09771e0] __blk_mq_run_hw_queue+0x120/0x180
LR [c0977484] __blk_mq_delay_run_hw_queue+0x244/0x250
Call Trace:

__blk_mq_delay_run_hw_queue+0x244/0x250
blk_mq_run_hw_queue+0x8c/0x1c0
blk_mq_run_hw_queues+0x60/0x90
scsi_run_queue+0x1e4/0x3b0
scsi_run_host_queues+0x48/0x80
login_rsp+0xb0/0x100
ibmvscsi_handle_crq+0x30c/0x3e0
ibmvscsi_task+0x54/0xe0
tasklet_action_common.isra.3+0xc4/0x1a0
__do_softirq+0x174/0x3f4
irq_exit+0xf0/0x120
__do_irq+0xb0/0x210
call_do_irq+0x14/0x24
do_IRQ+0x9c/0x130
hardware_interrupt_common+0x14c/0x150

This patch fixes the issue by introducing a new host action for
unblocking the scsi requests in our seperate work thread.

Signed-off-by: Tyrel Datwyler 
---
Changes in v2:
no change

 drivers/scsi/ibmvscsi/ibmvscsi.c | 5 -
 drivers/scsi/ibmvscsi/ibmvscsi.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 8df82c58e7b9..727c31dc11a0 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1179,7 +1179,8 @@ static void login_rsp(struct srp_event_struct *evt_struct)
   
be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
 
/* If we had any pending I/Os, kick them */
-   scsi_unblock_requests(hostdata->host);
+   hostdata->action = IBMVSCSI_HOST_ACTION_UNBLOCK;
+   wake_up(>work_wait_q);
 }
 
 /**
@@ -2123,6 +2124,7 @@ static void ibmvscsi_do_work(struct ibmvscsi_host_data 
*hostdata)
spin_lock_irqsave(hostdata->host->host_lock, flags);
switch (hostdata->action) {
case IBMVSCSI_HOST_ACTION_NONE:
+   case IBMVSCSI_HOST_ACTION_UNBLOCK:
break;
case IBMVSCSI_HOST_ACTION_RESET:
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
@@ -2164,6 +2166,7 @@ static int __ibmvscsi_work_to_do(struct 
ibmvscsi_host_data *hostdata)
return 0;
case IBMVSCSI_HOST_ACTION_RESET:
case IBMVSCSI_HOST_ACTION_REENABLE:
+   case IBMVSCSI_HOST_ACTION_UNBLOCK:
default:
break;
}
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.h b/drivers/scsi/ibmvscsi/ibmvscsi.h
index 04bcbc832dc9..d9bf502334ba 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.h
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.h
@@ -92,6 +92,7 @@ enum ibmvscsi_host_action {
IBMVSCSI_HOST_ACTION_NONE = 0,
IBMVSCSI_HOST_ACTION_RESET,
IBMVSCSI_HOST_ACTION_REENABLE,
+   IBMVSCSI_HOST_ACTION_UNBLOCK,
 };
 
 /* all driver data associated with a host adapter */
-- 
2.18.1



[PATCH v2 1/3] ibmvscsi: Wire up host_reset() in the drivers scsi_host_template

2019-05-02 Thread Tyrel Datwyler
From: Tyrel Datwyler 

Wire up the host_reset function in our driver_template to allow a user
requested adpater reset via the host_reset sysfs attribute.

Example:

echo "adapter" > /sys/class/scsi_host/host0/host_reset

Signed-off-by: Tyrel Datwyler 
---
Changes in v2:
removed interrupt disabe/enable around reset

 drivers/scsi/ibmvscsi/ibmvscsi.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 8cec5230fe31..65fc8ca962c5 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -2050,6 +2050,16 @@ static struct device_attribute ibmvscsi_host_config = {
.show = show_host_config,
 };
 
+static int ibmvscsi_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+   struct ibmvscsi_host_data *hostdata = shost_priv(shost);
+
+   dev_info(hostdata->dev, "Initiating adapter reset!\n");
+   ibmvscsi_reset_host(hostdata);
+
+   return 0;
+}
+
 static struct device_attribute *ibmvscsi_attrs[] = {
_host_vhost_loc,
_host_vhost_name,
@@ -2076,6 +2086,7 @@ static struct scsi_host_template driver_template = {
.eh_host_reset_handler = ibmvscsi_eh_host_reset_handler,
.slave_configure = ibmvscsi_slave_configure,
.change_queue_depth = ibmvscsi_change_queue_depth,
+   .host_reset = ibmvscsi_host_reset,
.cmd_per_lun = IBMVSCSI_CMDS_PER_LUN_DEFAULT,
.can_queue = IBMVSCSI_MAX_REQUESTS_DEFAULT,
.this_id = -1,
-- 
2.18.1



Re: [PATCH v2 2/2] powerpc/mm: Warn if W+X pages found on boot

2019-05-02 Thread Joel Stanley
On Thu, 2 May 2019 at 07:42, Russell Currey  wrote:
>
> Implement code to walk all pages and warn if any are found to be both
> writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and is
> behind the DEBUG_WX config option.
>
> This only runs on boot and has no runtime performance implications.
>
> Very heavily influenced (and in some cases copied verbatim) from the
> ARM64 code written by Laura Abbott (thanks!), since our ptdump
> infrastructure is similar.
>
> Signed-off-by: Russell Currey 
> ---
> v2: A myriad of fixes and cleanups thanks to Christophe Leroy
>
>  arch/powerpc/Kconfig.debug | 19 ++
>  arch/powerpc/include/asm/pgtable.h |  6 +
>  arch/powerpc/mm/pgtable_32.c   |  3 +++
>  arch/powerpc/mm/pgtable_64.c   |  3 +++
>  arch/powerpc/mm/ptdump/ptdump.c| 41 +-
>  5 files changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
> index 4e00cb0a5464..9e8bcddd8b8f 100644
> --- a/arch/powerpc/Kconfig.debug
> +++ b/arch/powerpc/Kconfig.debug
> @@ -361,6 +361,25 @@ config PPC_PTDUMP
>
>   If you are unsure, say N.
>
> +config PPC_DEBUG_WX

The other architectures call this DEBUG_WX, in case you wanted to name
it the same.

> +   bool "Warn on W+X mappings at boot"
> +   select PPC_PTDUMP
> +   help
> + Generate a warning if any W+X mappings are found at boot.


Re: [PATCH 1/3] ibmvscsi: Wire up host_reset() in the drivers scsi_host_template

2019-05-02 Thread Tyrel Datwyler
On 05/02/2019 02:50 PM, Brian King wrote:
> On 5/1/19 7:47 PM, Tyrel Datwyler wrote:
>> From: Tyrel Datwyler 
>>
>> Wire up the host_reset function in our driver_template to allow a user
>> requested adpater reset via the host_reset sysfs attribute.
>>
>> Example:
>>
>> echo "adapter" > /sys/class/scsi_host/host0/host_reset
>>
>> Signed-off-by: Tyrel Datwyler 
>> ---
>>  drivers/scsi/ibmvscsi/ibmvscsi.c | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c 
>> b/drivers/scsi/ibmvscsi/ibmvscsi.c
>> index 8cec5230fe31..1c37244f16a0 100644
>> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
>> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
>> @@ -2050,6 +2050,18 @@ static struct device_attribute ibmvscsi_host_config = 
>> {
>>  .show = show_host_config,
>>  };
>>  
>> +static int ibmvscsi_host_reset(struct Scsi_Host *shost, int reset_type)
>> +{
>> +struct ibmvscsi_host_data *hostdata = shost_priv(shost);
>> +
>> +vio_disable_interrupts(to_vio_dev(hostdata->dev));
>> +dev_info(hostdata->dev, "Initiating adapter reset!\n");
>> +ibmvscsi_reset_host(hostdata);
>> +vio_enable_interrupts(to_vio_dev(hostdata->dev));
> 
> Is it necessary to disable / enable interrupts around the call to 
> ibmvscsi_reset_host?
> I don't know why we'd need to do that before calling the reset as we have 
> other
> cases, like ibmvscsi_timeout where we don't bother doing this. Also, at the 
> end
> of the reset we look to be already enabling interrupts.

Yeah, I think you are right. My initial line of thought was that we have
interrupts disabled in handle_crq when we do a reset, but yeah we clearly call
it in the case of a timeout with them enabled.

-Tyrel

> 
> Thanks,
> 
> Brian
> 



Re: [PATCH 2/3] ibmvscsi: redo driver work thread to use enum action states

2019-05-02 Thread Tyrel Datwyler
On 05/02/2019 02:43 PM, Brian King wrote:
> On 5/1/19 7:47 PM, Tyrel Datwyler wrote:
>> From: Tyrel Datwyler 
>>
>> The current implemenation relies on two flags in the drivers private host
>> structure to signal the need for a host reset or to reenable the CRQ after a
>> LPAR migration. This patch does away with those flags and introduces a single
>> action flag and defined enums for the supported kthread work actions. Lastly,
>> the if/else logic is replaced with a switch statement.
>>
>> Signed-off-by: Tyrel Datwyler 
>> ---
>>  drivers/scsi/ibmvscsi/ibmvscsi.c | 57 +---
>>  drivers/scsi/ibmvscsi/ibmvscsi.h |  9 +++--
>>  2 files changed, 45 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c 
>> b/drivers/scsi/ibmvscsi/ibmvscsi.c
>> index 1c37244f16a0..683139e6c63f 100644
>> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
>> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
>> @@ -828,7 +828,7 @@ static void ibmvscsi_reset_host(struct 
>> ibmvscsi_host_data *hostdata)
>>  atomic_set(>request_limit, 0);
>>  
>>  purge_requests(hostdata, DID_ERROR);
>> -hostdata->reset_crq = 1;
>> +hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
>>  wake_up(>work_wait_q);
>>  }
>>  
>> @@ -1797,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
>>  /* We need to re-setup the interpartition connection */
>>  dev_info(hostdata->dev, "Re-enabling adapter!\n");
>>  hostdata->client_migrated = 1;
>> -hostdata->reenable_crq = 1;
>> +hostdata->action = IBMVSCSI_HOST_ACTION_REENABLE;
>>  purge_requests(hostdata, DID_REQUEUE);
>>  wake_up(>work_wait_q);
>>  } else {
>> @@ -2118,26 +2118,32 @@ static unsigned long ibmvscsi_get_desired_dma(struct 
>> vio_dev *vdev)
>>  
>>  static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
>>  {
>> +unsigned long flags;
>>  int rc;
>>  char *action = "reset";
>>  
>> -if (hostdata->reset_crq) {
>> -smp_rmb();
>> -hostdata->reset_crq = 0;
>> -
>> +spin_lock_irqsave(hostdata->host->host_lock, flags);
>> +switch (hostdata->action) {
>> +case IBMVSCSI_HOST_ACTION_NONE:
>> +break;
>> +case IBMVSCSI_HOST_ACTION_RESET:
>>  rc = ibmvscsi_reset_crq_queue(>queue, hostdata);
> 
> Looks like you are now calling ibmvscsi_reset_crq_queue with the host_lock 
> held.
> However, ibmvscsi_reset_crq_queue can call msleep.

Good catch. I remember thinking that needed to run lockless, but clearly failed
to release and re-grab the lock around that call.

-Tyrel

> 
> This had been implemented as separate reset_crq and reenable_crq fields
> so that it could run lockless. I'm not opposed to changing this to a single
> field in general, we just need to be careful where we are adding locking.
> 
> Thanks,
> 
> Brian
> 



Re: [PATCH kernel] prom_init: Fetch flatten device tree from the system firmware

2019-05-02 Thread Stewart Smith
David Gibson  writes:
> On Wed, May 01, 2019 at 01:42:21PM +1000, Alexey Kardashevskiy wrote:
>> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
>> about 8.5sec to fetch the entire device tree via the client interface
>> as the DT is traversed twice - for strings blob and for struct blob.
>> Also, "getprop" is quite slow too as SLOF stores properties in a linked
>> list.
>> 
>> However, since [1] SLOF builds flattened device tree (FDT) for another
>> purpose. [2] adds a new "fdt-fetch" client interface for the OS to fetch
>> the FDT.
>> 
>> This tries the new method; if not supported, this falls back to
>> the old method.
>> 
>> There is a change in the FDT layout - the old method produced
>> (reserved map, strings, structs), the new one receives only strings and
>> structs from the firmware and adds the final reserved map to the end,
>> so it is (fw reserved map, strings, structs, reserved map).
>> This still produces the same unflattened device tree.
>> 
>> This merges the reserved map from the firmware into the kernel's reserved
>> map. At the moment SLOF generates an empty reserved map so this does not
>> change the existing behaviour in regard of reservations.
>> 
>> This supports only v17 onward as only that version provides dt_struct_size
>> which works as "fdt-fetch" only produces v17 blobs.
>> 
>> If "fdt-fetch" is not available, the old method of fetching the DT is used.
>> 
>> [1] https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=e6fc84652c9c00
>> [2] https://git.qemu.org/?p=SLOF.git;a=commit;h=ecda95906930b80
>> 
>> Signed-off-by: Alexey Kardashevskiy 
>
> Hrm.  I've gotta say I'm not terribly convinced that it's worth adding
> a new interface we'll need to maintain to save 8s on a somewhat
> contrived testcase.

256CPUs aren't that many anymore though. Although I guess that many PCI
devices is still a little uncommon.

A 4 socket POWER8 or POWER9 can easily be that large, and a small test
kernel/userspace will boot in ~2.5-4 seconds. So it's possible that
the device tree fetch could be surprisingly non-trivial percentage of boot
time at least on some machines.


-- 
Stewart Smith
OPAL Architect, IBM.



Re: Linux 5.1-rc5

2019-05-02 Thread Christoph Hellwig
On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> largely irrelevant, partly since even theoretically this whole issue
> needs a _lot_ of memory.

Adding the relevant people - while the might be irrelevant, at least
mips and sparc have some giant memory systems.  And I'd really like
to see the arch-specific GUP implementations to go away for other
reasons, as we have a few issues to sort out with GUP usage now
(we just had discussions at LSF/MM), and the less implementations we
have to deal with the better.


Re: [PATCH 1/3] ibmvscsi: Wire up host_reset() in the drivers scsi_host_template

2019-05-02 Thread Brian King
On 5/1/19 7:47 PM, Tyrel Datwyler wrote:
> From: Tyrel Datwyler 
> 
> Wire up the host_reset function in our driver_template to allow a user
> requested adpater reset via the host_reset sysfs attribute.
> 
> Example:
> 
> echo "adapter" > /sys/class/scsi_host/host0/host_reset
> 
> Signed-off-by: Tyrel Datwyler 
> ---
>  drivers/scsi/ibmvscsi/ibmvscsi.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c 
> b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 8cec5230fe31..1c37244f16a0 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -2050,6 +2050,18 @@ static struct device_attribute ibmvscsi_host_config = {
>   .show = show_host_config,
>  };
>  
> +static int ibmvscsi_host_reset(struct Scsi_Host *shost, int reset_type)
> +{
> + struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> +
> + vio_disable_interrupts(to_vio_dev(hostdata->dev));
> + dev_info(hostdata->dev, "Initiating adapter reset!\n");
> + ibmvscsi_reset_host(hostdata);
> + vio_enable_interrupts(to_vio_dev(hostdata->dev));

Is it necessary to disable / enable interrupts around the call to 
ibmvscsi_reset_host?
I don't know why we'd need to do that before calling the reset as we have other
cases, like ibmvscsi_timeout where we don't bother doing this. Also, at the end
of the reset we look to be already enabling interrupts.

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH 2/3] ibmvscsi: redo driver work thread to use enum action states

2019-05-02 Thread Brian King
On 5/1/19 7:47 PM, Tyrel Datwyler wrote:
> From: Tyrel Datwyler 
> 
> The current implemenation relies on two flags in the drivers private host
> structure to signal the need for a host reset or to reenable the CRQ after a
> LPAR migration. This patch does away with those flags and introduces a single
> action flag and defined enums for the supported kthread work actions. Lastly,
> the if/else logic is replaced with a switch statement.
> 
> Signed-off-by: Tyrel Datwyler 
> ---
>  drivers/scsi/ibmvscsi/ibmvscsi.c | 57 +---
>  drivers/scsi/ibmvscsi/ibmvscsi.h |  9 +++--
>  2 files changed, 45 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c 
> b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 1c37244f16a0..683139e6c63f 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -828,7 +828,7 @@ static void ibmvscsi_reset_host(struct ibmvscsi_host_data 
> *hostdata)
>   atomic_set(>request_limit, 0);
>  
>   purge_requests(hostdata, DID_ERROR);
> - hostdata->reset_crq = 1;
> + hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
>   wake_up(>work_wait_q);
>  }
>  
> @@ -1797,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
>   /* We need to re-setup the interpartition connection */
>   dev_info(hostdata->dev, "Re-enabling adapter!\n");
>   hostdata->client_migrated = 1;
> - hostdata->reenable_crq = 1;
> + hostdata->action = IBMVSCSI_HOST_ACTION_REENABLE;
>   purge_requests(hostdata, DID_REQUEUE);
>   wake_up(>work_wait_q);
>   } else {
> @@ -2118,26 +2118,32 @@ static unsigned long ibmvscsi_get_desired_dma(struct 
> vio_dev *vdev)
>  
>  static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
>  {
> + unsigned long flags;
>   int rc;
>   char *action = "reset";
>  
> - if (hostdata->reset_crq) {
> - smp_rmb();
> - hostdata->reset_crq = 0;
> -
> + spin_lock_irqsave(hostdata->host->host_lock, flags);
> + switch (hostdata->action) {
> + case IBMVSCSI_HOST_ACTION_NONE:
> + break;
> + case IBMVSCSI_HOST_ACTION_RESET:
>   rc = ibmvscsi_reset_crq_queue(>queue, hostdata);

Looks like you are now calling ibmvscsi_reset_crq_queue with the host_lock held.
However, ibmvscsi_reset_crq_queue can call msleep.

This had been implemented as separate reset_crq and reenable_crq fields
so that it could run lockless. I'm not opposed to changing this to a single
field in general, we just need to be careful where we are adding locking.

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



[PATCH] Fix wrong message when RFI Flush is disable

2019-05-02 Thread Gustavo Walbon
From: "Gustavo L. F. Walbon" 

The issue was showing "Mitigation" message via sysfs whatever the state of
"RFI Flush", but it should show "Vulnerable" when it is disabled.

If you have "L1D private" feature enabled and not "RFI Flush" you are
vulnerable to meltdown attacks.

"RFI Flush" is the key feature to mitigate the meltdown whatever the
"L1D private" state.

SEC_FTR_L1D_THREAD_PRIV is a feature for Power9 only.

So the message should be as the truth table shows.
CPU | L1D private | RFI Flush |   sysfs   |
| --- | - | - |
 P9 |False|   False   | Vulnerable
 P9 |False|   True| Mitigation: RFI Flush
 P9 |True |   False   | Vulnerable: L1D private per thread
 P9 |True |   True| Mitigation: RFI Flush, L1D private per
| |   | thread
 P8 |False|   False   | Vulnerable
 P8 |False|   True| Mitigation: RFI Flush

Output before this fix:
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 Mitigation: RFI Flush, L1D private per thread
 # echo 0 > /sys/kernel/debug/powerpc/rfi_flush
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 Mitigation: L1D private per thread

Output after fix:
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 Mitigation: RFI Flush, L1D private per thread
 # echo 0 > /sys/kernel/debug/powerpc/rfi_flush
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 Vulnerable: L1D private per thread

Link: https://github.com/linuxppc/issues/issues/243

Signed-off-by: Gustavo L. F. Walbon 
Signed-off-by: Mauro S. M. Rodrigues 
---
 arch/powerpc/kernel/security.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index b33bafb8fcea..e08b81ef43b8 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -130,26 +130,22 @@ ssize_t cpu_show_meltdown(struct device *dev, struct 
device_attribute *attr, cha
 
thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
 
-   if (rfi_flush || thread_priv) {
+   if (rfi_flush) {
struct seq_buf s;
seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   seq_buf_printf(, "Mitigation: ");
-
-   if (rfi_flush)
-   seq_buf_printf(, "RFI Flush");
-
-   if (rfi_flush && thread_priv)
-   seq_buf_printf(, ", ");
-
+   seq_buf_printf(, "Mitigation: RFI Flush");
if (thread_priv)
-   seq_buf_printf(, "L1D private per thread");
+   seq_buf_printf(, ", L1D private per thread");
 
seq_buf_printf(, "\n");
 
return s.len;
}
 
+   if (thread_priv)
+   return sprintf(buf, "Vulnerable: L1D private per thread\n");
+
if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
!security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
return sprintf(buf, "Not affected\n");
-- 
2.19.1



Re: [PATCH 08/15] mips: switch to generic version of pte allocation

2019-05-02 Thread Paul Burton
Hi Mike,

On Thu, May 02, 2019 at 06:28:35PM +0300, Mike Rapoport wrote:
> MIPS allocates kernel PTE pages with
> 
>   __get_free_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER)
> 
> and user PTE pages with
> 
>   alloc_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER)

That bit isn't quite true - we don't use __GFP_ZERO in pte_alloc_one() &
instead call clear_highpage() on the allocated page. Not that I have a
problem with using __GFP_ZERO - it seems like the more optimal choice.
It just might be worth mentioning the change & expected equivalent
behavior.

Otherwise:

Acked-by: Paul Burton 

Thanks,
Paul


Re: [PATCH 01/15] asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]

2019-05-02 Thread Paul Burton
Hi Mike,

On Thu, May 02, 2019 at 06:28:28PM +0300, Mike Rapoport wrote:
> +/**
> + * pte_free_kernel - free PTE-level user page table page
> + * @mm: the mm_struct of the current context
> + * @pte_page: the `struct page` representing the page table
> + */
> +static inline void pte_free(struct mm_struct *mm, struct page *pte_page)
> +{
> + pgtable_page_dtor(pte_page);
> + __free_page(pte_page);
> +}

Nit: the comment names the wrong function (s/pte_free_kernel/pte_free/).

Thanks,
Paul


Re: [EXT] Re: [PATCH V4] ASoC: fsl_esai: Add pm runtime function

2019-05-02 Thread Nicolin Chen
On Thu, May 02, 2019 at 09:13:58AM +, S.j. Wang wrote:
> > On Sun, Apr 28, 2019 at 02:24:54AM +, S.j. Wang wrote:
> > > Add pm runtime support and move clock handling there.
> > > Close the clocks at suspend to reduce the power consumption.
> > >
> > > fsl_esai_suspend is replaced by pm_runtime_force_suspend.
> > > fsl_esai_resume is replaced by pm_runtime_force_resume.
> > 
> > This doesn't apply against for-5.2 again.  Sorry about this, I think this 
> > one is
> > due to some messups with my scripts which caused some patches to be
> > dropped for a while (and it's likely to be what happened the last time as
> > well).  Can you check and resend again please?  Like I say sorry about 
> > this, I
> > think it's my mistake.
> 
> I am checking, but I don't know why this patch failed in your side. I 
> Tried to apply this patch on for-5.1, for 5.2,  for-linus  and for-next, all 
> are

I just tried to apply it against top of trees of for-next and for-5.2
and both were fine on my side too.

> Successful.  The git is 
> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git.

Btw, this git link no longer works for me, not sure why:
# git remote add broonie 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
# git fetch broonie
fatal: remote error: access denied or repository not exported: 
/m/korg/pub/scm/linux/kernel/git/broonie/sound.git

It started to work after I changed "git://" to "https://; though...


Re: [PATCH 12/15] powerpc/nohash/64: switch to generic version of pte allocation

2019-05-02 Thread Christophe Leroy




Le 02/05/2019 à 17:28, Mike Rapoport a écrit :

The 64-bit book-E powerpc implements pte_alloc_one(),
pte_alloc_one_kernel(), pte_free_kernel() and pte_free() the same way as
the generic version.


Will soon be converted to the same as the 3 other PPC subarches, see
https://patchwork.ozlabs.org/patch/1091590/

Christophe



Switch it to the generic version that does exactly the same thing.

Signed-off-by: Mike Rapoport 
---
  arch/powerpc/include/asm/nohash/64/pgalloc.h | 35 ++--
  1 file changed, 2 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h 
b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index 66d086f..bfb53a0 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -11,6 +11,8 @@
  #include 
  #include 
  
+#include 	/* for pte_{alloc,free}_one */

+
  struct vmemmap_backing {
struct vmemmap_backing *list;
unsigned long phys;
@@ -92,39 +94,6 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
  }
  
-

-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *page;
-   pte_t *pte;
-
-   pte = (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT);
-   if (!pte)
-   return NULL;
-   page = virt_to_page(pte);
-   if (!pgtable_page_ctor(page)) {
-   __free_page(page);
-   return NULL;
-   }
-   return page;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-{
-   pgtable_page_dtor(ptepage);
-   __free_page(ptepage);
-}
-
  static inline void pgtable_free(void *table, int shift)
  {
if (!shift) {



[PATCH v2 1/3] powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h

2019-05-02 Thread Christophe Leroy
PPC_HA() PPC_HI() and PPC_LO() macros are nice macros. Move them
from module64.c to ppc-opcode.h in order to use them in other places.

Signed-off-by: Christophe Leroy 
---
v2: no change

 arch/powerpc/include/asm/ppc-opcode.h | 7 +++
 arch/powerpc/kernel/module_64.c   | 7 ---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 23f7ed796f38..c5ff44400d4d 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -412,6 +412,13 @@
 #define __PPC_SPR(r)   r) & 0x1f) << 16) | r) >> 5) & 0x1f) << 11))
 #define __PPC_RC21 (0x1 << 10)
 
+/* Both low and high 16 bits are added as SIGNED additions, so if low
+   16 bits has high bit set, high 16 bits must be adjusted.  These
+   macros do that (stolen from binutils). */
+#define PPC_LO(v) ((v) & 0x)
+#define PPC_HI(v) (((v) >> 16) & 0x)
+#define PPC_HA(v) PPC_HI ((v) + 0x8000)
+
 /*
  * Only use the larx hint bit on 64bit CPUs. e500v1/v2 based CPUs will treat a
  * larx with EH set as an illegal instruction.
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 8661eea78503..c2e1b06253b8 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -400,13 +400,6 @@ static inline unsigned long my_r2(const Elf64_Shdr 
*sechdrs, struct module *me)
return (sechdrs[me->arch.toc_section].sh_addr & ~0xfful) + 0x8000;
 }
 
-/* Both low and high 16 bits are added as SIGNED additions, so if low
-   16 bits has high bit set, high 16 bits must be adjusted.  These
-   macros do that (stolen from binutils). */
-#define PPC_LO(v) ((v) & 0x)
-#define PPC_HI(v) (((v) >> 16) & 0x)
-#define PPC_HA(v) PPC_HI ((v) + 0x8000)
-
 /* Patch stub to reference function and correct r2 value. */
 static inline int create_stub(const Elf64_Shdr *sechdrs,
  struct ppc64_stub_entry *entry,
-- 
2.13.3



[PATCH v2 2/3] powerpc/module32: Use symbolic instructions names.

2019-05-02 Thread Christophe Leroy
To increase readability/maintainability, replace hard coded
instructions values by symbolic names.

Signed-off-by: Christophe Leroy 
---
v2: Remove the ENTRY_JMP0 and ENTRY_JMP1 macros ; left real instructions as a 
comment.

 arch/powerpc/kernel/module_32.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index 88d83771f462..9cf20d6c 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -172,10 +172,12 @@ int module_frob_arch_sections(Elf32_Ehdr *hdr,
 
 static inline int entry_matches(struct ppc_plt_entry *entry, Elf32_Addr val)
 {
-   if (entry->jump[0] == 0x3d80 + ((val + 0x8000) >> 16)
-   && entry->jump[1] == 0x398c + (val & 0x))
-   return 1;
-   return 0;
+   if (entry->jump[0] != (PPC_INST_ADDIS | __PPC_RT(R12) | PPC_HA(val)))
+   return 0;
+   if (entry->jump[1] != (PPC_INST_ADDI | __PPC_RT(R12) | __PPC_RA(R12) |
+  PPC_LO(val)))
+   return 0;
+   return 1;
 }
 
 /* Set up a trampoline in the PLT to bounce us to the distant function */
@@ -200,10 +202,16 @@ static uint32_t do_plt_call(void *location,
entry++;
}
 
-   entry->jump[0] = 0x3d80+((val+0x8000)>>16); /* lis r12,sym@ha */
-   entry->jump[1] = 0x398c + (val&0x); /* addi r12,r12,sym@l*/
-   entry->jump[2] = 0x7d8903a6;/* mtctr r12 */
-   entry->jump[3] = 0x4e800420;/* bctr */
+   /*
+* lis r12, sym@ha
+* addi r12, r12, sym@l
+* mtctr r12
+* bctr
+*/
+   entry->jump[0] = PPC_INST_ADDIS | __PPC_RT(R12) | PPC_HA(val);
+   entry->jump[1] = PPC_INST_ADDI | __PPC_RT(R12) | __PPC_RA(R12) | 
PPC_LO(val);
+   entry->jump[2] = PPC_INST_MTCTR | __PPC_RS(R12);
+   entry->jump[3] = PPC_INST_BCTR;
 
pr_debug("Initialized plt for 0x%x at %p\n", val, entry);
return (uint32_t)entry;
-- 
2.13.3



[PATCH v2 3/3] powerpc/module64: Use symbolic instructions names.

2019-05-02 Thread Christophe Leroy
To increase readability/maintainability, replace hard coded
instructions values by symbolic names.

Signed-off-by: Christophe Leroy 
---
v2: rearranged comments ; fixed warning by adding () in an 'if' around X | Y

 arch/powerpc/kernel/module_64.c | 53 +++--
 1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index c2e1b06253b8..516c590c7a1f 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -133,20 +133,27 @@ struct ppc64_stub_entry
  * the stub, but it's significantly shorter to put these values at the
  * end of the stub code, and patch the stub address (32-bits relative
  * to the TOC ptr, r2) into the stub.
+ *
+ * addis   r11,r2, 
+ * addir11,r11, 
+ * std r2,R2_STACK_OFFSET(r1)
+ * ld  r12,32(r11)
+ * ld  r2,40(r11)
+ * mtctr   r12
+ * bctr
  */
-
 static u32 ppc64_stub_insns[] = {
-   0x3d62, /* addis   r11,r2,  */
-   0x396b, /* addir11,r11,  */
+   PPC_INST_ADDIS | __PPC_RT(R11) | __PPC_RA(R2),
+   PPC_INST_ADDI | __PPC_RT(R11) | __PPC_RA(R11),
/* Save current r2 value in magic place on the stack. */
-   0xf841|R2_STACK_OFFSET, /* std r2,R2_STACK_OFFSET(r1) */
-   0xe98b0020, /* ld  r12,32(r11) */
+   PPC_INST_STD | __PPC_RS(R2) | __PPC_RA(R1) | R2_STACK_OFFSET,
+   PPC_INST_LD | __PPC_RT(R12) | __PPC_RA(R11) | 32,
 #ifdef PPC64_ELF_ABI_v1
/* Set up new r2 from function descriptor */
-   0xe84b0028, /* ld  r2,40(r11) */
+   PPC_INST_LD | __PPC_RT(R2) | __PPC_RA(R11) | 40,
 #endif
-   0x7d8903a6, /* mtctr   r12 */
-   0x4e800420  /* bctr */
+   PPC_INST_MTCTR | __PPC_RS(R12),
+   PPC_INST_BCTR,
 };
 
 #ifdef CONFIG_DYNAMIC_FTRACE
@@ -704,18 +711,21 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 *  ld r2, ...(r12)
 *  add r2, r2, r12
 */
-   if uint32_t *)location)[0] & ~0xfffc)
-   != 0xe84c)
+   if uint32_t *)location)[0] & ~0xfffc) !=
+   PPC_INST_LD | __PPC_RT(R2) | __PPC_RA(R12))
break;
-   if (((uint32_t *)location)[1] != 0x7c426214)
+   if (((uint32_t *)location)[1] !=
+   PPC_INST_ADD | __PPC_RT(R2) | __PPC_RA(R2) | 
__PPC_RB(R12))
break;
/*
 * If found, replace it with:
 *  addis r2, r12, (.TOC.-func)@ha
 *  addi r2, r12, (.TOC.-func)@l
 */
-   ((uint32_t *)location)[0] = 0x3c4c + PPC_HA(value);
-   ((uint32_t *)location)[1] = 0x3842 + PPC_LO(value);
+   ((uint32_t *)location)[0] = PPC_INST_ADDIS | 
__PPC_RT(R2) |
+   __PPC_RA(R12) | 
PPC_HA(value);
+   ((uint32_t *)location)[1] = PPC_INST_ADDI | 
__PPC_RT(R2) |
+   __PPC_RA(R12) | 
PPC_LO(value);
break;
 
case R_PPC64_REL16_HA:
@@ -769,12 +779,19 @@ static unsigned long create_ftrace_stub(const Elf64_Shdr 
*sechdrs,
 {
struct ppc64_stub_entry *entry;
unsigned int i, num_stubs;
+   /*
+* ld  r12,PACATOC(r13)
+* addis   r12,r12,
+* addir12,r12,
+* mtctr   r12
+* bctr
+*/
static u32 stub_insns[] = {
-   0xe98d | PACATOC,   /* ld  r12,PACATOC(r13) */
-   0x3d8c, /* addis   r12,r12,   */
-   0x398c, /* addir12,r12,*/
-   0x7d8903a6, /* mtctr   r12  */
-   0x4e800420, /* bctr */
+   PPC_INST_LD | __PPC_RT(R12) | __PPC_RA(R13) | PACATOC,
+   PPC_INST_ADDIS | __PPC_RT(R12) | __PPC_RA(R12),
+   PPC_INST_ADDI | __PPC_RT(R12) | __PPC_RA(R12),
+   PPC_INST_MTCTR | __PPC_RS(R12),
+   PPC_INST_BCTR,
};
long reladdr;
 
-- 
2.13.3



Re: [PATCH] memblock: make keeping memblock memory opt-in rather than opt-out

2019-05-02 Thread Michael Ellerman
Mike Rapoport  writes:
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 2d0be82..39877b9 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -143,6 +143,7 @@ config PPC
>   select ARCH_HAS_UBSAN_SANITIZE_ALL
>   select ARCH_HAS_ZONE_DEVICE if PPC_BOOK3S_64
>   select ARCH_HAVE_NMI_SAFE_CMPXCHG
> + select ARCH_KEEP_MEMBLOCK

Acked-by: Michael Ellerman  (powerpc)

cheers


[PATCH 15/15] unicore32: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
Replace __get_free_page() and alloc_pages() calls with the generic
__pte_alloc_one_kernel() and __pte_alloc_one().

There is no functional change for the kernel PTE allocation.

The difference for the user PTEs, is that the clear_pte_table() is now
called after pgtable_page_ctor() and the addition of __GFP_ACCOUNT to the
GFP flags.

The pte_free() and pte_free_kernel() versions are identical to the generic
ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/unicore32/include/asm/pgalloc.h | 36 
 1 file changed, 8 insertions(+), 28 deletions(-)

diff --git a/arch/unicore32/include/asm/pgalloc.h 
b/arch/unicore32/include/asm/pgalloc.h
index 7cceabe..dd09af6 100644
--- a/arch/unicore32/include/asm/pgalloc.h
+++ b/arch/unicore32/include/asm/pgalloc.h
@@ -17,6 +17,10 @@
 #include 
 #include 
 
+#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
+#define __HAVE_ARCH_PTE_ALLOC_ONE
+#include 
+
 #define check_pgt_cache()  do { } while (0)
 
 #define _PAGE_USER_TABLE   (PMD_TYPE_TABLE | PMD_PRESENT)
@@ -28,17 +32,14 @@ extern void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd);
 #define pgd_alloc(mm)  get_pgd_slow(mm)
 #define pgd_free(mm, pgd)  free_pgd_slow(mm, pgd)
 
-#define PGALLOC_GFP(GFP_KERNEL | __GFP_ZERO)
-
 /*
  * Allocate one PTE table.
  */
 static inline pte_t *
 pte_alloc_one_kernel(struct mm_struct *mm)
 {
-   pte_t *pte;
+   pte_t *pte = __pte_alloc_one_kernel(mm);
 
-   pte = (pte_t *)__get_free_page(PGALLOC_GFP);
if (pte)
clean_dcache_area(pte, PTRS_PER_PTE * sizeof(pte_t));
 
@@ -50,35 +51,14 @@ pte_alloc_one(struct mm_struct *mm)
 {
struct page *pte;
 
-   pte = alloc_pages(PGALLOC_GFP, 0);
+   pte = __pte_alloc_one(mm, GFP_PGTABLE_USER);
if (!pte)
return NULL;
-   if (!PageHighMem(pte)) {
-   void *page = page_address(pte);
-   clean_dcache_area(page, PTRS_PER_PTE * sizeof(pte_t));
-   }
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   }
-
+   if (!PageHighMem(pte))
+   clean_pte_table(page_address(pte));
return pte;
 }
 
-/*
- * Free one PTE table.
- */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   if (pte)
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
 {
set_pmd(pmdp, __pmd(pmdval));
-- 
2.7.4



[PATCH 13/15] riscv: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The only difference between the generic and RISC-V implementation of PTE
allocation is the usage of __GFP_RETRY_MAYFAIL for both kernel and user
PTEs and the absence of __GFP_ACCOUNT for the user PTEs.

The conversion to the generic version removes the __GFP_RETRY_MAYFAIL and
ensures that GFP_ACCOUNT is used for the user PTE allocations.

The pte_free() and pte_free_kernel() versions are identical to the generic
ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/riscv/include/asm/pgalloc.h | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index 94043cf..48f28bb 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -18,6 +18,8 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 static inline void pmd_populate_kernel(struct mm_struct *mm,
pmd_t *pmd, pte_t *pte)
 {
@@ -82,33 +84,6 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 
 #endif /* __PAGETABLE_PMD_FOLDED */
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_page(
-   GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO);
-}
-
-static inline struct page *pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO);
-   if (likely(pte != NULL))
-   pgtable_page_ctor(pte);
-   return pte;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 #define __pte_free_tlb(tlb, pte, buf)   \
 do {\
pgtable_page_dtor(pte); \
-- 
2.7.4



[PATCH 14/15] um: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
um allocates PTE pages with __get_free_page() and uses
GFP_KERNEL | __GFP_ZERO for the allocations.

Switch it to the generic version that does exactly the same thing for the
kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The pte_free() and pte_free_kernel() versions are identical to the generic
ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/um/include/asm/pgalloc.h | 16 ++--
 arch/um/kernel/mem.c  | 22 --
 2 files changed, 2 insertions(+), 36 deletions(-)

diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h
index 99eb568..d7b282e 100644
--- a/arch/um/include/asm/pgalloc.h
+++ b/arch/um/include/asm/pgalloc.h
@@ -10,6 +10,8 @@
 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 #define pmd_populate_kernel(mm, pmd, pte) \
set_pmd(pmd, __pmd(_PAGE_TABLE + (unsigned long) __pa(pte)))
 
@@ -25,20 +27,6 @@
 extern pgd_t *pgd_alloc(struct mm_struct *);
 extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 
-extern pte_t *pte_alloc_one_kernel(struct mm_struct *);
-extern pgtable_t pte_alloc_one(struct mm_struct *);
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long) pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 #define __pte_free_tlb(tlb,pte, address)   \
 do {   \
pgtable_page_dtor(pte); \
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 99aa11b..2280374 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -215,28 +215,6 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd)
free_page((unsigned long) pgd);
 }
 
-pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   pte_t *pte;
-
-   pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO);
-   return pte;
-}
-
-pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_page(GFP_KERNEL|__GFP_ZERO);
-   if (!pte)
-   return NULL;
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
-}
-
 #ifdef CONFIG_3_LEVEL_PGTABLES
 pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
 {
-- 
2.7.4



[PATCH 01/15] asm-generic, x86: introduce generic pte_{alloc, free}_one[_kernel]

2019-05-02 Thread Mike Rapoport
Most architectures have identical or very similar implementation of
pte_alloc_one_kernel(), pte_alloc_one(), pte_free_kernel() and pte_free().

Add a generic implementation that can be reused across architectures and
enable its use on x86.

The generic implementation uses

GFP_KERNEL | __GFP_ZERO

for the kernel page tables and

GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT

for the user page tables.

The "base" functions for PTE allocation, namely __pte_alloc_one_kernel()
and __pte_alloc_one() are intended for the architectures that require
additional actions after actual memory allocation or must use non-default
GFP flags.

x86 is switched to use generic pte_alloc_one_kernel(), pte_free_kernel() and
pte_free().

x86 still implements pte_alloc_one() to allow run-time control of GFP flags
required for "userpte" command line option.

Signed-off-by: Mike Rapoport 
---
 arch/x86/include/asm/pgalloc.h |  19 ++--
 arch/x86/mm/pgtable.c  |  33 -
 include/asm-generic/pgalloc.h  | 107 +++--
 3 files changed, 115 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index a281e61..29aa785 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -6,6 +6,9 @@
 #include   /* for struct page */
 #include 
 
+#define __HAVE_ARCH_PTE_ALLOC_ONE
+#include/* for pte_{alloc,free}_one */
+
 static inline int  __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
 
 #ifdef CONFIG_PARAVIRT_XXL
@@ -47,24 +50,8 @@ extern gfp_t __userpte_alloc_gfp;
 extern pgd_t *pgd_alloc(struct mm_struct *);
 extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 
-extern pte_t *pte_alloc_one_kernel(struct mm_struct *);
 extern pgtable_t pte_alloc_one(struct mm_struct *);
 
-/* Should really implement gc for free page table pages. This could be
-   done with a reference count in struct page. */
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   BUG_ON((unsigned long)pte & (PAGE_SIZE-1));
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, struct page *pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 extern void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte);
 
 static inline void __pte_free_tlb(struct mmu_gather *tlb, struct page *pte,
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7bd0170..aaca89b 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -13,33 +13,17 @@ phys_addr_t physical_mask __ro_after_init = (1ULL << 
__PHYSICAL_MASK_SHIFT) - 1;
 EXPORT_SYMBOL(physical_mask);
 #endif
 
-#define PGALLOC_GFP (GFP_KERNEL_ACCOUNT | __GFP_ZERO)
-
 #ifdef CONFIG_HIGHPTE
-#define PGALLOC_USER_GFP __GFP_HIGHMEM
+#define PGTABLE_HIGHMEM __GFP_HIGHMEM
 #else
-#define PGALLOC_USER_GFP 0
+#define PGTABLE_HIGHMEM 0
 #endif
 
-gfp_t __userpte_alloc_gfp = PGALLOC_GFP | PGALLOC_USER_GFP;
-
-pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_page(PGALLOC_GFP & ~__GFP_ACCOUNT);
-}
+gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM;
 
 pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
-   struct page *pte;
-
-   pte = alloc_pages(__userpte_alloc_gfp, 0);
-   if (!pte)
-   return NULL;
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
+   return __pte_alloc_one(mm, __userpte_alloc_gfp);
 }
 
 static int __init setup_userpte(char *arg)
@@ -235,7 +219,7 @@ static int preallocate_pmds(struct mm_struct *mm, pmd_t 
*pmds[], int count)
 {
int i;
bool failed = false;
-   gfp_t gfp = PGALLOC_GFP;
+   gfp_t gfp = GFP_PGTABLE_USER;
 
if (mm == _mm)
gfp &= ~__GFP_ACCOUNT;
@@ -401,14 +385,14 @@ static inline pgd_t *_pgd_alloc(void)
 * We allocate one page for pgd.
 */
if (!SHARED_KERNEL_PMD)
-   return (pgd_t *)__get_free_pages(PGALLOC_GFP,
+   return (pgd_t *)__get_free_pages(GFP_PGTABLE_USER,
 PGD_ALLOCATION_ORDER);
 
/*
 * Now PAE kernel is not running as a Xen domain. We can allocate
 * a 32-byte slab for pgd to save memory space.
 */
-   return kmem_cache_alloc(pgd_cache, PGALLOC_GFP);
+   return kmem_cache_alloc(pgd_cache, GFP_PGTABLE_USER);
 }
 
 static inline void _pgd_free(pgd_t *pgd)
@@ -422,7 +406,8 @@ static inline void _pgd_free(pgd_t *pgd)
 
 static inline pgd_t *_pgd_alloc(void)
 {
-   return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER);
+   return (pgd_t *)__get_free_pages(GFP_PGTABLE_USER,
+PGD_ALLOCATION_ORDER);
 }
 
 static inline void _pgd_free(pgd_t *pgd)
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 

[PATCH 12/15] powerpc/nohash/64: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The 64-bit book-E powerpc implements pte_alloc_one(),
pte_alloc_one_kernel(), pte_free_kernel() and pte_free() the same way as
the generic version.

Switch it to the generic version that does exactly the same thing.

Signed-off-by: Mike Rapoport 
---
 arch/powerpc/include/asm/nohash/64/pgalloc.h | 35 ++--
 1 file changed, 2 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h 
b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index 66d086f..bfb53a0 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -11,6 +11,8 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 struct vmemmap_backing {
struct vmemmap_backing *list;
unsigned long phys;
@@ -92,39 +94,6 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
 }
 
-
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *page;
-   pte_t *pte;
-
-   pte = (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT);
-   if (!pte)
-   return NULL;
-   page = virt_to_page(pte);
-   if (!pgtable_page_ctor(page)) {
-   __free_page(page);
-   return NULL;
-   }
-   return page;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-{
-   pgtable_page_dtor(ptepage);
-   __free_page(ptepage);
-}
-
 static inline void pgtable_free(void *table, int shift)
 {
if (!shift) {
-- 
2.7.4



[PATCH 11/15] parisc: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
parisc allocates PTE pages with __get_free_page() and uses
GFP_KERNEL | __GFP_ZERO for the allocations.

Switch it to the generic version that does exactly the same thing for the
kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The pte_free_kernel() and pte_free() versions on are identical to the
generic ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/parisc/include/asm/pgalloc.h | 33 ++---
 1 file changed, 2 insertions(+), 31 deletions(-)

diff --git a/arch/parisc/include/asm/pgalloc.h 
b/arch/parisc/include/asm/pgalloc.h
index d05c678c..265ec42 100644
--- a/arch/parisc/include/asm/pgalloc.h
+++ b/arch/parisc/include/asm/pgalloc.h
@@ -10,6 +10,8 @@
 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 /* Allocate the top level pgd (page directory)
  *
  * Here (for 64 bit kernels) we implement a Hybrid L2/L3 scheme: we
@@ -121,37 +123,6 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, 
pte_t *pte)
pmd_populate_kernel(mm, pmd, page_address(pte_page))
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
-static inline pgtable_t
-pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *page = alloc_page(GFP_KERNEL|__GFP_ZERO);
-   if (!page)
-   return NULL;
-   if (!pgtable_page_ctor(page)) {
-   __free_page(page);
-   return NULL;
-   }
-   return page;
-}
-
-static inline pte_t *
-pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   pte_t *pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO);
-   return pte;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, struct page *pte)
-{
-   pgtable_page_dtor(pte);
-   pte_free_kernel(mm, page_address(pte));
-}
-
 #define check_pgt_cache()  do { } while (0)
 
 #endif
-- 
2.7.4



[PATCH 02/15] alpha: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
alpha allocates PTE pages with __get_free_page() and uses
GFP_KERNEL | __GFP_ZERO for the allocations.

Switch it to the generic version that does exactly the same thing for the
kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The alpha pte_free() and pte_free_kernel() versions are identical to the
generic ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/alpha/include/asm/pgalloc.h | 40 +++-
 1 file changed, 3 insertions(+), 37 deletions(-)

diff --git a/arch/alpha/include/asm/pgalloc.h b/arch/alpha/include/asm/pgalloc.h
index 02f9f91..71ded3b 100644
--- a/arch/alpha/include/asm/pgalloc.h
+++ b/arch/alpha/include/asm/pgalloc.h
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 /*  
  * Allocate and free page tables. The xxx_kernel() versions are
  * used to allocate a kernel page table - this turns on ASN bits
@@ -41,7 +43,7 @@ pgd_free(struct mm_struct *mm, pgd_t *pgd)
 static inline pmd_t *
 pmd_alloc_one(struct mm_struct *mm, unsigned long address)
 {
-   pmd_t *ret = (pmd_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+   pmd_t *ret = (pmd_t *)__get_free_page(GFP_PGTABLE_USER);
return ret;
 }
 
@@ -51,42 +53,6 @@ pmd_free(struct mm_struct *mm, pmd_t *pmd)
free_page((unsigned long)pmd);
 }
 
-static inline pte_t *
-pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   pte_t *pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO);
-   return pte;
-}
-
-static inline void
-pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
-static inline pgtable_t
-pte_alloc_one(struct mm_struct *mm)
-{
-   pte_t *pte = pte_alloc_one_kernel(mm);
-   struct page *page;
-
-   if (!pte)
-   return NULL;
-   page = virt_to_page(pte);
-   if (!pgtable_page_ctor(page)) {
-   __free_page(page);
-   return NULL;
-   }
-   return page;
-}
-
-static inline void
-pte_free(struct mm_struct *mm, pgtable_t page)
-{
-   pgtable_page_dtor(page);
-   __free_page(page);
-}
-
 #define check_pgt_cache()  do { } while (0)
 
 #endif /* _ALPHA_PGALLOC_H */
-- 
2.7.4



[PATCH 10/15] nios2: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
nios2 allocates kernel PTE pages with

__get_free_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER);

and user page tables with

pte = alloc_pages(GFP_KERNEL, PTE_ORDER);
if (pte)
clear_highpage();

The PTE_ORDER is hardwired to zero, which makes nios2 implementation almost
identical to the generic one.

Switch nios2 to the generic version that does exactly the same thing for
the kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The pte_free_kernel() and pte_free() versions on nios2 are identical to the
generic ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/nios2/include/asm/pgalloc.h | 37 ++---
 1 file changed, 2 insertions(+), 35 deletions(-)

diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h
index 3a149ea..4bc8cf7 100644
--- a/arch/nios2/include/asm/pgalloc.h
+++ b/arch/nios2/include/asm/pgalloc.h
@@ -12,6 +12,8 @@
 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
pte_t *pte)
 {
@@ -37,41 +39,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
free_pages((unsigned long)pgd, PGD_ORDER);
 }
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   pte_t *pte;
-
-   pte = (pte_t *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, PTE_ORDER);
-
-   return pte;
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_pages(GFP_KERNEL, PTE_ORDER);
-   if (pte) {
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   clear_highpage(pte);
-   }
-   return pte;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_pages((unsigned long)pte, PTE_ORDER);
-}
-
-static inline void pte_free(struct mm_struct *mm, struct page *pte)
-{
-   pgtable_page_dtor(pte);
-   __free_pages(pte, PTE_ORDER);
-}
-
 #define __pte_free_tlb(tlb, pte, addr) \
do {\
pgtable_page_dtor(pte); \
-- 
2.7.4



[PATCH 09/15] nds32: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The nds32 implementation of pte_alloc_one_kernel() differs from the generic
in the use of __GFP_RETRY_MAYFAIL flag, which is removed after the
conversion.

The nds32 version of pte_alloc_one() missed the call to pgtable_page_ctor()
and also used __GFP_RETRY_MAYFAIL. Switching it to use generic
__pte_alloc_one() for the PTE page allocation ensures that page table
constructor is run and the user page tables are allocated with
__GFP_ACCOUNT.

The conversion to the generic version of pte_free_kernel() removes the NULL
check for pte.

The pte_free() version on nds32 is identical to the generic one and can be
simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/nds32/include/asm/pgalloc.h | 31 ---
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/arch/nds32/include/asm/pgalloc.h b/arch/nds32/include/asm/pgalloc.h
index 3c5fee5..954696c 100644
--- a/arch/nds32/include/asm/pgalloc.h
+++ b/arch/nds32/include/asm/pgalloc.h
@@ -9,6 +9,9 @@
 #include 
 #include 
 
+#define __HAVE_ARCH_PTE_ALLOC_ONE
+#include/* for pte_{alloc,free}_one */
+
 /*
  * Since we have only two-level page tables, these are trivial
  */
@@ -22,22 +25,11 @@ extern void pgd_free(struct mm_struct *mm, pgd_t * pgd);
 
 #define check_pgt_cache()  do { } while (0)
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   pte_t *pte;
-
-   pte =
-   (pte_t *) __get_free_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL |
- __GFP_ZERO);
-
-   return pte;
-}
-
 static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
pgtable_t pte;
 
-   pte = alloc_pages(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO, 0);
+   pte = __pte_alloc_one(mm, GFP_PGTABLE_USER);
if (pte)
cpu_dcache_wb_page((unsigned long)page_address(pte));
 
@@ -45,21 +37,6 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
 }
 
 /*
- * Free one PTE table.
- */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t * pte)
-{
-   if (pte) {
-   free_page((unsigned long)pte);
-   }
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   __free_page(pte);
-}
-
-/*
  * Populate the pmdp entry with a pointer to the pte.  This pmd is part
  * of the mm address space.
  *
-- 
2.7.4



[PATCH 08/15] mips: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
MIPS allocates kernel PTE pages with

__get_free_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER)

and user PTE pages with

alloc_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER)

The PTE_ORDER is hardwired to zero, which makes MIPS implementation almost
identical to the generic one.

Switch MIPS to the generic version that does exactly the same thing for the
kernel page tables and adds __GFP_ACCOUNT for the user PTEs.

The pte_free_kernel() and pte_free() versions on mips are identical to the
generic ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/mips/include/asm/pgalloc.h | 33 ++---
 1 file changed, 2 insertions(+), 31 deletions(-)

diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index 27808d9..aa16b85 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -13,6 +13,8 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
pte_t *pte)
 {
@@ -50,37 +52,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
free_pages((unsigned long)pgd, PGD_ORDER);
 }
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, PTE_ORDER);
-}
-
-static inline struct page *pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_pages(GFP_KERNEL, PTE_ORDER);
-   if (!pte)
-   return NULL;
-   clear_highpage(pte);
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_pages((unsigned long)pte, PTE_ORDER);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_pages(pte, PTE_ORDER);
-}
-
 #define __pte_free_tlb(tlb,pte,address)\
 do {   \
pgtable_page_dtor(pte); \
-- 
2.7.4



[PATCH 07/15] m68k: sun3: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The sun3 MMU variant of m68k uses GFP_KERNEL to allocate a PTE page and
then memset(0) or clear_highpage() to clear it.

This is equivalent to allocating the page with GFP_KERNEL | __GFP_ZERO,
which allows replacing sun3 implementation of pte_alloc_one() and
pte_alloc_one_kernel() with the generic ones.

The pte_free() and pte_free_kernel() versions are identical to the generic
ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/m68k/include/asm/sun3_pgalloc.h | 41 ++--
 1 file changed, 2 insertions(+), 39 deletions(-)

diff --git a/arch/m68k/include/asm/sun3_pgalloc.h 
b/arch/m68k/include/asm/sun3_pgalloc.h
index 1456c5e..1a8ddbd 100644
--- a/arch/m68k/include/asm/sun3_pgalloc.h
+++ b/arch/m68k/include/asm/sun3_pgalloc.h
@@ -13,55 +13,18 @@
 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 extern const char bad_pmd_string[];
 
 #define pmd_alloc_one(mm,address)   ({ BUG(); ((pmd_t *)2); })
 
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-free_page((unsigned long) pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t page)
-{
-   pgtable_page_dtor(page);
-__free_page(page);
-}
-
 #define __pte_free_tlb(tlb,pte,addr)   \
 do {   \
pgtable_page_dtor(pte); \
tlb_remove_page((tlb), pte);\
 } while (0)
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   unsigned long page = __get_free_page(GFP_KERNEL);
-
-   if (!page)
-   return NULL;
-
-   memset((void *)page, 0, PAGE_SIZE);
-   return (pte_t *) (page);
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
-{
-struct page *page = alloc_pages(GFP_KERNEL, 0);
-
-   if (page == NULL)
-   return NULL;
-
-   clear_highpage(page);
-   if (!pgtable_page_ctor(page)) {
-   __free_page(page);
-   return NULL;
-   }
-   return page;
-
-}
-
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, pte_t 
*pte)
 {
pmd_val(*pmd) = __pa((unsigned long)pte);
-- 
2.7.4



[PATCH 06/15] hexagon: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The hexagon implementation pte_alloc_one(), pte_alloc_one_kernel(),
pte_free_kernel() and pte_free() is identical to the generic except of
lack of __GFP_ACCOUNT for the user PTEs allocation.

Switch hexagon to use generic version of these functions.

Signed-off-by: Mike Rapoport 
---
 arch/hexagon/include/asm/pgalloc.h | 34 ++
 1 file changed, 2 insertions(+), 32 deletions(-)

diff --git a/arch/hexagon/include/asm/pgalloc.h 
b/arch/hexagon/include/asm/pgalloc.h
index d361838..7661a26 100644
--- a/arch/hexagon/include/asm/pgalloc.h
+++ b/arch/hexagon/include/asm/pgalloc.h
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 #define check_pgt_cache() do {} while (0)
 
 extern unsigned long long kmap_generation;
@@ -59,38 +61,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
free_page((unsigned long) pgd);
 }
 
-static inline struct page *pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_page(GFP_KERNEL | __GFP_ZERO);
-   if (!pte)
-   return NULL;
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
-}
-
-/* _kernel variant gets to use a different allocator */
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   gfp_t flags =  GFP_KERNEL | __GFP_ZERO;
-   return (pte_t *) __get_free_page(flags);
-}
-
-static inline void pte_free(struct mm_struct *mm, struct page *pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_page((unsigned long)pte);
-}
-
 static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
pgtable_t pte)
 {
-- 
2.7.4



[PATCH 05/15] csky: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The csky implementation pte_alloc_one(), pte_free_kernel() and pte_free()
is identical to the generic except of lack of __GFP_ACCOUNT for the user
PTEs allocation.

Switch csky to use generic version of these functions.

The csky implementation of pte_alloc_one_kernel() is not replaced because
it does not clear the allocated page but rather sets each PTE in it to a
non-zero value.

The pte_free_kernel() and pte_free() versions on csky are identical to the
generic ones and can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/csky/include/asm/pgalloc.h | 30 +++---
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
index d213bb4..98c571670 100644
--- a/arch/csky/include/asm/pgalloc.h
+++ b/arch/csky/include/asm/pgalloc.h
@@ -8,6 +8,9 @@
 #include 
 #include 
 
+#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
+#include/* for pte_{alloc,free}_one */
+
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
pte_t *pte)
 {
@@ -39,33 +42,6 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct 
*mm)
return pte;
 }
 
-static inline struct page *pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_pages(GFP_KERNEL | __GFP_ZERO, 0);
-   if (!pte)
-   return NULL;
-
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-
-   return pte;
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   free_pages((unsigned long)pte, PTE_ORDER);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_pages(pte, PTE_ORDER);
-}
-
 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
free_pages((unsigned long)pgd, PGD_ORDER);
-- 
2.7.4



[PATCH 04/15] arm64: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
The PTE allocations in arm64 are identical to the generic ones modulo the
GFP flags.

Using the generic pte_alloc_one() functions ensures that the user page
tables are allocated with __GFP_ACCOUNT set.

The arm64 definition of PGALLOC_GFP is removed and replaced with
GFP_PGTABLE_USER for p[gum]d_alloc_one() and for KVM memory cache.

The mappings created with create_pgd_mapping() are now using
GFP_PGTABLE_KERNEL.

The conversion to the generic version of pte_free_kernel() removes the NULL
check for pte.

The pte_free() version on arm64 is identical to the generic one and
can be simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/arm64/include/asm/pgalloc.h | 43 
 arch/arm64/mm/mmu.c  |  2 +-
 arch/arm64/mm/pgd.c  |  4 ++--
 virt/kvm/arm/mmu.c   |  2 +-
 4 files changed, 8 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 52fa47c..3293b8b 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -24,16 +24,17 @@
 #include 
 #include 
 
+#include/* for pte_{alloc,free}_one */
+
 #define check_pgt_cache()  do { } while (0)
 
-#define PGALLOC_GFP(GFP_KERNEL | __GFP_ZERO)
 #define PGD_SIZE   (PTRS_PER_PGD * sizeof(pgd_t))
 
 #if CONFIG_PGTABLE_LEVELS > 2
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-   return (pmd_t *)__get_free_page(PGALLOC_GFP);
+   return (pmd_t *)__get_free_page(GFP_PGTABLE_USER);
 }
 
 static inline void pmd_free(struct mm_struct *mm, pmd_t *pmdp)
@@ -62,7 +63,7 @@ static inline void __pud_populate(pud_t *pudp, phys_addr_t 
pmdp, pudval_t prot)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-   return (pud_t *)__get_free_page(PGALLOC_GFP);
+   return (pud_t *)__get_free_page(GFP_PGTABLE_USER);
 }
 
 static inline void pud_free(struct mm_struct *mm, pud_t *pudp)
@@ -90,42 +91,6 @@ static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t 
pudp, pgdval_t prot)
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
 extern void pgd_free(struct mm_struct *mm, pgd_t *pgdp);
 
-static inline pte_t *
-pte_alloc_one_kernel(struct mm_struct *mm)
-{
-   return (pte_t *)__get_free_page(PGALLOC_GFP);
-}
-
-static inline pgtable_t
-pte_alloc_one(struct mm_struct *mm)
-{
-   struct page *pte;
-
-   pte = alloc_pages(PGALLOC_GFP, 0);
-   if (!pte)
-   return NULL;
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
-   return pte;
-}
-
-/*
- * Free a PTE table.
- */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *ptep)
-{
-   if (ptep)
-   free_page((unsigned long)ptep);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t ptep,
  pmdval_t prot)
 {
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e97f018..d5178c5 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -373,7 +373,7 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t 
phys,
 
 static phys_addr_t pgd_pgtable_alloc(void)
 {
-   void *ptr = (void *)__get_free_page(PGALLOC_GFP);
+   void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL);
if (!ptr || !pgtable_page_ctor(virt_to_page(ptr)))
BUG();
 
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 289f911..2ef1a53 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -31,9 +31,9 @@ static struct kmem_cache *pgd_cache __ro_after_init;
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
if (PGD_SIZE == PAGE_SIZE)
-   return (pgd_t *)__get_free_page(PGALLOC_GFP);
+   return (pgd_t *)__get_free_page(GFP_PGTABLE_USER);
else
-   return kmem_cache_alloc(pgd_cache, PGALLOC_GFP);
+   return kmem_cache_alloc(pgd_cache, GFP_PGTABLE_USER);
 }
 
 void pgd_free(struct mm_struct *mm, pgd_t *pgd)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..9f6f638 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -141,7 +141,7 @@ static int mmu_topup_memory_cache(struct 
kvm_mmu_memory_cache *cache,
if (cache->nobjs >= min)
return 0;
while (cache->nobjs < max) {
-   page = (void *)__get_free_page(PGALLOC_GFP);
+   page = (void *)__get_free_page(GFP_PGTABLE_USER);
if (!page)
return -ENOMEM;
cache->objects[cache->nobjs++] = page;
-- 
2.7.4



[PATCH 03/15] arm: switch to generic version of pte allocation

2019-05-02 Thread Mike Rapoport
Replace __get_free_page() and alloc_pages() calls with the generic
__pte_alloc_one_kernel() and __pte_alloc_one().

There is no functional change for the kernel PTE allocation.

The difference for the user PTEs, is that the clear_pte_table() is now
called after pgtable_page_ctor() and the addition of __GFP_ACCOUNT to the
GFP flags.

The conversion to the generic version of pte_free_kernel() removes the NULL
check for pte.

The pte_free() version on arm is identical to the generic one and can be
simply dropped.

Signed-off-by: Mike Rapoport 
---
 arch/arm/include/asm/pgalloc.h | 41 +
 arch/arm/mm/mmu.c  |  2 +-
 2 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index 17ab72f..13c5a9d 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -57,8 +57,6 @@ static inline void pud_populate(struct mm_struct *mm, pud_t 
*pud, pmd_t *pmd)
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
 extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 
-#define PGALLOC_GFP(GFP_KERNEL | __GFP_ZERO)
-
 static inline void clean_pte_table(pte_t *pte)
 {
clean_dcache_area(pte + PTE_HWTABLE_PTRS, PTE_HWTABLE_SIZE);
@@ -80,54 +78,41 @@ static inline void clean_pte_table(pte_t *pte)
  *  |  h/w pt 1  |
  *  ++
  */
+
+#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
+#define __HAVE_ARCH_PTE_ALLOC_ONE
+#include 
+
 static inline pte_t *
 pte_alloc_one_kernel(struct mm_struct *mm)
 {
-   pte_t *pte;
+   pte_t *pte = __pte_alloc_one_kernel(mm);
 
-   pte = (pte_t *)__get_free_page(PGALLOC_GFP);
if (pte)
clean_pte_table(pte);
 
return pte;
 }
 
+#ifdef CONFIG_HIGHPTE
+#define PGTABLE_HIGHMEM __GFP_HIGHMEM
+#else
+#define PGTABLE_HIGHMEM 0
+#endif
+
 static inline pgtable_t
 pte_alloc_one(struct mm_struct *mm)
 {
struct page *pte;
 
-#ifdef CONFIG_HIGHPTE
-   pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0);
-#else
-   pte = alloc_pages(PGALLOC_GFP, 0);
-#endif
+   pte = __pte_alloc_one(mm, GFP_PGTABLE_USER | PGTABLE_HIGHMEM);
if (!pte)
return NULL;
if (!PageHighMem(pte))
clean_pte_table(page_address(pte));
-   if (!pgtable_page_ctor(pte)) {
-   __free_page(pte);
-   return NULL;
-   }
return pte;
 }
 
-/*
- * Free one PTE table.
- */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-   if (pte)
-   free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
-{
-   pgtable_page_dtor(pte);
-   __free_page(pte);
-}
-
 static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t pte,
  pmdval_t prot)
 {
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index f3ce341..e8e0382 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -732,7 +732,7 @@ static void __init *early_alloc(unsigned long sz)
 
 static void *__init late_alloc(unsigned long sz)
 {
-   void *ptr = (void *)__get_free_pages(PGALLOC_GFP, get_order(sz));
+   void *ptr = (void *)__get_free_pages(GFP_PGTABLE_KERNEL, get_order(sz));
 
if (!ptr || !pgtable_page_ctor(virt_to_page(ptr)))
BUG();
-- 
2.7.4



[PATCH 00/15] introduce generic pte_{alloc,free}_one[_kernel]

2019-05-02 Thread Mike Rapoport
Hi,

I've tried to trim down the recipients list, but it's still quite long, so
sorry for the spam.

Many architectures have similar, if not identical implementation of
pte_alloc_one_kernel(), pte_alloc_one(), pte_free_kernel() and pte_free().

A while ago Anshuman suggested to introduce a common definition of
GFP_PGTABLE and during the discussion it was suggested to rather
consolidate the allocators.

These patches introduce generic version of PTE allocation and free and
enable their use on several architectures.

The conversion introduces some changes for some of the architectures.
Here's the executive summary and the details are described at each patch.

* Most architectures do not set __GFP_ACCOUNT for the user page tables.
Switch to the generic functions is "spreading that goodness to all other
architectures"
* arm, arm64 and unicore32 used to check if the pte is not NULL before
freeing its memory in pte_free_kernel(). It's dropped during the
conversion as it seems superfluous.
* x86 used to BUG_ON() is pte was not page aligned duirng
pte_free_kernel(), the generic version simply frees the memory without any
checks.

This set only performs the straightforward conversion, the architectures
with different logic in pte_alloc_one() and pte_alloc_one_kernel() are not
touched, as well as architectures that have custom page table allocators.

[1] 
https://lore.kernel.org/lkml/1547619692-7946-1-git-send-email-anshuman.khand...@arm.com

 asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]

Mike Rapoport (15):
  asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
  alpha: switch to generic version of pte allocation
  arm: switch to generic version of pte allocation
  arm64: switch to generic version of pte allocation
  csky: switch to generic version of pte allocation
  hexagon: switch to generic version of pte allocation
  m68k: sun3: switch to generic version of pte allocation
  mips: switch to generic version of pte allocation
  nds32: switch to generic version of pte allocation
  nios2: switch to generic version of pte allocation
  parisc: switch to generic version of pte allocation
  powerpc/nohash/64: switch to generic version of pte allocation
  riscv: switch to generic version of pte allocation
  um: switch to generic version of pte allocation
  unicore32: switch to generic version of pte allocation

 arch/alpha/include/asm/pgalloc.h |  40 +-
 arch/arm/include/asm/pgalloc.h   |  41 --
 arch/arm/mm/mmu.c|   2 +-
 arch/arm64/include/asm/pgalloc.h |  43 +--
 arch/arm64/mm/mmu.c  |   2 +-
 arch/arm64/mm/pgd.c  |   4 +-
 arch/csky/include/asm/pgalloc.h  |  30 +---
 arch/hexagon/include/asm/pgalloc.h   |  34 +
 arch/m68k/include/asm/sun3_pgalloc.h |  41 +-
 arch/mips/include/asm/pgalloc.h  |  33 +
 arch/nds32/include/asm/pgalloc.h |  31 +---
 arch/nios2/include/asm/pgalloc.h |  37 +
 arch/parisc/include/asm/pgalloc.h|  33 +
 arch/powerpc/include/asm/nohash/64/pgalloc.h |  35 +
 arch/riscv/include/asm/pgalloc.h |  29 +---
 arch/um/include/asm/pgalloc.h|  16 +---
 arch/um/kernel/mem.c |  22 --
 arch/unicore32/include/asm/pgalloc.h |  36 ++---
 arch/x86/include/asm/pgalloc.h   |  19 +
 arch/x86/mm/pgtable.c|  33 +++--
 include/asm-generic/pgalloc.h| 107 ++-
 virt/kvm/arm/mmu.c   |   2 +-
 22 files changed, 171 insertions(+), 499 deletions(-)

-- 
2.7.4



Re: [RESEND PATCH v3 05/11] mtd: rawnand: vf610_nfc: add initializer to avoid -Wmaybe-uninitialized

2019-05-02 Thread Miquel Raynal
Hi Masahiro,

Masahiro Yamada  wrote on Tue, 23 Apr
2019 12:49:53 +0900:

> This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
> place. We need to eliminate potential issues beforehand.
> 
> Kbuild test robot has never reported -Wmaybe-uninitialized warning
> for this probably because vf610_nfc_run() is inlined by the x86
> compiler's inlining heuristic.
> 
> If CONFIG_OPTIMIZE_INLINING is enabled for a different architecture
> and vf610_nfc_run() is not inlined, the following warning is reported:
> 
> drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
> drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
>vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
>^~~
> nfc->regs + NFC_MAIN_AREA(0) + offset,
> ~~
> trfr_sz, !nfc->data_access);
> ~~~

IMHO this patch has no dependencies with this series.
Would you mind sending it alone with the proper Fixes tag?

> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
> Changes in v3: None
> Changes in v2:
>   - split into a separate patch
> 
>  drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/raw/vf610_nfc.c 
> b/drivers/mtd/nand/raw/vf610_nfc.c
> index a662ca1970e5..19792d725ec2 100644
> --- a/drivers/mtd/nand/raw/vf610_nfc.c
> +++ b/drivers/mtd/nand/raw/vf610_nfc.c
> @@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
>  {
>   const struct nand_op_instr *instr;
>   struct vf610_nfc *nfc = chip_to_nfc(chip);
> - int op_id = -1, trfr_sz = 0, offset;
> + int op_id = -1, trfr_sz = 0, offset = 0;
>   u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
>   bool force8bit = false;
>  

Thanks,
Miquèl


Re: Linux 5.1-rc5

2019-05-02 Thread Martin Schwidefsky
On Thu, 2 May 2019 16:31:10 +0200
Greg KH  wrote:

> On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote:
> > On Thu, 2 May 2019 14:21:28 +0200
> > Greg KH  wrote:
> >   
> > > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:  
> > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  
> > > > wrote:
> > > > >
> > > > > Can we please have the page refcount overflow fixes out on the list
> > > > > for review, even if it is after the fact?
> > > > 
> > > > They were actually on a list for review long before the fact, but it
> > > > was the security mailing list. The issue actually got discussed back
> > > > in January along with early versions of the patches, but then we
> > > > dropped the ball because it just wasn't on anybody's radar and it got
> > > > resurrected late March. Willy wrote a rather bigger patch-series, and
> > > > review of that is what then resulted in those commits. So they may
> > > > look recent, but that's just because the original patches got
> > > > seriously edited down and rewritten.
> > > > 
> > > > That said, powerpc and s390 should at least look at maybe adding a
> > > > check for the page ref in their gup paths too. Powerpc has the special
> > > > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > > > was actually hoping the s390 guys would look at using the generic gup
> > > > code.
> > > > 
> > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> > > > largely irrelevant, partly since even theoretically this whole issue
> > > > needs a _lot_ of memory.
> > > > 
> > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> > > > (page ref overflow)"). You may or may not really care.
> > > 
> > > I've now queued these patches up for the next round of stable releases,
> > > as some people seem to care about these.
> > > 
> > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for
> > > these changes, am I just missing them and should also queue up a few
> > > more to handle this issue on those platforms?  
> > 
> > I fixed that with a different approach. The following two patches are
> > queued for the next merge window:
> > 
> > d1874a0c2805 "s390/mm: make the pxd_offset functions more robust"
> > 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code"
> > 
> > With these two s390 now uses the generic gup code in mm/gup.c  
> 
> Nice!  Do you want me to queue those up for the stable backports once
> they hit a public -rc release?

Yes please!

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH][next] KVM: PPC: Book3S HV: XIVE: fix spelling mistake "acessing" -> "accessing"

2019-05-02 Thread Mukesh Ojha



On 5/2/2019 3:53 PM, Colin King wrote:

From: Colin Ian King 

There is a spelling mistake in a pr_err message, fix it.

Signed-off-by: Colin Ian King 

Reviewed-by: Mukesh Ojha 

Cheers,
-Mukesh



---
  arch/powerpc/kvm/book3s_xive_native.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
b/arch/powerpc/kvm/book3s_xive_native.c
index 5e14df1a4403..6a8e698c4b6e 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -235,7 +235,7 @@ static vm_fault_t xive_native_esb_fault(struct vm_fault 
*vmf)
arch_spin_unlock(>lock);
  
  	if (WARN_ON(!page)) {

-   pr_err("%s: acessing invalid ESB page for source %lx !\n",
+   pr_err("%s: accessing invalid ESB page for source %lx !\n",
   __func__, irq);
return VM_FAULT_SIGBUS;
}


Re: Linux 5.1-rc5

2019-05-02 Thread Greg KH
On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote:
> On Thu, 2 May 2019 14:21:28 +0200
> Greg KH  wrote:
> 
> > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  
> > > wrote:  
> > > >
> > > > Can we please have the page refcount overflow fixes out on the list
> > > > for review, even if it is after the fact?  
> > > 
> > > They were actually on a list for review long before the fact, but it
> > > was the security mailing list. The issue actually got discussed back
> > > in January along with early versions of the patches, but then we
> > > dropped the ball because it just wasn't on anybody's radar and it got
> > > resurrected late March. Willy wrote a rather bigger patch-series, and
> > > review of that is what then resulted in those commits. So they may
> > > look recent, but that's just because the original patches got
> > > seriously edited down and rewritten.
> > > 
> > > That said, powerpc and s390 should at least look at maybe adding a
> > > check for the page ref in their gup paths too. Powerpc has the special
> > > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > > was actually hoping the s390 guys would look at using the generic gup
> > > code.
> > > 
> > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> > > largely irrelevant, partly since even theoretically this whole issue
> > > needs a _lot_ of memory.
> > > 
> > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> > > (page ref overflow)"). You may or may not really care.  
> > 
> > I've now queued these patches up for the next round of stable releases,
> > as some people seem to care about these.
> > 
> > I didn't see any follow-on patches for s390 or ppc64 hit the tree for
> > these changes, am I just missing them and should also queue up a few
> > more to handle this issue on those platforms?
> 
> I fixed that with a different approach. The following two patches are
> queued for the next merge window:
> 
> d1874a0c2805 "s390/mm: make the pxd_offset functions more robust"
> 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code"
> 
> With these two s390 now uses the generic gup code in mm/gup.c

Nice!  Do you want me to queue those up for the stable backports once
they hit a public -rc release?

thanks,

greg k-h


[PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-02 Thread Michael Ellerman
The mpc85xx EDAC code can be configured as a module but then fails to
build because it uses two unexported symbols:

  ERROR: ".pci_find_hose_for_OF_device" [drivers/edac/mpc85xx_edac_mod.ko] 
undefined!
  ERROR: ".early_find_capability" [drivers/edac/mpc85xx_edac_mod.ko] undefined!

We don't want to export those symbols just for this driver, so make
the driver only configurable as a built-in.

This seems to have been broken since at least commit c92132f59806
("edac/85xx: Add PCIe error interrupt edac support") (Nov 2013).

Signed-off-by: Michael Ellerman 
---
 drivers/edac/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 47eb4d13ed5f..6317519f9d88 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -263,7 +263,7 @@ config EDAC_PND2
  micro-server but may appear on others in the future.
 
 config EDAC_MPC85XX
-   tristate "Freescale MPC83xx / MPC85xx"
+   bool "Freescale MPC83xx / MPC85xx"
depends on FSL_SOC
help
  Support for error detection and correction on the Freescale
-- 
2.20.1



Re: Linux 5.1-rc5

2019-05-02 Thread Martin Schwidefsky
On Thu, 2 May 2019 14:21:28 +0200
Greg KH  wrote:

> On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  
> > wrote:  
> > >
> > > Can we please have the page refcount overflow fixes out on the list
> > > for review, even if it is after the fact?  
> > 
> > They were actually on a list for review long before the fact, but it
> > was the security mailing list. The issue actually got discussed back
> > in January along with early versions of the patches, but then we
> > dropped the ball because it just wasn't on anybody's radar and it got
> > resurrected late March. Willy wrote a rather bigger patch-series, and
> > review of that is what then resulted in those commits. So they may
> > look recent, but that's just because the original patches got
> > seriously edited down and rewritten.
> > 
> > That said, powerpc and s390 should at least look at maybe adding a
> > check for the page ref in their gup paths too. Powerpc has the special
> > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > was actually hoping the s390 guys would look at using the generic gup
> > code.
> > 
> > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> > largely irrelevant, partly since even theoretically this whole issue
> > needs a _lot_ of memory.
> > 
> > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> > (page ref overflow)"). You may or may not really care.  
> 
> I've now queued these patches up for the next round of stable releases,
> as some people seem to care about these.
> 
> I didn't see any follow-on patches for s390 or ppc64 hit the tree for
> these changes, am I just missing them and should also queue up a few
> more to handle this issue on those platforms?

I fixed that with a different approach. The following two patches are
queued for the next merge window:

d1874a0c2805 "s390/mm: make the pxd_offset functions more robust"
1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code"

With these two s390 now uses the generic gup code in mm/gup.c

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v2 2/2] powerpc/mm: Warn if W+X pages found on boot

2019-05-02 Thread Michael Ellerman
Russell Currey  writes:

> Implement code to walk all pages and warn if any are found to be both
> writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and is
> behind the DEBUG_WX config option.
>
> This only runs on boot and has no runtime performance implications.
>
> Very heavily influenced (and in some cases copied verbatim) from the
> ARM64 code written by Laura Abbott (thanks!), since our ptdump
> infrastructure is similar.
>
> Signed-off-by: Russell Currey 
> ---
> v2: A myriad of fixes and cleanups thanks to Christophe Leroy
>
>  arch/powerpc/Kconfig.debug | 19 ++
>  arch/powerpc/include/asm/pgtable.h |  6 +
>  arch/powerpc/mm/pgtable_32.c   |  3 +++
>  arch/powerpc/mm/pgtable_64.c   |  3 +++
>  arch/powerpc/mm/ptdump/ptdump.c| 41 +-
>  5 files changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
> index 4e00cb0a5464..9e8bcddd8b8f 100644
> --- a/arch/powerpc/Kconfig.debug
> +++ b/arch/powerpc/Kconfig.debug
> @@ -361,6 +361,25 @@ config PPC_PTDUMP
>  
> If you are unsure, say N.
>  
> +config PPC_DEBUG_WX
> + bool "Warn on W+X mappings at boot"
> + select PPC_PTDUMP

That should be depends not select, I'll fix it up.

cheers


Re: remove asm-generic/ptrace.h

2019-05-02 Thread Oleg Nesterov
On 05/01, Christoph Hellwig wrote:
>
> Hi all,
>
> asm-generic/ptrace.h is a little weird in that it doesn't actually
> implement any functionality, but it provided multiple layers of macros
> that just implement trivial inline functions.  We implement those
> directly in the few architectures and be off with a much simpler
> design.

Oh, thanks, I was always confused by these macros ;)

Oleg.



Re: [PATCH 1/5] arm64: don't use asm-generic/ptrace.h

2019-05-02 Thread Catalin Marinas
On Wed, May 01, 2019 at 01:39:39PM -0400, Christoph Hellwig wrote:
> Doing the indirection through macros for the regs accessors just
> makes them harder to read, so implement the helpers directly.
> 
> Note that only the helpers actually used are implemented now.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Catalin Marinas 


Re: [PATCH kernel] prom_init: Fetch flatten device tree from the system firmware

2019-05-02 Thread David Gibson
On Wed, May 01, 2019 at 01:42:21PM +1000, Alexey Kardashevskiy wrote:
> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
> about 8.5sec to fetch the entire device tree via the client interface
> as the DT is traversed twice - for strings blob and for struct blob.
> Also, "getprop" is quite slow too as SLOF stores properties in a linked
> list.
> 
> However, since [1] SLOF builds flattened device tree (FDT) for another
> purpose. [2] adds a new "fdt-fetch" client interface for the OS to fetch
> the FDT.
> 
> This tries the new method; if not supported, this falls back to
> the old method.
> 
> There is a change in the FDT layout - the old method produced
> (reserved map, strings, structs), the new one receives only strings and
> structs from the firmware and adds the final reserved map to the end,
> so it is (fw reserved map, strings, structs, reserved map).
> This still produces the same unflattened device tree.
> 
> This merges the reserved map from the firmware into the kernel's reserved
> map. At the moment SLOF generates an empty reserved map so this does not
> change the existing behaviour in regard of reservations.
> 
> This supports only v17 onward as only that version provides dt_struct_size
> which works as "fdt-fetch" only produces v17 blobs.
> 
> If "fdt-fetch" is not available, the old method of fetching the DT is used.
> 
> [1] https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=e6fc84652c9c00
> [2] https://git.qemu.org/?p=SLOF.git;a=commit;h=ecda95906930b80
> 
> Signed-off-by: Alexey Kardashevskiy 

Hrm.  I've gotta say I'm not terribly convinced that it's worth adding
a new interface we'll need to maintain to save 8s on a somewhat
contrived testcase.

> ---
>  arch/powerpc/kernel/prom_init.c | 43 +
>  1 file changed, 43 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index f33ff4163a51..72e7a602b68e 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -2457,6 +2457,48 @@ static void __init flatten_device_tree(void)
>   prom_panic("Can't allocate initial device-tree chunk\n");
>   mem_end = mem_start + room;
>  
> + hdr = (void *) mem_start;
> + if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start,
> + room - sizeof(mem_reserve_map)) &&
> + hdr->version >= 17) {
> + u32 size;
> + struct mem_map_entry *fwrmap;
> +
> + /* Fixup the boot cpuid */
> + hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
> +
> + /*
> +  * Store the struct and strings addresses, mostly
> +  * for consistency, only dt_header_start actually matters later.
> +  */
> + dt_header_start = mem_start;
> + dt_string_start = mem_start + be32_to_cpu(hdr->off_dt_strings);
> + dt_string_end = dt_string_start +
> + be32_to_cpu(hdr->dt_strings_size);
> + dt_struct_start = mem_start + be32_to_cpu(hdr->off_dt_struct);
> + dt_struct_end = dt_struct_start +
> + be32_to_cpu(hdr->dt_struct_size);
> +
> + /*
> +  * Calculate the reserved map location (which we put
> +  * at the blob end) and update total size.
> +  */
> + fwrmap = (void *)(mem_start + be32_to_cpu(hdr->off_mem_rsvmap));
> + hdr->off_mem_rsvmap = hdr->totalsize;
> + size = be32_to_cpu(hdr->totalsize);
> + hdr->totalsize = cpu_to_be32(size + sizeof(mem_reserve_map));
> +
> + /* Merge reserved map from firmware to ours */
> + for ( ; fwrmap->size; ++fwrmap)
> + reserve_mem(be64_to_cpu(fwrmap->base),
> + be64_to_cpu(fwrmap->size));
> +
> + rsvmap = (u64 *)(mem_start + size);
> +
> + prom_debug("Fetched DTB: %d bytes to @%lx\n", size, mem_start);
> + goto finalize_exit;
> + }
> +
>   /* Get root of tree */
>   root = call_prom("peer", 1, 1, (phandle)0);
>   if (root == (phandle)0)
> @@ -2504,6 +2546,7 @@ static void __init flatten_device_tree(void)
>   /* Version 16 is not backward compatible */
>   hdr->last_comp_version = cpu_to_be32(0x10);
>  
> +finalize_exit:
>   /* Copy the reserve map in */
>   memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map));
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Joakim Tjernlund
On Thu, 2019-05-02 at 12:58 +, Laurentiu Tudor wrote:
> 
> > -Original Message-
> > From: Joakim Tjernlund 
> > Sent: Thursday, May 2, 2019 1:37 PM
> > 
> > On Thu, 2019-05-02 at 09:05 +, Laurentiu Tudor wrote:
> > > Hi Joakim,
> > > 
> > > > -Original Message-
> > > > From: Joakim Tjernlund 
> > > > Sent: Saturday, April 27, 2019 8:11 PM
> > > > 
> > > > On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > > > > From: Laurentiu Tudor 
> > > > > 
> > > > > Fix issue with the entry indexing in the sg frame cleanup code being
> > > > > off-by-1. This problem showed up when doing some basic iperf tests
> > and
> > > > > manifested in traffic coming to a halt.
> > > > > 
> > > > > Signed-off-by: Laurentiu Tudor 
> > > > > Acked-by: Madalin Bucur 
> > > > 
> > > > Wasn't this a stable candidate too?
> > > 
> > > Yes, it is. I forgot to add the cc:stable tag, sorry about that.
> > 
> > Then this is a bug fix that should go directly to linus/stable.
> > 
> > I note that
> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fstable%2Flinux.git%2Flog%2Fdrivers%2Fnet%2Fethernet%2Ffreescale%2Fdpaa%3Fh%3Dlinux-4.19.ydata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cb88ecc951de649e5a55808d6cefdd286%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C636923986895133037sdata=ueUWI1%2BmNBHtlCoY9%2B1FreOUM8bHGiTYWhISy5nRoJk%3Dreserved=0
> 
> Not sure I understand ... I don't see the patch in the link.

Sorry, I copied the wrong link:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/ethernet/freescale/dpaa?h=linux-4.19.y=0aafea5d4b22fe9403e89d82e02597e4493d5d0f

> 
> > is in 4.19 but not in 4.14 , is it not appropriate for 4.14?
> 
> I think it makes sense to go in both stable trees.
> 
> ---
> Best Regards, Laurentiu
> 
> > > > > ---
> > > > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > index daede7272768..40420edc9ce6 100644
> > > > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > @@ -1663,7 +1663,7 @@ static struct sk_buff
> > *dpaa_cleanup_tx_fd(const
> > > > struct dpaa_priv *priv,
> > > > >  qm_sg_entry_get_len([0]),
> > dma_dir);
> > > > > /* remaining pages were mapped with
> > skb_frag_dma_map()
> > > > */
> > > > > -   for (i = 1; i < nr_frags; i++) {
> > > > > +   for (i = 1; i <= nr_frags; i++) {
> > > > > WARN_ON(qm_sg_entry_is_ext([i]));
> > > > > 
> > > > > dma_unmap_page(dev, qm_sg_addr([i]),
> > > > > --
> > > > > 2.17.1
> > > > > 



Re: [PATCH 1/2] x86, numa: always initialize all possible nodes

2019-05-02 Thread Michal Hocko
On Wed 01-05-19 15:12:32, Barret Rhoden wrote:
[...]
> A more elegant solution may be to avoid registering with sysfs during early
> boot, or something else entirely.  But I figured I'd ask for help at this
> point.  =)

Thanks for the report and an excellent analysis! This is really helpful.
I will think about this some more but I am traveling this week. It seems
really awkward to register a sysfs file for an empty range. That looks
like a bug to me.

-- 
Michal Hocko
SUSE Labs


RE: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Laurentiu Tudor


> -Original Message-
> From: Joakim Tjernlund 
> Sent: Thursday, May 2, 2019 1:37 PM
> 
> On Thu, 2019-05-02 at 09:05 +, Laurentiu Tudor wrote:
> > Hi Joakim,
> >
> > > -Original Message-
> > > From: Joakim Tjernlund 
> > > Sent: Saturday, April 27, 2019 8:11 PM
> > >
> > > On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > > > From: Laurentiu Tudor 
> > > >
> > > > Fix issue with the entry indexing in the sg frame cleanup code being
> > > > off-by-1. This problem showed up when doing some basic iperf tests
> and
> > > > manifested in traffic coming to a halt.
> > > >
> > > > Signed-off-by: Laurentiu Tudor 
> > > > Acked-by: Madalin Bucur 
> > >
> > > Wasn't this a stable candidate too?
> >
> > Yes, it is. I forgot to add the cc:stable tag, sorry about that.
> 
> Then this is a bug fix that should go directly to linus/stable.
> 
> I note that
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/ethernet/freescale/dpaa?h=linux-4.19.y

Not sure I understand ... I don't see the patch in the link.

> is in 4.19 but not in 4.14 , is it not appropriate for 4.14?

I think it makes sense to go in both stable trees.

---
Best Regards, Laurentiu

> >
> > > > ---
> > > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > index daede7272768..40420edc9ce6 100644
> > > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > @@ -1663,7 +1663,7 @@ static struct sk_buff
> *dpaa_cleanup_tx_fd(const
> > > struct dpaa_priv *priv,
> > > >  qm_sg_entry_get_len([0]),
> dma_dir);
> > > >
> > > > /* remaining pages were mapped with
> skb_frag_dma_map()
> > > */
> > > > -   for (i = 1; i < nr_frags; i++) {
> > > > +   for (i = 1; i <= nr_frags; i++) {
> > > > WARN_ON(qm_sg_entry_is_ext([i]));
> > > >
> > > > dma_unmap_page(dev, qm_sg_addr([i]),
> > > > --
> > > > 2.17.1
> > > >


Re: Linux 5.1-rc5

2019-05-02 Thread Greg KH
On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  wrote:
> >
> > Can we please have the page refcount overflow fixes out on the list
> > for review, even if it is after the fact?
> 
> They were actually on a list for review long before the fact, but it
> was the security mailing list. The issue actually got discussed back
> in January along with early versions of the patches, but then we
> dropped the ball because it just wasn't on anybody's radar and it got
> resurrected late March. Willy wrote a rather bigger patch-series, and
> review of that is what then resulted in those commits. So they may
> look recent, but that's just because the original patches got
> seriously edited down and rewritten.
> 
> That said, powerpc and s390 should at least look at maybe adding a
> check for the page ref in their gup paths too. Powerpc has the special
> gup_hugepte() case, and s390 has its own version of gup entirely. I
> was actually hoping the s390 guys would look at using the generic gup
> code.
> 
> I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> largely irrelevant, partly since even theoretically this whole issue
> needs a _lot_ of memory.
> 
> Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> (page ref overflow)"). You may or may not really care.

I've now queued these patches up for the next round of stable releases,
as some people seem to care about these.

I didn't see any follow-on patches for s390 or ppc64 hit the tree for
these changes, am I just missing them and should also queue up a few
more to handle this issue on those platforms?

thanks,

greg k-h


Re: [PATCH v2 01/17] powerpc/mm: Don't BUG() in hugepd_page()

2019-05-02 Thread Christophe Leroy




Le 02/05/2019 à 14:02, Michael Ellerman a écrit :

Christophe Leroy  writes:

Use VM_BUG_ON() instead of BUG_ON(), as those BUG_ON()
are not there to catch runtime errors but to catch errors
during development cycle only.


I've dropped this one and the next, because I don't like VM_BUG_ON().

Why not? Because it's contradictory. It's a condition that's so
important that we should BUG, but only if the kernel has been built
specially for debugging.

I don't really buy the development cycle distinction, it's not like we
have a rigorous test suite that we run and then we declare everything's
gold and ship a product. We often don't find bugs until they're hit in
the wild.

For example the recent corruption Joel discovered with STRICT_KERNEL_RWX
could have been caught by a BUG_ON() to check we weren't patching kernel
text in radix__change_memory_range(), but he wouldn't have been using
CONFIG_DEBUG_VM. (See 8adddf349fda)

I know Aneesh disagrees with me on this, so maybe you two can convince
me otherwise.



I have no strong oppinion about this. In v1, I replaced them with a 
WARN_ON(), and Aneesh suggested to go with VM_BUG_ON() instead.


My main purpose was to reduce the amount of BUG/BUG_ON and I thought 
those were good candidates, but if you prefer keeping the BUG(), that's 
ok for me. Or maybe you prefered v1 alternatives (series at 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=98170) ?


Christophe


Re: [PATCH v2 01/17] powerpc/mm: Don't BUG() in hugepd_page()

2019-05-02 Thread Michael Ellerman
Christophe Leroy  writes:
> Use VM_BUG_ON() instead of BUG_ON(), as those BUG_ON()
> are not there to catch runtime errors but to catch errors
> during development cycle only.

I've dropped this one and the next, because I don't like VM_BUG_ON().

Why not? Because it's contradictory. It's a condition that's so
important that we should BUG, but only if the kernel has been built
specially for debugging.

I don't really buy the development cycle distinction, it's not like we
have a rigorous test suite that we run and then we declare everything's
gold and ship a product. We often don't find bugs until they're hit in
the wild.

For example the recent corruption Joel discovered with STRICT_KERNEL_RWX
could have been caught by a BUG_ON() to check we weren't patching kernel
text in radix__change_memory_range(), but he wouldn't have been using
CONFIG_DEBUG_VM. (See 8adddf349fda)

I know Aneesh disagrees with me on this, so maybe you two can convince
me otherwise.

cheers

> diff --git a/arch/powerpc/include/asm/hugetlb.h 
> b/arch/powerpc/include/asm/hugetlb.h
> index 8d40565ad0c3..7f1867e428c0 100644
> --- a/arch/powerpc/include/asm/hugetlb.h
> +++ b/arch/powerpc/include/asm/hugetlb.h
> @@ -14,7 +14,7 @@
>   */
>  static inline pte_t *hugepd_page(hugepd_t hpd)
>  {
> - BUG_ON(!hugepd_ok(hpd));
> + VM_BUG_ON(!hugepd_ok(hpd));
>   /*
>* We have only four bits to encode, MMU page size
>*/
> @@ -42,7 +42,7 @@ static inline void flush_hugetlb_page(struct vm_area_struct 
> *vma,
>  
>  static inline pte_t *hugepd_page(hugepd_t hpd)
>  {
> - BUG_ON(!hugepd_ok(hpd));
> + VM_BUG_ON(!hugepd_ok(hpd));
>  #ifdef CONFIG_PPC_8xx
>   return (pte_t *)__va(hpd_val(hpd) & ~HUGEPD_SHIFT_MASK);
>  #else
> -- 
> 2.13.3


Re: [PATCH v1 3/4] powerpc/mm: Move book3s32 specifics in subdirectory mm/book3s64

2019-05-02 Thread Christophe Leroy




Le 02/05/2019 à 13:32, Michael Ellerman a écrit :

Christophe Leroy  writes:


Several files in arch/powerpc/mm are only for book3S32. This patch
creates a subdirectory for them.

Signed-off-by: Christophe Leroy 
---
  arch/powerpc/mm/Makefile| 3 +--
  arch/powerpc/mm/book3s32/Makefile   | 6 ++
  arch/powerpc/mm/{ => book3s32}/hash_low_32.S| 0
  arch/powerpc/mm/{ => book3s32}/mmu_context_hash32.c | 0
  arch/powerpc/mm/{ => book3s32}/ppc_mmu_32.c | 0
  arch/powerpc/mm/{ => book3s32}/tlb_hash32.c | 0
  6 files changed, 7 insertions(+), 2 deletions(-)
  create mode 100644 arch/powerpc/mm/book3s32/Makefile
  rename arch/powerpc/mm/{ => book3s32}/hash_low_32.S (100%)
  rename arch/powerpc/mm/{ => book3s32}/mmu_context_hash32.c (100%)
  rename arch/powerpc/mm/{ => book3s32}/ppc_mmu_32.c (100%)
  rename arch/powerpc/mm/{ => book3s32}/tlb_hash32.c (100%)


I shortened them to:

   arch/powerpc/mm/{hash_low_32.S => book3s32/hash_low.S}
   arch/powerpc/mm/{ppc_mmu_32.c => book3s32/mmu.c}


To be consistent with what you did in nohash/ dir, shouldn't we rename 
the above 'ppc.c' or 'ppc_32.c' instead of 'mmu.c' ?


Christophe


   arch/powerpc/mm/{mmu_context_hash32.c => book3s32/mmu_context.c}
   arch/powerpc/mm/{tlb_hash32.c => book3s32/tlb.c}

cheers



Re: [PATCH v1 4/4] powerpc/mm: Move nohash specifics in subdirectory mm/nohash

2019-05-02 Thread Michael Ellerman
Christophe Leroy  writes:

> Many files in arch/powerpc/mm are only for nohash. This patch
> creates a subdirectory for them.
>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/Makefile  | 17 +
>  arch/powerpc/mm/{ => nohash}/40x_mmu.c|  0
>  arch/powerpc/mm/{ => nohash}/44x_mmu.c|  0
>  arch/powerpc/mm/{ => nohash}/8xx_mmu.c|  0
>  arch/powerpc/mm/nohash/Makefile   | 21 +
>  arch/powerpc/mm/{ => nohash}/fsl_booke_mmu.c  |  0
>  arch/powerpc/mm/{ => nohash}/hugetlbpage-book3e.c |  0
>  arch/powerpc/mm/{ => nohash}/mmu_context_nohash.c |  0
>  arch/powerpc/mm/{ => nohash}/pgtable-book3e.c |  0
>  arch/powerpc/mm/{ => nohash}/tlb_low_64e.S|  0
>  arch/powerpc/mm/{ => nohash}/tlb_nohash.c |  0
>  arch/powerpc/mm/{ => nohash}/tlb_nohash_low.S |  0
>  12 files changed, 22 insertions(+), 16 deletions(-)
>  rename arch/powerpc/mm/{ => nohash}/40x_mmu.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/44x_mmu.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/8xx_mmu.c (100%)
>  create mode 100644 arch/powerpc/mm/nohash/Makefile
>  rename arch/powerpc/mm/{ => nohash}/fsl_booke_mmu.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/hugetlbpage-book3e.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/mmu_context_nohash.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/pgtable-book3e.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/tlb_low_64e.S (100%)
>  rename arch/powerpc/mm/{ => nohash}/tlb_nohash.c (100%)
>  rename arch/powerpc/mm/{ => nohash}/tlb_nohash_low.S (100%)

I went with:

  arch/powerpc/mm/{40x_mmu.c => nohash/40x.c}
  arch/powerpc/mm/{44x_mmu.c => nohash/44x.c}
  arch/powerpc/mm/{8xx_mmu.c => nohash/8xx.c}
  arch/powerpc/mm/{hugetlbpage-book3e.c => nohash/book3e_hugetlbpage.c}
  arch/powerpc/mm/{pgtable-book3e.c => nohash/book3e_pgtable.c}
  arch/powerpc/mm/{fsl_booke_mmu.c => nohash/fsl_booke.c}
  arch/powerpc/mm/{mmu_context_nohash.c => nohash/mmu_context.c}
  arch/powerpc/mm/{tlb_nohash.c => nohash/tlb.c}
  arch/powerpc/mm/{tlb_nohash_low.S => nohash/tlb_low.S}
  arch/powerpc/mm/{ => nohash}/tlb_low_64e.S

cheers


Re: [PATCH v1 3/4] powerpc/mm: Move book3s32 specifics in subdirectory mm/book3s64

2019-05-02 Thread Michael Ellerman
Christophe Leroy  writes:

> Several files in arch/powerpc/mm are only for book3S32. This patch
> creates a subdirectory for them.
>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/Makefile| 3 +--
>  arch/powerpc/mm/book3s32/Makefile   | 6 ++
>  arch/powerpc/mm/{ => book3s32}/hash_low_32.S| 0
>  arch/powerpc/mm/{ => book3s32}/mmu_context_hash32.c | 0
>  arch/powerpc/mm/{ => book3s32}/ppc_mmu_32.c | 0
>  arch/powerpc/mm/{ => book3s32}/tlb_hash32.c | 0
>  6 files changed, 7 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/mm/book3s32/Makefile
>  rename arch/powerpc/mm/{ => book3s32}/hash_low_32.S (100%)
>  rename arch/powerpc/mm/{ => book3s32}/mmu_context_hash32.c (100%)
>  rename arch/powerpc/mm/{ => book3s32}/ppc_mmu_32.c (100%)
>  rename arch/powerpc/mm/{ => book3s32}/tlb_hash32.c (100%)

I shortened them to:

  arch/powerpc/mm/{hash_low_32.S => book3s32/hash_low.S}
  arch/powerpc/mm/{ppc_mmu_32.c => book3s32/mmu.c}
  arch/powerpc/mm/{mmu_context_hash32.c => book3s32/mmu_context.c}
  arch/powerpc/mm/{tlb_hash32.c => book3s32/tlb.c}

cheers


Re: [PATCH v1 2/4] powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64

2019-05-02 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 02/05/2019 à 09:11, Michael Ellerman a écrit :
>> Christophe Leroy  writes:
>> 
>>> Many files in arch/powerpc/mm are only for book3S64. This patch
>>> creates a subdirectory for them.
>>>
>>> Signed-off-by: Christophe Leroy 
>>> ---
>>>   arch/powerpc/mm/Makefile   | 25 
>>> +++
>>>   arch/powerpc/mm/book3s64/Makefile  | 28 
>>> ++
>>>   arch/powerpc/mm/{ => book3s64}/hash64_4k.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/hash64_64k.c|  0
>>>   arch/powerpc/mm/{ => book3s64}/hash_native_64.c|  0
>>>   arch/powerpc/mm/{ => book3s64}/hash_utils_64.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c   |  0
>>>   .../powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c |  0
>>>   .../mm/{ => book3s64}/mmu_context_book3s64.c   |  0
>>>   arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c  |  0
>>>   arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c|  0
>>>   arch/powerpc/mm/{ => book3s64}/pgtable-radix.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/pkeys.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/slb.c   |  0
>>>   arch/powerpc/mm/{ => book3s64}/subpage-prot.c  |  0
>>>   arch/powerpc/mm/{ => book3s64}/tlb-radix.c |  0
>>>   arch/powerpc/mm/{ => book3s64}/tlb_hash64.c|  0
>>>   arch/powerpc/mm/{ => book3s64}/vphn.c  |  0
>>>   arch/powerpc/mm/{ => book3s64}/vphn.h  |  0
>>>   arch/powerpc/mm/numa.c |  2 +-
>>>   22 files changed, 32 insertions(+), 23 deletions(-)
>>>   create mode 100644 arch/powerpc/mm/book3s64/Makefile
>>>   rename arch/powerpc/mm/{ => book3s64}/hash64_4k.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hash64_64k.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hash_native_64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hash_utils_64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/mmu_context_book3s64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/pgtable-radix.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/pkeys.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/slb.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/subpage-prot.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/tlb-radix.c (100%)
>>>   rename arch/powerpc/mm/{ => book3s64}/tlb_hash64.c (100%)
>> 
>> Do you mind if I take this but rework the destination names in the process?
>
> I don't mind, I think it's a good idea.
>
>> 
>> I don't like having eg. book3s64/pgtable-book3s64.c
>> 
>> And some of the other names could use a bit of cleanup too.
>> 
>> What about:
>> 
>>   arch/powerpc/mm/{hash64_4k.c => book3s64/hash_4k.c}
>>   arch/powerpc/mm/{hash64_64k.c => book3s64/hash_64k.c}
>>   arch/powerpc/mm/{hugepage-hash64.c => book3s64/hash_hugepage.c}
>>   arch/powerpc/mm/{hugetlbpage-hash64.c => book3s64/hash_hugetlbpage.c}
>>   arch/powerpc/mm/{hash_native_64.c => book3s64/hash_native.c}
>>   arch/powerpc/mm/{pgtable-hash64.c => book3s64/hash_pgtable.c}
>>   arch/powerpc/mm/{tlb_hash64.c => book3s64/hash_tlb.c}
>>   arch/powerpc/mm/{hash_utils_64.c => book3s64/hash_utils.c}
>>   arch/powerpc/mm/{mmu_context_iommu.c => book3s64/iommu_api.c}
>>   arch/powerpc/mm/{mmu_context_book3s64.c => book3s64/mmu_context.c}
>>   arch/powerpc/mm/{pgtable-book3s64.c => book3s64/pgtable.c}
>>   arch/powerpc/mm/{hugetlbpage-radix.c => book3s64/radix_hugetlbpage.c}
>>   arch/powerpc/mm/{pgtable-radix.c => book3s64/radix_pgtable.c}
>>   arch/powerpc/mm/{tlb-radix.c => book3s64/radix_tlb.c}
>
> Looks good

Thanks. I'll do something similar for 32-bit & nohash.

cheers


Re: [PATCH] crypto: caam/jr - Remove extra memory barrier during job ring dequeue

2019-05-02 Thread Horia Geanta
On 5/1/2019 8:49 AM, Michael Ellerman wrote:
> Vakul Garg wrote:
>> In function caam_jr_dequeue(), a full memory barrier is used before
>> writing response job ring's register to signal removal of the completed
>> job. Therefore for writing the register, we do not need another write
>> memory barrier. Hence it is removed by replacing the call to wr_reg32()
>> with a newly defined function wr_reg32_relaxed().
>>
>> Signed-off-by: Vakul Garg 
>> ---
>>  drivers/crypto/caam/jr.c   | 2 +-
>>  drivers/crypto/caam/regs.h | 8 
>>  2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
>> index 4e9b3fca5627..2ce6d7d2ad72 100644
>> --- a/drivers/crypto/caam/jr.c
>> +++ b/drivers/crypto/caam/jr.c
>> @@ -266,7 +266,7 @@ static void caam_jr_dequeue(unsigned long devarg)
>>  mb();
>>  
>>  /* set done */
>> -wr_reg32(>rregs->outring_rmvd, 1);
>> +wr_reg32_relaxed(>rregs->outring_rmvd, 1);
>>  
>>  jrp->out_ring_read_index = (jrp->out_ring_read_index + 1) &
>> (JOBR_DEPTH - 1);
>> diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
>> index 3cd0822ea819..9e912c722e33 100644
>> --- a/drivers/crypto/caam/regs.h
>> +++ b/drivers/crypto/caam/regs.h
>> @@ -96,6 +96,14 @@ cpu_to_caam(16)
>>  cpu_to_caam(32)
>>  cpu_to_caam(64)
>>  
>> +static inline void wr_reg32_relaxed(void __iomem *reg, u32 data)
>> +{
>> +if (caam_little_end)
>> +writel_relaxed(data, reg);
>> +else
>> +writel_relaxed(cpu_to_be32(data), reg);
>> +}
When both core (PPC) and crypto engine (caam) are big endian, data ends up being
swapped - which is incorrect:
writel_relaxed -> writel -> __do_writel -> out_le32 -> swap
cpu_to_be32(data) -> data

>> +
>>  static inline void wr_reg32(void __iomem *reg, u32 data)
>>  {
>>  if (caam_little_end)
> 
> This crashes on my p5020ds. Did you test on powerpc?
> 
> # first bad commit: [bbfcac5ff5f26aafa51935a62eb86b6eacfe8a49] crypto: 
> caam/jr - Remove extra memory barrier during job ring dequeue

Thanks for the report Michael.

Any hint what would be the proper approach here - to have relaxed I/O accessors
that would work both for ARM and PPC, and avoid ifdeffery etc.?

For non-relaxed version, we used iowriteXX and iowriteXXbe - which work fine on
ARM and PPC, covering all the endianness combinations (core + crypto engine):

static inline void wr_reg32(void __iomem *reg, u32 data)
{
if (caam_little_end)
iowrite32(data, reg);
else
iowrite32be(data, reg);
}

Thanks,
Horia


Re: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Joakim Tjernlund
On Thu, 2019-05-02 at 09:05 +, Laurentiu Tudor wrote:
> Hi Joakim,
> 
> > -Original Message-
> > From: Joakim Tjernlund 
> > Sent: Saturday, April 27, 2019 8:11 PM
> > 
> > On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > > From: Laurentiu Tudor 
> > > 
> > > Fix issue with the entry indexing in the sg frame cleanup code being
> > > off-by-1. This problem showed up when doing some basic iperf tests and
> > > manifested in traffic coming to a halt.
> > > 
> > > Signed-off-by: Laurentiu Tudor 
> > > Acked-by: Madalin Bucur 
> > 
> > Wasn't this a stable candidate too?
> 
> Yes, it is. I forgot to add the cc:stable tag, sorry about that.

Then this is a bug fix that should go directly to linus/stable.

I note that 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/ethernet/freescale/dpaa?h=linux-4.19.y
is in 4.19 but not in 4.14 , is it not appropriate for 4.14?

 Jocke

> 
> ---
> Best Regards, Laurentiu
> 
> > > ---
> > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > index daede7272768..40420edc9ce6 100644
> > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > @@ -1663,7 +1663,7 @@ static struct sk_buff *dpaa_cleanup_tx_fd(const
> > struct dpaa_priv *priv,
> > >  qm_sg_entry_get_len([0]), dma_dir);
> > > 
> > > /* remaining pages were mapped with skb_frag_dma_map()
> > */
> > > -   for (i = 1; i < nr_frags; i++) {
> > > +   for (i = 1; i <= nr_frags; i++) {
> > > WARN_ON(qm_sg_entry_is_ext([i]));
> > > 
> > > dma_unmap_page(dev, qm_sg_addr([i]),
> > > --
> > > 2.17.1
> > > 


[PATCH][next] KVM: PPC: Book3S HV: XIVE: fix spelling mistake "acessing" -> "accessing"

2019-05-02 Thread Colin King
From: Colin Ian King 

There is a spelling mistake in a pr_err message, fix it.

Signed-off-by: Colin Ian King 
---
 arch/powerpc/kvm/book3s_xive_native.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
b/arch/powerpc/kvm/book3s_xive_native.c
index 5e14df1a4403..6a8e698c4b6e 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -235,7 +235,7 @@ static vm_fault_t xive_native_esb_fault(struct vm_fault 
*vmf)
arch_spin_unlock(>lock);
 
if (WARN_ON(!page)) {
-   pr_err("%s: acessing invalid ESB page for source %lx !\n",
+   pr_err("%s: accessing invalid ESB page for source %lx !\n",
   __func__, irq);
return VM_FAULT_SIGBUS;
}
-- 
2.20.1



RE: [EXT] Re: [PATCH V4] ASoC: fsl_esai: Add pm runtime function

2019-05-02 Thread S.j. Wang
Hi Mark

> On Sun, Apr 28, 2019 at 02:24:54AM +, S.j. Wang wrote:
> > Add pm runtime support and move clock handling there.
> > Close the clocks at suspend to reduce the power consumption.
> >
> > fsl_esai_suspend is replaced by pm_runtime_force_suspend.
> > fsl_esai_resume is replaced by pm_runtime_force_resume.
> 
> This doesn't apply against for-5.2 again.  Sorry about this, I think this one 
> is
> due to some messups with my scripts which caused some patches to be
> dropped for a while (and it's likely to be what happened the last time as
> well).  Can you check and resend again please?  Like I say sorry about this, I
> think it's my mistake.

I am checking, but I don't know why this patch failed in your side. I 
Tried to apply this patch on for-5.1, for 5.2,  for-linus  and for-next, all are
Successful.  The git is 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git.

I can't reproduce your problem. Is there any my operation wrong?

Best regards
Wang shengjiu



RE: [PATCH v2 7/9] dpaa_eth: fix iova handling for contiguous frames

2019-05-02 Thread Laurentiu Tudor



> -Original Message-
> From: Christoph Hellwig 
> Sent: Saturday, April 27, 2019 7:46 PM
> 
> On Sat, Apr 27, 2019 at 10:10:29AM +0300, laurentiu.tu...@nxp.com wrote:
> > From: Laurentiu Tudor 
> >
> > The driver relies on the no longer valid assumption that dma addresses
> > (iovas) are identical to physical addressees and uses phys_to_virt() to
> > make iova -> vaddr conversions. Fix this by adding a function that does
> > proper iova -> phys conversions using the iommu api and update the code
> > to use it.
> > Also, a dma_unmap_single() call had to be moved further down the code
> > because iova -> vaddr conversions were required before the unmap.
> > For now only the contiguous frame case is handled and the SG case is
> > split in a following patch.
> > While at it, clean-up a redundant dpaa_bpid2pool() and pass the bp
> > as parameter.
> 
> Err, this is broken.  A driver using the DMA API has no business
> call IOMMU APIs.  Just save the _virtual_ address used for the mapping
> away and use that again.  We should not go through crazy gymnastics
> like this.

I think that due to the particularity of this hardware we don't have a way of 
saving the VA, but I'd let my colleagues maintaining this driver to comment 
more on why we need to do this.

---
Best Regards, Laurentiu


RE: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Laurentiu Tudor
Hi Joakim,

> -Original Message-
> From: Joakim Tjernlund 
> Sent: Saturday, April 27, 2019 8:11 PM
> 
> On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > From: Laurentiu Tudor 
> >
> > Fix issue with the entry indexing in the sg frame cleanup code being
> > off-by-1. This problem showed up when doing some basic iperf tests and
> > manifested in traffic coming to a halt.
> >
> > Signed-off-by: Laurentiu Tudor 
> > Acked-by: Madalin Bucur 
> 
> Wasn't this a stable candidate too?

Yes, it is. I forgot to add the cc:stable tag, sorry about that.

---
Best Regards, Laurentiu
 
> > ---
> >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > index daede7272768..40420edc9ce6 100644
> > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > @@ -1663,7 +1663,7 @@ static struct sk_buff *dpaa_cleanup_tx_fd(const
> struct dpaa_priv *priv,
> >  qm_sg_entry_get_len([0]), dma_dir);
> >
> > /* remaining pages were mapped with skb_frag_dma_map()
> */
> > -   for (i = 1; i < nr_frags; i++) {
> > +   for (i = 1; i <= nr_frags; i++) {
> > WARN_ON(qm_sg_entry_is_ext([i]));
> >
> > dma_unmap_page(dev, qm_sg_addr([i]),
> > --
> > 2.17.1
> >



[PATCH v2 2/2] powerpc/mm: Warn if W+X pages found on boot

2019-05-02 Thread Russell Currey
Implement code to walk all pages and warn if any are found to be both
writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and is
behind the DEBUG_WX config option.

This only runs on boot and has no runtime performance implications.

Very heavily influenced (and in some cases copied verbatim) from the
ARM64 code written by Laura Abbott (thanks!), since our ptdump
infrastructure is similar.

Signed-off-by: Russell Currey 
---
v2: A myriad of fixes and cleanups thanks to Christophe Leroy

 arch/powerpc/Kconfig.debug | 19 ++
 arch/powerpc/include/asm/pgtable.h |  6 +
 arch/powerpc/mm/pgtable_32.c   |  3 +++
 arch/powerpc/mm/pgtable_64.c   |  3 +++
 arch/powerpc/mm/ptdump/ptdump.c| 41 +-
 5 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 4e00cb0a5464..9e8bcddd8b8f 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -361,6 +361,25 @@ config PPC_PTDUMP
 
  If you are unsure, say N.
 
+config PPC_DEBUG_WX
+   bool "Warn on W+X mappings at boot"
+   select PPC_PTDUMP
+   help
+ Generate a warning if any W+X mappings are found at boot.
+
+ This is useful for discovering cases where the kernel is leaving
+ W+X mappings after applying NX, as such mappings are a security risk.
+
+ Note that even if the check fails, your kernel is possibly
+ still fine, as W+X mappings are not a security hole in
+ themselves, what they do is that they make the exploitation
+ of other unfixed kernel bugs easier.
+
+ There is no runtime or memory usage effect of this option
+ once the kernel has booted up - it's a one time check.
+
+ If in doubt, say "Y".
+
 config PPC_FAST_ENDIAN_SWITCH
bool "Deprecated fast endian-switch syscall"
 depends on DEBUG_KERNEL && PPC_BOOK3S_64
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 505550fb2935..50c0d06fac2f 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -108,6 +108,12 @@ void mark_initmem_nx(void);
 static inline void mark_initmem_nx(void) { }
 #endif
 
+#ifdef CONFIG_PPC_DEBUG_WX
+void ptdump_check_wx(void);
+#else
+static inline void ptdump_check_wx(void) { }
+#endif
+
 /*
  * When used, PTE_FRAG_NR is defined in subarch pgtable.h
  * so we are sure it is included when arriving here.
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 6e56a6240bfa..6f919779ee06 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -384,6 +384,9 @@ void mark_rodata_ro(void)
   PFN_DOWN((unsigned long)__start_rodata);
 
change_page_attr(page, numpages, PAGE_KERNEL_RO);
+
+   // mark_initmem_nx() should have already run by now
+   ptdump_check_wx();
 }
 #endif
 
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index fb1375c07e8c..bfa18453625e 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -328,6 +328,9 @@ void mark_rodata_ro(void)
radix__mark_rodata_ro();
else
hash__mark_rodata_ro();
+
+   // mark_initmem_nx() should have already run by now
+   ptdump_check_wx();
 }
 
 void mark_initmem_nx(void)
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index a4a132f92810..e69b53a8a841 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -31,7 +31,7 @@
 #include "ptdump.h"
 
 #ifdef CONFIG_PPC32
-#define KERN_VIRT_START0
+#define KERN_VIRT_STARTPAGE_OFFSET
 #endif
 
 /*
@@ -68,6 +68,8 @@ struct pg_state {
unsigned long last_pa;
unsigned int level;
u64 current_flags;
+   bool check_wx;
+   unsigned long wx_pages;
 };
 
 struct addr_marker {
@@ -177,6 +179,20 @@ static void dump_addr(struct pg_state *st, unsigned long 
addr)
 
 }
 
+static void note_prot_wx(struct pg_state *st, unsigned long addr)
+{
+   if (!st->check_wx)
+   return;
+
+   if (!((st->current_flags & pgprot_val(PAGE_KERNEL_X)) == 
pgprot_val(PAGE_KERNEL_X)))
+   return;
+
+   WARN_ONCE(1, "powerpc/mm: Found insecure W+X mapping at address 
%p/%pS\n",
+ (void *)st->start_address, (void *)st->start_address);
+
+   st->wx_pages += (addr - st->start_address) / PAGE_SIZE;
+}
+
 static void note_page(struct pg_state *st, unsigned long addr,
   unsigned int level, u64 val)
 {
@@ -206,6 +222,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
 
/* Check the PTE flags */
if (st->current_flags) {
+   note_prot_wx(st, addr);
dump_addr(st, addr);
 
/* Dump all the flags */
@@ -378,6 +395,28 @@ static void 

[PATCH v2 1/2] powerpc/mm/ptdump: Wrap seq_printf() to handle NULL pointers

2019-05-02 Thread Russell Currey
Lovingly borrowed from the arch/arm64 ptdump code.

This doesn't seem to be an issue in practice, but is necessary for my
upcoming commit.

Signed-off-by: Russell Currey 
---
v2: Fix putc to actually putc thanks to Christophe Leroy

 arch/powerpc/mm/ptdump/ptdump.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 37138428ab55..a4a132f92810 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -104,6 +104,18 @@ static struct addr_marker address_markers[] = {
{ -1,   NULL },
 };
 
+#define pt_dump_seq_printf(m, fmt, args...)\
+({ \
+   if (m)  \
+   seq_printf(m, fmt, ##args); \
+})
+
+#define pt_dump_seq_putc(m, c) \
+({ \
+   if (m)  \
+   seq_putc(m, c); \
+})
+
 static void dump_flag_info(struct pg_state *st, const struct flag_info
*flag, u64 pte, int num)
 {
@@ -121,19 +133,19 @@ static void dump_flag_info(struct pg_state *st, const 
struct flag_info
val = pte & flag->val;
if (flag->shift)
val = val >> flag->shift;
-   seq_printf(st->seq, "  %s:%llx", flag->set, val);
+   pt_dump_seq_printf(st->seq, "  %s:%llx", flag->set, 
val);
} else {
if ((pte & flag->mask) == flag->val)
s = flag->set;
else
s = flag->clear;
if (s)
-   seq_printf(st->seq, "  %s", s);
+   pt_dump_seq_printf(st->seq, "  %s", s);
}
st->current_flags &= ~flag->mask;
}
if (st->current_flags != 0)
-   seq_printf(st->seq, "  unknown flags:%llx", st->current_flags);
+   pt_dump_seq_printf(st->seq, "  unknown flags:%llx", 
st->current_flags);
 }
 
 static void dump_addr(struct pg_state *st, unsigned long addr)
@@ -148,12 +160,12 @@ static void dump_addr(struct pg_state *st, unsigned long 
addr)
 #define REG"0x%08lx"
 #endif
 
-   seq_printf(st->seq, REG "-" REG " ", st->start_address, addr - 1);
+   pt_dump_seq_printf(st->seq, REG "-" REG " ", st->start_address, addr - 
1);
if (st->start_pa == st->last_pa && st->start_address + PAGE_SIZE != 
addr) {
-   seq_printf(st->seq, "[" REG "]", st->start_pa);
+   pt_dump_seq_printf(st->seq, "[" REG "]", st->start_pa);
delta = PAGE_SIZE >> 10;
} else {
-   seq_printf(st->seq, " " REG " ", st->start_pa);
+   pt_dump_seq_printf(st->seq, " " REG " ", st->start_pa);
delta = (addr - st->start_address) >> 10;
}
/* Work out what appropriate unit to use */
@@ -161,7 +173,7 @@ static void dump_addr(struct pg_state *st, unsigned long 
addr)
delta >>= 10;
unit++;
}
-   seq_printf(st->seq, "%9lu%c", delta, *unit);
+   pt_dump_seq_printf(st->seq, "%9lu%c", delta, *unit);
 
 }
 
@@ -178,7 +190,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
st->start_address = addr;
st->start_pa = pa;
st->last_pa = pa;
-   seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
+   pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
/*
 * Dump the section of virtual memory when:
 *   - the PTE flags from one entry to the next differs.
@@ -202,7 +214,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
  st->current_flags,
  pg_level[st->level].num);
 
-   seq_putc(st->seq, '\n');
+   pt_dump_seq_putc(st->seq, '\n');
}
 
/*
@@ -211,7 +223,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
 */
while (addr >= st->marker[1].start_address) {
st->marker++;
-   seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
+   pt_dump_seq_printf(st->seq, "---[ %s ]---\n", 
st->marker->name);
}
st->start_address = addr;
st->start_pa = pa;
-- 
2.21.0



Re: [PATCH 2/3] powerpc/module32: Use symbolic instructions names.

2019-05-02 Thread Christophe Leroy




Le 29/04/2019 à 13:54, Segher Boessenkool a écrit :

On Mon, Apr 29, 2019 at 10:43:27AM +, Christophe Leroy wrote:

To increase readability/maintainability, replace hard coded
instructions values by symbolic names.



+   /* lis r12,sym@ha */
+#define ENTRY_JMP0(sym)(PPC_INST_ADDIS | __PPC_RT(R12) | PPC_HA(sym))
+   /* addi r12,r12,sym@l */
+#define ENTRY_JMP1(sym)(PPC_INST_ADDI | __PPC_RT(R12) | __PPC_RA(R12) 
| PPC_LO(sym))


Those aren't "jump" instructions though, as the name suggests...  And you
only have names for the first two of the four insns.  ("2" and "3" were
still available ;-) )


Well, the idea was to say they are defining the jump destination.
Anyway, as they are used only once, let's put it directly in.




-   entry->jump[0] = 0x3d80+((val+0x8000)>>16); /* lis r12,sym@ha */
-   entry->jump[1] = 0x398c + (val&0x); /* addi r12,r12,sym@l*/
-   entry->jump[2] = 0x7d8903a6;/* mtctr r12 */
-   entry->jump[3] = 0x4e800420; /* bctr */
+   entry->jump[0] = ENTRY_JMP0(val);
+   entry->jump[1] = ENTRY_JMP1(val);
+   entry->jump[2] = PPC_INST_MTCTR | __PPC_RS(R12);
+   entry->jump[3] = PPC_INST_BCTR;


Deleting the comment here is not an improvement imo.


Ok, I'll leave them in as I did for module64

Christophe




Segher



[PATCH] MAINTAINERS: Update cxl/ocxl email address

2019-05-02 Thread Andrew Donnellan
Use my @linux.ibm.com email to avoid a layer of redirection.

Signed-off-by: Andrew Donnellan 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5c38f21aee78..386e2336fe7e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4293,7 +4293,7 @@ F:drivers/net/ethernet/chelsio/cxgb4vf/
 
 CXL (IBM Coherent Accelerator Processor Interface CAPI) DRIVER
 M: Frederic Barrat 
-M: Andrew Donnellan 
+M: Andrew Donnellan 
 L: linuxppc-dev@lists.ozlabs.org
 S: Supported
 F: arch/powerpc/platforms/powernv/pci-cxl.c
@@ -11173,7 +11173,7 @@ F:  tools/objtool/
 
 OCXL (Open Coherent Accelerator Processor Interface OpenCAPI) DRIVER
 M: Frederic Barrat 
-M: Andrew Donnellan 
+M: Andrew Donnellan 
 L: linuxppc-dev@lists.ozlabs.org
 S: Supported
 F: arch/powerpc/platforms/powernv/ocxl.c
-- 
2.20.1



Re: [PATCH v1 2/4] powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64

2019-05-02 Thread Christophe Leroy




Le 02/05/2019 à 09:11, Michael Ellerman a écrit :

Christophe Leroy  writes:


Many files in arch/powerpc/mm are only for book3S64. This patch
creates a subdirectory for them.

Signed-off-by: Christophe Leroy 
---
  arch/powerpc/mm/Makefile   | 25 +++
  arch/powerpc/mm/book3s64/Makefile  | 28 ++
  arch/powerpc/mm/{ => book3s64}/hash64_4k.c |  0
  arch/powerpc/mm/{ => book3s64}/hash64_64k.c|  0
  arch/powerpc/mm/{ => book3s64}/hash_native_64.c|  0
  arch/powerpc/mm/{ => book3s64}/hash_utils_64.c |  0
  arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c   |  0
  .../powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c |  0
  arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c |  0
  .../mm/{ => book3s64}/mmu_context_book3s64.c   |  0
  arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c |  0
  arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c  |  0
  arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c|  0
  arch/powerpc/mm/{ => book3s64}/pgtable-radix.c |  0
  arch/powerpc/mm/{ => book3s64}/pkeys.c |  0
  arch/powerpc/mm/{ => book3s64}/slb.c   |  0
  arch/powerpc/mm/{ => book3s64}/subpage-prot.c  |  0
  arch/powerpc/mm/{ => book3s64}/tlb-radix.c |  0
  arch/powerpc/mm/{ => book3s64}/tlb_hash64.c|  0
  arch/powerpc/mm/{ => book3s64}/vphn.c  |  0
  arch/powerpc/mm/{ => book3s64}/vphn.h  |  0
  arch/powerpc/mm/numa.c |  2 +-
  22 files changed, 32 insertions(+), 23 deletions(-)
  create mode 100644 arch/powerpc/mm/book3s64/Makefile
  rename arch/powerpc/mm/{ => book3s64}/hash64_4k.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hash64_64k.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hash_native_64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hash_utils_64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/mmu_context_book3s64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/pgtable-radix.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/pkeys.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/slb.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/subpage-prot.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/tlb-radix.c (100%)
  rename arch/powerpc/mm/{ => book3s64}/tlb_hash64.c (100%)


Do you mind if I take this but rework the destination names in the process?


I don't mind, I think it's a good idea.



I don't like having eg. book3s64/pgtable-book3s64.c

And some of the other names could use a bit of cleanup too.

What about:

  arch/powerpc/mm/{hash64_4k.c => book3s64/hash_4k.c}
  arch/powerpc/mm/{hash64_64k.c => book3s64/hash_64k.c}
  arch/powerpc/mm/{hugepage-hash64.c => book3s64/hash_hugepage.c}
  arch/powerpc/mm/{hugetlbpage-hash64.c => book3s64/hash_hugetlbpage.c}
  arch/powerpc/mm/{hash_native_64.c => book3s64/hash_native.c}
  arch/powerpc/mm/{pgtable-hash64.c => book3s64/hash_pgtable.c}
  arch/powerpc/mm/{tlb_hash64.c => book3s64/hash_tlb.c}
  arch/powerpc/mm/{hash_utils_64.c => book3s64/hash_utils.c}
  arch/powerpc/mm/{mmu_context_iommu.c => book3s64/iommu_api.c}
  arch/powerpc/mm/{mmu_context_book3s64.c => book3s64/mmu_context.c}
  arch/powerpc/mm/{pgtable-book3s64.c => book3s64/pgtable.c}
  arch/powerpc/mm/{hugetlbpage-radix.c => book3s64/radix_hugetlbpage.c}
  arch/powerpc/mm/{pgtable-radix.c => book3s64/radix_pgtable.c}
  arch/powerpc/mm/{tlb-radix.c => book3s64/radix_tlb.c}


Looks good

Christophe


Re: [PATCH v1 2/4] powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64

2019-05-02 Thread Michael Ellerman
Christophe Leroy  writes:

> Many files in arch/powerpc/mm are only for book3S64. This patch
> creates a subdirectory for them.
>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/Makefile   | 25 +++
>  arch/powerpc/mm/book3s64/Makefile  | 28 
> ++
>  arch/powerpc/mm/{ => book3s64}/hash64_4k.c |  0
>  arch/powerpc/mm/{ => book3s64}/hash64_64k.c|  0
>  arch/powerpc/mm/{ => book3s64}/hash_native_64.c|  0
>  arch/powerpc/mm/{ => book3s64}/hash_utils_64.c |  0
>  arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c   |  0
>  .../powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c |  0
>  arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c |  0
>  .../mm/{ => book3s64}/mmu_context_book3s64.c   |  0
>  arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c |  0
>  arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c  |  0
>  arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c|  0
>  arch/powerpc/mm/{ => book3s64}/pgtable-radix.c |  0
>  arch/powerpc/mm/{ => book3s64}/pkeys.c |  0
>  arch/powerpc/mm/{ => book3s64}/slb.c   |  0
>  arch/powerpc/mm/{ => book3s64}/subpage-prot.c  |  0
>  arch/powerpc/mm/{ => book3s64}/tlb-radix.c |  0
>  arch/powerpc/mm/{ => book3s64}/tlb_hash64.c|  0
>  arch/powerpc/mm/{ => book3s64}/vphn.c  |  0
>  arch/powerpc/mm/{ => book3s64}/vphn.h  |  0
>  arch/powerpc/mm/numa.c |  2 +-
>  22 files changed, 32 insertions(+), 23 deletions(-)
>  create mode 100644 arch/powerpc/mm/book3s64/Makefile
>  rename arch/powerpc/mm/{ => book3s64}/hash64_4k.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hash64_64k.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hash_native_64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hash_utils_64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hugepage-hash64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-hash64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/hugetlbpage-radix.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/mmu_context_book3s64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/mmu_context_iommu.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/pgtable-book3s64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/pgtable-hash64.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/pgtable-radix.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/pkeys.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/slb.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/subpage-prot.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/tlb-radix.c (100%)
>  rename arch/powerpc/mm/{ => book3s64}/tlb_hash64.c (100%)

Do you mind if I take this but rework the destination names in the process?

I don't like having eg. book3s64/pgtable-book3s64.c

And some of the other names could use a bit of cleanup too.

What about:

 arch/powerpc/mm/{hash64_4k.c => book3s64/hash_4k.c}
 arch/powerpc/mm/{hash64_64k.c => book3s64/hash_64k.c}
 arch/powerpc/mm/{hugepage-hash64.c => book3s64/hash_hugepage.c}
 arch/powerpc/mm/{hugetlbpage-hash64.c => book3s64/hash_hugetlbpage.c}
 arch/powerpc/mm/{hash_native_64.c => book3s64/hash_native.c}
 arch/powerpc/mm/{pgtable-hash64.c => book3s64/hash_pgtable.c}
 arch/powerpc/mm/{tlb_hash64.c => book3s64/hash_tlb.c}
 arch/powerpc/mm/{hash_utils_64.c => book3s64/hash_utils.c}
 arch/powerpc/mm/{mmu_context_iommu.c => book3s64/iommu_api.c}
 arch/powerpc/mm/{mmu_context_book3s64.c => book3s64/mmu_context.c}
 arch/powerpc/mm/{pgtable-book3s64.c => book3s64/pgtable.c}
 arch/powerpc/mm/{hugetlbpage-radix.c => book3s64/radix_hugetlbpage.c}
 arch/powerpc/mm/{pgtable-radix.c => book3s64/radix_pgtable.c}
 arch/powerpc/mm/{tlb-radix.c => book3s64/radix_tlb.c}

cheers


Re: [PATCHv2] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-05-02 Thread Pingfan Liu
On Thu, Apr 25, 2019 at 4:20 PM Pingfan Liu  wrote:
>
> On Wed, Apr 24, 2019 at 4:31 PM Matthias Brugger  wrote:
> >
> >
> [...]
> > > @@ -139,6 +141,8 @@ static int __init parse_crashkernel_simple(char 
> > > *cmdline,
> > >   pr_warn("crashkernel: unrecognized char: %c\n", *cur);
> > >   return -EINVAL;
> > >   }
> > > + if (*crash_size == 0)
> > > + return -EINVAL;
> >
> > This covers the case where I pass an argument like "crashkernel=0M" ?
> > Can't we fix that by using kstrtoull() in memparse and check if the return 
> > value
> > is < 0? In that case we could return without updating the retptr and we 
> > will be
> > fine.
After a series of work, I suddenly realized that it can not be done
like this way. "0M" causes kstrtoull() to return -EINVAL, but this is
caused by "M", not "0". If passing "0" to kstrtoull(), it will return
0 on success.

> >
> It seems that kstrtoull() treats 0M as invalid parameter, while
> simple_strtoull() does not.
>
My careless going through the code. And I tested with a valid value
"256M" using kstrtoull(), it also returned -EINVAL.

So I think there is no way to distinguish 0 from a positive value
inside this basic math function.
Do I miss anything?

Thanks and regards,
Pingfan