Re: [PATCH kernel 3/3] powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains

2022-07-18 Thread Alexey Kardashevskiy




On 19/07/2022 04:09, Jason Gunthorpe wrote:

On Thu, Jul 14, 2022 at 06:18:22PM +1000, Alexey Kardashevskiy wrote:


+/*
+ * A simple iommu_ops to allow less cruft in generic VFIO code.
+ */
+static bool spapr_tce_iommu_capable(enum iommu_cap cap)
+{
+   switch (cap) {
+   case IOMMU_CAP_CACHE_COHERENCY:


I would add a remark here that it is because vfio is going to use
SPAPR mode but still checks that the iommu driver support coherency -
with out that detail it looks very strange to have caps without
implementing unmanaged domains


+static struct iommu_domain *spapr_tce_iommu_domain_alloc(unsigned int type)
+{
+   struct iommu_domain *dom;
+
+   if (type != IOMMU_DOMAIN_BLOCKED)
+   return NULL;
+
+   dom = kzalloc(sizeof(*dom), GFP_KERNEL);
+   if (!dom)
+   return NULL;
+
+   dom->geometry.aperture_start = 0;
+   dom->geometry.aperture_end = ~0ULL;
+   dom->geometry.force_aperture = true;


A blocked domain doesn't really have an aperture, all DMA is rejected,
so I think these can just be deleted and left at zero.

Generally I'm suggesting drivers just use a static singleton instance
for the blocked domain instead of the allocation like this, but that
is a very minor nit.


+static struct iommu_device *spapr_tce_iommu_probe_device(struct device *dev)
+{
+   struct pci_dev *pdev;
+   struct pci_controller *hose;
+
+   /* Weirdly iommu_device_register() assigns the same ops to all buses */
+   if (!dev_is_pci(dev))
+   return ERR_PTR(-EPERM);


Less "weirdly", more by design. The iommu driver should check if the
given struct device is supported or not, it isn't really a bus
specific operation.


+static struct iommu_group *spapr_tce_iommu_device_group(struct device *dev)
+{
+   struct pci_controller *hose;
+   struct pci_dev *pdev;
+
+   /* Weirdly iommu_device_register() assigns the same ops to all buses */
+   if (!dev_is_pci(dev))
+   return ERR_PTR(-EPERM);


This doesn't need repeating, if probe_device() fails then this will
never be called.


+static int spapr_tce_iommu_attach_dev(struct iommu_domain *dom,
+ struct device *dev)
+{
+   struct iommu_group *grp = iommu_group_get(dev);
+   struct iommu_table_group *table_group;
+   int ret = -EINVAL;
+
+   if (!grp)
+   return -ENODEV;
+
+   table_group = iommu_group_get_iommudata(grp);
+
+   if (dom->type == IOMMU_DOMAIN_BLOCKED)
+   ret = table_group->ops->take_ownership(table_group);


Ideally there shouldn't be dom->type checks like this.


The blocking domain should have its own iommu_domain_ops that only
process the blocking operation. Ie call this like
spapr_tce_iommu_blocking_attach_dev()

Instead of having a "default_domain_ops" leave it NULL and create a
spapr_tce_blocking_domain_ops with these two functions and assign it
to domain->ops when creating. Then it is really clear these functions
are only called for the DOMAIN_BLOCKED type and you don't need to
check it.


+static void spapr_tce_iommu_detach_dev(struct iommu_domain *dom,
+  struct device *dev)
+{
+   struct iommu_group *grp = iommu_group_get(dev);
+   struct iommu_table_group *table_group;
+
+   table_group = iommu_group_get_iommudata(grp);
+   WARN_ON(dom->type != IOMMU_DOMAIN_BLOCKED);
+   table_group->ops->release_ownership(table_group);
+}


Ditto


+struct iommu_group *pSeries_pci_device_group(struct pci_controller *hose,
+struct pci_dev *pdev)
+{
+   struct device_node *pdn, *dn = pdev->dev.of_node;
+   struct iommu_group *grp;
+   struct pci_dn *pci;
+
+   pdn = pci_dma_find(dn, NULL);
+   if (!pdn || !PCI_DN(pdn))
+   return ERR_PTR(-ENODEV);
+
+   pci = PCI_DN(pdn);
+   if (!pci->table_group)
+   return ERR_PTR(-ENODEV);
+
+   grp = pci->table_group->group;
+   if (!grp)
+   return ERR_PTR(-ENODEV);
+
+   return iommu_group_ref_get(grp);


Not for this series, but this is kind of backwards, the driver
specific data (ie the table_group) should be in
iommu_group_get_iommudata()...



It is there but here we are getting from a device to a group - a device 
is not added to a group yet when iommu_probe_device() works and tries 
adding a device via iommu_group_get_for_dev().






diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
b/drivers/vfio/vfio_iommu_spapr_tce.c
index 8a65ea61744c..3b53b466e49b 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -1152,8 +1152,6 @@ static void tce_iommu_release_ownership(struct 
tce_container *container,
for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i)
if (container->tables[i])
table_group->ops->unset_window(table_group, i);
-
-   

Re: [PATCH 1/2] powerpc/pseries: define driver for Platform KeyStore

2022-07-18 Thread Gregory Joyce
Comments inline.
Reviewed-by: Greg Joyce 


From: Nayna Jain 
Sent: Tuesday, July 12, 2022 7:59 PM
To: linuxppc-dev@lists.ozlabs.org 
Cc: Michael Ellerman ; Benjamin Herrenschmidt 
; Paul Mackerras ; George Wilson 
; Gregory Joyce ; Nayna Jain 

Subject: [PATCH 1/2] powerpc/pseries: define driver for Platform KeyStore

PowerVM provides an isolated Platform Keystore(PKS) storage allocation
for each LPAR with individually managed access controls to store
sensitive information securely. It provides a new set of hypervisor
calls for Linux kernel to access PKS storage.

Define PLPKS driver using H_CALL interface to access PKS storage.

Signed-off-by: Nayna Jain 
---


+
+static int construct_auth(u8 consumer, struct plpks_auth **auth)
+{
+   pr_debug("max password size is %u\n", config->maxpwsize);

Are the pr_debugs in this function still needed?

+
+   if (!auth || consumer > 3)
+   return -EINVAL;
+
+   *auth = kmalloc(struct_size(*auth, password, config->maxpwsize),
+   GFP_KERNEL);
+   if (!*auth)
+   return -ENOMEM;
+
+   (*auth)->version = 1;
+   (*auth)->consumer = consumer;
+   (*auth)->rsvd0 = 0;
+   (*auth)->rsvd1 = 0;
+   if (consumer == PKS_FW_OWNER || consumer == PKS_BOOTLOADER_OWNER) {
+   pr_debug("consumer is bootloader or firmware\n");
+   (*auth)->passwordlength = 0;
+   return 0;
+   }
+
+   (*auth)->passwordlength = (__force __be16)ospasswordlength;
+
+   memcpy((*auth)->password, ospassword,
+  flex_array_size(*auth, password,
+  (__force u16)((*auth)->passwordlength)));
+   (*auth)->passwordlength = cpu_to_be16((__force 
u16)((*auth)->passwordlength));
+
+   return 0;
+}
+
+/**
+ * Label is combination of label attributes + name.
+ * Label attributes are used internally by kernel and not exposed to the user.
+ */
+static int construct_label(char *component, u8 varos, u8 *name, u16 namelen, 
u8 **label)
+{
+   int varlen;
+   int len = 0;
+   int llen = 0;
+   int i;
+   int rc = 0;
+   u8 labellength = MAX_LABEL_ATTR_SIZE;
+
+   if (!label)
+   return -EINVAL;
+
+   varlen = namelen + sizeof(struct label_attr);
+   *label = kzalloc(varlen, GFP_KERNEL);
+
+   if (!*label)
+   return -ENOMEM;
+
+   if (component) {
+   len = strlen(component);
+   memcpy(*label, component, len);
+   }
+   llen = len;
+

I guess the 8, 1, and 5 are field lengths. Could they be a define or sizeof?

+   if (component)
+   len = 8 - strlen(component);
+   else
+   len = 8;
+
+   memset(*label + llen, 0, len);
+   llen = llen + len;
+
+   ((*label)[llen]) = 0;
+   llen = llen + 1;
+
+   memcpy(*label + llen, , 1);
+   llen = llen + 1;
+
+   memcpy(*label + llen, , 1);
+   llen = llen + 1;
+
+   memset(*label + llen, 0, 5);
+   llen = llen + 5;
+
+   memcpy(*label + llen, name, namelen);
+   llen = llen + namelen;
+
+   for (i = 0; i < llen; i++)
+   pr_debug("%c", (*label)[i]);
+
+   rc = llen;
+   return rc;
+}
+
+static int _plpks_get_config(void)
+{
+   unsigned long retbuf[PLPAR_HCALL_BUFSIZE] = {0};
+   int rc;
+   size_t size = sizeof(struct plpks_config);
+
+   config = kzalloc(size, GFP_KERNEL);
+   if (!config)
+   return -ENOMEM;
+
+   rc = plpar_hcall(H_PKS_GET_CONFIG,
+retbuf,
+virt_to_phys(config),
+size);
+
+   if (rc != H_SUCCESS)

Free config before returning the error.

+   return pseries_status_to_err(rc);
+
+   config->rsvd0 = be32_to_cpu((__force __be32)config->rsvd0);
+   config->maxpwsize = be16_to_cpu((__force __be16)config->maxpwsize);
+   config->maxobjlabelsize = be16_to_cpu((__force 
__be16)config->maxobjlabelsize);
+   config->maxobjsize = be16_to_cpu((__force __be16)config->maxobjsize);
+   config->totalsize = be32_to_cpu((__force __be32)config->totalsize);
+   config->usedspace = be32_to_cpu((__force __be32)config->usedspace);
+   config->supportedpolicies = be32_to_cpu((__force 
__be32)config->supportedpolicies);
+   config->rsvd1 = be64_to_cpu((__force __be64)config->rsvd1);
+
+   configset = true;
+
+   return 0;
+}
+
+static int plpks_confirm_object_flushed(u8 *label, u16 labellen)
+{
+   unsigned long retbuf[PLPAR_HCALL_BUFSIZE] = {0};
+   int rc;
+   u64 timeout = 0;
+   struct plpks_auth *auth;
+   u8 status;
+   int i;
+

Deleted pr_debugs? I guess this is a general question for all pr_debugs in this 
file.

+   rc = construct_auth(PKS_OS_OWNER, );
+   if (rc)
+   return rc;
+
+   for (i = 0; i < labellen; i++)
+   pr_debug("%02x ", label[i]);
+
+   do {
+   

Re: [PATCH 1/2] powerpc/pseries: define driver for Platform KeyStore

2022-07-18 Thread Gregory Joyce
Tested-by: Greg Joyce 


From: Nayna Jain 
Sent: Tuesday, July 12, 2022 7:59 PM
To: linuxppc-dev@lists.ozlabs.org 
Cc: Michael Ellerman ; Benjamin Herrenschmidt 
; Paul Mackerras ; George Wilson 
; Gregory Joyce ; Nayna Jain 

Subject: [PATCH 1/2] powerpc/pseries: define driver for Platform KeyStore

PowerVM provides an isolated Platform Keystore(PKS) storage allocation
for each LPAR with individually managed access controls to store
sensitive information securely. It provides a new set of hypervisor
calls for Linux kernel to access PKS storage.

Define PLPKS driver using H_CALL interface to access PKS storage.

Tested-by: Greg Joyce 
Signed-off-by: Nayna Jain 





Re: mainline build failure of powerpc allmodconfig for prom_init_check

2022-07-18 Thread Linus Torvalds
On Mon, Jul 18, 2022 at 3:12 PM Segher Boessenkool
 wrote:
>
> Assembler language is unforgiving.  It isn't easy to write, and most
> mistakes will not be diagnosed.  If the assmbler language makes it
> easier to read the code, that makes it more likely correct code will be
> written, and that correct code will be written in less time.

What's your argument? That it's unforgiving, so it has to be
unreadable and easy to make mistakes in too?

You can get the order of operands wrong, and it will still build -
just to completely the wrong thing.

> > Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> > odd), but nothing *quite* as broken as "simple constants have entirely
> > different meanings depending on the exact instruction and argument
> > position".
>
> What is broken about that?  It makes everything very consistent, and
> very readable.  Sigils are just nasty, and having the register names the
> same as valid symbol names is also problematic.

Oh, I agree that sigils are good to make the type clear. So '%r4' is
better than 'r4' because the latter could be ambiguous and you could
have a symbol 'r4'.

But just '4' is plain garbage.  There's no "could be ambiguous" about it.

Type checking matters. Not just in C. In asm too.

The reason '$0' is odd because it's *just* a sigil and a number.

Which certainly is not unusual in itself, but it reads like it's a
number to me. Not just because of x86 gas ('$' means 'immediate'), but
Z80 ('$' means HEX), or at least 'Nth argument' (shell).

And yeah, alpha got it from MIPS, afaik.

And presumably MIPS got it from "we're hacking up the simplest
assembler we can".

> > The human-written asm files have those #define's in headers just to
> > make things slightly more legible, because apparently the assembler
> > doesn't even *accept* the sane names.
>
> That was true a long time ago.  And the "#define r0 0" thing caused
> quite a few bugs itself btw.

Those #define's still exist. Look it up.

And yes, they are horrible, and they are wrong, and they shouldn't exist.

   Linus


Re: mainline build failure of powerpc allmodconfig for prom_init_check

2022-07-18 Thread Segher Boessenkool
On Mon, Jul 18, 2022 at 12:06:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman  wrote:
> > > li 4,254 #,
> >
> > Here we load 254 into r4, which is the 2nd parameter to memset (c).
> 
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

This is compiler output.  Compiler output is mainly meant for the
assembler to produce object code from.  It isn't meant to be readable
(and e.g. -fverbose-asm didn't help much here, that's the "#," ;-) ).

The mnemonic determines what the operands mean.  It is much easier to
read and write "li 4,254" than "li r4,254" or "li %r4,254", all of which
are valid.  You can also write "li 3+1,2*127", but not with the other
forms (this is useful if you use assembler macros, which are way more
powerful and appropriate than abusing the C preprocessor, when writing
assembler code).

It matters more if you have three or four or five or six operands to an
assembler instruction, all the extra line noise makes things illegible.

The "%r4" variant hails from winnt.  It is a bit problematic in inline
assembler, because you need to escape the % in extended inline asm, but
not in basic inline asm.  It also is pure line noise to read.

The "r4" variant is problematic if you have symbols named the same.
When you use the -mregnames assembler option it is taken to mean the
register; you can write "(r6)" to mean the symbol.  (There also are "sp"
and "rtos" and "xer" and whatnot, not just "r4").

> I don't understand why the powerpc assembler is so messed up, and uses
> random integer constants for register "names".

360 was the same.  370 was the same.  390 is the same.  801 was the
same.  RIOS (aka POWER) was the same.  So yes, PowerPC inherited it, I
don't know how much thought was put into this, don't change a winning
team etc.

> And it gets even worse, when you start mixing FP, vector and integer "names".

It is clear from the mnemonic what the operands are: some register, an
immediate, a constant, etc.  An expression (which can include object
symbols) can be any of those.

Assembler language is unforgiving.  It isn't easy to write, and most
mistakes will not be diagnosed.  If the assmbler language makes it
easier to read the code, that makes it more likely correct code will be
written, and that correct code will be written in less time.

> I've seen many bad assemblers (in fact, I have *written* a couple of
> bad assemblers myself), but I have never seen anything quite that
> broken on any other architecture.
> 
> Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> odd), but nothing *quite* as broken as "simple constants have entirely
> different meanings depending on the exact instruction and argument
> position".

What is broken about that?  It makes everything very consistent, and
very readable.  Sigils are just nasty, and having the register names the
same as valid symbol names is also problematic.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

s390 has the same syntax, and even inherited the GAS code for this from
the ppc port.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

That was true a long time ago.  And the "#define r0 0" thing caused
quite a few bugs itself btw.

> So it's not even a "the compiler
> generates this abbreviated illegible mess". It's literally that the
> assembler is so horrid.

The disassembler has shown "r4" etc. by default since ages.  The
assembler needs -mregnames to accept it; enabling this by default would
be a compatibility break, not acceptable.

> Why do people put up with that?

Why are people misinformed?

Is there anything in particular in the documentation we could improve?


Hope this helps,


Segher


Re: mainline build failure of powerpc allmodconfig for prom_init_check

2022-07-18 Thread Linus Torvalds
On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman  wrote:
>
> > li 4,254 #,
>
> Here we load 254 into r4, which is the 2nd parameter to memset (c).

I love how even powerpc people know that "4" is bogus, and have to
make it clear that it means "r4".

I don't understand why the powerpc assembler is so messed up, and uses
random integer constants for register "names".

And it gets even worse, when you start mixing FP, vector and integer "names".

I've seen many bad assemblers (in fact, I have *written* a couple of
bad assemblers myself), but I have never seen anything quite that
broken on any other architecture.

Oddities, yes ("$" as a prefix for register? Alpha asm is also very
odd), but nothing *quite* as broken as "simple constants have entirely
different meanings depending on the exact instruction and argument
position".

It's not even an IBM thing. S390 uses perfectly sane register syntax,
and calls things '%r4" etc.

The human-written asm files have those #define's in headers just to
make things slightly more legible, because apparently the assembler
doesn't even *accept* the sane names. So it's not even a "the compiler
generates this abbreviated illegible mess". It's literally that the
assembler is so horrid.

Why do people put up with that?

   Linus


Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

2022-07-18 Thread Linus Torvalds
On Mon, Jul 18, 2022 at 6:44 AM Michael Ellerman  wrote:
>
> With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
> causes the compiler to emit memset calls to initialise on-stack
> variables with a pattern.

Ahh, and that explains why "volatile" made no difference. That did
seem very odd.

Thanks for figuring it out,

   Linus


Re: [PATCH kernel 3/3] powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains

2022-07-18 Thread Jason Gunthorpe
On Thu, Jul 14, 2022 at 06:18:22PM +1000, Alexey Kardashevskiy wrote:

> +/*
> + * A simple iommu_ops to allow less cruft in generic VFIO code.
> + */
> +static bool spapr_tce_iommu_capable(enum iommu_cap cap)
> +{
> + switch (cap) {
> + case IOMMU_CAP_CACHE_COHERENCY:

I would add a remark here that it is because vfio is going to use
SPAPR mode but still checks that the iommu driver support coherency -
with out that detail it looks very strange to have caps without
implementing unmanaged domains

> +static struct iommu_domain *spapr_tce_iommu_domain_alloc(unsigned int type)
> +{
> + struct iommu_domain *dom;
> +
> + if (type != IOMMU_DOMAIN_BLOCKED)
> + return NULL;
> +
> + dom = kzalloc(sizeof(*dom), GFP_KERNEL);
> + if (!dom)
> + return NULL;
> +
> + dom->geometry.aperture_start = 0;
> + dom->geometry.aperture_end = ~0ULL;
> + dom->geometry.force_aperture = true;

A blocked domain doesn't really have an aperture, all DMA is rejected,
so I think these can just be deleted and left at zero.

Generally I'm suggesting drivers just use a static singleton instance
for the blocked domain instead of the allocation like this, but that
is a very minor nit.

> +static struct iommu_device *spapr_tce_iommu_probe_device(struct device *dev)
> +{
> + struct pci_dev *pdev;
> + struct pci_controller *hose;
> +
> + /* Weirdly iommu_device_register() assigns the same ops to all buses */
> + if (!dev_is_pci(dev))
> + return ERR_PTR(-EPERM);

Less "weirdly", more by design. The iommu driver should check if the
given struct device is supported or not, it isn't really a bus
specific operation.

> +static struct iommu_group *spapr_tce_iommu_device_group(struct device *dev)
> +{
> + struct pci_controller *hose;
> + struct pci_dev *pdev;
> +
> + /* Weirdly iommu_device_register() assigns the same ops to all buses */
> + if (!dev_is_pci(dev))
> + return ERR_PTR(-EPERM);

This doesn't need repeating, if probe_device() fails then this will
never be called.

> +static int spapr_tce_iommu_attach_dev(struct iommu_domain *dom,
> +   struct device *dev)
> +{
> + struct iommu_group *grp = iommu_group_get(dev);
> + struct iommu_table_group *table_group;
> + int ret = -EINVAL;
> +
> + if (!grp)
> + return -ENODEV;
> +
> + table_group = iommu_group_get_iommudata(grp);
> +
> + if (dom->type == IOMMU_DOMAIN_BLOCKED)
> + ret = table_group->ops->take_ownership(table_group);

Ideally there shouldn't be dom->type checks like this.


The blocking domain should have its own iommu_domain_ops that only
process the blocking operation. Ie call this like
spapr_tce_iommu_blocking_attach_dev()

Instead of having a "default_domain_ops" leave it NULL and create a
spapr_tce_blocking_domain_ops with these two functions and assign it
to domain->ops when creating. Then it is really clear these functions
are only called for the DOMAIN_BLOCKED type and you don't need to
check it.

> +static void spapr_tce_iommu_detach_dev(struct iommu_domain *dom,
> +struct device *dev)
> +{
> + struct iommu_group *grp = iommu_group_get(dev);
> + struct iommu_table_group *table_group;
> +
> + table_group = iommu_group_get_iommudata(grp);
> + WARN_ON(dom->type != IOMMU_DOMAIN_BLOCKED);
> + table_group->ops->release_ownership(table_group);
> +}

Ditto

> +struct iommu_group *pSeries_pci_device_group(struct pci_controller *hose,
> +  struct pci_dev *pdev)
> +{
> + struct device_node *pdn, *dn = pdev->dev.of_node;
> + struct iommu_group *grp;
> + struct pci_dn *pci;
> +
> + pdn = pci_dma_find(dn, NULL);
> + if (!pdn || !PCI_DN(pdn))
> + return ERR_PTR(-ENODEV);
> +
> + pci = PCI_DN(pdn);
> + if (!pci->table_group)
> + return ERR_PTR(-ENODEV);
> +
> + grp = pci->table_group->group;
> + if (!grp)
> + return ERR_PTR(-ENODEV);
> +
> + return iommu_group_ref_get(grp);

Not for this series, but this is kind of backwards, the driver
specific data (ie the table_group) should be in
iommu_group_get_iommudata()...

> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
> b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 8a65ea61744c..3b53b466e49b 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -1152,8 +1152,6 @@ static void tce_iommu_release_ownership(struct 
> tce_container *container,
>   for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i)
>   if (container->tables[i])
>   table_group->ops->unset_window(table_group, i);
> -
> - table_group->ops->release_ownership(table_group);
>  }
>  
>  static long tce_iommu_take_ownership(struct tce_container *container,
> @@ -1161,10 +1159,6 @@ static long tce_iommu_take_ownership(struct 
> tce_container 

Re: [PATCH kernel 2/3] powerpc/pci_64: Init pcibios subsys a bit later

2022-07-18 Thread Jason Gunthorpe
On Thu, Jul 14, 2022 at 06:18:21PM +1000, Alexey Kardashevskiy wrote:
> The following patches are going to add dependency/use of iommu_ops which
> is initialized in subsys_initcall as well.
> 
> This moves pciobios_init() to the next initcall level.
> 
> This should not cause behavioral change.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kernel/pci_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe 

Jason


Re: [PATCH kernel 1/3] powerpc/iommu: Add "borrowing" iommu_table_group_ops

2022-07-18 Thread Jason Gunthorpe
On Thu, Jul 14, 2022 at 06:18:20PM +1000, Alexey Kardashevskiy wrote:
> PPC64 IOMMU API defines iommu_table_group_ops which handles DMA windows
> for PEs: control the ownership, create/set/unset a table the hardware
> for dynamic DMA windows (DDW). VFIO uses the API to implement support
> on POWER.
> 
> So far only PowerNV IODA2 (POWER8 and newer machines) implemented this and 
> other cases (POWER7 or nested KVM) did not and instead reused
> existing iommu_table structs. This means 1) no DDW 2) ownership transfer
> is done directly in the VFIO SPAPR TCE driver.
> 
> Soon POWER is going to get its own iommu_ops and ownership control is
> going to move there. This implements spapr_tce_table_group_ops which
> borrows iommu_table tables. The upside is that VFIO needs to know less
> about POWER.
> 
> The new ops returns the existing table from create_table() and
> only checks if the same window is already set. This is only going to work
> if the default DMA window starts table_group.tce32_start and as big as
> pe->table_group.tce32_size (not the case for IODA2+ PowerNV).
> 
> This changes iommu_table_group_ops::take_ownership() to return an error
> if borrowing a table failed.
> 
> This should not cause any visible change in behavior for PowerNV.
> pSeries was not that well tested/supported anyway.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/include/asm/iommu.h  |  6 +-
>  arch/powerpc/kernel/iommu.c   | 98 ++-
>  arch/powerpc/platforms/powernv/pci-ioda.c |  6 +-
>  arch/powerpc/platforms/pseries/iommu.c|  3 +
>  drivers/vfio/vfio_iommu_spapr_tce.c   | 94 --
>  5 files changed, 121 insertions(+), 86 deletions(-)

Reviewed-by: Jason Gunthorpe 

Jason


Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

2022-07-18 Thread Sudip Mukherjee
On Mon, Jul 18, 2022 at 2:44 PM Michael Ellerman  wrote:
>
> With GCC 12 allmodconfig prom_init fails to build:
>
>   Error: External symbol 'memset' referenced from prom_init.c
>   make[2]: *** [arch/powerpc/kernel/Makefile:204: 
> arch/powerpc/kernel/prom_init_check] Error 1
>



>
> Reported-by: Sudip Mukherjee 
> Signed-off-by: Michael Ellerman 

And, this has fixed the build failure.
Thanks Michael.


-- 
Regards
Sudip


Re: mainline build failure of powerpc allmodconfig for prom_init_check

2022-07-18 Thread Segher Boessenkool
On Mon, Jul 18, 2022 at 01:52:38PM +1000, Michael Ellerman wrote:
> Segher Boessenkool  writes:
> > Can't we simply have a small simple implementation of these functions in
> > arch/powerpc/boot/?  This stuff is not performance-critical, and this is
> > not the first time we hit these problems.
> 
> prom_init.c isn't in arch/powerpc/boot :)

Ah duh :-)

> It's linked into the kernel proper, but we want it to behave like a
> pre-boot environment (because not all boot paths run it) which is why we
> restrict what symbols it can call.
> 
> We could have a prom_memset() etc. but we'd need to do some tricks to
> rewrite references to memset() to prom_memset() before linking.

You can do it in its linker script?


Segher


[PATCH] powerpc/ps3: Fix comment typo

2022-07-18 Thread Jason Wang
The double `when' is duplicated in line 1069, remove one.

Signed-off-by: Jason Wang 
---
 drivers/ps3/ps3-lpm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ps3/ps3-lpm.c b/drivers/ps3/ps3-lpm.c
index 65512b6cc6fd..200ad8751860 100644
--- a/drivers/ps3/ps3-lpm.c
+++ b/drivers/ps3/ps3-lpm.c
@@ -1066,7 +1066,7 @@ EXPORT_SYMBOL_GPL(ps3_disable_pm_interrupts);
  *  instance, specified by one of enum ps3_lpm_tb_type.
  * @tb_cache: Optional user supplied buffer to use as the trace buffer cache.
  *  If NULL, the driver will allocate and manage an internal buffer.
- *  Unused when when @tb_type is PS3_LPM_TB_TYPE_NONE.
+ *  Unused when @tb_type is PS3_LPM_TB_TYPE_NONE.
  * @tb_cache_size: The size in bytes of the user supplied @tb_cache buffer.
  *  Unused when @tb_cache is NULL or @tb_type is PS3_LPM_TB_TYPE_NONE.
  */
-- 
2.35.1



[PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

2022-07-18 Thread Michael Ellerman
With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: 
arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f91f0f29a566..c8cf924bf9c0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom_init.o += -fno-stack-protector
 CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
 CFLAGS_prom_init.o += -ffreestanding
+CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-- 
2.35.3



Re: [PATCH v5 0/2] powerpc rng cleanups

2022-07-18 Thread Jason A. Donenfeld
Hey again,

On Tue, Jul 12, 2022 at 01:24:46AM +0200, Jason A. Donenfeld wrote:
> These are two small cleanups for -next. This v5 rebases on the latest
> git master, as some whitespace was added that made v4 no longer apply.
> 
> Jason A. Donenfeld (2):
>   powerpc/powernv: rename remaining rng powernv_ functions to pnv_
>   powerpc/kvm: don't crash on missing rng, and use darn
> 
>  arch/powerpc/include/asm/archrandom.h |  7 +--
>  arch/powerpc/kvm/book3s_hv_builtin.c  |  7 +--
>  arch/powerpc/platforms/powernv/rng.c  | 66 ++-
>  drivers/char/hw_random/powernv-rng.c  |  2 +-
>  4 files changed, 30 insertions(+), 52 deletions(-)

I think v5 has reached a completion point. Could you queue these up in
some PPC tree for 5.20?

Thanks,
Jason


Re: [PATCH v5] random: remove CONFIG_ARCH_RANDOM

2022-07-18 Thread Michael Ellerman
"Jason A. Donenfeld"  writes:
> When RDRAND was introduced, there was much discussion on whether it
> should be trusted and how the kernel should handle that. Initially, two
> mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
> "nordrand", a boot-time switch.
...
>
>  arch/arm/include/asm/archrandom.h |  2 ++
>  arch/arm64/Kconfig|  8 --
>  arch/arm64/include/asm/archrandom.h   | 10 
>  arch/arm64/kernel/cpufeature.c|  2 --
>  arch/powerpc/Kconfig  |  3 ---
>  arch/powerpc/include/asm/archrandom.h |  3 ---
>  arch/powerpc/include/asm/machdep.h|  2 --
>  arch/powerpc/platforms/microwatt/Kconfig  |  1 -
>  arch/powerpc/platforms/powernv/Kconfig|  1 -
>  arch/powerpc/platforms/pseries/Kconfig|  1 -

Acked-by: Michael Ellerman  (powerpc)

cheers


Re: BUG xfs_buf while running tests/xfs/435 (next-20220715)

2022-07-18 Thread Sachin Sant


> Fix it by removing the xfs_buf_init/terminate wrappers that just
> allocate and destroy the xfs_buf slab, and move them to the same
> place that all the other slab caches are set up and destroyed.
> 
> Reported-by: Sachin Sant 
> Fixes: 298f34224506 ("xfs: lockless buffer lookup")
> Signed-off-by: Dave Chinner 
> ---

Thanks. The patch fixes the reported problem for me.

Tested-by: Sachin Sant 

- Sachin


[PATCH] powerpc/sysdev: Fix comment typo

2022-07-18 Thread Jason Wang
The double `is' is duplicated in line 110, remove one.

Signed-off-by: Jason Wang 
---
 arch/powerpc/sysdev/cpm2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/cpm2.c b/arch/powerpc/sysdev/cpm2.c
index 3f130312b6e9..915f4d3991c3 100644
--- a/arch/powerpc/sysdev/cpm2.c
+++ b/arch/powerpc/sysdev/cpm2.c
@@ -107,7 +107,7 @@ EXPORT_SYMBOL(cpm_command);
  * memory mapped space.
  * The baud rate clock is the system clock divided by something.
  * It was set up long ago during the initial boot phase and is
- * is given to us.
+ * given to us.
  * Baud rate clocks are zero-based in the driver code (as that maps
  * to port numbers).  Documentation uses 1-based numbering.
  */
-- 
2.35.1



Memory problem with Delock SATA3/USB3 controller board

2022-07-18 Thread Roland
Hello!

I tested in my AmigaOne X5000 (QuorIQ 5020/5040) a combined Sata3/USB3 pcie 
controller board from Delock 
(https://www.delock.de/produkt/89389/merkmale.html?setLanguage=en). The ASM 
1042, ASM 1061 and ASM 1182 chipsets are used on this board.

The Sata controller seems to work without issues, except that with the 
speedtest of 'Disks' tool the read speed reach only 200MB/s (with a 6G SSD 
disk), whereas it is normally around 400Mb/s with other Sata3 controllers.

Instead, there are serious problems with the USB controller if the whole RAM of 
my machine (4 or 8 GB) is in use:

- If a USB drive is connected to the controller before booting, the size of 
that drive is interpreted totally wrong (e.g. 250 GB -> 1GB) and the disk is 
not usable. As a consequence, loading of a root filesystem from the USB disk is 
not possible

- If a USB drive is conneted only after booting has finished , the size is 
correct and reading of the disk is possible. But writing to the disk fails if 
larger filesizes (several hundreds of MBs) are used (message: "Error splicing 
file: Input/output error"). The drive is then also unmounted and cannot be 
remounted before rebooting. Also copying smaller files can sometimes lock the 
drive, so that it isn't possible to write on it more files.

If I limit the RAM size to 3.5 GB (using a "MEM=3500M" bootarg) the symptoms 
described above dissappear and USB disks work normally. One problem is still 
present, though: if the machine is set to boot from an internal Sata disk and 
if a second SATA disk is connected to the USB port (using an USB3-Sata3 
adapter) before booting, booting halts to a "start job" trying to load the 
kernel modules (present in lib/modules), adding continuously the time. The 
second disk does not have any Linux os components, only a couple of NTFS 
partitions with trivial data files! Also, connecting a simple USB memory stick 
does _not_ cause this issue.

So, is there some sort of memory allocation problem with the USB driver of this 
card? Is it possible to fix it? I had long time ago similar type of problems 
with some legacy pci boards, but they were fixed in kernel updates
I'm using currently Kernel 5.10.124, but the problem is seen also with later 
kernels, like 5.15 and 5.19 betas.

Here is some info how the controllers are seen:

1000:08:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host 
Controller (prog-if 30 [XHCI])
Subsystem: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller
Flags: bus master, fast devsel, latency 0, IRQ 19
Memory at c2010 (64-bit, non-prefetchable) [size=32K]
Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [68] MSI-X: Enable+ Count=8 Masked-
Capabilities: [78] Power Management version 3
Capabilities: [80] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Kernel driver in use: xhci_hcd

1000:09:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA 
Controller (rev 01) (prog-if 01 [AHCI 1.0])
Subsystem: ASMedia Technology Inc. ASM1062 Serial ATA Controller
Flags: bus master, fast devsel, latency 0, IRQ 47
I/O ports at 3000 [size=8]
I/O ports at 3008 [size=4]
I/O ports at 3010 [size=8]
I/O ports at 3018 [size=4]
I/O ports at 3020 [size=32]
Memory at c2020 (32-bit, non-prefetchable) [size=512]
Expansion ROM at c2021 [virtual] [disabled] [size=64K]
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [78] Power Management version 3
Capabilities: [80] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Kernel driver in use: ahci

Regards,

Roland



[PATCH] KVM: PPC: Book3S HV: Fix comment typo

2022-07-18 Thread Jason Wang
The double `that' is duplicated in line 1604, remove one.

Signed-off-by: Jason Wang 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 514fd45c1994..73c6db20cd8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1601,7 +1601,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
  * is valid, it is written to the HPT as if an H_ENTER with the
  * exact flag set was done.  When the invalid count is non-zero
  * in the header written to the stream, the kernel will make
- * sure that that many HPTEs are invalid, and invalidate them
+ * sure that many HPTEs are invalid, and invalidate them
  * if not.
  */
 
-- 
2.35.1



[PATCH] Merge: Fix comment typo

2022-07-18 Thread Jason Wang
The double `that' is duplicated in line 1604, remove one.

Signed-off-by: Jason Wang 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 514fd45c1994..73c6db20cd8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1601,7 +1601,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
  * is valid, it is written to the HPT as if an H_ENTER with the
  * exact flag set was done.  When the invalid count is non-zero
  * in the header written to the stream, the kernel will make
- * sure that that many HPTEs are invalid, and invalidate them
+ * sure that many HPTEs are invalid, and invalidate them
  * if not.
  */
 
-- 
2.35.1



[PATCH] powerpc: Fix all occurences of duplicate words

2022-07-18 Thread Michael Ellerman
Since commit 87c78b612f4f ("powerpc: Fix all occurences of "the the"")
fixed "the the", there's now a steady stream of patches fixing other
duplicate words.

Just fix them all at once, to save the overhead of dealing with
individual patches for each case.

This leaves a few cases of "that that", which in some contexts is
correct.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/crypto/aes-spe-glue.c   | 2 +-
 arch/powerpc/kernel/btext.c  | 2 +-
 arch/powerpc/kernel/eeh_driver.c | 2 +-
 arch/powerpc/kernel/exceptions-64s.S | 2 +-
 arch/powerpc/kernel/pci-common.c | 2 +-
 arch/powerpc/kernel/pci_dn.c | 2 +-
 arch/powerpc/kernel/ptrace/ptrace-vsx.c  | 2 +-
 arch/powerpc/kernel/trace/ftrace.c   | 6 +++---
 arch/powerpc/kernel/watchdog.c   | 2 +-
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 arch/powerpc/mm/book3s64/hash_utils.c| 2 +-
 arch/powerpc/platforms/4xx/cpm.c | 2 +-
 arch/powerpc/platforms/powernv/vas-fault.c   | 2 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c | 2 +-
 arch/powerpc/platforms/pseries/papr_scm.c| 2 +-
 15 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/crypto/aes-spe-glue.c 
b/arch/powerpc/crypto/aes-spe-glue.c
index e8dfe9fb0266..efab78a3a8f6 100644
--- a/arch/powerpc/crypto/aes-spe-glue.c
+++ b/arch/powerpc/crypto/aes-spe-glue.c
@@ -28,7 +28,7 @@
  * instructions per clock cycle using one 32/64 bit unit (SU1) and one 32
  * bit unit (SU2). One of these can be a memory access that is executed via
  * a single load and store unit (LSU). XTS-AES-256 takes ~780 operations per
- * 16 byte block block or 25 cycles per byte. Thus 768 bytes of input data
+ * 16 byte block or 25 cycles per byte. Thus 768 bytes of input data
  * will need an estimated maximum of 20,000 cycles. Headroom for cache misses
  * included. Even with the low end model clocked at 667 MHz this equals to a
  * critical time window of less than 30us. The value has been chosen to
diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c
index 8f69bb07e500..2769889219bf 100644
--- a/arch/powerpc/kernel/btext.c
+++ b/arch/powerpc/kernel/btext.c
@@ -73,7 +73,7 @@ static inline void rmci_maybe_off(void)
  * the display during identify_machine() and MMU_Init()
  *
  * The display is mapped to virtual address 0xD000, rather
- * than 1:1, because some some CHRP machines put the frame buffer
+ * than 1:1, because some CHRP machines put the frame buffer
  * in the region starting at 0xC000 (PAGE_OFFSET).
  * This mapping is temporary and will disappear as soon as the
  * setup done by MMU_Init() is applied.
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 260273e56431..f279295179bd 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -750,7 +750,7 @@ static void eeh_pe_cleanup(struct eeh_pe *pe)
  * @pdev: pci_dev to check
  *
  * This function may return a false positive if we can't determine the slot's
- * presence state. This might happen for for PCIe slots if the PE containing
+ * presence state. This might happen for PCIe slots if the PE containing
  * the upstream bridge is also frozen, or the bridge is part of the same PE
  * as the device.
  *
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b66dd6f775a4..3d0dc133a9ae 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -2779,7 +2779,7 @@ EXC_COMMON_BEGIN(soft_nmi_common)
 
 /*
  * An interrupt came in while soft-disabled. We set paca->irq_happened, then:
- * - If it was a decrementer interrupt, we bump the dec to max and and return.
+ * - If it was a decrementer interrupt, we bump the dec to max and return.
  * - If it was a doorbell we return immediately since doorbells are edge
  *   triggered and won't automatically refire.
  * - If it was a HMI we return immediately since we handled it in realmode
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index c87999289752..cbb912ee92fa 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1088,7 +1088,7 @@ void pcibios_fixup_bus(struct pci_bus *bus)
 */
pci_read_bridge_bases(bus);
 
-   /* Now fixup the bus bus */
+   /* Now fixup the bus */
pcibios_setup_bus_self(bus);
 }
 EXPORT_SYMBOL(pcibios_fixup_bus);
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index 938ab8838ab5..7a35fc25a304 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -259,7 +259,7 @@ void remove_sriov_vf_pdns(struct pci_dev *pdev)
if (edev) {
/*
 * We allocate pci_dn's for the totalvfs count,
-* but only only the vfs that were activated
+   

Re: [PATCH v2 1/2] asm-generic: Remove pci.h copying remaining code to x86

2022-07-18 Thread Arnd Bergmann
On Mon, Jul 18, 2022 at 6:33 AM Christoph Hellwig  wrote:
>
> On Sun, Jul 17, 2022 at 12:34:52PM +0900, Stafford Horne wrote:
> > The generic pci.h header now only provides a definition of
> > pci_get_legacy_ide_irq which is used by architectures that support PNP.
> > Of the architectures that use asm-generic/pci.h this is only x86.
>
> Please move this into a separate header, ike legacy-ide.h.  It doens't
> have anyting to do with actual PCI support.

It looks like asm/libata-portmap.h is meant to have this information already,
and this is what libata uses, while drivers/ide used the
pci_get_legacy_ide_irq()
function for the same purpose.

Only ia64 and powerpc have interesting definitions of both, and they
return the same thing, so I think this is sufficient to remove the last caller:

diff --git a/drivers/pnp/resource.c b/drivers/pnp/resource.c
index 2fa0f7d55259..d7a6250589d6 100644
--- a/drivers/pnp/resource.c
+++ b/drivers/pnp/resource.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 

@@ -322,8 +322,8 @@ static int pci_dev_uses_irq(struct pnp_dev *pnp,
struct pci_dev *pci,
 * treat the compatibility IRQs as busy.
 */
if ((progif & 0x5) != 0x5)
-   if (pci_get_legacy_ide_irq(pci, 0) == irq ||
-   pci_get_legacy_ide_irq(pci, 1) == irq) {
+   if (ATA_PRIMARY_IRQ(pci) == irq ||
+   ATA_SECONDARY_IRQ(pci) == irq) {
pnp_dbg(>dev, "  legacy IDE device %s "
"using irq %d\n", pci_name(pci), irq);
return 1;

This is fine on the architectures that currently return an error from
pci_get_legacy_ide_irq() but will change to returning 15/14 instead,
because they do not support ISA devices, so pci_dev_uses_irq()
will never be called either.

Arnd


Re: [PATCH v3 1/2] asm-generic: Remove pci.h copying remaining code to x86

2022-07-18 Thread Geert Uytterhoeven
On Mon, Jul 18, 2022 at 2:41 AM Stafford Horne  wrote:
> The generic pci.h header now only provides a definition of
> pci_get_legacy_ide_irq which is used by architectures that support PNP.
> Of the architectures that use asm-generic/pci.h this is only x86.
>
> This patch removes the old pci.h in order to make room for a new
> pci.h to be used by arm64, riscv, openrisc, etc.
>
> The existing code in pci.h is moved out to x86.  On other architectures
> we clean up any outstanding references.
>
> Suggested-by: Arnd Bergmann 
> Link: 
> https://lore.kernel.org/lkml/CAK8P3a0JmPeczfmMBE__vn=Jbvf=nkbpvazcycyv40pzncj...@mail.gmail.com/
> Signed-off-by: Stafford Horne 
> ---
> Since v2:
>  - Remove pci_get_legacy_ide_irq in m68k
> Since v1:
>  - Remove pci_get_legacy_ide_irq for most architectures as its not needed.

>  arch/m68k/include/asm/pci.h|  2 --

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: BUG xfs_buf while running tests/xfs/435 (next-20220715)

2022-07-18 Thread Dave Chinner
On Mon, Jul 18, 2022 at 12:01:53PM +0530, Sachin Sant wrote:
> While running xfstests (specifically xfs/435) on a IBM Power server booted 
> with
> 5.19.0-rc6-next-20220715 following warnings are seen:
> 
> 
> [  110.954136] XFS (sdb2): Unmounting Filesystem
> [  110.968860] XFS (sdb1): Unmounting Filesystem
> [  111.115807] 
> =
> [  111.115817] BUG xfs_buf (Tainted: GE ): Objects remaining 
> in xfs_buf on __kmem_cache_shutdown()
> [  111.115824] 
> -
> [  111.115824] 
> [  111.115828] Slab 0x74cdc09a objects=170 used=1 
> fp=0x5f24a5e1 
> flags=0x1380200(slab|node=1|zone=0|lastcpupid=0x7)
> [  111.115840] CPU: 26 PID: 4704 Comm: modprobe Tainted: GE  
> 5.19.0-rc6-next-20220715 #3
> [  111.115849] Call Trace:
> [  111.115852] [c0002985b9a0] [c0830bec] dump_stack_lvl+0x70/0xa4 
> (unreliable)
> [  111.115867] [c0002985b9e0] [c04ef6f8] slab_err+0xd8/0xf0
> [  111.115877] [c0002985bad0] [c04f6cbc] 
> __kmem_cache_shutdown+0x1fc/0x560
> [  111.115884] [c0002985bbf0] [c04534c8] 
> kmem_cache_destroy+0xa8/0x1f0
> [  111.115893] [c0002985bc80] [c0080ccf30e4] 
> xfs_buf_terminate+0x2c/0x48 [xfs]
> [  111.115977] [c0002985bca0] [c0080cd6f55c] exit_xfs_fs+0x90/0x20b34 
> [xfs]
> [  111.116045] [c0002985bcd0] [c023b7e0] 
> sys_delete_module+0x1e0/0x3c0
> [  111.116053] [c0002985bdb0] [c003302c] 
> system_call_exception+0x17c/0x350
> [  111.116062] [c0002985be10] [c000c53c] 
> system_call_common+0xec/0x270
> [  111.116070] --- interrupt: c00 at 0x7fff8c158b88
> [  111.116075] NIP:  7fff8c158b88 LR: 00013adb0398 CTR: 
> 
> [  111.116080] REGS: c0002985be80 TRAP: 0c00   Tainted: GE
>(5.19.0-rc6-next-20220715)
> [  111.116086] MSR:  8280f033   
> CR: 24008282  XER: 
> [  111.116103] IRQMASK: 0 
> [  111.116103] GPR00: 0081 7e17dff0 7fff8c227300 
> 01003f2f0c18 
> [  111.116103] GPR04: 0800 000a 1999 
>  
> [  111.116103] GPR08: 7fff8c1b7830   
>  
> [  111.116103] GPR12:  7fff8c72ca50 00013adba650 
> 00013adba648 
> [  111.116103] GPR16:  0001  
> 00013adba428 
> [  111.116103] GPR20: 00013ade0068  7e17f948 
> 01003f2f02a0 
> [  111.116103] GPR24:  7e17f948 01003f2f0c18 
>  
> [  111.116103] GPR28:  01003f2f0bb0 01003f2f0c18 
> 01003f2f0bb0 
> [  111.116162] NIP [7fff8c158b88] 0x7fff8c158b88
> [  111.116166] LR [00013adb0398] 0x13adb0398
> [  111.116170] --- interrupt: c00
> [  111.116173] Disabling lock debugging due to kernel taint
> [  111.116184] Object 0x7e079655 @offset=18816
> [  111.116189] 
> =
> [  111.116193] BUG xfs_buf (Tainted: GB   E ): Objects remaining 
> in xfs_buf on __kmem_cache_shutdown()
> [  111.116198] 
> -
> [  111.116198] 
> [  111.116202] Slab 0x8186f78a objects=170 used=12 
> fp=0x8233ac7d 
> flags=0x1380200(slab|node=1|zone=0|lastcpupid=0x7)
> [  111.116210] CPU: 26 PID: 4704 Comm: modprobe Tainted: GB   E  
> 5.19.0-rc6-next-20220715 #3
> [  111.116216] Call Trace:
> [  111.116218] [c0002985b9a0] [c0830bec] dump_stack_lvl+0x70/0xa4 
> (unreliable)
> [  111.116227] [c0002985b9e0] [c04ef6f8] slab_err+0xd8/0xf0
> [  111.116234] [c0002985bad0] [c04f6cbc] 
> __kmem_cache_shutdown+0x1fc/0x560
> [  111.116241] [c0002985bbf0] [c04534c8] 
> kmem_cache_destroy+0xa8/0x1f0
> [  111.116248] [c0002985bc80] [c0080ccf30e4] 
> xfs_buf_terminate+0x2c/0x48 [xfs]
> [  111.116312] [c0002985bca0] [c0080cd6f55c] exit_xfs_fs+0x90/0x20b34 
> [xfs]
> [  111.116379] [c0002985bcd0] [c023b7e0] 
> sys_delete_module+0x1e0/0x3c0
> [  111.116386] [c0002985bdb0] [c003302c] 
> system_call_exception+0x17c/0x350
> [  111.116392] [c0002985be10] [c000c53c] 
> system_call_common+0xec/0x270
> [  111.116400] --- interrupt: c00 at 0x7fff8c158b88
> [  111.116404] NIP:  7fff8c158b88 LR: 00013adb0398 CTR: 
> 
> [  111.116409] REGS: c0002985be80 TRAP: 0c00   Tainted: GB   E
>(5.19.0-rc6-next-20220715)
> [  111.116414] MSR:  8280f033   
> CR: 24008282  XER: 
> [  111.116430] IRQMASK: 0 
> [  111.116430] GPR00: 0081 7e17dff0 7fff8c227300 
> 01003f2f0c18 
> [  

[PATCH] powerpc/pseries/vas: Fix comment typo

2022-07-18 Thread Jason Wang
The double `the' in line 807 is duplicated, remove one.

Signed-off-by: Jason Wang 
---
 arch/powerpc/platforms/pseries/vas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 91e7eda0606c..7e6e6dd2e33e 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -804,7 +804,7 @@ int vas_reconfig_capabilties(u8 type, int new_nr_creds)
 * The total number of available credits may be decreased or
 * increased with DLPAR operation. Means some windows have to be
 * closed / reopened. Hold the vas_pseries_mutex so that the
-* the user space can not open new windows.
+* user space can not open new windows.
 */
if (old_nr_creds <  new_nr_creds) {
/*
-- 
2.35.1



RE: mainline build failure of powerpc allmodconfig for prom_init_check

2022-07-18 Thread David Laight
From: Michael Ellerman
> Sent: 18 July 2022 05:41
...
> So we're memsetting all of args to 254, not zero.
> 
> That's happening because allmodconfig with gcc 12 enables
> CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I can't help feeling it would be better if that generated
a call to a memset64() function.
Saving loads of tests at the top of the function,
and (most of?) the constant expansion to 64bit.

Although and explicit 'stack clear' function would be better
for the kernel - since it would give the option of patching
it away at startup.

I really can't help feeling that initialising on-stack
arrays will kill performance.
While kernel stack frames have to be relatively small,
in userspace very large on-stack arrays can be allocated
(and correctly bound checked) knowing that the cost is
minimal (maybe a TLB miss).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH] KVM: PPC: Fix comment typo

2022-07-18 Thread Jason Wang
The double `that' in line 1604 is duplicated, removed one.

Signed-off-by: Jason Wang 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 514fd45c1994..73c6db20cd8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1601,7 +1601,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
  * is valid, it is written to the HPT as if an H_ENTER with the
  * exact flag set was done.  When the invalid count is non-zero
  * in the header written to the stream, the kernel will make
- * sure that that many HPTEs are invalid, and invalidate them
+ * sure that many HPTEs are invalid, and invalidate them
  * if not.
  */
 
-- 
2.35.1



Re: [PATCH v2] random: handle archrandom in plural words

2022-07-18 Thread Gabriel Paubert
On Mon, Jul 18, 2022 at 04:31:11PM +1000, Michael Ellerman wrote:
> "Jason A. Donenfeld"  writes:
> > The archrandom interface was originally designed for x86, which supplies
> > RDRAND/RDSEED for receiving random words into registers, resulting in
> > one function to generate an int and another to generate a long. However,
> > other architectures don't follow this.
> >
> > On arm64, the SMCCC TRNG interface can return between 1 and 3 words. On
> > s390, the CPACF TRNG interface can return between 1 and 32 words for the
> > same cost as for one word. On UML, the os_getrandom() interface can return
> > arbitrary amounts.
> >
> > So change the api signature to take a "words" parameter designating the
> > maximum number of words requested, and then return the number of words
> > generated.
> 
> On powerpc a word is 32-bits and a doubleword is 64-bits (at least
> according to the ISA). I think that's also true on other 64-bit
> architectures.

IIRC, this is (or was) not the case on Alpha, where word was defined as
16 bits. All assembly mnemonics had w for 16 bits, l for 32 bits, and q
for 64 bits.

Blame the PDP-11...

Gabriel

> 
> You could avoid any confusion by defining the API in terms of "longs"
> rather than "words".
> 
> But that's just a comment, see what others think.
> 
> >  arch/powerpc/include/asm/archrandom.h |  30 ++--
> >  arch/powerpc/kvm/book3s_hv.c  |   2 +-
> 
> Acked-by: Michael Ellerman  (powerpc)
> 
> cheers
 



BUG xfs_buf while running tests/xfs/435 (next-20220715)

2022-07-18 Thread Sachin Sant
While running xfstests (specifically xfs/435) on a IBM Power server booted with
5.19.0-rc6-next-20220715 following warnings are seen:


[  110.954136] XFS (sdb2): Unmounting Filesystem
[  110.968860] XFS (sdb1): Unmounting Filesystem
[  111.115807] 
=
[  111.115817] BUG xfs_buf (Tainted: GE ): Objects remaining in 
xfs_buf on __kmem_cache_shutdown()
[  111.115824] 
-
[  111.115824] 
[  111.115828] Slab 0x74cdc09a objects=170 used=1 fp=0x5f24a5e1 
flags=0x1380200(slab|node=1|zone=0|lastcpupid=0x7)
[  111.115840] CPU: 26 PID: 4704 Comm: modprobe Tainted: GE  
5.19.0-rc6-next-20220715 #3
[  111.115849] Call Trace:
[  111.115852] [c0002985b9a0] [c0830bec] dump_stack_lvl+0x70/0xa4 
(unreliable)
[  111.115867] [c0002985b9e0] [c04ef6f8] slab_err+0xd8/0xf0
[  111.115877] [c0002985bad0] [c04f6cbc] 
__kmem_cache_shutdown+0x1fc/0x560
[  111.115884] [c0002985bbf0] [c04534c8] 
kmem_cache_destroy+0xa8/0x1f0
[  111.115893] [c0002985bc80] [c0080ccf30e4] 
xfs_buf_terminate+0x2c/0x48 [xfs]
[  111.115977] [c0002985bca0] [c0080cd6f55c] exit_xfs_fs+0x90/0x20b34 
[xfs]
[  111.116045] [c0002985bcd0] [c023b7e0] 
sys_delete_module+0x1e0/0x3c0
[  111.116053] [c0002985bdb0] [c003302c] 
system_call_exception+0x17c/0x350
[  111.116062] [c0002985be10] [c000c53c] 
system_call_common+0xec/0x270
[  111.116070] --- interrupt: c00 at 0x7fff8c158b88
[  111.116075] NIP:  7fff8c158b88 LR: 00013adb0398 CTR: 
[  111.116080] REGS: c0002985be80 TRAP: 0c00   Tainted: GE  
 (5.19.0-rc6-next-20220715)
[  111.116086] MSR:  8280f033   CR: 
24008282  XER: 
[  111.116103] IRQMASK: 0 
[  111.116103] GPR00: 0081 7e17dff0 7fff8c227300 
01003f2f0c18 
[  111.116103] GPR04: 0800 000a 1999 
 
[  111.116103] GPR08: 7fff8c1b7830   
 
[  111.116103] GPR12:  7fff8c72ca50 00013adba650 
00013adba648 
[  111.116103] GPR16:  0001  
00013adba428 
[  111.116103] GPR20: 00013ade0068  7e17f948 
01003f2f02a0 
[  111.116103] GPR24:  7e17f948 01003f2f0c18 
 
[  111.116103] GPR28:  01003f2f0bb0 01003f2f0c18 
01003f2f0bb0 
[  111.116162] NIP [7fff8c158b88] 0x7fff8c158b88
[  111.116166] LR [00013adb0398] 0x13adb0398
[  111.116170] --- interrupt: c00
[  111.116173] Disabling lock debugging due to kernel taint
[  111.116184] Object 0x7e079655 @offset=18816
[  111.116189] 
=
[  111.116193] BUG xfs_buf (Tainted: GB   E ): Objects remaining in 
xfs_buf on __kmem_cache_shutdown()
[  111.116198] 
-
[  111.116198] 
[  111.116202] Slab 0x8186f78a objects=170 used=12 
fp=0x8233ac7d 
flags=0x1380200(slab|node=1|zone=0|lastcpupid=0x7)
[  111.116210] CPU: 26 PID: 4704 Comm: modprobe Tainted: GB   E  
5.19.0-rc6-next-20220715 #3
[  111.116216] Call Trace:
[  111.116218] [c0002985b9a0] [c0830bec] dump_stack_lvl+0x70/0xa4 
(unreliable)
[  111.116227] [c0002985b9e0] [c04ef6f8] slab_err+0xd8/0xf0
[  111.116234] [c0002985bad0] [c04f6cbc] 
__kmem_cache_shutdown+0x1fc/0x560
[  111.116241] [c0002985bbf0] [c04534c8] 
kmem_cache_destroy+0xa8/0x1f0
[  111.116248] [c0002985bc80] [c0080ccf30e4] 
xfs_buf_terminate+0x2c/0x48 [xfs]
[  111.116312] [c0002985bca0] [c0080cd6f55c] exit_xfs_fs+0x90/0x20b34 
[xfs]
[  111.116379] [c0002985bcd0] [c023b7e0] 
sys_delete_module+0x1e0/0x3c0
[  111.116386] [c0002985bdb0] [c003302c] 
system_call_exception+0x17c/0x350
[  111.116392] [c0002985be10] [c000c53c] 
system_call_common+0xec/0x270
[  111.116400] --- interrupt: c00 at 0x7fff8c158b88
[  111.116404] NIP:  7fff8c158b88 LR: 00013adb0398 CTR: 
[  111.116409] REGS: c0002985be80 TRAP: 0c00   Tainted: GB   E  
 (5.19.0-rc6-next-20220715)
[  111.116414] MSR:  8280f033   CR: 
24008282  XER: 
[  111.116430] IRQMASK: 0 
[  111.116430] GPR00: 0081 7e17dff0 7fff8c227300 
01003f2f0c18 
[  111.116430] GPR04: 0800 000a 1999 
 
[  111.116430] GPR08: 7fff8c1b7830   
 
[  111.116430] GPR12:  7fff8c72ca50 00013adba650 
00013adba648 
[  

Re: [PATCH v2] random: handle archrandom in plural words

2022-07-18 Thread Michael Ellerman
"Jason A. Donenfeld"  writes:
> The archrandom interface was originally designed for x86, which supplies
> RDRAND/RDSEED for receiving random words into registers, resulting in
> one function to generate an int and another to generate a long. However,
> other architectures don't follow this.
>
> On arm64, the SMCCC TRNG interface can return between 1 and 3 words. On
> s390, the CPACF TRNG interface can return between 1 and 32 words for the
> same cost as for one word. On UML, the os_getrandom() interface can return
> arbitrary amounts.
>
> So change the api signature to take a "words" parameter designating the
> maximum number of words requested, and then return the number of words
> generated.

On powerpc a word is 32-bits and a doubleword is 64-bits (at least
according to the ISA). I think that's also true on other 64-bit
architectures.

You could avoid any confusion by defining the API in terms of "longs"
rather than "words".

But that's just a comment, see what others think.

>  arch/powerpc/include/asm/archrandom.h |  30 ++--
>  arch/powerpc/kvm/book3s_hv.c  |   2 +-

Acked-by: Michael Ellerman  (powerpc)

cheers