Re: [PATCH] arch/powerpc/Kconfig: fix typo in select statement

2015-12-11 Thread Valentin Rothberg
Hi Daniel,

thank you for the links and the explanation.

On Dec 11 '15 19:13, Daniel Axtens wrote:
> Thanks Valentin.
> 
> Ironically I first noticed the spelling mistake in the original UBSAN
> patchset[1], but wanted it to work on powerpc straight away so I went
> with the mis-spelled version. It's since been corrected upstream[2]; and
> I was going to spin a v2 of my original patch with this fix, especially
> since I tagged the original patch as an RFC only. Regardless, thank you:
> this should keep -mm and next working in the mean time!

I haven't seen the fix, sorry for the noise then.

> > Fixes: 257e4ee82dbd ("powerpc: enable UBSAN support")
> Is this a SHA from linux-next?

Yes, it's Linux next.  Retrospectively, that was a bad idea :(

Kind regards,
 Valentin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Geert Uytterhoeven
On Wed, Dec 9, 2015 at 2:21 PM, Petr Mladek  wrote:
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -866,6 +866,28 @@ config LOG_CPU_MAX_BUF_SHIFT
>  13 =>   8 KB for each CPU
>  12 =>   4 KB for each CPU
>
> +config NMI_LOG_BUF_SHIFT
> +   int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
> +   range 10 21
> +   default 13
> +   depends on PRINTK && HAVE_NMI

Symbol NMI_LOG_BUF_SHIFT does not exist if its dependencies are not met.

> +   help
> + Select the size of a per-CPU buffer where NMI messages are temporary
> + stored. They are copied to the main log buffer in a safe context
> + to avoid a deadlock. The value defines the size as a power of 2.
> +
> + NMI messages are rare and limited. The largest one is when
> + a backtrace is printed. It usually fits into 4KB. Select
> + 8KB if you want to be on the safe side.
> +
> + Examples:
> +17 => 128 KB for each CPU
> +16 =>  64 KB for each CPU
> +15 =>  32 KB for each CPU
> +14 =>  16 KB for each CPU
> +13 =>   8 KB for each CPU
> +12 =>   4 KB for each CPU
> +
>  #
>  # Architectures with an unreliable sched_clock() should select this:
>  #
> diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c
> index 5465230b75ec..78c07d441b4e 100644
> --- a/kernel/printk/nmi.c
> +++ b/kernel/printk/nmi.c
> @@ -41,7 +41,8 @@ DEFINE_PER_CPU(printk_func_t, printk_func) = 
> vprintk_default;
>  static int printk_nmi_irq_ready;
>  atomic_t nmi_message_lost;
>
> -#define NMI_LOG_BUF_LEN (4096 - sizeof(atomic_t) - sizeof(struct irq_work))
> +#define NMI_LOG_BUF_LEN ((1 << CONFIG_NMI_LOG_BUF_SHIFT) - \
> +sizeof(atomic_t) - sizeof(struct irq_work))

kernel/printk/nmi.c:50:24: error: 'CONFIG_NMI_LOG_BUF_SHIFT'
undeclared here (not in a function)

E.g. efm32_defconfig
http://kisskb.ellerman.id.au/kisskb/buildresult/12565754/

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 0/3] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table on PPC64 platform

2015-12-11 Thread Yongji Xie
Current vfio-pci implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because
sub-page BARs' mmio page may be shared with other BARs and MSI-X table
should not be accessed directly from the guest for security reasons.

But these would cause some performance issues for mmio accesses in guest
when vfio passthrough sub-page BARs or BARs containing MSI-X table on
PPC64 platform. This is because PAGE_SIZE is 64KB by default on PPC64
platform and the big page may easily hit the sub-page MMIO
BARs' unmmapping and cause the unmmaping of the mmio page which
MSI-X table locate in, which lead to mmio emulation in host.

For sub-page MMIO BARs' unmmapping, this patch set enforces all MMIO
BARs to be page aligned on PPC64 platform so that sub-page BAR's mmio
page would not be shared with other BARs. Then we can mmap sub-page
MMIO BARs in vfio-pci driver if all MMIO BARs are page aligned.

For MSI-X table's unmmapping, we think MSI-X table is safe to access
directly from the guest with EEH mechanism enabled which can ensure that
a given pci device can only shoot the MSIs assigned for its PE. So
we add support for mmapping MSI-X table in vfio-pci driver if EEH is
supported.

With this patch set applied, we can get almost 100% improvement on
performance for mmio accesses when we passthrough sub-page BARs in
our test.

The last two patches in the patch set can be used by qemu to:
- Add support for a VFIO-PCI ioctl to indicate that platform
  support all PCI BARs are page aligned.
- Add support for a VFIO-PCI ioctl to indicate that platform
  support mmapping MSI-X table.

Yongji Xie (3):
  powerpc/pci: Enforce all MMIO BARs to be page aligned
  vfio-pci: Allow to mmap sub-page MMIO BARs if all MMIO BARs are page aligned
  vfio-pci: Allow to mmap MSI-X table if EEH is supported

 arch/powerpc/kernel/pci-common.c|   10 +-
 drivers/vfio/pci/vfio_pci.c |   15 +--
 drivers/vfio/pci/vfio_pci_private.h |   10 ++
 include/uapi/linux/vfio.h   |4 
 4 files changed, 36 insertions(+), 3 deletions(-)

-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Geert Uytterhoeven
On Fri, Dec 11, 2015 at 1:41 PM, Petr Mladek  wrote:
> On Fri 2015-12-11 12:10:02, Geert Uytterhoeven wrote:
>> On Wed, Dec 9, 2015 at 2:21 PM, Petr Mladek  wrote:
>> > --- a/init/Kconfig
>> > +++ b/init/Kconfig
>> > @@ -866,6 +866,28 @@ config LOG_CPU_MAX_BUF_SHIFT
>> >  13 =>   8 KB for each CPU
>> >  12 =>   4 KB for each CPU
>> >
>> > +config NMI_LOG_BUF_SHIFT
>> > +   int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
>> > +   range 10 21
>> > +   default 13
>> > +   depends on PRINTK && HAVE_NMI
>>
>> Symbol NMI_LOG_BUF_SHIFT does not exist if its dependencies are not met.
>
> Åh, the NMI buffer is enabled on arm via NEED_PRINTK_NMI.
>
> The buffer is compiled when CONFIG_PRINTK_NMI is defined. I am going
> to fix it the following way:
>
>
> diff --git a/init/Kconfig b/init/Kconfig
> index efcff25a112d..61cfd96a3c96 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -870,7 +870,7 @@ config NMI_LOG_BUF_SHIFT
> int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
> range 10 21
> default 13
> -   depends on PRINTK && HAVE_NMI
> +   depends on PRINTK_NMI
> help
>   Select the size of a per-CPU buffer where NMI messages are temporary
>   stored. They are copied to the main log buffer in a safe context

Makes sense, as kernel/printk/nmi.c is compiled if PRINTK_NMI is set.

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/6] cxlflash: Fix to avoid virtual LUN failover failure

2015-12-11 Thread Manoj Kumar

On 12/10/2015 4:53 PM, Uma Krishnan wrote:

From: "Matthew R. Ochs" 
To remedy this scenario, provide feedback back to the application on
virtual LUN creation as to which ports the LUN may be accessed. LUN's
spanning both ports are candidates for a retry in a presence of an I/O
failure.

Signed-off-by: Matthew R. Ochs 


Acked-by: Manoj Kumar 



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Petr Mladek
On Fri 2015-12-11 12:10:02, Geert Uytterhoeven wrote:
> On Wed, Dec 9, 2015 at 2:21 PM, Petr Mladek  wrote:
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -866,6 +866,28 @@ config LOG_CPU_MAX_BUF_SHIFT
> >  13 =>   8 KB for each CPU
> >  12 =>   4 KB for each CPU
> >
> > +config NMI_LOG_BUF_SHIFT
> > +   int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
> > +   range 10 21
> > +   default 13
> > +   depends on PRINTK && HAVE_NMI
> 
> Symbol NMI_LOG_BUF_SHIFT does not exist if its dependencies are not met.

Åh, the NMI buffer is enabled on arm via NEED_PRINTK_NMI.

The buffer is compiled when CONFIG_PRINTK_NMI is defined. I am going
to fix it the following way:


diff --git a/init/Kconfig b/init/Kconfig
index efcff25a112d..61cfd96a3c96 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -870,7 +870,7 @@ config NMI_LOG_BUF_SHIFT
int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
range 10 21
default 13
-   depends on PRINTK && HAVE_NMI
+   depends on PRINTK_NMI
help
  Select the size of a per-CPU buffer where NMI messages are temporary
  stored. They are copied to the main log buffer in a safe context


Thanks a lot for report,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 1/3] powerpc/pci: Enforce all MMIO BARs to be page aligned

2015-12-11 Thread Yongji Xie
PAGE_SIZE is 64KB by default on PPC64 platform. When vfio
passthrough a pci device of which MMIO BARs are smaller than
64KB(PAGE_SIZE), guest would not handle the mmio accesses
to the BARs which leads to mmio emulations in host.

This is because vfio would not allow to passthrough one
BAR's mmio page which may be shared with other BARs.

To solve this performance issue, this patch enforces the
alignment of all MMIO BARs allocations to be at least PAGE_SIZE
on PPC64 platform because we have enough address space, so
that one BAR's mmio page would not be shared with other BARs.

Signed-off-by: Yongji Xie 
---
 arch/powerpc/kernel/pci-common.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..6989e0f 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1074,6 +1074,11 @@ static int skip_isa_ioresource_align(struct pci_dev *dev)
  * bits, so it's ok to allocate at, say, 0x2800-0x28ff,
  * but we want to try to avoid allocating at 0x2900-0x2bff
  * which might have be mirrored at 0x0100-0x03ff..
+ *
+ * And for PPC64, we enforce the alignment of all MMIO BARs
+ * allocations to be at least PAGE_SIZE(64KB). This would be
+ * helpful to improve performance when we passthrough
+ * a PCI device of which BARs are smaller than PAGE_SIZE
  */
 resource_size_t pcibios_align_resource(void *data, const struct resource *res,
resource_size_t size, resource_size_t align)
@@ -1087,7 +1092,10 @@ resource_size_t pcibios_align_resource(void *data, const 
struct resource *res,
if (start & 0x300)
start = (start + 0x3ff) & ~0x3ff;
}
-
+#ifdef CONFIG_PPC64
+   if (res->flags & IORESOURCE_MEM)
+   start = PAGE_ALIGN(start);
+#endif
return start;
 }
 EXPORT_SYMBOL(pcibios_align_resource);
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Arnd Bergmann
On Friday 11 December 2015 13:41:59 Petr Mladek wrote:
> diff --git a/init/Kconfig b/init/Kconfig
> index efcff25a112d..61cfd96a3c96 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -870,7 +870,7 @@ config NMI_LOG_BUF_SHIFT
> int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
> range 10 21
> default 13
> -   depends on PRINTK && HAVE_NMI
> +   depends on PRINTK_NMI
> help
>   Select the size of a per-CPU buffer where NMI messages are temporary
>   stored. They are copied to the main log buffer in a safe context
> 
> 

Acked-by: Arnd Bergmann 

I found this on linux-next as well today and came to the same conclusion.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Russell King - ARM Linux
On Fri, Dec 11, 2015 at 02:57:25PM -0800, Andrew Morton wrote:
> This is a bit messy.  NEED_PRINTK_NMI is an added-on hack for one
> particular arm variant.  From the changelog:
> 
>"One exception is arm where the deferred printing is used for
> printing backtraces even without NMI.  For this purpose, we define
> NEED_PRINTK_NMI Kconfig flag.  The alternative printk_func is
> explicitly set when IPI_CPU_BACKTRACE is handled."
> 
> 
> - why does arm needs deferred printing for backtraces?
> 
> - why is this specific to CONFIG_CPU_V7M?
> 
> - can this Kconfig logic be cleaned up a bit?

I think this comes purely from this attempt to apply another round of
cleanups to the nmi backtrace work I did.

As I explained when I did that work, the vast majority of ARM platforms
are unable to trigger anything like a NMI - the FIQ is something that's
generally a property of the secure monitor, and is not accessible to
Linux.  However, there are platforms where it is accessible.

The work to add the FIQ-based variant never happened (I've no idea what
happened to that part, Daniel seems to have lost interest in working on
it.)  So, what we have is the IRQ-based variant merged in mainline, which
would be the fallback for the "FIQ not available" cases, and I carry a
local hack in my tree which provides the FIQ-based version - but if it
were to trigger, it takes out all interrupts (hence why I've not merged
my hack.)

I think the reason that the FIQ-based variant has never really happened
is that hooking into the interrupt controller code to clear down the FIQ
creates such a horrid layering violation, and also a locking mess that
I suspect it's just been given up with.

However, I've found my "hack" useful - it's turned a number of totally
undebuggable hangs (where one CPU silently hangs leaving the others
running with no way to find out where the hung CPU is) into something
that can be debugged.

Now, when we end up triggering the IRQ-based variant, we could already
be in a situation where IRQs are off for the local CPU, so the IRQ is
never delivered.  Others decided that it wasn't acceptable to wait 10sec
for the local CPU to time out, and (iirc) we'd also loose the local CPUs
backtrace in certain situations.

I'm personally happy with the existing code, and I've been wondering why
there's this effort to apply further cleanups - to me, the changelogs
don't seem to make that much sense, unless we want to start using
printk() extensively in NMI functions - using the generic nmi backtrace
code surely gets us something that works across all architectures...

I've been assuming that I've missed something, which is why I've not
said anything on that point until now.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Andrew Morton
On Fri, 11 Dec 2015 13:41:59 +0100 Petr Mladek  wrote:

> On Fri 2015-12-11 12:10:02, Geert Uytterhoeven wrote:
> > On Wed, Dec 9, 2015 at 2:21 PM, Petr Mladek  wrote:
> > > --- a/init/Kconfig
> > > +++ b/init/Kconfig
> > > @@ -866,6 +866,28 @@ config LOG_CPU_MAX_BUF_SHIFT
> > >  13 =>   8 KB for each CPU
> > >  12 =>   4 KB for each CPU
> > >
> > > +config NMI_LOG_BUF_SHIFT
> > > +   int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
> > > +   range 10 21
> > > +   default 13
> > > +   depends on PRINTK && HAVE_NMI
> > 
> > Symbol NMI_LOG_BUF_SHIFT does not exist if its dependencies are not met.
> 
> __h, the NMI buffer is enabled on arm via NEED_PRINTK_NMI.
> 
> The buffer is compiled when CONFIG_PRINTK_NMI is defined. I am going
> to fix it the following way:
> 
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index efcff25a112d..61cfd96a3c96 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -870,7 +870,7 @@ config NMI_LOG_BUF_SHIFT
>   int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
>   range 10 21
>   default 13
> - depends on PRINTK && HAVE_NMI
> + depends on PRINTK_NMI
>   help
> Select the size of a per-CPU buffer where NMI messages are temporary
> stored. They are copied to the main log buffer in a safe context

I'm wondering why we're building kernel/printk/nmi.o if HAVE_NMI is not
set.

obj-$(CONFIG_PRINTK_NMI)+= nmi.o

and

config PRINTK_NMI
def_bool y
depends on PRINTK
depends on HAVE_NMI || NEED_PRINTK_NMI

This is a bit messy.  NEED_PRINTK_NMI is an added-on hack for one
particular arm variant.  From the changelog:

   "One exception is arm where the deferred printing is used for
printing backtraces even without NMI.  For this purpose, we define
NEED_PRINTK_NMI Kconfig flag.  The alternative printk_func is
explicitly set when IPI_CPU_BACKTRACE is handled."


- why does arm needs deferred printing for backtraces?

- why is this specific to CONFIG_CPU_V7M?

- can this Kconfig logic be cleaned up a bit?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-11 Thread Andrew Morton
On Fri, 11 Dec 2015 23:21:13 + Russell King - ARM Linux 
 wrote:

> On Fri, Dec 11, 2015 at 02:57:25PM -0800, Andrew Morton wrote:
> > This is a bit messy.  NEED_PRINTK_NMI is an added-on hack for one
> > particular arm variant.  From the changelog:
> > 
> >"One exception is arm where the deferred printing is used for
> > printing backtraces even without NMI.  For this purpose, we define
> > NEED_PRINTK_NMI Kconfig flag.  The alternative printk_func is
> > explicitly set when IPI_CPU_BACKTRACE is handled."
> > 
> > 
> > - why does arm needs deferred printing for backtraces?
> > 
> > - why is this specific to CONFIG_CPU_V7M?
> > 
> > - can this Kconfig logic be cleaned up a bit?
> 
> I think this comes purely from this attempt to apply another round of
> cleanups to the nmi backtrace work I did.
> 
> As I explained when I did that work, the vast majority of ARM platforms
> are unable to trigger anything like a NMI - the FIQ is something that's
> generally a property of the secure monitor, and is not accessible to
> Linux.  However, there are platforms where it is accessible.

OK, thanks.  So "not needed at present, might be needed in the future,
useful for out-of-tree debug code"?

> I'm personally happy with the existing code, and I've been wondering why
> there's this effort to apply further cleanups - to me, the changelogs
> don't seem to make that much sense, unless we want to start using
> printk() extensively in NMI functions - using the generic nmi backtrace
> code surely gets us something that works across all architectures...

Yes, I was scratching my head over that.  The patchset takes an nmi-safe
all-cpu-backtrace and generalises that into an nmi-safe printk.  That
*sounds* like a good thing to do but yes, some additional justification
would be helpful.  What real-world value does this patchset really
bring to real-world users?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH net-next v2] Driver for IBM System i/p VNIC protocol

2015-12-11 Thread David Miller
From: Thomas Falcon 
Date: Tue,  8 Dec 2015 11:52:19 -0600

> +static long h_reg_sub_crq(unsigned long unit_address, unsigned long token,
> +   unsigned long length, unsigned long *number,
> +   unsigned long *irq)
> +{
> + long rc;
> + unsigned long retbuf[PLPAR_HCALL_BUFSIZE];

Please declare local variables from longest to shortest line, otherwise
known as "reverse christmas tree" order.

Audit this in your entire driver.

> + pool->rx_buff = kcalloc(pool->size, sizeof(struct ibmvnic_rx_buff),
> + GFP_KERNEL);

Allocation failures not checked until much later in this function, where
several other resources have been allocated meanwhile.  That doesn't
make any sense at all.

> + adapter->closing = 1;

Please use type 'bool' and values 'true' and 'false' for boolean
values.

Audit this in your entire driver.

> + if (ip_hdr(skb)->version == 4)
> + tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV4;
> + else if (ip_hdr(skb)->version == 6)
> + tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV6;
> +

You cannot dereference the protocol header of the SKB without
first checking the skb->protocol value, otherwise you're looking
at garbage.

> +static int ibmvnic_set_mac(struct net_device *netdev, void *p)
> +{
> + struct ibmvnic_adapter *adapter = netdev_priv(netdev);
> + struct sockaddr *addr = p;
> + union ibmvnic_crq crq;
> +
> + if (!is_valid_ether_addr(addr->sa_data))
> + return -EADDRNOTAVAIL;
> +
> + memset(, 0, sizeof(crq));
> + crq.change_mac_addr.first = IBMVNIC_CRQ_CMD;
> + crq.change_mac_addr.cmd = CHANGE_MAC_ADDR;
> + ether_addr_copy(_mac_addr.mac_addr[0], addr->sa_data);
> + ibmvnic_send_crq(adapter, );
> +
> + return 0;
> +}

You are responsible for copying the new MAC address into dev->dev_addr
on success.

> +static int ibmvnic_mii_ioctl(struct net_device *netdev, struct ifreq *ifr,
> +  int cmd)
> +{
 ...
> + return 0;
> +}
> +
> +static int ibmvnic_ioctl(struct net_device *netdev, struct ifreq *ifr, int 
> cmd)
> +{
> + switch (cmd) {
> + case SIOCGMIIPHY:
> + case SIOCGMIIREG:
> + case SIOCSMIIREG:
> + return ibmvnic_mii_ioctl(netdev, ifr, cmd);
> + default:
> + return -EOPNOTSUPP;
> + }
> +}

This really doesn't make any sense.  Please just delete this.  You
don't support MII reads or writes because they logically don't make
sense on this device.

> +static struct net_device_stats *ibmvnic_get_stats(struct net_device *dev)
> +{
> + struct ibmvnic_adapter *adapter = netdev_priv(dev);
> +
> + /* only return the current stats */
> + return >net_stats;
> +}

The default method does this for you as long as you properly use
net_device's embedded stats, therefore you don't need to provide this
at all.

That's all I have any energy for, and as you can see nobody else wants
to even try to review this driver.

It's going to take a lot of respins and time before this driver is
ready for upstream inclusion.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: crypto/nx842: Ignore queue overflow informative error

2015-12-11 Thread Haren Myneni
On 12/07/2015 11:34 AM, Dan Streetman wrote:
> On Sun, Dec 6, 2015 at 2:46 AM, Haren Myneni  wrote:
>>
>> NX842 coprocessor sets bit 3 if queue is overflow. It is just for
>> information to the user. So the driver prints this informative message
>> and ignores it.
>>
>> Signed-off-by: Haren Myneni 
>>
>> diff --git a/arch/powerpc/include/asm/icswx.h 
>> b/arch/powerpc/include/asm/icswx.h
>> index 9f8402b..d1a2a2d 100644
>> --- a/arch/powerpc/include/asm/icswx.h
>> +++ b/arch/powerpc/include/asm/icswx.h
>> @@ -164,6 +164,7 @@ struct coprocessor_request_block {
>>  #define ICSWX_INITIATED(0x8)
>>  #define ICSWX_BUSY (0x4)
>>  #define ICSWX_REJECTED (0x2)
>> +#define ICSWX_BIT3 (0x1)   /* undefined or set from XERSO. */
> 
> Since this isn't defined by the icswx rfc workbook, it probably
> shouldn't go here, it would make more sense to put it into nx-842.h
> and call it something like "ICSWX_NX_QUEUE_OVERFLOW" or similar
> NX-specific meaningful name.

This bit is defined in icswx RFC. Hence I think we should define this in 
icswx.h.

"Bit 3 of CR0 is undefined or set from XERSO."

Please ignore this patch. Talking to HW team, whenever gets floating point 
overflow from FPU, XER[S0] will be set and it stays until other FPU operation 
is executed. It is typical behaviour on powerpc. ixswx RFC says coprocessor can 
set this XER[S0] to bit 3 and NX is doing this. I think it should have ignored 
this bit.
 
"An implementation is permitted to set bit 3 of CR0 from XERSO."

So,the issue is not queue overflow problem, but NX is copying XER[S0] which is 
no use and nothing to do with compression. We need to ignore this bit since it 
can be set with other valuable return status. I will repost new patch with the 
proper description. 

Thanks
Haren 
 

> 
>>
>>  static inline int icswx(__be32 ccw, struct coprocessor_request_block *crb)
>>  {
>> diff --git a/drivers/crypto/nx/nx-842-powernv.c 
>> b/drivers/crypto/nx/nx-842-powernv.c
>> index 9ef51fa..321b8e8 100644
>> --- a/drivers/crypto/nx/nx-842-powernv.c
>> +++ b/drivers/crypto/nx/nx-842-powernv.c
>> @@ -442,6 +442,15 @@ static int nx842_powernv_function(const unsigned char 
>> *in, unsigned int inlen,
>>  (unsigned int)ccw,
>>  (unsigned int)be32_to_cpu(crb->ccw));
>>
>> +   /*
>> +* NX842 coprocessor uses 3rd bit to report queue overflow which is
>> +* not an error, just for information to user. So, ignore this bit.
>> +*/
> 
> a meaningfully named bit define means you don't need to explain it
> with a comment :-)
> 
> However, I suggest that you do explain *why* a queue overflow isn't an
> error - either here or (probably better) at the #define of the bit -
> because that isn't obvious.
> 
>> +   if (ret & ICSWX_BIT3) {
>> +   pr_info_ratelimited("842 coprocessor queue overflow\n");
> 
> if it's not an error, should this be pr_debug_ratelimited instead?
> What is an end user expected to do if they see this msg in the log?
> 
>> +   ret &= ~ICSWX_BIT3;
>> +   }
>> +
>> switch (ret) {
>> case ICSWX_INITIATED:
>> ret = wait_for_csb(wmem, csb);
>>
>>
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] EDAC, mpc85xx: Make mpc85xx-pci-edac a platform device

2015-12-11 Thread Borislav Petkov
On Thu, Dec 10, 2015 at 01:07:12PM -0600, Scott Wood wrote:
> Originally the mpc85xx-pci-edac driver bound directly to the PCI
> controller node.
> 
> Commit 905e75c46dba5f30 ("powerpc/fsl-pci: Unify pci/pcie
> initialization code") turned the PCI controller code into a platform
> device.  Since we can't have two drivers binding to the same device,
> the edac code was changed to be called into as a library-style
> submodule.  However, this doesn't work if the edac driver is built as a
> module.
> 
> Commit 8d8fcba6d1eab ("EDAC: Rip out the edac_subsys reference
> counting") exposed another problem with this approach --
> mpc85xx_pci_err_probe() was being called in the same early boot phase
> that the PCI controller is initialized, rather than in the
> device_initcall phase that the EDAC layer expects.  This caused a crash
> on boot.
> 
> To fix this, the PCI controller code now creates a child platform
> device specifically for EDAC, which the mpc85xx-pci-edac driver binds
> to.
> 
> Signed-off-by: Scott Wood 
> Cc: Jia Hongtao 
> Cc: Borislav Petkov 
> Cc: Johannes Thumshirn 
> Cc: Michael Ellerman 
> ---
> v3: Fix build with CONFIG_PCI disabled
> 
>  arch/powerpc/sysdev/fsl_pci.c | 28 +++-
>  arch/powerpc/sysdev/fsl_pci.h |  9 -
>  drivers/edac/mpc85xx_edac.c   | 38 +-
>  include/linux/fsl/edac.h  |  8 
>  4 files changed, 68 insertions(+), 15 deletions(-)
>  create mode 100644 include/linux/fsl/edac.h

Applied and rebased the tree so that this fix comes before the
rip-out-reference counting patch so that there's no hole during
bisection.

Thanks guys.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Crypto/nx842: Ignore invalid XER[S0] return error

2015-12-11 Thread Haren Myneni

NX842 coprocessor sets 3rd bit in CR register with XER[S0] which is
nothing to do with NX request. On powerpc, XER[S0] will be set if
overflow in FPU and stays until another floating point operation is
executed. Since this bit can be set with other valuable return status,
ignore this XER[S0] value.

One of other bits (INITIATED, BUSY or REJECTED) will be returned for
any given NX request.

Signed-off-by: Haren Myneni 

diff --git a/arch/powerpc/include/asm/icswx.h b/arch/powerpc/include/asm/icswx.h
index 9f8402b..27e588f 100644
--- a/arch/powerpc/include/asm/icswx.h
+++ b/arch/powerpc/include/asm/icswx.h
@@ -164,6 +164,7 @@ struct coprocessor_request_block {
 #define ICSWX_INITIATED(0x8)
 #define ICSWX_BUSY (0x4)
 #define ICSWX_REJECTED (0x2)
+#define ICSWX_XERS0(0x1)   /* undefined or set from XERSO. */
 
 static inline int icswx(__be32 ccw, struct coprocessor_request_block *crb)
 {
diff --git a/drivers/crypto/nx/nx-842-powernv.c 
b/drivers/crypto/nx/nx-842-powernv.c
index 9ef51fa..6bc33ae 100644
--- a/drivers/crypto/nx/nx-842-powernv.c
+++ b/drivers/crypto/nx/nx-842-powernv.c
@@ -442,6 +442,16 @@ static int nx842_powernv_function(const unsigned char *in, 
unsigned int inlen,
 (unsigned int)ccw,
 (unsigned int)be32_to_cpu(crb->ccw));
 
+   /*
+* NX842 coprocessor sets 3rd bit in CR register with XER[S0].
+* Setting XER[S0] happens if overflow in FPU and stays until
+* other floating operation is executed. XER[S0] value is nothing
+* to NX and no use to user. Since this bit can be set with other
+* return values, ignore this error.
+*/
+   if (ret & ICSWX_XERS0)
+   ret &= ~ICSWX_XERS0;
+
switch (ret) {
case ICSWX_INITIATED:
ret = wait_for_csb(wmem, csb);
@@ -454,10 +464,6 @@ static int nx842_powernv_function(const unsigned char *in, 
unsigned int inlen,
pr_err_ratelimited("ICSWX rejected\n");
ret = -EPROTO;
break;
-   default:
-   pr_err_ratelimited("Invalid ICSWX return code %x\n", ret);
-   ret = -EPROTO;
-   break;
}
 
if (!ret)



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] tracing: introduce TRACE_EVENT_FN_COND macro

2015-12-11 Thread Steven Rostedt
On Mon, 7 Dec 2015 17:42:42 +0300
Denis Kirjanov  wrote:

> On 11/24/15, Denis Kirjanov  wrote:
> > TRACE_EVENT_FN can't be used in some circumstances
> > like invoking trace functions from offlined CPU due
> > to RCU usage.
> >
> > This patch adds the TRACE_EVENT_FN_COND macro
> > to make such trace points conditional.
> >
> > Signed-off-by: Denis Kirjanov   
> 
> Hi Steven,
> are you going to take this series through your tree?
> 

I haven't gotten an ack for the powerpc side yet. I can take the
tracing patch, but I'm not pulling in powerpc without an ack from one
of the maintainers. Maybe I got it already, but I don't see it in my
INBOX in this thread.

-- Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] arch/powerpc/Kconfig: fix typo in select statement

2015-12-11 Thread Daniel Axtens
Thanks Valentin.

Ironically I first noticed the spelling mistake in the original UBSAN
patchset[1], but wanted it to work on powerpc straight away so I went
with the mis-spelled version. It's since been corrected upstream[2]; and
I was going to spin a v2 of my original patch with this fix, especially
since I tagged the original patch as an RFC only. Regardless, thank you:
this should keep -mm and next working in the mean time!

> Fixes: 257e4ee82dbd ("powerpc: enable UBSAN support")
Is this a SHA from linux-next?

Regards,
Daniel

[1]: https://lkml.org/lkml/2015/12/9/1008
[2]: https://lkml.org/lkml/2015/12/10/422


> Signed-off-by: Valentin Rothberg 
> ---
> Detected with ./scripts/checkkconfigsymbols.py
>
>  arch/powerpc/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b0420a8c458e..4de86d145f17 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -157,7 +157,7 @@ config PPC
>   select EDAC_ATOMIC_SCRUB
>   select ARCH_HAS_DMA_SET_COHERENT_MASK
>   select HAVE_ARCH_SECCOMP_FILTER
> - select ARCH_HAS_UBSAN_SANTIZE_ALL
> + select ARCH_HAS_UBSAN_SANITIZE_ALL
>  
>  config GENERIC_CSUM
>   def_bool CPU_LITTLE_ENDIAN
> -- 
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] EDAC, mpc85xx: Make mpc85xx-pci-edac a platform device

2015-12-11 Thread Johannes Thumshirn
On Thu, Dec 10, 2015 at 01:07:12PM -0600, Scott Wood wrote:
> Originally the mpc85xx-pci-edac driver bound directly to the PCI
> controller node.
> 
> Commit 905e75c46dba5f30 ("powerpc/fsl-pci: Unify pci/pcie
> initialization code") turned the PCI controller code into a platform
> device.  Since we can't have two drivers binding to the same device,
> the edac code was changed to be called into as a library-style
> submodule.  However, this doesn't work if the edac driver is built as a
> module.
> 
> Commit 8d8fcba6d1eab ("EDAC: Rip out the edac_subsys reference
> counting") exposed another problem with this approach --
> mpc85xx_pci_err_probe() was being called in the same early boot phase
> that the PCI controller is initialized, rather than in the
> device_initcall phase that the EDAC layer expects.  This caused a crash
> on boot.
> 
> To fix this, the PCI controller code now creates a child platform
> device specifically for EDAC, which the mpc85xx-pci-edac driver binds
> to.
> 
> Signed-off-by: Scott Wood 
> Cc: Jia Hongtao 
> Cc: Borislav Petkov 
> Cc: Johannes Thumshirn 
> Cc: Michael Ellerman 
> ---
> v3: Fix build with CONFIG_PCI disabled
> 
>  arch/powerpc/sysdev/fsl_pci.c | 28 +++-
>  arch/powerpc/sysdev/fsl_pci.h |  9 -
>  drivers/edac/mpc85xx_edac.c   | 38 +-
>  include/linux/fsl/edac.h  |  8 
>  4 files changed, 68 insertions(+), 15 deletions(-)
>  create mode 100644 include/linux/fsl/edac.h
> 
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> index 610f472..a1ac80b 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -21,10 +21,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1255,6 +1257,25 @@ void fsl_pcibios_fixup_phb(struct pci_controller *phb)
>  #endif
>  }
>  
> +static int add_err_dev(struct platform_device *pdev)
> +{
> + struct platform_device *errdev;
> + struct mpc85xx_edac_pci_plat_data pd = {
> + .of_node = pdev->dev.of_node
> + };
> +
> + errdev = platform_device_register_resndata(>dev,
> +"mpc85xx-pci-edac",
> +PLATFORM_DEVID_AUTO,
> +pdev->resource,
> +pdev->num_resources,
> +, sizeof(pd));
> + if (IS_ERR(errdev))
> + return PTR_ERR(errdev);
> +
> + return 0;
> +}
> +
>  static int fsl_pci_probe(struct platform_device *pdev)
>  {
>   struct device_node *node;
> @@ -1262,8 +1283,13 @@ static int fsl_pci_probe(struct platform_device *pdev)
>  
>   node = pdev->dev.of_node;
>   ret = fsl_add_bridge(pdev, fsl_pci_primary == node);
> + if (ret)
> + return ret;
>  
> - mpc85xx_pci_err_probe(pdev);
> + ret = add_err_dev(pdev);
> + if (ret)
> + dev_err(>dev, "couldn't register error device: %d\n",
> + ret);
>  
>   return 0;
>  }
> diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
> index c1cec77..1515885 100644
> --- a/arch/powerpc/sysdev/fsl_pci.h
> +++ b/arch/powerpc/sysdev/fsl_pci.h
> @@ -130,15 +130,6 @@ void fsl_pci_assign_primary(void);
>  static inline void fsl_pci_assign_primary(void) {}
>  #endif
>  
> -#ifdef CONFIG_EDAC_MPC85XX
> -int mpc85xx_pci_err_probe(struct platform_device *op);
> -#else
> -static inline int mpc85xx_pci_err_probe(struct platform_device *op)
> -{
> - return -ENOTSUPP;
> -}
> -#endif
> -
>  #ifdef CONFIG_FSL_PCI
>  extern int fsl_pci_mcheck_exception(struct pt_regs *);
>  #else
> diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
> index 3eab063..b7139c1 100644
> --- a/drivers/edac/mpc85xx_edac.c
> +++ b/drivers/edac/mpc85xx_edac.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -238,10 +239,12 @@ static irqreturn_t mpc85xx_pci_isr(int irq, void 
> *dev_id)
>   return IRQ_HANDLED;
>  }
>  
> -int mpc85xx_pci_err_probe(struct platform_device *op)
> +static int mpc85xx_pci_err_probe(struct platform_device *op)
>  {
>   struct edac_pci_ctl_info *pci;
>   struct mpc85xx_pci_pdata *pdata;
> + struct mpc85xx_edac_pci_plat_data *plat_data;
> + struct device_node *of_node;
>   struct resource r;
>   int res = 0;
>  
> @@ -266,7 +269,15 @@ int mpc85xx_pci_err_probe(struct platform_device *op)
>   pdata->name = "mpc85xx_pci_err";
>   pdata->irq = NO_IRQ;
>  
> - if (mpc85xx_pcie_find_capability(op->dev.of_node) > 0)
> + plat_data = op->dev.platform_data;
> + if (!plat_data) {
> + 

[RFC PATCH 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported

2015-12-11 Thread Yongji Xie
Current vfio-pci implementation disallows to mmap MSI-X table in
case that user get to touch this directly.

However, EEH mechanism could ensure that a given pci device
can only shoot the MSIs assigned for its PE and guest kernel also
would not write to MSI-X table in pci_enable_msix() because
para-virtualization on PPC64 platform. So MSI-X table is safe to
access directly from the guest with EEH mechanism enabled.

This patch adds support for this case and allow to mmap MSI-X
table if EEH is supported on PPC64 platform.

And we also add a VFIO_DEVICE_FLAGS_PCI_MSIX_MMAP flag to notify
userspace that it's safe to mmap MSI-X table.

Signed-off-by: Yongji Xie 
---
 drivers/vfio/pci/vfio_pci.c |5 -
 drivers/vfio/pci/vfio_pci_private.h |5 +
 include/uapi/linux/vfio.h   |2 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index dbcad99..85d9980 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -446,6 +446,9 @@ static long vfio_pci_ioctl(void *device_data,
if (vfio_pci_bar_page_aligned())
info.flags |= VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED;
 
+   if (vfio_msix_table_mmap_enabled())
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_MSIX_MMAP;
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
@@ -871,7 +874,7 @@ static int vfio_pci_mmap(void *device_data, struct 
vm_area_struct *vma)
if (phys_len < PAGE_SIZE || req_start + req_len > phys_len)
return -EINVAL;
 
-   if (index == vdev->msix_bar) {
+   if (index == vdev->msix_bar && !vfio_msix_table_mmap_enabled()) {
/*
 * Disallow mmaps overlapping the MSI-X table; users don't
 * get to touch this directly.  We could find somewhere
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 319352a..835619e 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -74,6 +74,11 @@ static inline bool vfio_pci_bar_page_aligned(void)
return IS_ENABLED(CONFIG_PPC64);
 }
 
+static inline bool vfio_msix_table_mmap_enabled(void)
+{
+   return IS_ENABLED(CONFIG_EEH);
+}
+
 extern void vfio_pci_intx_mask(struct vfio_pci_device *vdev);
 extern void vfio_pci_intx_unmask(struct vfio_pci_device *vdev);
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 1fc8066..289e662 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -173,6 +173,8 @@ struct vfio_device_info {
 #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)   /* vfio-amba device */
 /* Platform support all PCI MMIO BARs to be page aligned */
 #define VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED (1 << 4)
+/* Platform support mmapping PCI MSI-X vector table */
+#define VFIO_DEVICE_FLAGS_PCI_MSIX_MMAP(1 << 5)
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 2/3] vfio-pci: Allow to mmap sub-page MMIO BARs if all MMIO BARs are page aligned

2015-12-11 Thread Yongji Xie
Current vfio-pci implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio page
may be shared with other BARs.

But we should allow to mmap these sub-page MMIO BARs if all MMIO BARs
are page aligned which leads the BARs' mmio page would not be shared
with other BARs.

This patch adds support for this case and we also add a
VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED flag to notify userspace that
platform supports all MMIO BARs to be page aligned.

Signed-off-by: Yongji Xie 
---
 drivers/vfio/pci/vfio_pci.c |   10 +-
 drivers/vfio/pci/vfio_pci_private.h |5 +
 include/uapi/linux/vfio.h   |2 ++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 32b88bd..dbcad99 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -443,6 +443,9 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev->reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   if (vfio_pci_bar_page_aligned())
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED;
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
@@ -479,7 +482,8 @@ static long vfio_pci_ioctl(void *device_data,
 VFIO_REGION_INFO_FLAG_WRITE;
if (IS_ENABLED(CONFIG_VFIO_PCI_MMAP) &&
pci_resource_flags(pdev, info.index) &
-   IORESOURCE_MEM && info.size >= PAGE_SIZE)
+   IORESOURCE_MEM && (info.size >= PAGE_SIZE ||
+   vfio_pci_bar_page_aligned()))
info.flags |= VFIO_REGION_INFO_FLAG_MMAP;
break;
case VFIO_PCI_ROM_REGION_INDEX:
@@ -855,6 +859,10 @@ static int vfio_pci_mmap(void *device_data, struct 
vm_area_struct *vma)
return -EINVAL;
 
phys_len = pci_resource_len(pdev, index);
+
+   if (vfio_pci_bar_page_aligned())
+   phys_len = PAGE_ALIGN(phys_len);
+
req_len = vma->vm_end - vma->vm_start;
pgoff = vma->vm_pgoff &
((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 0e7394f..319352a 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -69,6 +69,11 @@ struct vfio_pci_device {
 #define is_irq_none(vdev) (!(is_intx(vdev) || is_msi(vdev) || is_msix(vdev)))
 #define irq_is(vdev, type) (vdev->irq_type == type)
 
+static inline bool vfio_pci_bar_page_aligned(void)
+{
+   return IS_ENABLED(CONFIG_PPC64);
+}
+
 extern void vfio_pci_intx_mask(struct vfio_pci_device *vdev);
 extern void vfio_pci_intx_unmask(struct vfio_pci_device *vdev);
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 751b69f..1fc8066 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -171,6 +171,8 @@ struct vfio_device_info {
 #define VFIO_DEVICE_FLAGS_PCI  (1 << 1)/* vfio-pci device */
 #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)/* vfio-platform device */
 #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)   /* vfio-amba device */
+/* Platform support all PCI MMIO BARs to be page aligned */
+#define VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED (1 << 4)
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev