Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-14 Thread Benjamin Herrenschmidt
<1465404871-5406-11-git-send-email-shre...@linux.vnet.ibm.com> <1465854492.3022.30.ca...@au1.ibm.com> <575fe64c.9080...@linux.vnet.ibm.com> Organization: IBM Australia Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.3 (3.20.3-1.fc24)

Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-14 Thread Benjamin Herrenschmidt
<1465404871-5406-11-git-send-email-shre...@linux.vnet.ibm.com> <1465854492.3022.30.ca...@au1.ibm.com> <575fe64c.9080...@linux.vnet.ibm.com> Organization: IBM Australia Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.3 (3.20.3-1.fc24)

Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-14 Thread Benjamin Herrenschmidt
On Tue, 2016-06-14 at 16:17 +0530, Shreyas B Prabhu wrote: > > I ignored adding this check because this is part of initcall and we are > unlikely to run out of memory at this state. But I'll add the check in > next version. Why do you malloc the u64 array and not the string pointer array ?

Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-14 Thread Benjamin Herrenschmidt
On Tue, 2016-06-14 at 16:17 +0530, Shreyas B Prabhu wrote: > > I ignored adding this check because this is part of initcall and we are > unlikely to run out of memory at this state. But I'll add the check in > next version. Why do you malloc the u64 array and not the string pointer array ?

Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-13 Thread Benjamin Herrenschmidt
On Wed, 2016-06-08 at 11:54 -0500, Shreyas B. Prabhu wrote: > >  /* >   * States for dedicated partition case. >   */ > @@ -167,6 +183,8 @@ static int powernv_add_idle_states(void) >   int nr_idle_states = 1; /* Snooze */ >   int dt_idle_states; >   u32 *latency_ns, *residency_ns,

Re: [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states

2016-06-13 Thread Benjamin Herrenschmidt
On Wed, 2016-06-08 at 11:54 -0500, Shreyas B. Prabhu wrote: > >  /* >   * States for dedicated partition case. >   */ > @@ -167,6 +183,8 @@ static int powernv_add_idle_states(void) >   int nr_idle_states = 1; /* Snooze */ >   int dt_idle_states; >   u32 *latency_ns, *residency_ns,

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-06 Thread Benjamin Herrenschmidt
On Mon, 2016-06-06 at 17:59 +0200, Peter Zijlstra wrote: > On Fri, Jun 03, 2016 at 02:33:47PM +1000, Benjamin Herrenschmidt wrote: > > > >  - For the above, can you show (or describe) where the qspinlock > >    improves things compared to our current locks. > So cu

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-06 Thread Benjamin Herrenschmidt
On Mon, 2016-06-06 at 17:59 +0200, Peter Zijlstra wrote: > On Fri, Jun 03, 2016 at 02:33:47PM +1000, Benjamin Herrenschmidt wrote: > > > >  - For the above, can you show (or describe) where the qspinlock > >    improves things compared to our current locks. > So cu

Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-05 Thread Benjamin Herrenschmidt
On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote: > @@ -61,8 +72,13 @@ save_sprs_to_stack: > * Note all register i.e per-core, per-subcore or per-thread is saved > * here since any thread in the core might wake up first > */ > +BEGIN_FTR_SECTION > +   mfspr 

Re: [PATCH v5 08/11] powerpc/powernv: Add platform support for stop instruction

2016-06-05 Thread Benjamin Herrenschmidt
On Thu, 2016-06-02 at 07:38 -0500, Shreyas B. Prabhu wrote: > @@ -61,8 +72,13 @@ save_sprs_to_stack: > * Note all register i.e per-core, per-subcore or per-thread is saved > * here since any thread in the core might wake up first > */ > +BEGIN_FTR_SECTION > +   mfspr 

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Fri, 2016-06-03 at 12:10 +0800, xinhui wrote: > On 2016年06月03日 09:32, Benjamin Herrenschmidt wrote: > > On Fri, 2016-06-03 at 11:32 +1000, Benjamin Herrenschmidt wrote: > >> On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > >>> > >>> Base code to e

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Fri, 2016-06-03 at 12:10 +0800, xinhui wrote: > On 2016年06月03日 09:32, Benjamin Herrenschmidt wrote: > > On Fri, 2016-06-03 at 11:32 +1000, Benjamin Herrenschmidt wrote: > >> On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > >>> > >>> Base code to e

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Fri, 2016-06-03 at 11:32 +1000, Benjamin Herrenschmidt wrote: > On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > > > > Base code to enable qspinlock on powerpc. this patch add some > > #ifdef > > here and there. Although there is no paravirt related code, we

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Fri, 2016-06-03 at 11:32 +1000, Benjamin Herrenschmidt wrote: > On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > > > > Base code to enable qspinlock on powerpc. this patch add some > > #ifdef > > here and there. Although there is no paravirt related code, we

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > Base code to enable qspinlock on powerpc. this patch add some #ifdef > here and there. Although there is no paravirt related code, we can > successfully build a qspinlock kernel after apply this patch. This is missing the IO_SYNC stuff ... It

Re: [PATCH v5 1/6] qspinlock: powerpc support qspinlock

2016-06-02 Thread Benjamin Herrenschmidt
On Thu, 2016-06-02 at 17:22 +0800, Pan Xinhui wrote: > Base code to enable qspinlock on powerpc. this patch add some #ifdef > here and there. Although there is no paravirt related code, we can > successfully build a qspinlock kernel after apply this patch. This is missing the IO_SYNC stuff ... It

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-12 Thread Benjamin Herrenschmidt
On Thu, 2016-05-12 at 11:58 +0200, Christian Lamparter wrote: > > > http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-ref-big-little-endian-herrenschmidt.odp > > > > but there are at least two more twists that you completely missed here: > > > > - Some architectures

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-12 Thread Benjamin Herrenschmidt
On Thu, 2016-05-12 at 11:58 +0200, Christian Lamparter wrote: > > > http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-ref-big-little-endian-herrenschmidt.odp > > > > but there are at least two more twists that you completely missed here: > > > > - Some architectures

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 21:06 +0200, Christian Lamparter via Linuxppc-dev wrote: > > I ran into the following issues: > - gadget.c uses ioread32_rep [0] & iowrite32_rep [1]. >   This is interesting because both of these functions actually use >   the __raw_io* on powerpc.

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 21:06 +0200, Christian Lamparter via Linuxppc-dev wrote: > > I ran into the following issues: > - gadget.c uses ioread32_rep [0] & iowrite32_rep [1]. >   This is interesting because both of these functions actually use >   the __raw_io* on powerpc.

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 17:08 +0200, Arnd Bergmann wrote: > > Unfortunately, I don't see any way this could be done in MIPS specific > code: There is typically a byteswap between the internal bus and the PCI > bus on big-endian MIPS systems, so the PCI MMIO ends up being little-endian, Ugh ... not

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 17:08 +0200, Arnd Bergmann wrote: > > Unfortunately, I don't see any way this could be done in MIPS specific > code: There is typically a byteswap between the internal bus and the PCI > bus on big-endian MIPS systems, so the PCI MMIO ends up being little-endian, Ugh ... not

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 13:39 +0300, Felipe Balbi wrote: > and patch all drivers similarly? Shouldn't arch/mips itself deal with > it and hide it from drivers ? Not sure what you mean, but we never had "endian neutral" accessors. It would be a bit of an endeavour and we already have so many

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 13:39 +0300, Felipe Balbi wrote: > and patch all drivers similarly? Shouldn't arch/mips itself deal with > it and hide it from drivers ? Not sure what you mean, but we never had "endian neutral" accessors. It would be a bit of an endeavour and we already have so many

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 12:36 +0200, Arnd Bergmann wrote: >  > I think we can simply make this set of accessors architecture- > dependent > (MIPS vs. the rest of the world) to revert ARM and PowerPC back to > the working version. Or use writel_be which mips seems to support... Really, make it a BE

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-09 Thread Benjamin Herrenschmidt
On Mon, 2016-05-09 at 12:36 +0200, Arnd Bergmann wrote: >  > I think we can simply make this set of accessors architecture- > dependent > (MIPS vs. the rest of the world) to revert ARM and PowerPC back to > the working version. Or use writel_be which mips seems to support... Really, make it a BE

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-08 Thread Benjamin Herrenschmidt
On Sun, 2016-05-08 at 13:44 +0200, Christian Lamparter wrote: > On Sunday, May 08, 2016 08:40:55 PM Benjamin Herrenschmidt wrote: > > > > On Sun, 2016-05-08 at 00:54 +0200, Christian Lamparter via Linuxppc-dev  > > wrote: > > > > > > I've been looking i

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-08 Thread Benjamin Herrenschmidt
On Sun, 2016-05-08 at 13:44 +0200, Christian Lamparter wrote: > On Sunday, May 08, 2016 08:40:55 PM Benjamin Herrenschmidt wrote: > > > > On Sun, 2016-05-08 at 00:54 +0200, Christian Lamparter via Linuxppc-dev  > > wrote: > > > > > > I've been looking i

Re: [PATCH v11 04/60] sparc/PCI: Use correct offset for bus address to resource

2016-05-05 Thread Benjamin Herrenschmidt
On Thu, 2016-05-05 at 08:53 -0700, Yinghai Lu wrote: > For powerpc io port, we still need extra offset from resource address > to final address. > > resource_size_t offset = > ((resource_size_t)vma->vm_pgoff) << PAGE_SHIFT; > > +    if (mmap_state == pci_mmap_io) { > +   

Re: [PATCH v11 04/60] sparc/PCI: Use correct offset for bus address to resource

2016-05-05 Thread Benjamin Herrenschmidt
On Thu, 2016-05-05 at 08:53 -0700, Yinghai Lu wrote: > For powerpc io port, we still need extra offset from resource address > to final address. > > resource_size_t offset = > ((resource_size_t)vma->vm_pgoff) << PAGE_SHIFT; > > +    if (mmap_state == pci_mmap_io) { > +   

Re: [PATCH v11 04/60] sparc/PCI: Use correct offset for bus address to resource

2016-05-03 Thread Benjamin Herrenschmidt
On Tue, 2016-05-03 at 15:52 -0700, Yinghai Lu wrote: > BenH and DavidM, > Are you ok to let /proc/bus/pci/devices to expose resource value > instead of > BAR value? > powerpc already expose MMIO as resource value, but still keep IO as > BAR value? > > Or can we just dump /proc/bus/pci support

Re: [PATCH v11 04/60] sparc/PCI: Use correct offset for bus address to resource

2016-05-03 Thread Benjamin Herrenschmidt
On Tue, 2016-05-03 at 15:52 -0700, Yinghai Lu wrote: > BenH and DavidM, > Are you ok to let /proc/bus/pci/devices to expose resource value > instead of > BAR value? > powerpc already expose MMIO as resource value, but still keep IO as > BAR value? > > Or can we just dump /proc/bus/pci support

Re: [RFC/PATCH] of: of_find_node_by_name - stop dropping reference to 'from' node

2016-04-21 Thread Benjamin Herrenschmidt
On Thu, 2016-04-21 at 15:35 -0700, Frank Rowand wrote: > No.  It is correct for of_find_by_name() to call of_node_put() for > the from argument.  The callers should be fixed. I would argue that if everybody makes the same mistake then our interface is wrong. In that case I wrote it so I think I

Re: [RFC/PATCH] of: of_find_node_by_name - stop dropping reference to 'from' node

2016-04-21 Thread Benjamin Herrenschmidt
On Thu, 2016-04-21 at 15:35 -0700, Frank Rowand wrote: > No.  It is correct for of_find_by_name() to call of_node_put() for > the from argument.  The callers should be fixed. I would argue that if everybody makes the same mistake then our interface is wrong. In that case I wrote it so I think I

Re: [PATCH v11 44/60] PCI: Add alt_size ressource allocation support

2016-04-08 Thread Benjamin Herrenschmidt
On Thu, 2016-04-07 at 17:56 -0700, Linus Torvalds wrote: > Maybe the kernel should just accept the smaller alignment. If the > minimum alignment we use is bigger than necessary, then we're just > wrong about it, and perhaps we should just use the smaller alignment > that the bios used. > > So

Re: [PATCH v11 44/60] PCI: Add alt_size ressource allocation support

2016-04-08 Thread Benjamin Herrenschmidt
On Thu, 2016-04-07 at 17:56 -0700, Linus Torvalds wrote: > Maybe the kernel should just accept the smaller alignment. If the > minimum alignment we use is bigger than necessary, then we're just > wrong about it, and perhaps we should just use the smaller alignment > that the bios used. > > So

Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

2016-03-19 Thread Benjamin Herrenschmidt
On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote: > >  static int nr_chips; > +static DEFINE_PER_CPU(unsigned int, chip_id); >   >  /* >   * Note: The set of pstates consists of contiguous integers, the > @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void > *data) >   >   

Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

2016-03-19 Thread Benjamin Herrenschmidt
On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote: > >  static int nr_chips; > +static DEFINE_PER_CPU(unsigned int, chip_id); >   >  /* >   * Note: The set of pstates consists of contiguous integers, the > @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void > *data) >   >   

Re: [RFC PATCH v2 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported

2016-01-04 Thread Benjamin Herrenschmidt
On Mon, 2016-01-04 at 14:07 -0700, Alex Williamson wrote: > On Thu, 2015-12-31 at 16:50 +0800, Yongji Xie wrote: > > Current vfio-pci implementation disallows to mmap MSI-X > > table in case that user get to touch this directly. > > > > However, EEH mechanism can ensure that a given pci device >

Re: [RFC PATCH v2 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported

2016-01-04 Thread Benjamin Herrenschmidt
On Mon, 2016-01-04 at 14:07 -0700, Alex Williamson wrote: > On Thu, 2015-12-31 at 16:50 +0800, Yongji Xie wrote: > > Current vfio-pci implementation disallows to mmap MSI-X > > table in case that user get to touch this directly. > > > > However, EEH mechanism can ensure that a given pci device >

Re: [RFC PATCH 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported

2015-12-17 Thread Benjamin Herrenschmidt
On Thu, 2015-12-17 at 14:41 -0700, Alex Williamson wrote: > > So I think it is safe to mmap/passthrough MSI-X table on PPC64 > > platform. > > And I'm not sure whether other architectures can ensure these two  > > points.  > > There is another consideration, which is the API exposed to the user.

Re: [RFC PATCH 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported

2015-12-17 Thread Benjamin Herrenschmidt
On Thu, 2015-12-17 at 14:41 -0700, Alex Williamson wrote: > > So I think it is safe to mmap/passthrough MSI-X table on PPC64 > > platform. > > And I'm not sure whether other architectures can ensure these two  > > points.  > > There is another consideration, which is the API exposed to the user.

Re: [PATCH 2/2] powerpc: tracing: don't trace hcalls on offline CPUs

2015-12-07 Thread Benjamin Herrenschmidt
On Mon, 2015-12-07 at 15:52 -0500, Steven Rostedt wrote: > > + TP_CONDITION(cpu_online(smp_processor_id())), > > + This should probably be some kind of __raw version though, hcalls can be called in contexts where the debug stuff in smp_processor_id() isn't safe (or preempt enabled). Cheers,

Re: [PATCH 2/2] powerpc: tracing: don't trace hcalls on offline CPUs

2015-12-07 Thread Benjamin Herrenschmidt
On Mon, 2015-12-07 at 15:52 -0500, Steven Rostedt wrote: > > + TP_CONDITION(cpu_online(smp_processor_id())), > > + This should probably be some kind of __raw version though, hcalls can be called in contexts where the debug stuff in smp_processor_id() isn't safe (or preempt enabled). Cheers,

Re: [PATCH v3 0/3] virtio DMA API core stuff

2015-11-19 Thread Benjamin Herrenschmidt
On Thu, 2015-11-19 at 23:38 +, David Woodhouse wrote: > > I understand that POWER and other platforms don't currently have a > clean way to indicate that certain device don't have translation. And I > understand that we may end up with a *quirk* which ensures that the DMA > API does the right

Re: [PATCH v3 0/3] virtio DMA API core stuff

2015-11-19 Thread Benjamin Herrenschmidt
On Thu, 2015-11-19 at 23:38 +, David Woodhouse wrote: > > I understand that POWER and other platforms don't currently have a > clean way to indicate that certain device don't have translation. And I > understand that we may end up with a *quirk* which ensures that the DMA > API does the right

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 20:46 -0800, Andy Lutomirski wrote: > Me neither.  At least it wouldn't be a regression, but it's still > crappy. > > I think that arm is fine, at least.  I was unable to find an arm QEMU > config that has any problems with my patches. Ok, give me a few days for my headache

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote: > > > What about partition <-> partition virtio such as what we could do on > > PAPR systems. That would have the weak barrier bit. > > > > Is it partition <-> partition, bypassing IOMMU? No. > I think I'd settle for just something that

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote: >  > Does that work on powerpc on existing kernels? > > Anyway, here's another crazy idea: make the quirk assume that the > IOMMU is bypasses if and only if the weak barriers bit is set on > systems that are missing the new DT binding.

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote: > But not virtio-pci I think - that's broken for that usecase since we use > weaker barriers than required for real IO, as these have measureable > overhead.  We could have a feature "is a real PCI device", > that's completely

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 11:27 +0100, Joerg Roedel wrote: > > You have the same problem when real PCIe devices appear that speak > virtio. I think the only real (still not very nice) solution is to add a > quirk to powerpc platform code that sets noop dma-ops for the existing > virtio

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 10:45 +0100, Knut Omang wrote: > Can something be done by means of PCIe capabilities? > ATS (Address Translation Support) seems like a natural choice? Euh no... ATS is something else completely Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote: > > We could do it the other way around: on powerpc, if a PCI device is in > that range and doesn't have the "bypass" property at all, then it's > assumed to bypass the IOMMU.  This means that everything that > currently works continues

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote: > But not virtio-pci I think - that's broken for that usecase since we use > weaker barriers than required for real IO, as these have measureable > overhead.  We could have a feature "is a real PCI device", > that's completely

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 11:27 +0100, Joerg Roedel wrote: > > You have the same problem when real PCIe devices appear that speak > virtio. I think the only real (still not very nice) solution is to add a > quirk to powerpc platform code that sets noop dma-ops for the existing > virtio

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote: >  > Does that work on powerpc on existing kernels? > > Anyway, here's another crazy idea: make the quirk assume that the > IOMMU is bypasses if and only if the weak barriers bit is set on > systems that are missing the new DT binding.

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote: > > We could do it the other way around: on powerpc, if a PCI device is in > that range and doesn't have the "bypass" property at all, then it's > assumed to bypass the IOMMU.  This means that everything that > currently works continues

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote: > > > What about partition <-> partition virtio such as what we could do on > > PAPR systems. That would have the weak barrier bit. > > > > Is it partition <-> partition, bypassing IOMMU? No. > I think I'd settle for just something that

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 10:45 +0100, Knut Omang wrote: > Can something be done by means of PCIe capabilities? > ATS (Address Translation Support) seems like a natural choice? Euh no... ATS is something else completely Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-10 Thread Benjamin Herrenschmidt
On Tue, 2015-11-10 at 20:46 -0800, Andy Lutomirski wrote: > Me neither.  At least it wouldn't be a regression, but it's still > crappy. > > I think that arm is fine, at least.  I was unable to find an arm QEMU > config that has any problems with my patches. Ok, give me a few days for my headache

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote: > > Which leaves the special case of Xen, where even preexisting devices > don't bypass the IOMMU.  Can we keep this specific to powerpc and > sparc?  On x86, this problem is basically nonexistent, since the IOMMU > is properly

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote: > > /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. > */ > static const struct pci_device_id virtio_pci_id_table[] = { >     { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, >     { 0 } > }; > > Can we match on that range?

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote: > The problem here is that in some of the problematic cases the virtio > driver may not even be loaded.  If someone runs an L1 guest with an > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then > *boom* L1 crashes.  (Same

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
So ... I've finally tried to sort that out for powerpc and I can't find a way to make that work that isn't a complete pile of stinking shit. I'm very tempted to go back to my original idea: virtio itself should indicate it's "bypassing ability" via the virtio config space or some other bit (like

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote: > > /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. > */ > static const struct pci_device_id virtio_pci_id_table[] = { >     { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, >     { 0 } > }; > > Can we match on that range?

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote: > The problem here is that in some of the problematic cases the virtio > driver may not even be loaded.  If someone runs an L1 guest with an > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then > *boom* L1 crashes.  (Same

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote: > > Which leaves the special case of Xen, where even preexisting devices > don't bypass the IOMMU.  Can we keep this specific to powerpc and > sparc?  On x86, this problem is basically nonexistent, since the IOMMU > is properly

Re: [PATCH v4 0/6] virtio core DMA API conversion

2015-11-09 Thread Benjamin Herrenschmidt
So ... I've finally tried to sort that out for powerpc and I can't find a way to make that work that isn't a complete pile of stinking shit. I'm very tempted to go back to my original idea: virtio itself should indicate it's "bypassing ability" via the virtio config space or some other bit (like

Re: [PATCH] powerpc: allow cross-compilation of ppc64 kernel

2015-11-06 Thread Benjamin Herrenschmidt
On Fri, 2015-11-06 at 15:09 -0600, Scott Wood wrote: > On Thu, 2015-11-05 at 12:47 +0100, Laurent Vivier wrote: > > When I try to cross compile a ppc64 kernel, it generally > > fails on the VDSO stage. This is true for powerpc64 cross- > > compiler, but also when I try to build a ppc64le kernel >

Re: [PATCH] powerpc: allow cross-compilation of ppc64 kernel

2015-11-06 Thread Benjamin Herrenschmidt
On Fri, 2015-11-06 at 15:09 -0600, Scott Wood wrote: > On Thu, 2015-11-05 at 12:47 +0100, Laurent Vivier wrote: > > When I try to cross compile a ppc64 kernel, it generally > > fails on the VDSO stage. This is true for powerpc64 cross- > > compiler, but also when I try to build a ppc64le kernel >

Re: [GIT] Networking

2015-11-02 Thread Benjamin Herrenschmidt
On Mon, 2015-11-02 at 13:30 -0800, Andy Lutomirski wrote: > > I'll stop making inane arguments if you stop bashing arguments I > didn't make. :) I said the helpers were useful for multiplication (by > which I meant both signed and unsigned) and, to a lesser extent, for > signed addition and

Re: [GIT] Networking

2015-11-02 Thread Benjamin Herrenschmidt
On Mon, 2015-11-02 at 13:30 -0800, Andy Lutomirski wrote: > > I'll stop making inane arguments if you stop bashing arguments I > didn't make. :) I said the helpers were useful for multiplication (by > which I meant both signed and unsigned) and, to a lesser extent, for > signed addition and

Re: [PATCH v8 00/61] PCI: Resource allocation cleanup for v4.4

2015-10-31 Thread Benjamin Herrenschmidt
On Sat, 2015-10-31 at 14:51 -0400, David Miller wrote: > This is the way OF seems to work. > > It maps all of the ROMs essentially to the same address range, but > only enables one at a time as it inspects the ROMs and builds the > device tree during power-on. > > Then it makes sure all of them

Re: [PATCH v8 00/61] PCI: Resource allocation cleanup for v4.4

2015-10-31 Thread Benjamin Herrenschmidt
On Sat, 2015-10-31 at 14:51 -0400, David Miller wrote: > This is the way OF seems to work. > > It maps all of the ROMs essentially to the same address range, but > only enables one at a time as it inspects the ROMs and builds the > device tree during power-on. > > Then it makes sure all of them

Re: [PATCH v3 0/3] virtio DMA API core stuff

2015-10-28 Thread Benjamin Herrenschmidt
On Wed, 2015-10-28 at 16:40 +0900, Christian Borntraeger wrote: > We have discussed that at kernel summit. I will try to implement a dummy > dma_ops for > s390 that does 1:1 mapping and Ben will look into doing some quirk to handle > "old" > code in addition to also make it possible to mark

Re: [PATCH v3 0/3] virtio DMA API core stuff

2015-10-28 Thread Benjamin Herrenschmidt
On Wed, 2015-10-28 at 16:40 +0900, Christian Borntraeger wrote: > We have discussed that at kernel summit. I will try to implement a dummy > dma_ops for > s390 that does 1:1 mapping and Ben will look into doing some quirk to handle > "old" > code in addition to also make it possible to mark

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Benjamin Herrenschmidt
On Tue, 2015-10-27 at 19:30 -0700, Nishanth Aravamudan wrote: > On 28.10.2015 [11:20:05 +0900], Benjamin Herrenschmidt wrote: > > On Tue, 2015-10-27 at 18:54 -0700, Nishanth Aravamudan wrote: > > > > > > In "bypass" mode, what TCE size is used? Is it gua

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Benjamin Herrenschmidt
On Tue, 2015-10-27 at 18:54 -0700, Nishanth Aravamudan wrote: > > In "bypass" mode, what TCE size is used? Is it guaranteed to be 4K? None :-) The TCEs are completely bypassed. You get a N:M linear mapping of all memory starting at 1<<59 PCI side. > Seems like this would be a different platform

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-27 Thread Benjamin Herrenschmidt
On Wed, 2015-10-28 at 10:43 +1100, Julian Calaby wrote: > Hi Nishanth, > You'll be CCing the maintainers of each architecture on the patches > to > add the functions, so if they do have specific requirements, I'm sure > they'll let you know or provide patches. That sort of accross-all-arch change

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-27 Thread Benjamin Herrenschmidt
On Wed, 2015-10-28 at 10:43 +1100, Julian Calaby wrote: > Hi Nishanth, > You'll be CCing the maintainers of each architecture on the patches > to > add the functions, so if they do have specific requirements, I'm sure > they'll let you know or provide patches. That sort of accross-all-arch change

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Benjamin Herrenschmidt
On Tue, 2015-10-27 at 18:54 -0700, Nishanth Aravamudan wrote: > > In "bypass" mode, what TCE size is used? Is it guaranteed to be 4K? None :-) The TCEs are completely bypassed. You get a N:M linear mapping of all memory starting at 1<<59 PCI side. > Seems like this would be a different platform

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Benjamin Herrenschmidt
On Tue, 2015-10-27 at 19:30 -0700, Nishanth Aravamudan wrote: > On 28.10.2015 [11:20:05 +0900], Benjamin Herrenschmidt wrote: > > On Tue, 2015-10-27 at 18:54 -0700, Nishanth Aravamudan wrote: > > > > > > In "bypass" mode, what TCE size is used? Is it gua

Re: [PATCH v7 03/60] sparc/PCI: Unify pci_register_region()

2015-10-21 Thread Benjamin Herrenschmidt
On Wed, 2015-10-21 at 18:27 -0700, David Miller wrote: > From: Yinghai Lu > Date: Wed, 21 Oct 2015 11:16:53 -0700 > > > otherwise we need to compare res with pbm->mem_space or pbm > ->mem64_space > > to get direct parent for request_resource_conflict() calling in > >

Re: [PATCH v7 03/60] sparc/PCI: Unify pci_register_region()

2015-10-21 Thread Benjamin Herrenschmidt
On Wed, 2015-10-21 at 18:27 -0700, David Miller wrote: > From: Yinghai Lu > Date: Wed, 21 Oct 2015 11:16:53 -0700 > > > otherwise we need to compare res with pbm->mem_space or pbm > ->mem64_space > > to get direct parent for request_resource_conflict() calling in > >

Re: [PATCH] fbcon: initialize blink interval before calling fb_set_par

2015-10-20 Thread Benjamin Herrenschmidt
On Fri, 2015-10-09 at 15:08 +, Scot Doyle wrote: > Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459 > fbcon: use the cursor blink interval provided by vt > > a PPC64LE kernel fails to boot when fbcon_add_cursor_timer uses an > uninitialized ops->cur_blink_jiffies. Prevent by

Re: [PATCH] fbcon: initialize blink interval before calling fb_set_par

2015-10-20 Thread Benjamin Herrenschmidt
On Fri, 2015-10-09 at 15:08 +, Scot Doyle wrote: > Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459 > fbcon: use the cursor blink interval provided by vt > > a PPC64LE kernel fails to boot when fbcon_add_cursor_timer uses an > uninitialized ops->cur_blink_jiffies. Prevent by

Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking

2015-10-17 Thread Benjamin Herrenschmidt
On Sat, 2015-10-17 at 17:49 +0530, Aneesh Kumar K.V wrote: > This will break after > https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html > > > A good option is to drop this patch from the series and let Andrew take > the first two patches. You can send an updated version of

Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking

2015-10-17 Thread Benjamin Herrenschmidt
On Sat, 2015-10-17 at 17:49 +0530, Aneesh Kumar K.V wrote: > This will break after > https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html > > > A good option is to drop this patch from the series and let Andrew take > the first two patches. You can send an updated version of

Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 14:11 -0700, Andrew Morton wrote: > I grabbed these patches, but they're more a ppc thing than a core > kernel thing. I can merge them into 4.3 with suitable acks or drop > them if they turn up in the powerpc tree. Or something else? I'm happy for you to keep the generic

Re: [PATCH] powerpc: on crash, kexec'ed kernel needs all CPUs are online

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 09:57 +0200, Laurent Vivier wrote: > For me the problem is: as these CPUs are offline, I guess the core has > been switched to 1 thread per core, so the CPUs (1 to 7 for core 0) > don't exist anymore, how can we return them to OPAL ? Another option is to make the new kernel

Re: [PATCH] powerpc: on crash, kexec'ed kernel needs all CPUs are online

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 09:48 +0200, Laurent Vivier wrote: > > Yes, we know :) > > On the crash, as the CPUs are offline, kernel doesn't call > opal_return_cpu(), so for OPAL all these CPU are always in the > kernel. Hrm and they may even be in winkle state, so basically off... waking them up

Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 14:07 +0200, Laurent Dufour wrote: > As mentioned in the commit 56eecdb912b5 ("mm: Use > ptep/pmdp_set_numa() > for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do > tlb flush in set_pte/pmd functions. > > So when dealing with existing pte in clear_soft_dirty,

Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 14:07 +0200, Laurent Dufour wrote: > As mentioned in the commit 56eecdb912b5 ("mm: Use > ptep/pmdp_set_numa() > for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do > tlb flush in set_pte/pmd functions. > > So when dealing with existing pte in clear_soft_dirty,

Re: [PATCH] powerpc: on crash, kexec'ed kernel needs all CPUs are online

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 09:57 +0200, Laurent Vivier wrote: > For me the problem is: as these CPUs are offline, I guess the core has > been switched to 1 thread per core, so the CPUs (1 to 7 for core 0) > don't exist anymore, how can we return them to OPAL ? Another option is to make the new kernel

Re: [PATCH] powerpc: on crash, kexec'ed kernel needs all CPUs are online

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 09:48 +0200, Laurent Vivier wrote: > > Yes, we know :) > > On the crash, as the CPUs are offline, kernel doesn't call > opal_return_cpu(), so for OPAL all these CPU are always in the > kernel. Hrm and they may even be in winkle state, so basically off... waking them up

Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking

2015-10-16 Thread Benjamin Herrenschmidt
On Fri, 2015-10-16 at 14:11 -0700, Andrew Morton wrote: > I grabbed these patches, but they're more a ppc thing than a core > kernel thing. I can merge them into 4.3 with suitable acks or drop > them if they turn up in the powerpc tree. Or something else? I'm happy for you to keep the generic

Re: [PATCH] fbcon: initialize blink interval before calling fb_set_par

2015-10-09 Thread Benjamin Herrenschmidt
On Fri, 2015-10-09 at 15:08 +, Scot Doyle wrote: > Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459 > fbcon: use the cursor blink interval provided by vt > > a PPC64LE kernel fails to boot when fbcon_add_cursor_timer uses an > uninitialized ops->cur_blink_jiffies. Prevent by

Re: [PATCH] fbcon: initialize blink interval before calling fb_set_par

2015-10-09 Thread Benjamin Herrenschmidt
On Fri, 2015-10-09 at 15:08 +, Scot Doyle wrote: > Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459 > fbcon: use the cursor blink interval provided by vt > > a PPC64LE kernel fails to boot when fbcon_add_cursor_timer uses an > uninitialized ops->cur_blink_jiffies. Prevent by

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Benjamin Herrenschmidt
On Fri, 2015-10-02 at 14:04 -0700, Nishanth Aravamudan wrote: > Right, I did start with your advice and tried that approach, but it > turned out I was wrong about the actual issue at the time. The problem > for NVMe isn't actually the starting address alignment (which it can > handle not being

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Benjamin Herrenschmidt
On Fri, 2015-10-02 at 13:09 -0700, Nishanth Aravamudan wrote: > 1) add a generic dma_get_page_shift implementation that just returns > PAGE_SHIFT So you chose to return the granularity of the iommu to the driver rather than providing a way for the driver to request a specific alignment for DMA

<    7   8   9   10   11   12   13   14   15   16   >