Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
are very simply implemented, but requiring callers to pass certain
parameters like count and align, and taking a boolean parameter to
check __GFP_NOWARN in the allocation flags. So every function call
duplicates similar work:
/*
[ Per discussion at v1, we decide to add two new functions and start
replacing callers one by one. For this series, it only touches the
dma-direct part. And instead of merging two PATCHes, I still keep
them separate so that we may easily revert PATCH-2 if anything bad
happens as last time
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for
On Thu, May 23, 2019 at 08:59:30PM -0600, dann frazier wrote:
> > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > index b2a87905846d..21f39a6cb04f 100644
> > > --- a/kernel/dma/contiguous.c
> > > +++ b/kernel/dma/contiguous.c
> > > @@ -214,6 +214,54 @@ bool
On Thu, May 23, 2019 at 7:52 PM dann frazier wrote:
>
> On Mon, May 6, 2019 at 4:35 PM Nicolin Chen wrote:
> >
> > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > are very simply implemented, but requiring callers to pass certain
> > parameters like count and align, and
On Mon, May 6, 2019 at 4:35 PM Nicolin Chen wrote:
>
> Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> are very simply implemented, but requiring callers to pass certain
> parameters like count and align, and taking a boolean parameter to
> check __GFP_NOWARN in the
Keep the HPET-based hardlockup detector disabled unless explicitly enabled
via a command-line argument. If such parameter is not given, the
initialization of the hpet-based hardlockup detector fails and the NMI
watchdog will fallback to use the perf-based implementation.
Given that
The HPET-based hardlockup detector relies on the TSC to determine if an
observed NMI interrupt was originated by HPET timer. Hence, this detector
can no longer be used with an unstable TSC.
In such case, permanently stop the HPET-based hardlockup detector and
start the perf-based detector.
A recent change introduced a new member to struct irq_cfg to specify the
delivery mode of an interrupt. Supporting the configuration of the
delivery mode would require adding a third argument to prepare_irte().
Instead, simply take a pointer to a irq_cfg data structure as a the only
argument.
Hi,
This is the third attempt to demonstrate the implementation of a
hardlockup detector driven by the High-Precision Event Timer. This
version provides a few but important updates with respect the previous
version (please refer to the Changes since v3 section). The three initial
implementations
The generic hardlockup detector is based on perf. It also provides a set
of weak stubs that CPU architectures can override. Add a shim hardlockup
detector for x86 that selects between perf and hpet implementations.
Specifically, this shim implementation is needed for the HPET-based
hardlockup
This is the initial implementation of a hardlockup detector driven by an
HPET timer. This initial implementation includes functions to control the
timer via its registers. It also requests such timer, installs an NMI
interrupt handler and performs the initial configuration of the timer.
The
The only direct method to determine whether an HPET timer caused an
interrupt is to read the Interrupt Status register. Unfortunately,
reading HPET registers is slow and, therefore, it is not recommended to
read them while in NMI context. Furthermore, status is not available if
the interrupt is
Prepare hardlockup_panic_setup() to handle a comma-separated list of
options. This is needed to pass options to specific implementations of the
hardlockup detector.
Cc: "H. Peter Anvin"
Cc: Ashok Raj
Cc: Andi Kleen
Cc: Tony Luck
Cc: Peter Zijlstra
Cc: Clemens Ladisch
Cc: Arnd Bergmann
Cc:
When interrupt remapping is enabled in the system, the MSI interrupt
message must follow a special format the IOMMU can understand. Hence,
utilize the functionality provided by the IOMMU driver for such purpose.
The first step is to determine whether interrupt remapping is enabled
by looking for
When interrupt remapping is enabled, MSI interrupt messages must follow a
special format that the IOMMU can understand. Hence, when the HPET hard
lockup detector is used with interrupt remapping, it must also follow this
special format.
The IOMMU, given the information about a particular
When there are more than one implementation of the NMI watchdog, there may
be situations in which switching from one to another is needed (e.g., if
the time-stamp counter becomes unstable, the HPET-based NMI watchdog can
no longer be used.
The perf-based implementation of the hardlockup detector
Each CPU should be monitored for hardlockups every watchdog_thresh seconds.
Since all the CPUs in the system are monitored by the same timer and the
timer interrupt is rotated among the monitored CPUs, the timer must expire
every watchdog_thresh/N seconds; where N is the number of monitored CPUs.
The current default implementation of the hardlockup detector assumes that
it is implemented using perf events. However, the hardlockup detector can
be driven by other sources of non-maskable interrupts (e.g., a properly
configured timer).
Group and wrap in #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
Until now, the delivery mode of MSI interrupts is set to the default
mode set in the APIC driver. However, there are no restrictions in hardware
to configure each interrupt with a different delivery mode. Specifying the
delivery mode per interrupt is useful when one is interested in changing
the
HPET timer 2 will be used to drive the HPET-based hardlockup detector.
Reserve such timer to ensure it cannot be used by user space programs or
for clock events.
When looking for MSI-capable timers for clock events, skip timer 2 if
the HPET hardlockup detector is selected.
Cc: "H. Peter Anvin"
The procedure to detect hardlockups is independent of the underlying
mechanism that generates the non-maskable interrupt used to drive the
detector. Thus, it can be put in a separate, generic function. In this
manner, it can be invoked by various implementations of the NMI watchdog.
For this
Instead of setting the timer period directly in hpet_set_periodic(), add a
new helper function hpet_set_comparator() that only sets the accumulator
and comparator. hpet_set_periodic() will only prepare the timer for
periodic mode and leave the expiration programming to
hpet_set_comparator().
This
It is easier to compute the expiration times of an HPET timer by using
its frequency (i.e., the number of times it ticks in a second) than its
period, as given in the capabilities register.
In addition to the HPET char driver, the HPET-based hardlockup detector
will also need to know the timer's
Until now, the delivery mode of APIC interrupts is set to the default
mode set in the APIC driver. However, there are no restrictions in hardware
to configure each interrupt with a different delivery mode. Specifying the
delivery mode per interrupt is useful when one is interested in changing
the
In order to allow hpet_writel() to be used by other components (e.g.,
the HPET-based hardlockup detector) expose it in the HPET header file.
No empty definition is needed if CONFIG_HPET is not selected as all
existing callers select such config symbol.
Cc: "H. Peter Anvin"
Cc: Ashok Raj
Cc:
Implement the initial configuration of the timer to be used by the
hardlockup detector. Return a data structure with a description of the
timer; this information is subsequently used by the hardlockup detector.
Only provide the timer if it supports Front Side Bus interrupt delivery.
This
Add a NMI_WATCHDOG as a new category of NMI handler. This new category
is to be used with the HPET-based hardlockup detector. This detector
does not have a direct way of checking if the HPET timer is the source of
the NMI. Instead it indirectly estimate it using the time-stamp counter.
Therefore,
This is another resend as there has been no feedback since v4.
Seems Jon has been MIA this past cycle so hopefully he appears on the
list soon.
I've addressed the feedback so far and rebased on the latest kernel
and would like this to be considered for merging this cycle.
The only outstanding
When the ntb_msi_test module is available, the test code will trigger
each of the interrupts and ensure the corresponding occurrences files
gets incremented.
Signed-off-by: Logan Gunthorpe
Cc: Jon Mason
Cc: Dave Jiang
Cc: Allen Hubbe
---
tools/testing/selftests/ntb/ntb_test.sh | 54
Introduce a tool to test NTB MSI interrupts similar to the other
NTB test tools. This tool creates a debugfs directory for each
NTB device with the following files:
port
irqX_occurrences
peerX/port
peerX/count
peerX/trigger
The 'port' file tells the user the local port number and the
Introduce the module parameter 'use_msi' which, when set, uses
MSI interrupts instead of doorbells for each queue pair (QP). The
parameter is only available if NTB MSI support is configured into
the kernel. We also require there to be more than one memory window
(MW) so that an extra one is
The NTB MSI library allows passing MSI interrupts across a memory
window. This offers similar functionality to doorbells or messages
except will often have much better latency and the client can
potentially use significantly more remote interrupts than typical hardware
provides for doorbells.
This patch introduces the "Logical Port Number" which is similar to the
"Port Number" in that it enumerates the ports in the system.
The original (or Physical) "Port Number" can be any number used by the
hardware to uniquely identify a port in the system. The "Logical Port
Number" enumerates all
The kbuild system does not support having multiple source files in
a module if one of those source files has the same name as the module.
Therefore, we must rename ntb.c to core.c, while the module remains
ntb.ko.
This is similar to the way the nvme modules are structured.
Signed-off-by: Logan
For NTB devices, we want to be able to trigger MSI interrupts
through a memory window. In these cases we may want to use
more interrupts than the NTB PCI device has available in its MSI-X
table.
We allow for this by creating a new 'virtual' interrupt. These
interrupts are allocated as usual but
Seeing the we want to use more interrupts in the NTB MSI code
we need to be able allocate more (sometimes virtual) interrupts
in the switchtec driver. Therefore add a module parameter to
request to allocate additional interrupts.
This puts virtually no limit on the number of MSI interrupts
Add a blurb in Documentation/ntb.txt to describe the ntb_msi_test tool's
debugfs interface. Similar to the (out of date) ntb_tool description.
Signed-off-by: Logan Gunthorpe
---
Documentation/ntb.txt | 27 +++
1 file changed, 27 insertions(+)
diff --git
When using multi-ports each port uses resources (dbs, msgs, mws, etc)
on every other port. Creating a mapping for these resources such that
each port has a corresponding resource on every other port is a bit
tricky.
Introduce the ntb_peer_resource_idx() function for this purpose.
It returns the
On Tue, May 21, 2019 at 07:18:32PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> >If split pagetables are enabled, create a pagetable for TTBR1 and set
> >up the sign extension bit so that all
On 23/05/2019 19:06, Jean-Philippe Brucker wrote:
From: Jacob Pan
Traditionally, device specific faults are detected and handled within
their own device drivers. When IOMMU is enabled, faults such as DMA
related transactions are detected by IOMMU. There is no generic
reporting mechanism to
On 23/05/2019 19:06, Jean-Philippe Brucker wrote:
From: Jacob Pan
Device faults detected by IOMMU can be reported outside the IOMMU
subsystem for further processing. This patch introduces
a generic device fault data structure.
The fault can be either an unrecoverable fault or a page request,
From: Jacob Pan
Device faults detected by IOMMU can be reported outside the IOMMU
subsystem for further processing. This patch introduces
a generic device fault data structure.
The fault can be either an unrecoverable fault or a page request,
also referred to as a recoverable fault.
We only
Some IOMMU hardware features, for example PCI PRI and Arm SMMU Stall,
enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page
Requests and Stall events through the new fault reporting API. The
consumer of the fault can be either an I/O page fault handler in the host,
or a guest
From: Jacob Pan
Traditionally, device specific faults are detected and handled within
their own device drivers. When IOMMU is enabled, faults such as DMA
related transactions are detected by IOMMU. There is no generic
reporting mechanism to report faults back to the in-kernel device
driver or
Allow device drivers and VFIO to get notifications on IOMMU translation
fault, and to handle recoverable faults (PCI PRI). These four patches
are relatively mature since they are required by three different series,
and have been under discussion for a while:
* Nested translation support for
From: Jacob Pan
DMA faults can be detected by IOMMU at device level. Adding a pointer
to struct device allows IOMMU subsystem to report relevant faults
back to the device driver for further handling.
For direct assigned device (or user space drivers), guest OS holds
responsibility to handle and
On 23/05/2019 17:43, Christoph Hellwig wrote:
On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote:
Don't we have DMA_BIDIRECTIONAL for such case?
Not sure if it was intended for that case, but it definitively should
do the right thing for swiotlb, and it should also do the right
On 5/23/2019 7:43 PM, Christoph Hellwig wrote:
> On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote:
>> Don't we have DMA_BIDIRECTIONAL for such case?
>
> Not sure if it was intended for that case, but it definitively should
> do the right thing for swiotlb, and it should also do
On Thu, May 23, 2019 at 07:35:07AM +0200, Marek Szyprowski wrote:
> Don't we have DMA_BIDIRECTIONAL for such case?
Not sure if it was intended for that case, but it definitively should
do the right thing for swiotlb, and it should also do the right thing
in terms of cache maintainance.
> Maybe
On 5/23/2019 8:35 AM, Marek Szyprowski wrote:
> Hi Robin,
>
> On 2019-05-22 15:55, Robin Murphy wrote:
>> On 22/05/2019 14:34, Christoph Hellwig wrote:
>>> On Wed, May 22, 2019 at 02:25:38PM +0100, Robin Murphy wrote:
Sure, but that should be irrelevant since the effective problem here
On Thu, 23 May 2019 14:25:25 +0200
Pierre Morel wrote:
> We define a new device region in vfio.h to be able to
> get the ZPCI CLP information by reading this region from
> userland.
>
> We create a new file, vfio_zdev.h to define the structure
> of the new region we defined in vfio.h
>
>
On Thu, 23 May 2019 09:14:07 +0200
Auger Eric wrote:
> Hi Jacob,
>
> On 5/22/19 9:42 PM, Jacob Pan wrote:
> > On Tue, 21 May 2019 11:55:55 +0200
> > Auger Eric wrote:
> >
> >> Hi Jacob,
> >>
> >> On 5/4/19 12:32 AM, Jacob Pan wrote:
> >>> Sometimes, IOASID allocation must be handled by
On Thu, 23 May 2019 08:44:57 +
"Liu, Yi L" wrote:
> Hi Alex,
>
> Sorry to disturb you. Do you want to review on this version or review a
> rebased version? :-) If rebase version is better, I can try to do it asap.
Hi Yi,
Perhaps you missed my comments on 1/3:
On Thu, May 23, 2019 at 4:11 PM Robin Murphy wrote:
>
> On 2019-05-16 10:30 am, Vivek Gautam wrote:
> > Few Qualcomm platforms such as, sdm845 have an additional outer
> > cache called as System cache, aka. Last level cache (LLC) that
> > allows non-coherent devices to upgrade to using caching.
>
On 23/05/2019 08:00, Christoph Hellwig wrote:
Hi Robin and Joerg,
I think we are finally ready for the generic dma-iommu series. I have
various DMA API changes pending, and Tom has patches ready to convert
the AMD and Intel iommu drivers over to it. I'd love to have this
in a stable branch
For the generic implementation of VFIO PCI we need to retrieve
the hardware configuration for the PCI functions and the
PCI function groups.
We modify the internal function using CLP Query PCI function and
CLP query PCI function group so that they can be called from
outside the S390 architecture
We define a new configuration entry for VFIO/PCI, VFIO_PCI_ZDEV
to configure access to a zPCI region dedicated for retrieving
zPCI features.
When the VFIO_PCI_ZDEV feature is configured we initialize
a new device region, VFIO_REGION_SUBTYPE_ZDEV_CLP, to hold
the information from the ZPCI device
We define a new device region in vfio.h to be able to
get the ZPCI CLP information by reading this region from
userland.
We create a new file, vfio_zdev.h to define the structure
of the new region we defined in vfio.h
Signed-off-by: Pierre Morel
---
include/uapi/linux/vfio.h | 4
We define a new configuration entry for VFIO/PCI, VFIO_PCI_ZDEV
When the VFIO_PCI_ZDEV feature is configured we initialize
a new device region, VFIO_REGION_SUBTYPE_ZDEV_CLP, to hold
the information from the ZPCI device the userland needs to
give to a guest driving the zPCI function.
On 2019-05-16 10:30 am, Vivek Gautam wrote:
Few Qualcomm platforms such as, sdm845 have an additional outer
cache called as System cache, aka. Last level cache (LLC) that
allows non-coherent devices to upgrade to using caching.
This cache sits right before the DDR, and is tightly coupled
with
Hi Alex,
Sorry to disturb you. Do you want to review on this version or review a rebased
version? :-) If rebase version is better, I can try to do it asap.
Thanks,
Yi Liu
> -Original Message-
> From: Liu, Yi L
> Sent: Tuesday, April 23, 2019 8:15 PM
> To: alex.william...@redhat.com;
Hi Robin,
On Thu, May 16, 2019 at 3:00 PM Vivek Gautam
wrote:
>
> Few Qualcomm platforms such as, sdm845 have an additional outer
> cache called as System cache, aka. Last level cache (LLC) that
> allows non-coherent devices to upgrade to using caching.
> This cache sits right before the DDR,
Hi Jacob,
On 5/22/19 9:42 PM, Jacob Pan wrote:
> On Tue, 21 May 2019 11:55:55 +0200
> Auger Eric wrote:
>
>> Hi Jacob,
>>
>> On 5/4/19 12:32 AM, Jacob Pan wrote:
>>> Sometimes, IOASID allocation must be handled by platform specific
>>> code. The use cases are guest vIOMMU and pvIOMMU where
From: Robin Murphy
Most of the callers don't care, and the couple that do already have the
domain to hand for other reasons are in slow paths where the (trivial)
overhead of a repeated lookup will be utterly immaterial.
Signed-off-by: Robin Murphy
Signed-off-by: Christoph Hellwig
---
For entirely dma coherent architectures there is no requirement to ever
remap dma coherent allocation. Move all the remap and pool code under
IS_ENABLED() checks and drop the Kconfig dependency.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/Kconfig | 1 -
Signed-off-by: Christoph Hellwig
Acked-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 13 +
include/linux/dma-iommu.h | 13 +
2 files changed, 2 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index
Inline __iommu_dma_mmap and __iommu_dma_mmap_pfn into the main function,
and use the fact that __iommu_dma_get_pages return NULL for remapped
contigous allocations to simplify the code flow a bit.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 60
Signed-off-by: Christoph Hellwig
Acked-by: Robin Murphy
Reviewed-by: Mukesh Ojha
Acked-by: Catalin Marinas
---
arch/arm64/mm/dma-mapping.c | 15 +--
1 file changed, 1 insertion(+), 14 deletions(-)
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index
From: Robin Murphy
Since we duplicate the find_vm_area() logic a few times in places where
we only care aboute the pages, factor out a helper to abstract it.
Signed-off-by: Robin Murphy
[hch: don't warn when not finding a region, as we'll rely on that later]
Signed-off-by: Christoph Hellwig
From: Robin Murphy
Most importantly clear up the size / iosize confusion. Also rename addr
to cpu_addr to match the surrounding code and make the intention a little
more clear.
Signed-off-by: Robin Murphy
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig
---
From: Robin Murphy
Most of it can double up to serve the failure cleanup path for
iommu_dma_alloc().
Signed-off-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.
Signed-off-by: Christoph Hellwig
Acked-by: Catalin Marinas
---
arch/arm64/mm/dma-mapping.c | 10 --
1 file changed, 10 deletions(-)
diff --git a/arch/arm64/mm/dma-mapping.c
Inline __iommu_dma_get_sgtable_page into the main function, and use the
fact that __iommu_dma_get_pages return NULL for remapped contigous
allocations to simplify the code flow a bit.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 45
Move the call to dma_common_pages_remap into __iommu_dma_alloc and
rename it to iommu_dma_alloc_remap. This creates a self-contained
helper for remapped pages allocation and mapping.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 54
We only have a single caller of this function left, so open code it there.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 21 ++---
1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c
From: Robin Murphy
Shuffle around the self-contained atomic and non-contiguous cases to
return early and get out of the way of the CMA case that we're about to
work on next.
Signed-off-by: Robin Murphy
[hch: slight changes to the code flow]
Signed-off-by: Christoph Hellwig
---
From: Robin Murphy
The freeing logic was made particularly horrible by part of it being
opaque to the arch wrapper, which led to a lot of convoluted repetition
to ensure each path did everything in the right order. Now that it's
all private, we can pick apart and consolidate the
All the logic in iommu_dma_alloc that deals with page allocation from
the CMA or page allocators can be split into a self-contained helper,
and we can than map the result of that or the atomic pool allocation
with the iommu later. This also allows reusing __iommu_dma_free to
tear down the
From: Robin Murphy
Always remapping CMA allocations was largely a bodge to keep the freeing
logic manageable when it was split between here and an arch wrapper. Now
that it's all together and streamlined, we can relax that limitation.
Signed-off-by: Robin Murphy
Signed-off-by: Christoph
Instead of having a separate code path for the non-blocking alloc_pages
and CMA allocations paths merge them into one. There is a slight
behavior change here in that we try the page allocator if CMA fails.
This matches what dma-direct and other iommu drivers do and will be
needed to use the
From: Robin Murphy
The remaining internal callsites don't care about having prototypes
compatible with the relevant dma_map_ops callbacks, so the extra
level of indirection just wastes space and complictaes things.
Signed-off-by: Robin Murphy
Signed-off-by: Christoph Hellwig
---
Hi Robin and Joerg,
I think we are finally ready for the generic dma-iommu series. I have
various DMA API changes pending, and Tom has patches ready to convert
the AMD and Intel iommu drivers over to it. I'd love to have this
in a stable branch shared between the dma-mapping and iommu trees
We now have a arch_dma_prep_coherent architecture hook that is used
for the generic DMA remap allocator, and we should use the same
interface for the dma-iommu code.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
Acked-by: Catalin Marinas
---
arch/arm64/mm/dma-mapping.c | 8
Moving this function up to its unmap counterpart helps to keep related
code together for the following changes.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 46 +++
1 file changed, 23 insertions(+), 23
arch_dma_prep_coherent can handle physically contiguous ranges larger
than PAGE_SIZE just fine, which means we don't need a page-based
iterator.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma-iommu.c | 14 +-
1 file changed, 5 insertions(+), 9
There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained. Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this
No need for a __KERNEL__ guard outside uapi and add a missing comment
describing the #else cpp statement. Last but not least include
instead of the asm version, which is frowned upon.
Signed-off-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
include/linux/dma-iommu.h | 6 ++
1 file
88 matches
Mail list logo