> On Apr 6, 2020, at 10:12 PM, Qian Cai <[email protected]> wrote: > > fetch_pte() could race with increase_address_space() because it held no > lock from iommu_unmap_page(). On the CPU that runs fetch_pte() it could > see a stale domain->pt_root and a new increased domain->mode from > increase_address_space(). As the result, it could trigger invalid > accesses later on. Fix it by using a pair of smp_[w|r]mb in those > places.
After further testing, the change along is insufficient. What I am chasing right now is the swap device will go offline after heavy memory pressure below. The symptom is similar to what we have in the commit, 754265bcab78 (“iommu/amd: Fix race in increase_address_space()”) Apparently, it is no possible to take the domain->lock in fetch_pte() because it could sleep. Thoughts? [ 7292.727377][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd80000 flags=0x0010] [ 7292.740571][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd81000 flags=0x0010] [ 7292.753268][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd81900 flags=0x0010] [ 7292.766078][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd81d00 flags=0x0010] [ 7292.778447][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd82000 flags=0x0010] [ 7292.790724][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd82400 flags=0x0010] [ 7292.803195][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd82800 flags=0x0010] [ 7292.815664][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd82c00 flags=0x0010] [ 7292.828089][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd83000 flags=0x0010] [ 7292.840468][ T814] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffffffffffd83400 flags=0x0010] [ 7292.852795][ T814] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xffffffffffd83800 flags=0x0010] [ 7292.864566][ T814] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xffffffffffd83c00 flags=0x0010] [ 7326.583908][ C8] smartpqi 0000:23:00.0: controller is offline: status code 0x14803 [ 7326.592386][ C8] smartpqi 0000:23:00.0: controller offline [ 7326.663728][ C66] sd 0:1:0:0: [sda] tag#467 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=6s [ 7326.664321][ T2738] smartpqi 0000:23:00.0: resetting scsi 0:1:0:0 [ 7326.673849][ C66] sd 0:1:0:0: [sda] tag#467 CDB: opcode=0x28 28 00 02 ee 2e 60 00 00 08 00 [ 7326.680118][ T2738] smartpqi 0000:23:00.0: reset of scsi 0:1:0:0: FAILED [ 7326.688612][ C66] blk_update_request: I/O error, dev sda, sector 49163872 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 7326.695456][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.706632][ C66] Read-error on swap-device (254:1:45833824) [ 7326.714030][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.723382][T45523] sd 0:1:0:0: rejecting I/O to offline device [ 7326.727352][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.727379][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.727442][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery > > kernel BUG at drivers/iommu/amd_iommu.c:1704! > BUG_ON(unmapped && !is_power_of_2(unmapped)); > Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 > 07/10/2019 > RIP: 0010:amd_iommu_unmap+0x1b2/0x1d0 > Call Trace: > <IRQ> > __iommu_unmap+0x106/0x320 > iommu_unmap_fast+0xe/0x10 > __iommu_dma_unmap+0xdc/0x1a0 > iommu_dma_unmap_sg+0xae/0xd0 > scsi_dma_unmap+0xe7/0x150 > pqi_raid_io_complete+0x37/0x60 [smartpqi] > pqi_irq_handler+0x1fc/0x13f0 [smartpqi] > __handle_irq_event_percpu+0x78/0x4f0 > handle_irq_event_percpu+0x70/0x100 > handle_irq_event+0x5a/0x8b > handle_edge_irq+0x10c/0x370 > do_IRQ+0x9e/0x1e0 > common_interrupt+0xf/0xf > </IRQ> > > Signed-off-by: Qian Cai <[email protected]> > --- > drivers/iommu/amd_iommu.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 20cce366e951..22328a23335f 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -1434,6 +1434,11 @@ static bool increase_address_space(struct > protection_domain *domain, > *pte = PM_LEVEL_PDE(domain->mode, > iommu_virt_to_phys(domain->pt_root)); > domain->pt_root = pte; > + /* > + * Make sure fetch_pte() will see the new domain->pt_root before it > + * snapshots domain->mode. > + */ > + smp_wmb(); > domain->mode += 1; > > ret = true; > @@ -1460,6 +1465,8 @@ static u64 *alloc_pte(struct protection_domain *domain, > *updated = increase_address_space(domain, address, gfp) || > *updated; > > level = domain->mode - 1; > + /* To pair with smp_wmb() in increase_address_space(). */ > + smp_rmb(); > pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; > address = PAGE_SIZE_ALIGN(address, page_size); > end_lvl = PAGE_SIZE_LEVEL(page_size); > @@ -1545,6 +1552,8 @@ static u64 *fetch_pte(struct protection_domain *domain, > return NULL; > > level = domain->mode - 1; > + /* To pair with smp_wmb() in increase_address_space(). */ > + smp_rmb(); > pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; > *page_size = PTE_LEVEL_PAGE_SIZE(level); > > -- > 2.21.0 (Apple Git-122.2) > _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
