Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-05-03 Thread Qian Cai
> On May 3, 2020, at 2:39 PM, Joerg Roedel wrote: > > Can I add your Tested-by when I > send them to the mailing list tomorrow? Sure. Feel free to add, Tested-by: Qian Cai ___ iommu mailing list iommu@lists.linux-foundation.org

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-05-03 Thread Joerg Roedel
Hi Qian, On Sun, May 03, 2020 at 09:04:03AM -0400, Qian Cai wrote: > > On Apr 29, 2020, at 7:20 AM, Joerg Roedel wrote: > > Can you please test this branch: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=amd-iommu-fixes > > > > It has the previous fix in

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-05-03 Thread Qian Cai
> On Apr 29, 2020, at 7:20 AM, Joerg Roedel wrote: > > On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: >> No dice. There could be some other races. For example, > > Can you please test this branch: > > >

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-29 Thread Qian Cai
> On Apr 29, 2020, at 7:20 AM, Joerg Roedel wrote: > > On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: >> No dice. There could be some other races. For example, > > Can you please test this branch: > > >

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-29 Thread Joerg Roedel
On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: > No dice. There could be some other races. For example, Can you please test this branch: https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=amd-iommu-fixes It has the previous fix in it and a couple more to

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-29 Thread Joerg Roedel
Hi Qian, On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: > > No dice. There could be some other races. For example, Okay, I think I know what is happening. The increase_address_space() function increases the address space, but does not update the DTE and does not flush the old DTE

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-20 Thread Qian Cai
> On Apr 18, 2020, at 2:34 PM, Joerg Roedel wrote: > > On Sat, Apr 18, 2020 at 09:01:35AM -0400, Qian Cai wrote: >> Hard to tell without testing further. I’ll leave that optimization in >> the future, and focus on fixing those races first. > > Yeah right, we should fix the existing races

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-19 Thread Qian Cai
> On Apr 18, 2020, at 2:34 PM, Joerg Roedel wrote: > > On Sat, Apr 18, 2020 at 09:01:35AM -0400, Qian Cai wrote: >> Hard to tell without testing further. I’ll leave that optimization in >> the future, and focus on fixing those races first. > > Yeah right, we should fix the existing races

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-18 Thread Joerg Roedel
On Sat, Apr 18, 2020 at 09:01:35AM -0400, Qian Cai wrote: > Hard to tell without testing further. I’ll leave that optimization in > the future, and focus on fixing those races first. Yeah right, we should fix the existing races first before introducing new ones ;) Btw, THANKS A LOT for tracking

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-18 Thread Qian Cai
> On Apr 18, 2020, at 8:10 AM, Joerg Roedel wrote: > > Yes, your patch still looks racy. You need to atomically read > domain->pt_root to a stack variable and derive the pt_root pointer and > the mode from that variable instead of domain->pt_root directly. If you > read the domain->pt_root

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-18 Thread Joerg Roedel
On Thu, Apr 16, 2020 at 09:42:41PM -0400, Qian Cai wrote: > So, this is still not enough that would still trigger storage driver offline > under > memory pressure for a bit longer. It looks to me that in fetch_pte() there are > could still racy? Yes, your patch still looks racy. You need to

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-16 Thread Qian Cai
> On Apr 13, 2020, at 9:36 PM, Qian Cai wrote: > > > >> On Apr 8, 2020, at 10:19 AM, Joerg Roedel wrote: >> >> Hi Qian, >> >> On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: >>> After further testing, the change along is insufficient. What I am chasing >>> right >>> now is the

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-13 Thread Qian Cai
> On Apr 8, 2020, at 10:19 AM, Joerg Roedel wrote: > > Hi Qian, > > On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: >> After further testing, the change along is insufficient. What I am chasing >> right >> now is the swap device will go offline after heavy memory pressure below.

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-08 Thread Joerg Roedel
Hi Qian, On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: > After further testing, the change along is insufficient. What I am chasing > right > now is the swap device will go offline after heavy memory pressure below. The > symptom is similar to what we have in the commit, > >

Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-07 Thread Qian Cai
> On Apr 6, 2020, at 10:12 PM, Qian Cai wrote: > > fetch_pte() could race with increase_address_space() because it held no > lock from iommu_unmap_page(). On the CPU that runs fetch_pte() it could > see a stale domain->pt_root and a new increased domain->mode from > increase_address_space().

[RFC PATCH] iommu/amd: fix a race in fetch_pte()

2020-04-06 Thread Qian Cai
fetch_pte() could race with increase_address_space() because it held no lock from iommu_unmap_page(). On the CPU that runs fetch_pte() it could see a stale domain->pt_root and a new increased domain->mode from increase_address_space(). As the result, it could trigger invalid accesses later on. Fix