----- Original Message -----
> From: "Robin Murphy" <robin.mur...@arm.com>
> To: "Aaron Sierra" <asie...@xes-inc.com>, "Joerg Roedel" <j...@8bytes.org>
> Cc: "iommu" <iommu@lists.linux-foundation.org>, "Nate Watterson" 
> <nwatt...@codeaurora.org>
> Sent: Thursday, May 11, 2017 12:39:01 PM
> Subject: Re: [PATCH] iova: allow allocation of lowest domain PFN

Robin,

> Hi Aaron,
> 
> On 11/05/17 14:36, Aaron Sierra wrote:
>> I found it was impossible to allocate all of the address space
>> defined by an IOVA domain. For example, a trivial domain with 2 pages
>> would cover PFNs 0 (start) and 1 (limit), but attempting to allocate
>> more than one total PFN would fail because the start PFN could never be
>> allocated.
> 
> Does this only happen when start_pfn is 0? The recent fix in
> 5016bdb796b3 implies that allocating the entire address space works
> fine, albeit with a non-zero start_pfn. I was about to say that for all
> current callers, start_pfn is always at least 1 anyway, but I see the
> Nvidia DRM stuff in next is making that no longer true, oh well.

No, this limitation exists regardless of the start_pfn value. I've tested
0, 1, then powers-of-two up to half of the limit_pfn.

The problem with the test case laid out in 5016bdb796b3 is that start_pfn
is 1 and limit_pfn is odd (as usual), so the number of available PFNs is odd.
Therefore allocating two at a time can't possibly allocate all of the address
space.

>> This adds a function to prevent PFN limit calculations from dipping
>> "below" the start PFN in __alloc_and_insert_iova_range() and
>> __get_cached_rbnode(). It also alters the PFN validity checks in
>> __alloc_and_insert_iova_range() to anticipate possible PFN rollover.
>> These combine to allow every PFN within the IOVA domain to be available
>> for allocation.
>> 
>> Signed-off-by: Aaron Sierra <asie...@xes-inc.com>
>> ---
>>  drivers/iommu/iova.c | 43 +++++++++++++++++++++++++++++++------------
>>  1 file changed, 31 insertions(+), 12 deletions(-)
>> 
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 5c88ba7..9d92005b 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -32,6 +32,18 @@ static unsigned long iova_rcache_get(struct iova_domain
>> *iovad,
>>  static void init_iova_rcaches(struct iova_domain *iovad);
>>  static void free_iova_rcaches(struct iova_domain *iovad);
>>  
>> +static inline unsigned long
>> +bounded_next_pfn(struct iova_domain *iovad, struct iova *iova)
>> +{
>> +    unsigned long next_pfn = iova->pfn_lo - 1;
>> +
>> +    if (!iova->pfn_lo)
>> +            return iovad->start_pfn;
>> +
>> +    /* make sure the PFN doesn't rollover */
>> +    return (next_pfn > iova->pfn_lo) ? iovad->start_pfn : next_pfn;
> 
> Why this check? Substituting the definition of next_pfn:
> 
>       ((iova->pfn_lo - 1) > iova->pfn_lo)
>
> which is only true if iova->pfn_lo == 0, in which case we never get this
> far anyway.

That's a good question. That should be able to be simplified.
 
> I'm also not sure about returning start_pfn instead of 0 in the
> underflow case, since reserved entries may exist in the tree such that
> pfn_lo < start_pfn, and I'm not convinced that returning a "next" PFN
> which is actually "previous" to the lowest entry won't cause things to
> go horrendously wrong.

I didn't realize that reservations could occur below the defined IOVA
domain lower limit. Is that really intended behavior? I'd expected that if
you wanted to reserve 0, for example, you'd define a domain starting at
zero, then reserve that PFN before making the domain generally available.

[ snip ]

>>  
>> -    if (!curr) {
>> -            if (size_aligned)
>> -                    pad_size = iova_get_pad_size(size, limit_pfn);
>> -            if ((iovad->start_pfn + size + pad_size) > limit_pfn) {
> 
> Wouldn't it be sufficient to just change this to
> 
>       if ((iovad->start_pfn + size + pad_size - 1) > limit_pfn)

That may be. My original solution was equivalent, but factored differently
(i.e. limit_pfn + 1). I couldn't fully wrap my head around the logic and
was worried about missing corner cases. If that is indeed a complete fix,
could the refactoring below still be beneficial?

-Aaron S.

> ?
> 
> Since a size of 0 is invalid, that would appear to be sufficiently
> resistant to underflow.
> 
> Robin.
> 
>> -                    spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
>> -                    return -ENOMEM;
>> -            }
>> -    }
>> +    if (size_aligned)
>> +            pad_size = iova_get_pad_size(size, limit_pfn);
>>  
>>      /* pfn_lo will point to size aligned address if size_aligned is set */
>> -    new->pfn_lo = limit_pfn - (size + pad_size) + 1;
>> -    new->pfn_hi = new->pfn_lo + size - 1;
>> +    pfn_lo = limit_pfn - (size + pad_size) + 1;
>> +    pfn_hi = pfn_lo + size - 1;
>> +
>> +    /*
>> +     * We're working with unsigned values, so we have to be careful about
>> +     * how we detect a PFN "below" the lowest PFN possible; zero.
>> +     */
>> +    if (pfn_lo < iovad->start_pfn || pfn_lo > limit_pfn) {
>> +            spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
>> +            return -ENOMEM;
>> +    }
>> +
>> +    new->pfn_lo = pfn_lo;
>> +    new->pfn_hi = pfn_hi;
>>  
>>      /* If we have 'prev', it's a valid place to start the insertion. */
>>      iova_insert_rbtree(&iovad->rbroot, new, prev);
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to