Re: [RFC PATCH] iommu/iova: Add a best-fit algorithm

Robin Murphy Thu, 20 Feb 2020 10:43:15 -0800

On 20/02/2020 6:38 am, [email protected] wrote:

On 2020-02-17 08:03, Robin Murphy wrote:

On 14/02/2020 11:06 pm, Isaac J. Manjarres wrote:

From: Liam Mark <[email protected]>


Using the best-fit algorithm, instead of the first-fit
algorithm, may reduce fragmentation when allocating
IOVAs.


What kind of pathological allocation patterns make that a serious
problem? Is there any scope for simply changing the order of things in
the callers? Do these drivers also run under other DMA API backends
(e.g. 32-bit Arm)?

The usecases where the IOVA space has been fragmented havenon-deterministic allocationpatterns, and thus, it's not feasible to change the allocation order toavoid fragmenting

the IOVA space.

What about combining smaller buffers into larger individual allocations;any scope for that sort of thing? Certainly if you're consistentlyallocating small things less than PAGE_SIZE then DMA pools would beuseful to avoid wanton memory wastage in general.

 From what we've observed, the usecases involve allocations of two types of

buffers: one type of buffer between 1 KB to 4 MB in size, and anothertype of

buffer between 1 KB to 400 MB in size.

The pathological scenarios seem to arise when there are

many (100+) randomly distributed non-power of two allocations, which insome cases leaves

behind holes of up to 100+ MB in the IOVA space.

Here are some examples that show the state of the IOVA space under whichfailure to

allocate an IOVA was observed:

Instance 1:
     Currently mapped total size : ~1.3GB
     Free space available : ~2GB
     Map for ~162MB fails.
         Max contiguous space available : < 162MB

Instance 2:
     Currently mapped total size : ~950MB
     Free space available : ~2.3GB
     Map for ~320MB fails.
     Max contiguous space available : ~189MB

Instance 3:
     Currently mapped total size : ~1.2GB
     Free space available : ~2.7GB
     Map for ~162MB fails.
     Max contiguous space available : <162MB

We are still in the process of collecting data with the best-fitalgorithm enabled

to provide some numbers to show that it results in less IOVA space
fragmentation.

Thanks for those examples, and I'd definitely like to see thecomparative figures. To dig a bit further, at the point where thingsstart failing, where are the cached nodes pointing? IIRC there is stilla pathological condition where empty space between limit_pfn andcached32_node gets 'lost' if nothing in between is freed, so the biggerthe range of allocation sizes, the worse the effect, e.g.:


(considering an empty domain, pfn 0 *not* reserved, 32-bit limit_pfn)

        alloc 4K, succeeds, cached32_node now at 4G-4K
        alloc 2G, succeeds, cached32_node now at 0
        alloc 4K, fails despite almost 2G contiguous free space within limit_pfn

(and max32_alloc_size==1 now fast-forwards *any* further allocationattempt to failure)

If you're falling foul of this case (I was never sure how realistic aproblem it would be in practice), there are at least a couple of muchless invasive tweaks I can think of that would be worth exploring.

To answer your question about whether if this driver run under other DMAAPI backends:
yes, such as 32 bit ARM.


OK, that's what I suspected :)

AFAICS arch/arm's __alloc_iova() is also a first-fit algorithm, so ifyou get better behaviour there then it would suggest that this aspectisn't really the most important issue. Certainly, the fact that the"best fit" logic here also happens to ignore the cached nodes does startdrawing me back to the point above.


Robin.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH] iommu/iova: Add a best-fit algorithm

Reply via email to