On Wed, Feb 28, 2018 at 7:15 PM, Tian, Kevin <[email protected]> wrote:
>> From: David Woodhouse
>> Sent: Tuesday, February 27, 2018 3:48 PM
>>
>> On Mon, 2018-02-26 at 15:01 -0800, Alexander Duyck wrote:
>> > I am interested in adding a new memory mapping option that
>> > establishes
>> > one identity-mapped region for all DMA_TO_DEVICE mappings and
>> creates
>> > a new dynamic mapping for any DMA_FROM_DEVICE and
>> DMA_BIDIRECTIONAL
>> > mappings. My thought is it should allow for a compromise between
>> > security and performance (in the case of networking) in that many of
>> > the server NICs drivers these days are running with mostly pinned or
>> > resused paged for Rx. By using an identity mapping for the Tx packets
>
> Rx packets?

Yes, Rx packets. In the case of drivers like ixgbevf the driver has a
page reuse mechanism in place that allows it to reuse the same page
multiple times for each Rx buffer. As a result we often end up
processing millions of packets and don't have to allocate and/or
map/unmap a page while doing so. As such we normally don't have to
invalidate or create any new mappings in the IOMMU for those packets
which greatly reduces the overhead.

>> > we should be able to significantly cut down on the IOMMU overhead for
>> > the device. The other advantage if this works is that we could use
>> > this to possibly do something like dirty page tracking in the case of
>> > a emulated version of the IOMMU.
>
> I didn't quite get this part. dirty-page tracking on emulated IOMMU
> doesn't rely on the changes that you are proposing - just monitor
> invalidation requests and then count all changed pages as dirty.
> Of course it would be overkill since even RO mapping changes are
> also counted. In that manner your proposal is still an optimization
> instead of a must to enable dirty-page tracking, correct?

Yes this is meant to be an optimization. One of the theoretical issues
with trying to use the IOMMU for dirty page tracking is that we are
currently having to update a paravirtual IOMMU to create and
invalidate mappings for every packet. With that being the case then
the argument could be make that we might as well just be using a
paravirtual network interface since many of the benefits of SR-IOV are
likely lost.

I feel that the design of drivers like ixgbevf already solves half the
issue since we almost never map/unmap Rx buffers. If we identity
mapped the Tx packets then that is the other half of the issue solved.
Essentially what you end up with is that the SR-IOV device, in this
case a VF running under something like ixgbevf, has a means to
provided the hypervisor with a list of pages that the device is
writing to so they could theoretically either be avoided for the live
migration, or at least we could keep track of them so that when the
device is either halted or invalidates the mapping we could then mark
the pages as dirty and migrate them.

The added bonus with all this is that we could probably use it as a
half-way point between the current iommu=pt solution that is often
used to get performance and the standard setup which gives you a bit
more in the way of security. At least with this we would probably see
just about the same throughput for most NICs, but we would have the
added security of a device that is unable to write outside of where we
had permitted it to.

>> >
>> > I was originally thinking I could get away with just reusing the
>> > identity mapping code but it looks like that would end up merging
>> > everything into one domain if I am understanding correctly. Do I have
>> > that right?
>> >
>> > Would I be correct in assuming that I will need to have a separate
>> > domain per device, each domain containing the 1 TO_DEVICE identity
>> > mapped region, and then whatever other mappings are needed to handle
>> > the FROM and BIDIRECTIONAL mappings?
>>
>> In the normal model where we explicitly map every RX and TX buffer, you
>> have a domain device anyway; that's not a new requirement for your
>> model.
>>
>> It sounds like an interesting idea; I agree that it's a reasonable
>> compromise between security and performance. The device can *read* all
>> of memory, but it can't write anywhere that isn't explicitly mapped.
>>
>> In addition, we're mapping buffers for RX some time in advance of them
>> being needed (replenishing the tail of the RX ring), where the latency
>> of the map operation hopefully shouldn't be quite so much of an issue.
>> While packets for TX can go straight to the device with no latency.
>> Overall, I think it might work really well.
>>
>> You don't want the existing identity mapping code; that will give you a
>> RW mapping which you don't want — you really do want read-only or this
>> whole exercise is pointless, right? And you're right, it would have put
>> the domain into the single identity domain.
>>
>> You could probably start by mocking this up with the IOMMU API. Create
>> a domain with the 1:1 read-only mapping of all memory, add your device
>> to it, and then do your writeable mappings on top (at IOVAs higher than
>> the top of physical memory). That's probably a quick way to assess
>> performance and prove the concept (although you don't get deferred
>> unmap of RX packets that way, which might mess things up a bit).
>
> Curious question. Do you think above could be a final solution or just
> a proof-of-concept? If the latter, will the cleaner option be to allow the
> device binding to multiple domains e.g. RO & RW which are selected
> based on DMA mapping direction (not sure whether it's an intrusive
> change to iommu driver which today likely assumes single binding)?
>
>>
>> When we expose this through the DMA API, I'd quite like this *not* to
>> be Intel-specific. It could reasonably live in a higher layer and be
>> usable with all kinds of IOMMU implementations.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to