On Thu, Nov 30, 2017 at 05:54:35PM +0800, Liu, Yi L wrote: > On Thu, Nov 30, 2017 at 05:11:55PM +0800, Peter Xu wrote: > > On Thu, Nov 30, 2017 at 01:22:38PM +0800, Liu, Yi L wrote: > > > On Tue, Nov 14, 2017 at 06:13:50PM -0500, prasad.singamse...@oracle.com > > > wrote: > > > > From: Prasad Singamsetty <prasad.singamse...@oracle.com> > > > > > > > > The current implementation of Intel IOMMU code only supports 39 bits > > > > iova address width. This patch provides a new parameter (x-aw-bits) > > > > for intel-iommu to extend its address width to 48 bits but keeping the > > > > default the same (39 bits). The reason for not changing the default > > > > is to avoid potential compatibility problems with live migration of > > > > intel-iommu enabled QEMU guest. The only valid values for 'x-aw-bits' > > > > parameter are 39 and 48. > > > > > > > > After enabling larger address width (48), we should be able to map > > > > larger iova addresses in the guest. For example, a QEMU guest that > > > > is configured with large memory ( >=1TB ). To check whether 48 bits > > > > > > I didn't quite get your point here. Address width limits the iova range, > > > but it doesn't limit the guest memory range. e.g. you can use 39 bit iova > > > address to access a guest physical address larger than (2^39 - 1) as long > > > as the guest 2nd level page table is well programmed. Only one exception, > > > if you requires a continuous iova range(e.g. 2^39), it would be an issue. > > > Not sure if this is the major motivation of your patch? However, I'm not > > > against extend the address width to be 48 bits. Just need to make it clear > > > here. > > > > One thing I can think of is the identity mapping. Say, when iommu=pt > > is set in guest, meanwhile when PT capability is not supported from > > hardware (here I mean, the emulated hardware, of course), guest kernel > > will create one identity mapping to emulate the PT mode. > > True. > > > Current linux kernel's identity mapping should be a static 48 bits > > mapping (it must cover the whole memory range of guest), so if we > > I suppose guest memory range depends on the AW reported by CPUID? Not sure > if it is constantly 48 bits.
Please refer to si_domain_init() and DEFAULT_DOMAIN_ADDRESS_WIDTH. > > > provide a 39 bits vIOMMU to the guest, AFAIU we'll fail at device > > attaching to that identity domain of every single device that is > > protected by that 39 bits vIOMMU (kernel will find that domain gaw is > > bigger than vIOMMU supported gaw of that device). > > Yeah, this is a good argue. As it is 1:1 mapping, the translated address > is limited all the same. > > > I do see no good fix for that, except boost the supported gaw to > > bigger ones. > > How about defaultly expose cap.PT bit? In that case, there will no 1:1 > mapping in guest side. Translation is skipped. So the IOMMU AW won't > limit the addressing. PT is defaultly on already from the first day it's there. :) -- Peter Xu