On Thu, Nov 14, 2013 at 05:42:26PM +0200, Avi Kivity wrote: > On 11/14/2013 05:37 PM, Michael S. Tsirkin wrote: > >On Thu, Nov 14, 2013 at 04:56:43PM +0200, Avi Kivity wrote: > >>On 11/14/2013 04:40 PM, Michael S. Tsirkin wrote: > >>>On Thu, Nov 14, 2013 at 08:54:11AM +0000, Avi Kivity wrote: > >>>>Michael S. Tsirkin <mst <at> redhat.com> writes: > >>>> > >>>>>At the moment, memory radix tree is already variable width, but it can > >>>>>only skip the low bits of address. > >>>>> > >>>>>This is efficient if we have huge memory regions but inefficient if we > >>>>>are only using a tiny portion of the address space. > >>>>> > >>>>>After we have built up the map, detect > >>>>>configurations where a single L2 entry is valid. > >>>>> > >>>>>We then speed up the lookup by skipping one or more levels. > >>>>>In case any levels were skipped, we might end up in a valid section > >>>>>instead of erroring out. We handle this by checking that > >>>>>the address is in range of the resulting section. > >>>>> > >>>>I think this is overkill. It can be done in a simpler way as follows: > >>>> > >>>> > >>>>phys_page_find(RadixTree* tr, hwaddr index, ...) > >>>>{ > >>>> if (index & rt->invalid_index_mask) { > >>>> // not found > >>>> } > >>>> lp = rt->root; > >>>> for (i = rt->nb_levels - 1; i >= 0 && !lp.is_leaf; --i) { > >>>> ... > >>>> > >>>>This exploits the fact the lower portion of the address space is always > >>>>filled, at least in the cases that matter to us. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>Basically skip unused high bits? > >>>Sure. > >>>In fact I think both optimizations can be combined. > >>Not much value in combining them -- the variable level tree check > >>will be dominated by the level skip logic. > >> > >>IMO however skipping intermediate levels will be too rare to justify > >>the complexity and the doubling of the page table size -- it can > >>only happen in iommu setups that place memory in very high > >>addresses. These ought to be rare. > >> > >Well maybe not very high address, but you can have a device with > >a 64 bit bar and this will add back levels (though it would not > >be slower than it is currently). > > > >I agree the simplicity is attractive. > > > >However I really would like some logic that can handle > 1 leaf > >somehow. > > > >Specifically both tricks break if we add a full 64 bit io region > >in order to return -1 for reads on master abort. > > That can be achieved by directing a missed lookup to a catch-all region. > > It would be best to do this completely internally to the memory > code, without API changes.
You mean e.g. add a catch-all region to the address space?