Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-16 Thread Roger Pau Monné
On Fri, May 16, 2025 at 10:02:22AM +0200, Jan Beulich wrote:
> On 16.05.2025 09:48, Roger Pau Monné wrote:
> > Overall, I have the impression hvm_set_mem_pinned_cacheattr() should
> > only be used while building a domain, and hence the flush can likely
> > be skipped unconditionally, regardless of the type changes.
> 
> See my patch submission, which had this remark:
> 
> "Is it really sensible to add/remove ranges once the guest is already
>  running? (If it is, limiting the scope of the flush would be nice, but
>  would require knowing dirtyness for the domain wrt the caches, which
>  currently we don't track.)"

Well, I had an extra patch to attempt to do track vCPU dirtyness wrt
to the caches.

> 
> As apparently we both agree, why don't we reject requests post-creation
> then, and drop the flush? The one thing I'm uncertain about is whether
> the DM would actually have carried out the operation strictly ahead of
> the domain being first un-paused.

I've looked at QEMU, and I cannot convince myself the operation
couldn't be used against live domains, it's part of a memory listener
hook.

What is also concerning is that this seems to be completely ignored
when using AMD NPT, I can only find references to
hvm_get_mem_pinned_cacheattr() in EPT and shadow code.

Thanks, Roger.



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-16 Thread Jan Beulich
On 16.05.2025 09:48, Roger Pau Monné wrote:
> Overall, I have the impression hvm_set_mem_pinned_cacheattr() should
> only be used while building a domain, and hence the flush can likely
> be skipped unconditionally, regardless of the type changes.

See my patch submission, which had this remark:

"Is it really sensible to add/remove ranges once the guest is already
 running? (If it is, limiting the scope of the flush would be nice, but
 would require knowing dirtyness for the domain wrt the caches, which
 currently we don't track.)"

As apparently we both agree, why don't we reject requests post-creation
then, and drop the flush? The one thing I'm uncertain about is whether
the DM would actually have carried out the operation strictly ahead of
the domain being first un-paused.

Jan



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-16 Thread Roger Pau Monné
On Fri, May 16, 2025 at 09:00:42AM +0200, Jan Beulich wrote:
> On 15.05.2025 12:22, Roger Pau Monné wrote:
> > On Mon, May 12, 2025 at 05:04:56PM +0200, Jan Beulich wrote:
> >> On 06.05.2025 10:31, Roger Pau Monne wrote:
> >>> The current logic partially open-codes memory_type_changed(), but doesn't
> >>> check whether the type change or the cache flush is actually needed.
> >>> Instead switch to using memory_type_changed(), at possibly a higher 
> >>> expense
> >>> cost of not exclusively issuing cache flushes when limiting cacheability.
> >>>
> >>> However using memory_type_changed() has the benefit of limiting
> >>> recalculations and cache flushes to strictly only when it's meaningful due
> >>> to the guest configuration, like having devices or IO regions assigned.
> >>>
> >>> Signed-off-by: Roger Pau Monné 
> >>
> >> Hmm, I'm not convinced this is a win. This operation isn't normally used on
> >> a running guest, aiui.
> >>
> >> What's more, this heavily conflicts with a patch posted (again) over 2 
> >> years
> >> ago [1]. Unless there's something severely wrong with that (and unless your
> >> patch would make that old one unnecessary), I'm again of the opinion that
> >> that one should go in first. It is becoming increasingly noticeable that 
> >> it's
> >> unhelpful if posted patches aren't being looked at.
> > 
> > I'm happy for your change to go in first, but I think that
> > memory_type_changed() should be adjusted to contain the extra checks
> > that you add in your patch, and then hvm_set_mem_pinned_cacheattr()
> > should be switched to use memory_type_changed() rather than
> > open-coding it.
> 
> Maybe, but see my other reply to your patch.
> 
> >  For once it would miss the adjustment done to
> > memory_type_changed() to only flush the cache when there's a
> > meaningful change to the p2m (iow: p2m_memory_type_changed() is not a
> > no-op).
> 
> That could also be mirrored here, if there were (remaining) reasons to avoid
> use of memory_type_changed() for this function.

Indeed, but it's just more open-coding of memory_type_changed() in the
context of hvm_set_mem_pinned_cacheattr().  IMO I don't know exactly
the usage of this function, it seems like it should be mostly used at
domain build time, or maybe when doing hotplug of a device, so not
very often.

Given how complicated cache management is, I would prefer if the logic
is limited to few places (like memory_type_changed()), instead of
having open-coded implementations of the same logic in multiple
places.

Thanks, Roger.



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-16 Thread Roger Pau Monné
On Fri, May 16, 2025 at 08:58:39AM +0200, Jan Beulich wrote:
> On 06.05.2025 10:31, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/mtrr.c
> > +++ b/xen/arch/x86/hvm/mtrr.c
> > @@ -605,22 +605,8 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, 
> > uint64_t gfn_start,
> >  
> >  type = range->type;
> >  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
> > -p2m_memory_type_changed(d);
> > -switch ( type )
> > -{
> > -case X86_MT_UCM:
> > -/*
> > - * For EPT we can also avoid the flush in this case;
> > - * see epte_get_entry_emt().
> > - */
> > -if ( hap_enabled(d) && cpu_has_vmx )
> > -case X86_MT_UC:
> > -break;
> > -/* fall through */
> > -default:
> > -flush_all(FLUSH_CACHE);
> > -break;
> > -}
> > +memory_type_changed(d);
> > +
> >  return 0;
> >  }
> 
> Hmm, so your consideration to avoid the "goto" in my patch was perhaps this
> part of your change, where you expect the "return 0" could then stay here.
> Maybe. However, you removing the switch() means we'd then also flush in
> cases where currently (or with my change) we avoid doing so.

Oh, yes, just replied to your previous email with this suggestion.

I don't have a strong opinion, but I don't think the fine grained
flush avoidance is really required.  What's more, if we want to call
memory_type_changed() we must do so for any changes, as the call to
p2m_memory_type_changed() must be done unconditionally regardless of
whether the specific MTRR type change could allow us to avoid the
flush.

Overall, I have the impression hvm_set_mem_pinned_cacheattr() should
only be used while building a domain, and hence the flush can likely
be skipped unconditionally, regardless of the type changes.

Thanks, Roger.



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-16 Thread Jan Beulich
On 15.05.2025 12:22, Roger Pau Monné wrote:
> On Mon, May 12, 2025 at 05:04:56PM +0200, Jan Beulich wrote:
>> On 06.05.2025 10:31, Roger Pau Monne wrote:
>>> The current logic partially open-codes memory_type_changed(), but doesn't
>>> check whether the type change or the cache flush is actually needed.
>>> Instead switch to using memory_type_changed(), at possibly a higher expense
>>> cost of not exclusively issuing cache flushes when limiting cacheability.
>>>
>>> However using memory_type_changed() has the benefit of limiting
>>> recalculations and cache flushes to strictly only when it's meaningful due
>>> to the guest configuration, like having devices or IO regions assigned.
>>>
>>> Signed-off-by: Roger Pau Monné 
>>
>> Hmm, I'm not convinced this is a win. This operation isn't normally used on
>> a running guest, aiui.
>>
>> What's more, this heavily conflicts with a patch posted (again) over 2 years
>> ago [1]. Unless there's something severely wrong with that (and unless your
>> patch would make that old one unnecessary), I'm again of the opinion that
>> that one should go in first. It is becoming increasingly noticeable that it's
>> unhelpful if posted patches aren't being looked at.
> 
> I'm happy for your change to go in first, but I think that
> memory_type_changed() should be adjusted to contain the extra checks
> that you add in your patch, and then hvm_set_mem_pinned_cacheattr()
> should be switched to use memory_type_changed() rather than
> open-coding it.

Maybe, but see my other reply to your patch.

>  For once it would miss the adjustment done to
> memory_type_changed() to only flush the cache when there's a
> meaningful change to the p2m (iow: p2m_memory_type_changed() is not a
> no-op).

That could also be mirrored here, if there were (remaining) reasons to avoid
use of memory_type_changed() for this function.

Jan



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-15 Thread Jan Beulich
On 06.05.2025 10:31, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -605,22 +605,8 @@ int hvm_set_mem_pinned_cacheattr(struct domain *d, 
> uint64_t gfn_start,
>  
>  type = range->type;
>  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
> -p2m_memory_type_changed(d);
> -switch ( type )
> -{
> -case X86_MT_UCM:
> -/*
> - * For EPT we can also avoid the flush in this case;
> - * see epte_get_entry_emt().
> - */
> -if ( hap_enabled(d) && cpu_has_vmx )
> -case X86_MT_UC:
> -break;
> -/* fall through */
> -default:
> -flush_all(FLUSH_CACHE);
> -break;
> -}
> +memory_type_changed(d);
> +
>  return 0;
>  }

Hmm, so your consideration to avoid the "goto" in my patch was perhaps this
part of your change, where you expect the "return 0" could then stay here.
Maybe. However, you removing the switch() means we'd then also flush in
cases where currently (or with my change) we avoid doing so.

Jan



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-15 Thread Roger Pau Monné
On Mon, May 12, 2025 at 05:04:56PM +0200, Jan Beulich wrote:
> On 06.05.2025 10:31, Roger Pau Monne wrote:
> > The current logic partially open-codes memory_type_changed(), but doesn't
> > check whether the type change or the cache flush is actually needed.
> > Instead switch to using memory_type_changed(), at possibly a higher expense
> > cost of not exclusively issuing cache flushes when limiting cacheability.
> > 
> > However using memory_type_changed() has the benefit of limiting
> > recalculations and cache flushes to strictly only when it's meaningful due
> > to the guest configuration, like having devices or IO regions assigned.
> > 
> > Signed-off-by: Roger Pau Monné 
> 
> Hmm, I'm not convinced this is a win. This operation isn't normally used on
> a running guest, aiui.
> 
> What's more, this heavily conflicts with a patch posted (again) over 2 years
> ago [1]. Unless there's something severely wrong with that (and unless your
> patch would make that old one unnecessary), I'm again of the opinion that
> that one should go in first. It is becoming increasingly noticeable that it's
> unhelpful if posted patches aren't being looked at.

I'm happy for your change to go in first, but I think that
memory_type_changed() should be adjusted to contain the extra checks
that you add in your patch, and then hvm_set_mem_pinned_cacheattr()
should be switched to use memory_type_changed() rather than
open-coding it.  For once it would miss the adjustment done to
memory_type_changed() to only flush the cache when there's a
meaningful change to the p2m (iow: p2m_memory_type_changed() is not a
no-op).

Thanks, Roger.



Re: [PATCH 5/9] x86/mtrr: use memory_type_changed() in hvm_set_mem_pinned_cacheattr()

2025-05-12 Thread Jan Beulich
On 06.05.2025 10:31, Roger Pau Monne wrote:
> The current logic partially open-codes memory_type_changed(), but doesn't
> check whether the type change or the cache flush is actually needed.
> Instead switch to using memory_type_changed(), at possibly a higher expense
> cost of not exclusively issuing cache flushes when limiting cacheability.
> 
> However using memory_type_changed() has the benefit of limiting
> recalculations and cache flushes to strictly only when it's meaningful due
> to the guest configuration, like having devices or IO regions assigned.
> 
> Signed-off-by: Roger Pau Monné 

Hmm, I'm not convinced this is a win. This operation isn't normally used on
a running guest, aiui.

What's more, this heavily conflicts with a patch posted (again) over 2 years
ago [1]. Unless there's something severely wrong with that (and unless your
patch would make that old one unnecessary), I'm again of the opinion that
that one should go in first. It is becoming increasingly noticeable that it's
unhelpful if posted patches aren't being looked at.

Jan

[1] https://lists.xen.org/archives/html/xen-devel/2023-03/msg01551.html