Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/19/18 4:17 PM, Razvan Cojocaru wrote: > On 11/19/18 5:58 PM, George Dunlap wrote: >> On 11/17/18 6:58 PM, Razvan Cojocaru wrote: >>> On 11/16/18 9:50 PM, Razvan Cojocaru wrote: On 11/16/18 7:59 PM, George Dunlap wrote: > On the other hand, we want the logdirty rangesets to actually match the > host's rangesets; using altp2m->max_mapped_pfn for this is clearly > wrong. The easiest fix would be just to explicitly use the host's > max_mapped_pfn when calculating the clipping. A more complete fix would > involve calculating two different ranges -- a "rangeset" range and a > "invalidate" range, the second of which would be clipped on altp2ms by > {min,max}_remapped_gfn. > > Something like the attached (compile-tested only). I'm partial to > having both patches applied, but I'd be open to arguments that we should > only use the first. Thanks! I haven't yet been able to think in depth about the logic, but I did manage to apply them. Just applying the first one allows me to set p2m->max_mapped_pfn = 0; without the ASSERT() crashing the hypervisor, and everything appears to work well. With both patches applies, the display remains frozen (things appear to behave - externally - in the same way as before the series). >>> >>> Right, I know why the second patch keeps the display frozen. It's >>> because for altp2ms (where it matters most), the patch basically does >>> invalidate_start = max(invalidate_start, p2m->min_remapped_gfn) and >>> invalidate_end = min(invalidate_end, p2m->max_remapped_gfn). >>> >>> However, as previously requested, I've now made p2m->max_remapped_gfn >>> begin life as 0 for altp2ms, and p2m->min_remapped_gfn is initialized to >>> INVALID_GFN, which is decimal 18446744073709551615. So we get >>> invalidate_end: 0, invalidate_start: 18446744073709551615, >>> invalidate_end < invalidate_start, resulting in nothing being done for >>> altp2ms, which is functionally back to square one. >> >> But this doesn't explain why my reasoning was wrong; and it's always >> dangerous to use a system whose behavior you don't really understand, >> even if it seems to work. >> >> It turns out I'd made a mistake in saying that altp2m->max_mapped_pfn == >> alt2m->max_remapped_gfn. The first is the highest gfn present in the >> altp2m, either copied from the hostp2m or changed; the second is the >> highest value changed (via p2m_altp2m_change_gfn()). >> >> What about using the attached patch instead? > > The attached patch does work, thanks! Shall I append them both to the > end of the series and send out V7? Yes, please. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/19/18 5:58 PM, George Dunlap wrote: > On 11/17/18 6:58 PM, Razvan Cojocaru wrote: >> On 11/16/18 9:50 PM, Razvan Cojocaru wrote: >>> On 11/16/18 7:59 PM, George Dunlap wrote: On the other hand, we want the logdirty rangesets to actually match the host's rangesets; using altp2m->max_mapped_pfn for this is clearly wrong. The easiest fix would be just to explicitly use the host's max_mapped_pfn when calculating the clipping. A more complete fix would involve calculating two different ranges -- a "rangeset" range and a "invalidate" range, the second of which would be clipped on altp2ms by {min,max}_remapped_gfn. Something like the attached (compile-tested only). I'm partial to having both patches applied, but I'd be open to arguments that we should only use the first. >>> >>> Thanks! I haven't yet been able to think in depth about the logic, but I >>> did manage to apply them. Just applying the first one allows me to set >>> p2m->max_mapped_pfn = 0; without the ASSERT() crashing the hypervisor, >>> and everything appears to work well. >>> >>> With both patches applies, the display remains frozen (things appear to >>> behave - externally - in the same way as before the series). >> >> Right, I know why the second patch keeps the display frozen. It's >> because for altp2ms (where it matters most), the patch basically does >> invalidate_start = max(invalidate_start, p2m->min_remapped_gfn) and >> invalidate_end = min(invalidate_end, p2m->max_remapped_gfn). >> >> However, as previously requested, I've now made p2m->max_remapped_gfn >> begin life as 0 for altp2ms, and p2m->min_remapped_gfn is initialized to >> INVALID_GFN, which is decimal 18446744073709551615. So we get >> invalidate_end: 0, invalidate_start: 18446744073709551615, >> invalidate_end < invalidate_start, resulting in nothing being done for >> altp2ms, which is functionally back to square one. > > But this doesn't explain why my reasoning was wrong; and it's always > dangerous to use a system whose behavior you don't really understand, > even if it seems to work. > > It turns out I'd made a mistake in saying that altp2m->max_mapped_pfn == > alt2m->max_remapped_gfn. The first is the highest gfn present in the > altp2m, either copied from the hostp2m or changed; the second is the > highest value changed (via p2m_altp2m_change_gfn()). > > What about using the attached patch instead? The attached patch does work, thanks! Shall I append them both to the end of the series and send out V7? Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/17/18 6:58 PM, Razvan Cojocaru wrote: > On 11/16/18 9:50 PM, Razvan Cojocaru wrote: >> On 11/16/18 7:59 PM, George Dunlap wrote: >>> On the other hand, we want the logdirty rangesets to actually match the >>> host's rangesets; using altp2m->max_mapped_pfn for this is clearly >>> wrong. The easiest fix would be just to explicitly use the host's >>> max_mapped_pfn when calculating the clipping. A more complete fix would >>> involve calculating two different ranges -- a "rangeset" range and a >>> "invalidate" range, the second of which would be clipped on altp2ms by >>> {min,max}_remapped_gfn. >>> >>> Something like the attached (compile-tested only). I'm partial to >>> having both patches applied, but I'd be open to arguments that we should >>> only use the first. >> >> Thanks! I haven't yet been able to think in depth about the logic, but I >> did manage to apply them. Just applying the first one allows me to set >> p2m->max_mapped_pfn = 0; without the ASSERT() crashing the hypervisor, >> and everything appears to work well. >> >> With both patches applies, the display remains frozen (things appear to >> behave - externally - in the same way as before the series). > > Right, I know why the second patch keeps the display frozen. It's > because for altp2ms (where it matters most), the patch basically does > invalidate_start = max(invalidate_start, p2m->min_remapped_gfn) and > invalidate_end = min(invalidate_end, p2m->max_remapped_gfn). > > However, as previously requested, I've now made p2m->max_remapped_gfn > begin life as 0 for altp2ms, and p2m->min_remapped_gfn is initialized to > INVALID_GFN, which is decimal 18446744073709551615. So we get > invalidate_end: 0, invalidate_start: 18446744073709551615, > invalidate_end < invalidate_start, resulting in nothing being done for > altp2ms, which is functionally back to square one. But this doesn't explain why my reasoning was wrong; and it's always dangerous to use a system whose behavior you don't really understand, even if it seems to work. It turns out I'd made a mistake in saying that altp2m->max_mapped_pfn == alt2m->max_remapped_gfn. The first is the highest gfn present in the altp2m, either copied from the hostp2m or changed; the second is the highest value changed (via p2m_altp2m_change_gfn()). What about using the attached patch instead? -George From f4c72ecb95cfa5d597b8e28e99e681cfeaa32199 Mon Sep 17 00:00:00 2001 From: George Dunlap Date: Fri, 16 Nov 2018 16:28:25 + Subject: [PATCH 2/2] p2m: change_range_type: Only invalidate mapped gfns change_range_type() invalidates gfn ranges to lazily change the type of a range of gfns, and also modifies the logdirty rangesets of that p2m. At the moment, it clips both down by the hostp2m. While this will result in correct behavior, it's not entirely efficient, since invalidated entries outside that range will, on fault, simply be modified back to "empty" before faulting normally again. Separate out the calculation of the two ranges. Keep using the hostp2m's max_mapped_pfn to clip the logdirty ranges, but use the current p2m's max_mapped_pfn to further clip the invalidation range for alternate p2ms. Signed-off-by: George Dunlap --- xen/arch/x86/mm/p2m.c | 61 +-- 1 file changed, 42 insertions(+), 19 deletions(-) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 6d764d1e22..b97e138452 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1043,41 +1043,64 @@ static void change_type_range(struct p2m_domain *p2m, p2m_type_t ot, p2m_type_t nt) { unsigned long rangeset_start, rangeset_end; +unsigned long invalidate_start, invalidate_end; struct domain *d = p2m->domain; unsigned long host_max_pfn = p2m_get_hostp2m(d)->max_mapped_pfn; +unsigned long max_pfn = p2m->max_mapped_pfn; int rc = 0; -rangeset_start = start; -rangeset_end = end - 1; +/* + * If we have an altp2m, the logdirty rangeset range needs to + * match that of the hostp2m, but for efficiency, we want to clip + * down the the invalidation range according to the mapped values + * in the altp2m. Keep track of and clip the ranges separately. + */ +rangeset_start = invalidate_start = start; +rangeset_end = invalidate_end = end - 1; -/* Always clip the rangeset down to the host p2m */ +/* Clip down to the host p2m */ if ( unlikely(rangeset_end > host_max_pfn) ) -rangeset_end = host_max_pfn; +rangeset_end = invalidate_end = host_max_pfn; /* If the requested range is out of scope, return doing nothing */ if ( rangeset_start > rangeset_end ) return; +if ( p2m_is_altp2m(p2m) ) +invalidate_end = min(invalidate_end, max_pfn); + p2m->defer_nested_flush = 1; /* - * If all valid gfns are in the invalidation range, just do a - * global type change.
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/19/18 2:42 PM, Jan Beulich wrote: On 16.11.18 at 16:52, wrote: >> On 11/16/18 3:05 PM, George Dunlap wrote: >>> Now take change_type_range. The global effect of change_type_range >>> should be that reads of the p2m which happen afterwards should have the >>> new, changed value. >>> >>> In case A, change_type_range will write invalid entries up to >>> max_remapped_pfn, leaving the range between max_remapped_pfn and >>> hostp2m->max_mapped_pfn invalid. When a gfn in this range is read, an >>> EPT fault will happen, p2m_altp2m_lazy_copy() will be called, and the >>> new (correct) value copied from the hostp2m. >>> >>> In case B, change_type_range will write invalid entries up until >>> hostp2m->max_mapped_pfn. When a gfn in this range is accessed, a >>> MISCONFIG fault will happen, and the correct value will be calculated in >>> resolve_misconfig. >> >> Or, no: resolve_misconfig() will read the current type in the altp2m, >> which will be p2m_invalid; p2m_recalc_type() will then return >> p2m_invalid, and then set the entry to a "plain" invalid without the >> reserved bit set. Then the fault will happen again, taking the normal >> HAP fault path (calling p2m_altp2m_lazy_copy()), at which point... >> >>> And at this point, I realize that my previous analysis was probably >>> wrong, because at this point altp2m->max_remapped_gfn will be wrong: >>> entries above max_remapped_gfn will have become valid without going >>> through p2m_altp2m_lazy_copy(). >> >> ...altp2m->max_remapped_gfn will be set appropriately. >> >> I think. :-/ > > And I think I can follow your analysis now, which in turn means I > think your two patches are going to help, but for a proper review > I'd prefer them to be inline in a mail, rather than attachments. Please see my review of the second patch, which won't work with my series. Also, the patches George has sent are applicable after my series has been applied - would you and George like me to append his first patch to the end of my series, or would we be better off if he sends his patch first for the current code, and I rebase my series on that? Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
>>> On 16.11.18 at 16:52, wrote: > On 11/16/18 3:05 PM, George Dunlap wrote: >> Now take change_type_range. The global effect of change_type_range >> should be that reads of the p2m which happen afterwards should have the >> new, changed value. >> >> In case A, change_type_range will write invalid entries up to >> max_remapped_pfn, leaving the range between max_remapped_pfn and >> hostp2m->max_mapped_pfn invalid. When a gfn in this range is read, an >> EPT fault will happen, p2m_altp2m_lazy_copy() will be called, and the >> new (correct) value copied from the hostp2m. >> >> In case B, change_type_range will write invalid entries up until >> hostp2m->max_mapped_pfn. When a gfn in this range is accessed, a >> MISCONFIG fault will happen, and the correct value will be calculated in >> resolve_misconfig. > > Or, no: resolve_misconfig() will read the current type in the altp2m, > which will be p2m_invalid; p2m_recalc_type() will then return > p2m_invalid, and then set the entry to a "plain" invalid without the > reserved bit set. Then the fault will happen again, taking the normal > HAP fault path (calling p2m_altp2m_lazy_copy()), at which point... > >> And at this point, I realize that my previous analysis was probably >> wrong, because at this point altp2m->max_remapped_gfn will be wrong: >> entries above max_remapped_gfn will have become valid without going >> through p2m_altp2m_lazy_copy(). > > ...altp2m->max_remapped_gfn will be set appropriately. > > I think. :-/ And I think I can follow your analysis now, which in turn means I think your two patches are going to help, but for a proper review I'd prefer them to be inline in a mail, rather than attachments. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 9:50 PM, Razvan Cojocaru wrote: > On 11/16/18 7:59 PM, George Dunlap wrote: >> On the other hand, we want the logdirty rangesets to actually match the >> host's rangesets; using altp2m->max_mapped_pfn for this is clearly >> wrong. The easiest fix would be just to explicitly use the host's >> max_mapped_pfn when calculating the clipping. A more complete fix would >> involve calculating two different ranges -- a "rangeset" range and a >> "invalidate" range, the second of which would be clipped on altp2ms by >> {min,max}_remapped_gfn. >> >> Something like the attached (compile-tested only). I'm partial to >> having both patches applied, but I'd be open to arguments that we should >> only use the first. > > Thanks! I haven't yet been able to think in depth about the logic, but I > did manage to apply them. Just applying the first one allows me to set > p2m->max_mapped_pfn = 0; without the ASSERT() crashing the hypervisor, > and everything appears to work well. > > With both patches applies, the display remains frozen (things appear to > behave - externally - in the same way as before the series). Right, I know why the second patch keeps the display frozen. It's because for altp2ms (where it matters most), the patch basically does invalidate_start = max(invalidate_start, p2m->min_remapped_gfn) and invalidate_end = min(invalidate_end, p2m->max_remapped_gfn). However, as previously requested, I've now made p2m->max_remapped_gfn begin life as 0 for altp2ms, and p2m->min_remapped_gfn is initialized to INVALID_GFN, which is decimal 18446744073709551615. So we get invalidate_end: 0, invalidate_start: 18446744073709551615, invalidate_end < invalidate_start, resulting in nothing being done for altp2ms, which is functionally back to square one. In light of this analysis, I suppose I should now also disregard the recommendation to clip first_gfn and max_nr to fit within p2m->min_remapped_gfn and p2m->max_remapped_gfn for altp2ms in finish_type_change(), for obvious reasons - unless you prefer a different strategy. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 7:59 PM, George Dunlap wrote: > On 11/16/18 2:10 PM, Razvan Cojocaru wrote: >> On 11/16/18 2:03 PM, George Dunlap wrote: >>> The code is definitely complicated enough, though, that I may have >>> missed something, which is why I asked Razvan if there was a reason he >>> changed it. >>> >>> For the purposes of this patch, I propose having p2m_altp2m_init_ept() >>> set max_mapped_pfn to 0 (if that works), and leaving "get rid of >>> max_remapped_pfn" for a future clean-up series. >> >> I've retraced my previous analysis and re-ran some tests, and I now >> remember (sorry it took a while) why the p2m->max_mapped_pfn = >> hostp2m->max_mapped_pfn was both necessary and not accidental. >> >> Let's say we set it to 0 in p2m_altp2m_init_ept(). Then, >> hap_track_dirty_vram() calls p2m_change_type_range(), which calls the >> newly added change_type_range(). >> >> Change_type_range() looks like this: >> >> static void change_type_range(struct p2m_domain *p2m, >> unsigned long start, unsigned long end, >> p2m_type_t ot, p2m_type_t nt) >> { >> unsigned long gfn = start; >> struct domain *d = p2m->domain; >> int rc = 0; >> >> p2m->defer_nested_flush = 1; >> >> if ( unlikely(end > p2m->max_mapped_pfn) ) >> { >> if ( !gfn ) >> { >> p2m->change_entry_type_global(p2m, ot, nt); >> gfn = end; >> } >> end = p2m->max_mapped_pfn + 1; >> } >> if ( gfn < end ) >> rc = p2m->change_entry_type_range(p2m, ot, nt, gfn, end - 1); >> if ( rc ) >> { >> printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from >> %d to %d\n", >>rc, d->domain_id, start, end - 1, ot, nt); >> domain_crash(d); >> } >> >> switch ( nt ) >> { >> case p2m_ram_rw: >> if ( ot == p2m_ram_logdirty ) >> rc = rangeset_remove_range(p2m->logdirty_ranges, start, end >> - 1); >> break; >> case p2m_ram_logdirty: >> if ( ot == p2m_ram_rw ) >> rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1); >> break; >> default: >> break; >> } >> if ( rc ) >> { >> printk(XENLOG_G_ERR "Error %d manipulating Dom%d's log-dirty >> ranges\n", >>rc, d->domain_id); >> domain_crash(d); >> } >> >> p2m->defer_nested_flush = 0; >> if ( nestedhvm_enabled(d) ) >> p2m_flush_nestedp2m(d); >> } >> >> If we set p2m->max_mapped_pfn to 0, we're guaranteed to run into the if >> ( unlikely(end > p2m->max_mapped_pfn) ) body, where end = >> p2m->max_mapped_pfn + 1; will make end 1. >> >> Then, we will crash the hypervisor in rangeset_add_range(), where >> there's an ASSERT() stating that start <= end. > > Ah, right, this was the original crash that you ran into several months > ago, which flagged up the whole logdirty range synchronization issue. > > But that's partly a logic hole in change_entry_type_range(), which > assumes that start < p2m->max_mapped_pfn. It would be better to fix > that than to work around it by changing the meaning of max_mapped_pfn. > > On the other hand, we want the logdirty rangesets to actually match the > host's rangesets; using altp2m->max_mapped_pfn for this is clearly > wrong. The easiest fix would be just to explicitly use the host's > max_mapped_pfn when calculating the clipping. A more complete fix would > involve calculating two different ranges -- a "rangeset" range and a > "invalidate" range, the second of which would be clipped on altp2ms by > {min,max}_remapped_gfn. > > Something like the attached (compile-tested only). I'm partial to > having both patches applied, but I'd be open to arguments that we should > only use the first. Thanks! I haven't yet been able to think in depth about the logic, but I did manage to apply them. Just applying the first one allows me to set p2m->max_mapped_pfn = 0; without the ASSERT() crashing the hypervisor, and everything appears to work well. With both patches applies, the display remains frozen (things appear to behave - externally - in the same way as before the series). Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 2:10 PM, Razvan Cojocaru wrote: > On 11/16/18 2:03 PM, George Dunlap wrote: >> The code is definitely complicated enough, though, that I may have >> missed something, which is why I asked Razvan if there was a reason he >> changed it. >> >> For the purposes of this patch, I propose having p2m_altp2m_init_ept() >> set max_mapped_pfn to 0 (if that works), and leaving "get rid of >> max_remapped_pfn" for a future clean-up series. > > I've retraced my previous analysis and re-ran some tests, and I now > remember (sorry it took a while) why the p2m->max_mapped_pfn = > hostp2m->max_mapped_pfn was both necessary and not accidental. > > Let's say we set it to 0 in p2m_altp2m_init_ept(). Then, > hap_track_dirty_vram() calls p2m_change_type_range(), which calls the > newly added change_type_range(). > > Change_type_range() looks like this: > > static void change_type_range(struct p2m_domain *p2m, > unsigned long start, unsigned long end, > p2m_type_t ot, p2m_type_t nt) > { > unsigned long gfn = start; > struct domain *d = p2m->domain; > int rc = 0; > > p2m->defer_nested_flush = 1; > > if ( unlikely(end > p2m->max_mapped_pfn) ) > { > if ( !gfn ) > { > p2m->change_entry_type_global(p2m, ot, nt); > gfn = end; > } > end = p2m->max_mapped_pfn + 1; > } > if ( gfn < end ) > rc = p2m->change_entry_type_range(p2m, ot, nt, gfn, end - 1); > if ( rc ) > { > printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from > %d to %d\n", >rc, d->domain_id, start, end - 1, ot, nt); > domain_crash(d); > } > > switch ( nt ) > { > case p2m_ram_rw: > if ( ot == p2m_ram_logdirty ) > rc = rangeset_remove_range(p2m->logdirty_ranges, start, end > - 1); > break; > case p2m_ram_logdirty: > if ( ot == p2m_ram_rw ) > rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1); > break; > default: > break; > } > if ( rc ) > { > printk(XENLOG_G_ERR "Error %d manipulating Dom%d's log-dirty > ranges\n", >rc, d->domain_id); > domain_crash(d); > } > > p2m->defer_nested_flush = 0; > if ( nestedhvm_enabled(d) ) > p2m_flush_nestedp2m(d); > } > > If we set p2m->max_mapped_pfn to 0, we're guaranteed to run into the if > ( unlikely(end > p2m->max_mapped_pfn) ) body, where end = > p2m->max_mapped_pfn + 1; will make end 1. > > Then, we will crash the hypervisor in rangeset_add_range(), where > there's an ASSERT() stating that start <= end. Ah, right, this was the original crash that you ran into several months ago, which flagged up the whole logdirty range synchronization issue. But that's partly a logic hole in change_entry_type_range(), which assumes that start < p2m->max_mapped_pfn. It would be better to fix that than to work around it by changing the meaning of max_mapped_pfn. On the other hand, we want the logdirty rangesets to actually match the host's rangesets; using altp2m->max_mapped_pfn for this is clearly wrong. The easiest fix would be just to explicitly use the host's max_mapped_pfn when calculating the clipping. A more complete fix would involve calculating two different ranges -- a "rangeset" range and a "invalidate" range, the second of which would be clipped on altp2ms by {min,max}_remapped_gfn. Something like the attached (compile-tested only). I'm partial to having both patches applied, but I'd be open to arguments that we should only use the first. -George From d92bd123f92d66aef394735a6d836fd104f01867 Mon Sep 17 00:00:00 2001 From: George Dunlap Date: Fri, 16 Nov 2018 17:17:48 + Subject: [PATCH 1/2] p2m: Always use hostp2m when clipping rangesets The logdirty rangesets of the altp2ms need to be kept in sync with the hostp2m. This means when iterating through the altp2ms, we need to use the host p2m to clip the rangeset, not the indiviual altp2m's value. This change also: - Documents that the end is non-inclusive - Calculates an "inclusive" value for the end once, rather than open-coding the modification, and (worse) back-modifying updates so that the calculation ends up correct - Clarifies the logic deciding whether to call change_entry_type_global() or change_entry_type_range() - Handles the case where start >= hostp2m->max_mapped_pfn Signed-off-by: George Dunlap --- RFC: Wasn't sure what the best thing was to do if start >= host_max_pfn. We silently clip the logdirty rangeset to max_mapped_pfn, and the chosen behavior seems consistent with that. But it seems like such a request would almost certainly be a bug somewhere that people might like to find out about. --- xen/arch/x86/mm/p2m.c | 46 +++ 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/mm/p2m.c
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 3:05 PM, George Dunlap wrote: > Now take change_type_range. The global effect of change_type_range > should be that reads of the p2m which happen afterwards should have the > new, changed value. > > In case A, change_type_range will write invalid entries up to > max_remapped_pfn, leaving the range between max_remapped_pfn and > hostp2m->max_mapped_pfn invalid. When a gfn in this range is read, an > EPT fault will happen, p2m_altp2m_lazy_copy() will be called, and the > new (correct) value copied from the hostp2m. > > In case B, change_type_range will write invalid entries up until > hostp2m->max_mapped_pfn. When a gfn in this range is accessed, a > MISCONFIG fault will happen, and the correct value will be calculated in > resolve_misconfig. Or, no: resolve_misconfig() will read the current type in the altp2m, which will be p2m_invalid; p2m_recalc_type() will then return p2m_invalid, and then set the entry to a "plain" invalid without the reserved bit set. Then the fault will happen again, taking the normal HAP fault path (calling p2m_altp2m_lazy_copy()), at which point... > And at this point, I realize that my previous analysis was probably > wrong, because at this point altp2m->max_remapped_gfn will be wrong: > entries above max_remapped_gfn will have become valid without going > through p2m_altp2m_lazy_copy(). ...altp2m->max_remapped_gfn will be set appropriately. I think. :-/ -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 12:31 PM, Jan Beulich wrote: On 16.11.18 at 13:03, wrote: >> So the basic question is, does "max_mapped_pfn" mean, "Maximum pfn _for >> the domain_", or "Maximum pfn _for this p2m_". When the element was >> added to the p2m struct those were the same thing. Which do the various >> use cases expect it to be, and which would be the most robust going forward? > > So with the field getting updated right in ept_set_entry(), as long as > no copying of entries exists which does not go through that function, > I then agree that it shouldn't really matter whether the field gets > copied when setting up a new altp2m. > > However, fair parts of your further response are confusing to me, > rather than clarifying. That's for one because you talk about the > max_remapped_gfn field, but you never mention its > min_remapped_gfn sibling. The only place I could find where > current code consumes these two is p2m_altp2m_propagate_change(). > This suggests to me that both fields really only exist for optimization > purposes. > > Furthermore I in particular ... > >> I spent a bunch of time going through the code yesterday, and I'm pretty >> sure that as long as the value in the p2m is one or the other, there >> will be at the moment no _correctness_ issues. It looked to me like in >> the case where altp2m->max_mapped_pfn > altp2m->max_remapped_gfn, then >> the p2m machinery would do a certain amount of unnecessary work, but >> that's all. >> >> It also looked to me like before this patch, the value mostly ends up >> being "maximum pfn ever mapped in this p2m (even across altp2m >> desroys)". That's because when the altp2m is allocated, it starts as 0; >> every time an entry is propagated from the hostp2m to the altp2m, >> ept_set_entry() updates max_mapped_pfn; but nothing sets it back to zero. >> >> Also, hostp2m->max_mapped_pfn is never decreased, only increased. >> >> So either before or after this patch, altp2m->max_remapped_gfn <= >> altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn. >> >> In the vast majority of cases, max_mapped_pfn is explicitly being read >> from the hostp2m. >> >> Below are the cases where it may be read from an altp2m: >> >> - ept_get_entry(), ept_walk_tables(): If ==max_remapped_gfn, it will >> return INVALID_MFN early. If higher than max_remapped_gfn, it falls >> back to walking the altp2m's ept tables, which will give you the same >> answer, just a bit more slowly. >> >> - ept_dump_p2m_tables(): If ==max_remapped_gfn, it will dump only up to >> the current map; if >max_remapped_gfn, it will dump a number of >> unnecessary INVALID_MFN entries, but no more entries than the hostp2m. >> >> - p2m.c:change_type_range(): If ==max_remapped_gfn, it will only >> invalidate entries in the altp2m as high have been currently remapped. >> If >max_remapped_gfn, it will write invalid ept entries that *haven't* >> yet been copied over. But I don't think either one should cause a >> correctness issue: either way, accessing a gfn > max_remapped_gfn will >> cause a fault, at which point either the correct value will be copied >> from the hostp2m (perhaps going through resolve_misconfig() on the >> hostp2m in the process) or the correct value will be calculated via >> resolve_misconfig(). > > ... cannot identify any of the three cases above where I understand > you say a max_mapped_pfn == max_remapped_gfn comparison > happens. But as you say - the code is complicated enough, so I may > easily overlook something. Sorry, it seems I took too many shortcuts explaining things. :-) I was using max_remapped_gfn as a shorthand for, "the highest gfn mapped in the altp2m" (since that's what it will be equal to). Not that an actual comparison will happen there, but we're considering what will happen, based on various values of altp2m->max_mapped_pfn, when a gfn that is higher than the highest *remapped* gfn is encountered. So in the situation we're considering, the following are always true: - gfn > altp2m->max_remapped_gfn - altp2m->max_remapped_gfn <= altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn And we're comparing the results in the following cases: A: altp2m->max_mapped_pfn == alt2m->max_remapped_gfn B: altp2m->max_mapped_pfn > altp2m->max_remapped_gfn (Perhaps == hostp2m->max_mapped_pfn, perhaps something in between). Take ept_get_entry(). At this level, it's below all the mem_access / mem_sharing / mem_paging / altp2m abstractions. Apart from resolve_misconfig and pod, it should return the actual contents of the particular p2m it's given. In the case of an altp2m, any entries above what's currently been remapped should return empty. In case A, it will do this, because the first conditional will find that gfn > altp2m->max_mapped_pfn. In case B, it will also do this, because although it passes the first conditional, when it actually reads the table it will find an empty entry and return that. Both results are correct, but A is a tiny bit faster. Now take
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 2:03 PM, George Dunlap wrote: > The code is definitely complicated enough, though, that I may have > missed something, which is why I asked Razvan if there was a reason he > changed it. > > For the purposes of this patch, I propose having p2m_altp2m_init_ept() > set max_mapped_pfn to 0 (if that works), and leaving "get rid of > max_remapped_pfn" for a future clean-up series. I've retraced my previous analysis and re-ran some tests, and I now remember (sorry it took a while) why the p2m->max_mapped_pfn = hostp2m->max_mapped_pfn was both necessary and not accidental. Let's say we set it to 0 in p2m_altp2m_init_ept(). Then, hap_track_dirty_vram() calls p2m_change_type_range(), which calls the newly added change_type_range(). Change_type_range() looks like this: static void change_type_range(struct p2m_domain *p2m, unsigned long start, unsigned long end, p2m_type_t ot, p2m_type_t nt) { unsigned long gfn = start; struct domain *d = p2m->domain; int rc = 0; p2m->defer_nested_flush = 1; if ( unlikely(end > p2m->max_mapped_pfn) ) { if ( !gfn ) { p2m->change_entry_type_global(p2m, ot, nt); gfn = end; } end = p2m->max_mapped_pfn + 1; } if ( gfn < end ) rc = p2m->change_entry_type_range(p2m, ot, nt, gfn, end - 1); if ( rc ) { printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from %d to %d\n", rc, d->domain_id, start, end - 1, ot, nt); domain_crash(d); } switch ( nt ) { case p2m_ram_rw: if ( ot == p2m_ram_logdirty ) rc = rangeset_remove_range(p2m->logdirty_ranges, start, end - 1); break; case p2m_ram_logdirty: if ( ot == p2m_ram_rw ) rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1); break; default: break; } if ( rc ) { printk(XENLOG_G_ERR "Error %d manipulating Dom%d's log-dirty ranges\n", rc, d->domain_id); domain_crash(d); } p2m->defer_nested_flush = 0; if ( nestedhvm_enabled(d) ) p2m_flush_nestedp2m(d); } If we set p2m->max_mapped_pfn to 0, we're guaranteed to run into the if ( unlikely(end > p2m->max_mapped_pfn) ) body, where end = p2m->max_mapped_pfn + 1; will make end 1. Then, we will crash the hypervisor in rangeset_add_range(), where there's an ASSERT() stating that start <= end. So my reasoning was that, since setting p2m->max_mapped_pfn = hostp2m->max_mapped_pfn is both harmless (as both your and Jan's analyses appear to confirm) and makes this code correct, that was the way to go. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
>>> On 16.11.18 at 13:03, wrote: > So the basic question is, does "max_mapped_pfn" mean, "Maximum pfn _for > the domain_", or "Maximum pfn _for this p2m_". When the element was > added to the p2m struct those were the same thing. Which do the various > use cases expect it to be, and which would be the most robust going forward? So with the field getting updated right in ept_set_entry(), as long as no copying of entries exists which does not go through that function, I then agree that it shouldn't really matter whether the field gets copied when setting up a new altp2m. However, fair parts of your further response are confusing to me, rather than clarifying. That's for one because you talk about the max_remapped_gfn field, but you never mention its min_remapped_gfn sibling. The only place I could find where current code consumes these two is p2m_altp2m_propagate_change(). This suggests to me that both fields really only exist for optimization purposes. Furthermore I in particular ... > I spent a bunch of time going through the code yesterday, and I'm pretty > sure that as long as the value in the p2m is one or the other, there > will be at the moment no _correctness_ issues. It looked to me like in > the case where altp2m->max_mapped_pfn > altp2m->max_remapped_gfn, then > the p2m machinery would do a certain amount of unnecessary work, but > that's all. > > It also looked to me like before this patch, the value mostly ends up > being "maximum pfn ever mapped in this p2m (even across altp2m > desroys)". That's because when the altp2m is allocated, it starts as 0; > every time an entry is propagated from the hostp2m to the altp2m, > ept_set_entry() updates max_mapped_pfn; but nothing sets it back to zero. > > Also, hostp2m->max_mapped_pfn is never decreased, only increased. > > So either before or after this patch, altp2m->max_remapped_gfn <= > altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn. > > In the vast majority of cases, max_mapped_pfn is explicitly being read > from the hostp2m. > > Below are the cases where it may be read from an altp2m: > > - ept_get_entry(), ept_walk_tables(): If ==max_remapped_gfn, it will > return INVALID_MFN early. If higher than max_remapped_gfn, it falls > back to walking the altp2m's ept tables, which will give you the same > answer, just a bit more slowly. > > - ept_dump_p2m_tables(): If ==max_remapped_gfn, it will dump only up to > the current map; if >max_remapped_gfn, it will dump a number of > unnecessary INVALID_MFN entries, but no more entries than the hostp2m. > > - p2m.c:change_type_range(): If ==max_remapped_gfn, it will only > invalidate entries in the altp2m as high have been currently remapped. > If >max_remapped_gfn, it will write invalid ept entries that *haven't* > yet been copied over. But I don't think either one should cause a > correctness issue: either way, accessing a gfn > max_remapped_gfn will > cause a fault, at which point either the correct value will be copied > from the hostp2m (perhaps going through resolve_misconfig() on the > hostp2m in the process) or the correct value will be calculated via > resolve_misconfig(). ... cannot identify any of the three cases above where I understand you say a max_mapped_pfn == max_remapped_gfn comparison happens. But as you say - the code is complicated enough, so I may easily overlook something. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 2:03 PM, George Dunlap wrote: > On 11/16/18 10:08 AM, Jan Beulich wrote: > On 15.11.18 at 18:54, wrote: >>> On 11/15/18 7:34 PM, George Dunlap wrote: > On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru > wrote: > @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, > unsigned int i) > struct p2m_domain *hostp2m = p2m_get_hostp2m(d); > struct ept_data *ept; > > +p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; Why we need to copy this value? I’ve just spent the afternoon tracing around the source code, and while I’m >>> pretty sure it won’t hurt, I’m also not sure why it’s necessary. >>> >>> I think I did that because I assumed that it would matter for >>> ept_get_entry() and misconfigurations, when: >>> >>> 954 /* This pfn is higher than the highest the p2m map currently >>> holds */ >>> 955 if ( gfn > p2m->max_mapped_pfn ) >> >> The question is whether under any condition such checks require that >> the accumulated value be in sync between the host and the various >> alt p2m-s. >> >>> 956 { >>> 957 for ( i = ept->wl; i > 0; --i ) >>> 958 if ( (gfn & ~((1UL << (i * EPT_TABLE_ORDER)) - 1)) > >>> 959 p2m->max_mapped_pfn ) >>> 960 break; >>> 961 goto out; >>> 962 } >>> >>> but I am not certain it is required either. I can certainly remove this >>> assignment and see if anything breaks. >> >> I've seen you mention such or alike a couple of times now while >> discussing this series, and I have to admit that I'm a little frightened: >> Making a change just based on the observation that nothing breaks >> is setting us up for breakage in some corner case. That is - seeing >> something break is a good indication that a change was wrong, but >> not seeing any breakage doesn't alone justify a change. >> >> In the particular case here I think whether the copying of the field >> is needed depends on what else is being copied (I'm sorry, I'm not >> that familiar with altp2m): If mappings get cloned from the host p2m, >> then this value ought to be cloned too. For any mappings that >> might be kept in sync between p2m-s, I think this field also would >> need to be updated in sync. > > So the basic question is, does "max_mapped_pfn" mean, "Maximum pfn _for > the domain_", or "Maximum pfn _for this p2m_". When the element was > added to the p2m struct those were the same thing. Which do the various > use cases expect it to be, and which would be the most robust going forward? > > I spent a bunch of time going through the code yesterday, and I'm pretty > sure that as long as the value in the p2m is one or the other, there > will be at the moment no _correctness_ issues. It looked to me like in > the case where altp2m->max_mapped_pfn > altp2m->max_remapped_gfn, then > the p2m machinery would do a certain amount of unnecessary work, but > that's all. > > It also looked to me like before this patch, the value mostly ends up > being "maximum pfn ever mapped in this p2m (even across altp2m > desroys)". That's because when the altp2m is allocated, it starts as 0; > every time an entry is propagated from the hostp2m to the altp2m, > ept_set_entry() updates max_mapped_pfn; but nothing sets it back to zero. > > Also, hostp2m->max_mapped_pfn is never decreased, only increased. > > So either before or after this patch, altp2m->max_remapped_gfn <= > altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn. > > In the vast majority of cases, max_mapped_pfn is explicitly being read > from the hostp2m. > > Below are the cases where it may be read from an altp2m: > > - ept_get_entry(), ept_walk_tables(): If ==max_remapped_gfn, it will > return INVALID_MFN early. If higher than max_remapped_gfn, it falls > back to walking the altp2m's ept tables, which will give you the same > answer, just a bit more slowly. > > - ept_dump_p2m_tables(): If ==max_remapped_gfn, it will dump only up to > the current map; if >max_remapped_gfn, it will dump a number of > unnecessary INVALID_MFN entries, but no more entries than the hostp2m. > > - p2m.c:change_type_range(): If ==max_remapped_gfn, it will only > invalidate entries in the altp2m as high have been currently remapped. > If >max_remapped_gfn, it will write invalid ept entries that *haven't* > yet been copied over. But I don't think either one should cause a > correctness issue: either way, accessing a gfn > max_remapped_gfn will > cause a fault, at which point either the correct value will be copied > from the hostp2m (perhaps going through resolve_misconfig() on the > hostp2m in the process) or the correct value will be calculated via > resolve_misconfig(). > > So, it seemed from my investigation like it would be more useful if we > got rid of altp2m->max_remapped_gfn, and used atlp2m->max_mapped_pfn > instead. If people want to know the maximum gfn mapped by the domain as > a whole, they should just use
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/15/18 8:51 PM, Razvan Cojocaru wrote: > On 11/15/18 10:26 PM, George Dunlap wrote: >> >> >>> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru >>> wrote: >>> >>> When an new altp2m view is created very early on guest boot, the >>> display will freeze (although the guest will run normally). This >>> may also happen on resizing the display. The reason is the way >>> Xen currently (mis)handles logdirty VGA: it intentionally >>> misconfigures VGA pages so that they will fault. >>> >>> The problem is that it only does this in the host p2m. Once we >>> switch to a new altp2m, the misconfigured entries will no longer >>> fault, so the display will not be updated. >>> >>> This patch: >>> * updates ept_handle_misconfig() to use the active altp2m instead >>> of the hostp2m; >>> * modifies p2m_change_entry_type_global(), p2m_memory_type_changed >>> and p2m_change_type_range() to propagate their changes to all >>> valid altp2ms. >>> >>> With the introduction of altp2m fields in p2m_memory_type_changed() >>> the whole function has been put under CONFIG_HVM. >> >> Sorry — actually, I think there’s yet another issue lurking here: >> p2m_finish_type_change(). I’ll poke around a bit more tomorrow. > > OK, I'll wait for your recommendation before working on the last patch > of the series. It looks like you'll want to do the same thing to p2m_finish_type_change() that you did for p2m_change_type_range(): create a finish_type_change() function which loops over the appropriate range calling p2m->recalc(), and then have p2m_finish_type_change() first call this on the hostp2m, then one-by-one on all valid altp2ms. In finish_type_change() you'll want to clip first_gfn and max_nr to fit within p2m->min_remapped_gfn and p2m->max_remapped_gfn for altp2ms. I think that should do the trick. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 10:08 AM, Jan Beulich wrote: On 15.11.18 at 18:54, wrote: >> On 11/15/18 7:34 PM, George Dunlap wrote: On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru wrote: @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned int i) struct p2m_domain *hostp2m = p2m_get_hostp2m(d); struct ept_data *ept; +p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; >>> >>> Why we need to copy this value? >>> >>> I’ve just spent the afternoon tracing around the source code, and while I’m >> pretty sure it won’t hurt, I’m also not sure why it’s necessary. >> >> I think I did that because I assumed that it would matter for >> ept_get_entry() and misconfigurations, when: >> >> 954 /* This pfn is higher than the highest the p2m map currently >> holds */ >> 955 if ( gfn > p2m->max_mapped_pfn ) > > The question is whether under any condition such checks require that > the accumulated value be in sync between the host and the various > alt p2m-s. > >> 956 { >> 957 for ( i = ept->wl; i > 0; --i ) >> 958 if ( (gfn & ~((1UL << (i * EPT_TABLE_ORDER)) - 1)) > >> 959 p2m->max_mapped_pfn ) >> 960 break; >> 961 goto out; >> 962 } >> >> but I am not certain it is required either. I can certainly remove this >> assignment and see if anything breaks. > > I've seen you mention such or alike a couple of times now while > discussing this series, and I have to admit that I'm a little frightened: > Making a change just based on the observation that nothing breaks > is setting us up for breakage in some corner case. That is - seeing > something break is a good indication that a change was wrong, but > not seeing any breakage doesn't alone justify a change. > > In the particular case here I think whether the copying of the field > is needed depends on what else is being copied (I'm sorry, I'm not > that familiar with altp2m): If mappings get cloned from the host p2m, > then this value ought to be cloned too. For any mappings that > might be kept in sync between p2m-s, I think this field also would > need to be updated in sync. So the basic question is, does "max_mapped_pfn" mean, "Maximum pfn _for the domain_", or "Maximum pfn _for this p2m_". When the element was added to the p2m struct those were the same thing. Which do the various use cases expect it to be, and which would be the most robust going forward? I spent a bunch of time going through the code yesterday, and I'm pretty sure that as long as the value in the p2m is one or the other, there will be at the moment no _correctness_ issues. It looked to me like in the case where altp2m->max_mapped_pfn > altp2m->max_remapped_gfn, then the p2m machinery would do a certain amount of unnecessary work, but that's all. It also looked to me like before this patch, the value mostly ends up being "maximum pfn ever mapped in this p2m (even across altp2m desroys)". That's because when the altp2m is allocated, it starts as 0; every time an entry is propagated from the hostp2m to the altp2m, ept_set_entry() updates max_mapped_pfn; but nothing sets it back to zero. Also, hostp2m->max_mapped_pfn is never decreased, only increased. So either before or after this patch, altp2m->max_remapped_gfn <= altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn. In the vast majority of cases, max_mapped_pfn is explicitly being read from the hostp2m. Below are the cases where it may be read from an altp2m: - ept_get_entry(), ept_walk_tables(): If ==max_remapped_gfn, it will return INVALID_MFN early. If higher than max_remapped_gfn, it falls back to walking the altp2m's ept tables, which will give you the same answer, just a bit more slowly. - ept_dump_p2m_tables(): If ==max_remapped_gfn, it will dump only up to the current map; if >max_remapped_gfn, it will dump a number of unnecessary INVALID_MFN entries, but no more entries than the hostp2m. - p2m.c:change_type_range(): If ==max_remapped_gfn, it will only invalidate entries in the altp2m as high have been currently remapped. If >max_remapped_gfn, it will write invalid ept entries that *haven't* yet been copied over. But I don't think either one should cause a correctness issue: either way, accessing a gfn > max_remapped_gfn will cause a fault, at which point either the correct value will be copied from the hostp2m (perhaps going through resolve_misconfig() on the hostp2m in the process) or the correct value will be calculated via resolve_misconfig(). So, it seemed from my investigation like it would be more useful if we got rid of altp2m->max_remapped_gfn, and used atlp2m->max_mapped_pfn instead. If people want to know the maximum gfn mapped by the domain as a whole, they should just use hostp2m->max_mapped_pfn. The code is definitely complicated enough, though, that I may have missed something, which is why I asked Razvan if there was a reason he changed it. For the purposes of this
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/16/18 12:08 PM, Jan Beulich wrote: > I've seen you mention such or alike a couple of times now while > discussing this series, and I have to admit that I'm a little frightened: > Making a change just based on the observation that nothing breaks > is setting us up for breakage in some corner case. That is - seeing > something break is a good indication that a change was wrong, but > not seeing any breakage doesn't alone justify a change. > > In the particular case here I think whether the copying of the field > is needed depends on what else is being copied (I'm sorry, I'm not > that familiar with altp2m): If mappings get cloned from the host p2m, > then this value ought to be cloned too. For any mappings that > might be kept in sync between p2m-s, I think this field also would > need to be updated in sync. I agree, I was merely offering to test whether any breakage occurs if not syncing the field, not suggesting that that's a comprehensive analysis or a justification for change, in case that offers us more insight. AFAIK, EPT restrictions are always propagated from the hostp2m to all active altp2ms. I've tested this at some point in the past and it has proven true - it's supposedly an altp2m feature although it can also be a bit of a headache: if we would have been able to use the hostp2m independently then we wouldn't have had this display freeze problem, since we could have used the hostp2m as the default (restricted) EPT view. As things stand now, we need to create a new view when starting introspection and switch to that immediately, so changes to it will not propagate into other views (in other words, so we can use it independently of all other altp2ms). Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
>>> On 15.11.18 at 18:54, wrote: > On 11/15/18 7:34 PM, George Dunlap wrote: >>> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru >>> wrote: >>> @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned >>> int i) >>> struct p2m_domain *hostp2m = p2m_get_hostp2m(d); >>> struct ept_data *ept; >>> >>> +p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; >> >> Why we need to copy this value? >> >> I’ve just spent the afternoon tracing around the source code, and while I’m > pretty sure it won’t hurt, I’m also not sure why it’s necessary. > > I think I did that because I assumed that it would matter for > ept_get_entry() and misconfigurations, when: > > 954 /* This pfn is higher than the highest the p2m map currently > holds */ > 955 if ( gfn > p2m->max_mapped_pfn ) The question is whether under any condition such checks require that the accumulated value be in sync between the host and the various alt p2m-s. > 956 { > 957 for ( i = ept->wl; i > 0; --i ) > 958 if ( (gfn & ~((1UL << (i * EPT_TABLE_ORDER)) - 1)) > > 959 p2m->max_mapped_pfn ) > 960 break; > 961 goto out; > 962 } > > but I am not certain it is required either. I can certainly remove this > assignment and see if anything breaks. I've seen you mention such or alike a couple of times now while discussing this series, and I have to admit that I'm a little frightened: Making a change just based on the observation that nothing breaks is setting us up for breakage in some corner case. That is - seeing something break is a good indication that a change was wrong, but not seeing any breakage doesn't alone justify a change. In the particular case here I think whether the copying of the field is needed depends on what else is being copied (I'm sorry, I'm not that familiar with altp2m): If mappings get cloned from the host p2m, then this value ought to be cloned too. For any mappings that might be kept in sync between p2m-s, I think this field also would need to be updated in sync. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/15/18 10:26 PM, George Dunlap wrote: > > >> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru >> wrote: >> >> When an new altp2m view is created very early on guest boot, the >> display will freeze (although the guest will run normally). This >> may also happen on resizing the display. The reason is the way >> Xen currently (mis)handles logdirty VGA: it intentionally >> misconfigures VGA pages so that they will fault. >> >> The problem is that it only does this in the host p2m. Once we >> switch to a new altp2m, the misconfigured entries will no longer >> fault, so the display will not be updated. >> >> This patch: >> * updates ept_handle_misconfig() to use the active altp2m instead >> of the hostp2m; >> * modifies p2m_change_entry_type_global(), p2m_memory_type_changed >> and p2m_change_type_range() to propagate their changes to all >> valid altp2ms. >> >> With the introduction of altp2m fields in p2m_memory_type_changed() >> the whole function has been put under CONFIG_HVM. > > Sorry — actually, I think there’s yet another issue lurking here: > p2m_finish_type_change(). I’ll poke around a bit more tomorrow. OK, I'll wait for your recommendation before working on the last patch of the series. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru > wrote: > > When an new altp2m view is created very early on guest boot, the > display will freeze (although the guest will run normally). This > may also happen on resizing the display. The reason is the way > Xen currently (mis)handles logdirty VGA: it intentionally > misconfigures VGA pages so that they will fault. > > The problem is that it only does this in the host p2m. Once we > switch to a new altp2m, the misconfigured entries will no longer > fault, so the display will not be updated. > > This patch: > * updates ept_handle_misconfig() to use the active altp2m instead > of the hostp2m; > * modifies p2m_change_entry_type_global(), p2m_memory_type_changed > and p2m_change_type_range() to propagate their changes to all > valid altp2ms. > > With the introduction of altp2m fields in p2m_memory_type_changed() > the whole function has been put under CONFIG_HVM. Sorry — actually, I think there’s yet another issue lurking here: p2m_finish_type_change(). I’ll poke around a bit more tomorrow. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
On 11/15/18 7:34 PM, George Dunlap wrote: > > >> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru >> wrote: >> >> When an new altp2m view is created very early on guest boot, the >> display will freeze (although the guest will run normally). This >> may also happen on resizing the display. The reason is the way >> Xen currently (mis)handles logdirty VGA: it intentionally >> misconfigures VGA pages so that they will fault. >> >> The problem is that it only does this in the host p2m. Once we >> switch to a new altp2m, the misconfigured entries will no longer >> fault, so the display will not be updated. >> >> This patch: >> * updates ept_handle_misconfig() to use the active altp2m instead >> of the hostp2m; >> * modifies p2m_change_entry_type_global(), p2m_memory_type_changed >> and p2m_change_type_range() to propagate their changes to all >> valid altp2ms. >> >> With the introduction of altp2m fields in p2m_memory_type_changed() >> the whole function has been put under CONFIG_HVM. >> >> Signed-off-by: Razvan Cojocaru >> Suggested-by: George Dunlap >> Reviewed-by: Kevin Tian > > Just one more question... > >> >> --- >> CC: Jun Nakajima >> CC: Kevin Tian >> CC: George Dunlap >> CC: Jan Beulich >> CC: Andrew Cooper >> CC: Wei Liu >> >> --- >> Changes since V5: >> - Added Kevin's Reviewed-by. >> - Added a note on p2m_memory_type_changed() being under CONFIG_HVM. >> --- >> xen/arch/x86/mm/p2m-ept.c | 8 >> xen/arch/x86/mm/p2m-pt.c | 8 >> xen/arch/x86/mm/p2m.c | 115 >> ++ >> xen/include/asm-x86/p2m.h | 6 +-- >> 4 files changed, 114 insertions(+), 23 deletions(-) >> >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >> index fabcd06..e6fa85f 100644 >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -657,6 +657,9 @@ bool_t ept_handle_misconfig(uint64_t gpa) >> bool_t spurious; >> int rc; >> >> +if ( altp2m_active(curr->domain) ) >> +p2m = p2m_get_altp2m(curr); >> + >> p2m_lock(p2m); >> >> spurious = curr->arch.hvm.vmx.ept_spurious_misconfig; >> @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned >> int i) >> struct p2m_domain *hostp2m = p2m_get_hostp2m(d); >> struct ept_data *ept; >> >> +p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; > > Why we need to copy this value? > > I’ve just spent the afternoon tracing around the source code, and while I’m > pretty sure it won’t hurt, I’m also not sure why it’s necessary. I think I did that because I assumed that it would matter for ept_get_entry() and misconfigurations, when: 954 /* This pfn is higher than the highest the p2m map currently holds */ 955 if ( gfn > p2m->max_mapped_pfn ) 956 { 957 for ( i = ept->wl; i > 0; --i ) 958 if ( (gfn & ~((1UL << (i * EPT_TABLE_ORDER)) - 1)) > 959 p2m->max_mapped_pfn ) 960 break; 961 goto out; 962 } but I am not certain it is required either. I can certainly remove this assignment and see if anything breaks. > Everything else looks good! Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early
> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru > wrote: > > When an new altp2m view is created very early on guest boot, the > display will freeze (although the guest will run normally). This > may also happen on resizing the display. The reason is the way > Xen currently (mis)handles logdirty VGA: it intentionally > misconfigures VGA pages so that they will fault. > > The problem is that it only does this in the host p2m. Once we > switch to a new altp2m, the misconfigured entries will no longer > fault, so the display will not be updated. > > This patch: > * updates ept_handle_misconfig() to use the active altp2m instead > of the hostp2m; > * modifies p2m_change_entry_type_global(), p2m_memory_type_changed > and p2m_change_type_range() to propagate their changes to all > valid altp2ms. > > With the introduction of altp2m fields in p2m_memory_type_changed() > the whole function has been put under CONFIG_HVM. > > Signed-off-by: Razvan Cojocaru > Suggested-by: George Dunlap > Reviewed-by: Kevin Tian Just one more question... > > --- > CC: Jun Nakajima > CC: Kevin Tian > CC: George Dunlap > CC: Jan Beulich > CC: Andrew Cooper > CC: Wei Liu > > --- > Changes since V5: > - Added Kevin's Reviewed-by. > - Added a note on p2m_memory_type_changed() being under CONFIG_HVM. > --- > xen/arch/x86/mm/p2m-ept.c | 8 > xen/arch/x86/mm/p2m-pt.c | 8 > xen/arch/x86/mm/p2m.c | 115 ++ > xen/include/asm-x86/p2m.h | 6 +-- > 4 files changed, 114 insertions(+), 23 deletions(-) > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index fabcd06..e6fa85f 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -657,6 +657,9 @@ bool_t ept_handle_misconfig(uint64_t gpa) > bool_t spurious; > int rc; > > +if ( altp2m_active(curr->domain) ) > +p2m = p2m_get_altp2m(curr); > + > p2m_lock(p2m); > > spurious = curr->arch.hvm.vmx.ept_spurious_misconfig; > @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned > int i) > struct p2m_domain *hostp2m = p2m_get_hostp2m(d); > struct ept_data *ept; > > +p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; Why we need to copy this value? I’ve just spent the afternoon tracing around the source code, and while I’m pretty sure it won’t hurt, I’m also not sure why it’s necessary. Everything else looks good! -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel