Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 2 November, 2016 12:46
> To: Andrei Vlad LUTAS <vlu...@bitdefender.com>
> Cc: rcojoc...@bitdefender.com; andrew.coop...@citrix.com; xen-
> de...@lists.xenproject.org; ta...@tklengyel.com
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
>
> >>> On 02.11.16 at 11:22, <vlu...@bitdefender.com> wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 2 November, 2016 12:05
> >> >>> On 02.11.16 at 10:53, <vlu...@bitdefender.com> wrote:
> >> > The decision whether further events are needed or not is NOT made
> >> > based on the contents of the missing page. Let us assume we have
> >> > the MODULE structure, that contains a "name" and an "address". The
> >> > MODULE is inserted in the modules list via a write, which triggers
> >> > an EPT violation, which is handled by HVI. The HVI sees that "name"
> >> > is the module it was waiting for (for example, ntdll, kernel32, or
> >> > whatever), and decides that it doesn't want to intercept other
> >> > modules being loaded, so it removes the write hook from the list.
> >> > Furthermore, it sees that "address" points to a swapped-out page,
> >> > so it injects a #PF, to
> >> swap it in.
> >>
> >> So what's the #PF needed for then, if the introspection engine
> >> doesn't need looking at the page? Once again - I think it would have
> >> helped quite a bit if a _complete_ picture had been drawn from the very
> beginning of this thread.
> >
> > Who said the introspection logic doesn't need to inspect the page?
> > That's why we inject the #PF. Because we need to further inspect the
> page.
>
> Looks like I've drawn a wrong conclusion then - sorry.
>
> > But the
> > decision to inspect the missing page or the decision that further
> > module events are relevant or not is not related in any way to the
> > contents of the missing page. The contents of the missing page need to be
> inspected for other reasons.
>
> And the disabling of (in your example) module load monitoring could then be
> done at that point, rather than right away?

We could theoretically do even better than that - for example, inject an INT3 
(0xCC) instruction at that point, and make sure the VCPU doesn't advance until 
we get to inject our #PF. But even this requires some modifications, because 
right now, we cannot know what and if the injection will succeed.

>
> > In my opinion, the complete picture _was_ drawn from the beginning, it
> > just seems that we need to zoom in more. You also have to understand
> > that the HVI itself is closed-source and there's a point beyond which
> > we simply can't give any more info.
>
> I'm sure you understand that with partial / insufficient info it then may be
> impossible for anyone here to give advice which is actually helpful to you.

Absolutely. I didn't mean to sound obtrusive or anything, it's just that we 
simply cannot disclose some of the details, as much as I would want 
(intellectual property, etc.). We already provided and we will continue 
providing as much info as we can. After all, we, above all, want this fixed, as 
it impacts our product.

>
> Jan
>
>
> 
> This email was scanned by Bitdefender

Best regards,
Andrei.


This email was scanned by Bitdefender

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 2 November, 2016 12:05
> To: Andrei Vlad LUTAS <vlu...@bitdefender.com>
> Cc: rcojoc...@bitdefender.com; andrew.coop...@citrix.com; xen-
> de...@lists.xenproject.org; ta...@tklengyel.com
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
> 
> >>> On 02.11.16 at 10:53, <vlu...@bitdefender.com> wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 2 November, 2016 11:38
> >> >>> On 02.11.16 at 10:13, <vlu...@bitdefender.com> wrote:
> >> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> >> Sent: 2 November, 2016 10:50
> >> >> >>> On 01.11.16 at 23:17, <vlu...@bitdefender.com> wrote:
> >> >> > We don't really care when and how the #PF is handled. We don't
> >> >> > care if the page is paged out at some random point. What we do
> >> >> > know is that at a certain point in the future, the page will be
> >> >> > swapped in; how do we know when? The OS will write the guest
> >> >> > page tables, at which point we can inspect the physical page
> >> >> > itself (so you can see here why we don't care about the page
> >> >> > being swapped out sometime in the future). So we really _can_ lift
> any restriction we want at that point.
> >> >>
> >> >> Hmm, I'm having difficulty seeing the supposedly broken flow of
> >> >> events
> >> >> here: Earlier it was said that #PF injection would be a result of
> >> >> EPT event processing. Here you say that the lifting of the
> >> >> restrictions would be a result of seeing the guest modify its page
> >> >> tables (which would in turn be a result of the #PF actually having
> >> >> arrived in the guest). So if (with this, and as you say
> >> >> above) you don't care when the #PF gets handled, where's the
> >> >> original problem?
> >> >
> >> > That's not what I wanted to say, sorry if it was unclear. What I'm
> >> > trying to say is that the decision to inject a #PF can be made when
> >> > handling an EPT violation - the accessed page needs not be related
> >> > in any way with the page for which we decide to inject the #PF. For
> >> > example, we intercept writes in a list that describes the loaded
> >> > module. Whenever a new module is loaded, an entry would be inserted
> >> > into that list, and that would generate an EPT write violation.
> >> > Now, the introspection logic will be able to analyze what module
> >> > was loaded and where, and it may find out that the module headers
> >> > (which are needed by the protection logic) are not present in
> >> > memory - therefore, it would inject a #PF in order to force the OS
> >> > to swap in said headers. On the other hand, the HVI logic may also
> >> > decide that it doesn't need to watch for modules loading anymore
> >> > (for example, all the
> >> interesting modules were loaded), so it will remove the write hook
> >> from the list of loaded modules.
> >> > These two (injection of the #PF and the removal of the EPT write
> >> > protection) would be done in the same event handler, so we can't
> >> > rely on the event being re-generated in this case. Hopefully this
> >> > example
> >> makes it more clear.
> >>
> >> If the decision whether further events are needed depends on data in
> >> a page not present in memory, how can that decision be taken before
> >> the injected #PF has actually been handled? I'm still not seeing a
> >> flow of events where there is a problem. Furthermore, I don't think
> >> it would do much harm if you kept the watch in place slightly longer?
> >
> > The decision whether further events are needed or not is NOT made
> > based on the contents of the missing page. Let us assume we have the
> > MODULE structure, that contains a "name" and an "address". The MODULE
> > is inserted in the modules list via a write, which triggers an EPT
> > violation, which is handled by HVI. The HVI sees that "name" is the
> > module it was waiting for (for example, ntdll, kernel32, or whatever),
> > and decides that it doesn't want to intercept other modules being
> > loaded, so it removes the write hook from the list. Furthermore, it
> > sees that "address" points to a swapped-out page, so i

Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 2 November, 2016 11:38
> To: rcojoc...@bitdefender.com; Andrei Vlad LUTAS
> <vlu...@bitdefender.com>
> Cc: andrew.coop...@citrix.com; xen-de...@lists.xenproject.org;
> ta...@tklengyel.com
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
> 
> >>> On 02.11.16 at 10:13, <vlu...@bitdefender.com> wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 2 November, 2016 10:50
> >> >>> On 01.11.16 at 23:17, <vlu...@bitdefender.com> wrote:
> >> > We don't really care when and how the #PF is handled. We don't care
> >> > if the page is paged out at some random point. What we do know is
> >> > that at a certain point in the future, the page will be swapped in;
> >> > how do we know when? The OS will write the guest page tables, at
> >> > which point we can inspect the physical page itself (so you can see
> >> > here why we don't care about the page being swapped out sometime in
> >> > the future). So we really _can_ lift any restriction we want at that 
> >> > point.
> >>
> >> Hmm, I'm having difficulty seeing the supposedly broken flow of
> >> events
> >> here: Earlier it was said that #PF injection would be a result of EPT
> >> event processing. Here you say that the lifting of the restrictions
> >> would be a result of seeing the guest modify its page tables (which
> >> would in turn be a result of the #PF actually having arrived in the
> >> guest). So if (with this, and as you say
> >> above) you don't care when the #PF gets handled, where's the original
> >> problem?
> >
> > That's not what I wanted to say, sorry if it was unclear. What I'm
> > trying to say is that the decision to inject a #PF can be made when
> > handling an EPT violation - the accessed page needs not be related in
> > any way with the page for which we decide to inject the #PF. For
> > example, we intercept writes in a list that describes the loaded
> > module. Whenever a new module is loaded, an entry would be inserted
> > into that list, and that would generate an EPT write violation. Now,
> > the introspection logic will be able to analyze what module was loaded
> > and where, and it may find out that the module headers (which are
> > needed by the protection logic) are not present in memory - therefore,
> > it would inject a #PF in order to force the OS to swap in said
> > headers. On the other hand, the HVI logic may also decide that it
> > doesn't need to watch for modules loading anymore (for example, all the
> interesting modules were loaded), so it will remove the write hook from the
> list of loaded modules.
> > These two (injection of the #PF and the removal of the EPT write
> > protection) would be done in the same event handler, so we can't rely
> > on the event being re-generated in this case. Hopefully this example
> makes it more clear.
> 
> If the decision whether further events are needed depends on data in a
> page not present in memory, how can that decision be taken before the
> injected #PF has actually been handled? I'm still not seeing a flow of events
> where there is a problem. Furthermore, I don't think it would do much harm
> if you kept the watch in place slightly longer?

The decision whether further events are needed or not is NOT made based on the 
contents of the missing page. Let us assume we have the MODULE structure, that 
contains a "name" and an "address". The MODULE is inserted in the modules list 
via a write, which triggers an EPT violation, which is handled by HVI. The HVI 
sees that "name" is the module it was waiting for (for example, ntdll, 
kernel32, or whatever), and decides that it doesn't want to intercept other 
modules being loaded, so it removes the write hook from the list. Furthermore, 
it sees that "address" points to a swapped-out page, so it injects a #PF, to 
swap it in. 

> 
> >> The fact that {vmx,svm}_inject_trap() combine the new exception with
> >> an already injected one (and blindly discard events other than hw
> >> exceptions), otoh, looks like indeed wants to be controllable by the
> >> caller: When the event comes from the outside (the hypercall), it
> >> would clearly seem better to simply tell the caller that no injection
> >> happened and the event needs to be kept pending. The main question
> >> then is how to make certain injection gets retried at the right point
> >> in time (read: once the other interrupt handler IRETs back to original
> c

Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 2 November, 2016 11:31
> To: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Andrei Vlad LUTAS <vlu...@bitdefender.com>;
> andrew.coop...@citrix.com; xen-de...@lists.xenproject.org;
> ta...@tklengyel.com
> Subject: Re: [Xen-devel] xc_hvm_inject_trap() races
>
> >>> On 02.11.16 at 10:11, <rcojoc...@bitdefender.com> wrote:
> > On 11/02/2016 11:05 AM, Jan Beulich wrote:
> >>>>> On 02.11.16 at 09:57, <rcojoc...@bitdefender.com> wrote:
> >>> On 11/02/2016 10:49 AM, Jan Beulich wrote:
> >>>> The fact that {vmx,svm}_inject_trap() combine the new exception
> >>>> with an already injected one (and blindly discard events other than
> >>>> hw exceptions), otoh, looks like indeed wants to be controllable by
> >>>> the caller: When the event comes from the outside (the hypercall),
> >>>> it would clearly seem better to simply tell the caller that no
> >>>> injection happened and the event needs to be kept pending.
> >>>
> >>> However this is not possible with the current design, since all
> >>> xc_hvm_inject_trap() really does is set the info to be used at
> >>> hvm_do_resume() time. So at the time xc_hvm_inject_trap() returns,
> >>> it's not yet possible to know if the injection will succeed or not
> >>> (assuming we discard it when it would collide with another).
> >>
> >> That's my point - it shouldn't get discarded, but remain latched for
> >> a future invocation of hvm_do_resume(). Making
> >> hvm_inject_trap() have a suitable parameter (and a return value)
> >> would be the easy part of the change here. The difficult part would
> >> be to make sure the injection context is the right one.
> >
> > Should I then bring this patch back?
> >
> > https://lists.xen.org/archives/html/xen-devel/2014-07/msg02927.html
> >
> > It has been rejected at the time on the grounds that
> > xc_hvm_inject_trap() is sufficient.
>
> I don't think it would deal with all possible situations, the more that it's
> (already by its title) #PF specific. I think the named difficult part would 
> need
> to be solved in the hypervisor alone, without further external information.
>
> Jan

With some HVI re-engineering I think I may be able to make everything work if 
the existing API would return an error in case it cannot inject the #PF. Would 
that be possible?

>
>
> 
> This email was scanned by Bitdefender

Best regards,
Andrei.


This email was scanned by Bitdefender

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 2 November, 2016 10:50
> To: rcojoc...@bitdefender.com; Andrei Vlad LUTAS
> <vlu...@bitdefender.com>
> Cc: andrew.coop...@citrix.com; xen-de...@lists.xenproject.org;
> ta...@tklengyel.com
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
>
> >>> On 01.11.16 at 23:17, <vlu...@bitdefender.com> wrote:
> > From: Jan Beulich [mailto:jbeul...@suse.com]
> > Sent: 1 November, 2016 18:40
> >>>> Andrei Vlad LUTAS <vlu...@bitdefender.com> 11/01/16 5:13 PM >>>
> >>>First of all, to answer your original question: the injection
> >>>decision is made when the introspection logic needs to inspect a page
> >>>that is not present in the physical memory. We don't really care if
> >>>the current instruction triggers multiple faults or not (and here I'm
> >>>not sure what you mean by that - multiple exceptions, or multiple EPT
> >>>violations - but the answer is still the same), and removing the page
> >>>restrictions after the #PF injection is introspection specific logic
> >>>- the address for which we inject the #PF doesn't have to be related
> >>>in any way to the
> > current instruction.
> >
> >>Ah, that's this no-architectural behavior again.
> >
> > I don't think the HVI #PF injection internals or how the #PF is
> > handled by the OS are relevant here. We are using an existing API that
> > seems to not work quite correct under certain circumstances and we
> > were curious if any of you can shed some light in this regard, and
> > maybe point us to the right direction for cooking up a fix.
> >
> >>What if the OS doesn't fully carry out the page-in, relying on the #PF
> >>to
> > retrigger once the insn for which it got reported has been restarted?
> >
> > Can you be more specific?
>
> Well, perhaps with the answer you gave further down that's not that
> relevant anymore, but consider a #PF handler which handles just the top
> most not-present page table level each time it gets invoked. I.e.
> for a not-present L4 entry it would take 4 re-invocations of the same original
> instruction to resolve all 4 levels.

I see what you're referring to. As I explained to Andrew in a previous mail - 
the #PF injection logic is indeed OS specific, and in our particular case 
(since VM introspection already has to handle a lot of OS specific stuff), we 
don't have to deal with such a behavior on the supported operating systems. 
Anyway, the example you provided would involve significant added performance 
penalty and I don't see why an OS would do that (nor have I heard of any doing 
it), but I understand your concern.

>
> >> Or what if the page gets paged out again before the insn actually
> >> gets to
> > execute (e.g. because a re-schedule happened inside the guest on the
> > way out of the #PF handler)? All of this suggests that you really
> > can't lift >any restrictions _before_ seeing what you need to see.
> >
> > We don't really care when and how the #PF is handled. We don't care if
> > the page is paged out at some random point. What we do know is that at
> > a certain point in the future, the page will be swapped in; how do we
> > know when? The OS will write the guest page tables, at which point we
> > can inspect the physical page itself (so you can see here why we don't
> > care about the page being swapped out sometime in the future). So we
> > really _can_ lift any restriction we want at that point.
>
> Hmm, I'm having difficulty seeing the supposedly broken flow of events
> here: Earlier it was said that #PF injection would be a result of EPT event
> processing. Here you say that the lifting of the restrictions would be a 
> result
> of seeing the guest modify its page tables (which would in turn be a result of
> the #PF actually having arrived in the guest). So if (with this, and as you 
> say
> above) you don't care when the #PF gets handled, where's the original
> problem?

That's not what I wanted to say, sorry if it was unclear. What I'm trying to 
say is that the decision to inject a #PF can be made when handling an EPT 
violation - the accessed page needs not be related in any way with the page for 
which we decide to inject the #PF. For example, we intercept writes in a list 
that describes the loaded module. Whenever a new module is loaded, an entry 
would be inserted into that list, and that would generate an EPT write 
violation. Now, the introspection logic will be able to analyze what module was 
loaded and where, and it may find out that the module headers (which are needed 
by the 

Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-02 Thread Andrei Vlad LUTAS

> -Original Message-
> From: Andrew Cooper [mailto:am...@hermes.cam.ac.uk] On Behalf Of
> Andrew Cooper
> Sent: 2 November, 2016 00:55
> To: Andrei Vlad LUTAS <vlu...@bitdefender.com>; Jan Beulich
> <jbeul...@suse.com>; rcojoc...@bitdefender.com
> Cc: xen-de...@lists.xenproject.org; ta...@tklengyel.com
> Subject: Re: [Xen-devel] xc_hvm_inject_trap() races
>
> On 01/11/2016 22:17, Andrei Vlad LUTAS wrote:
> >>> First of all, to answer your original question: the injection
> >>> decision is made when the introspection logic needs to inspect a
> >>> page that is not present in the physical memory. We don't really
> >>> care if the current instruction triggers multiple faults or not (and
> >>> here I'm not sure what you mean by that - multiple exceptions, or
> >>> multiple EPT violations - but the answer is still the same), and
> >>> removing the page restrictions after the #PF injection is
> >>> introspection specific logic - the address for which we inject the #PF
> doesn't have to be related in any way to the current instruction.
> >> Ah, that's this no-architectural behavior again.
> > I don't think the HVI #PF injection internals or how the #PF is handled by
> the OS are relevant here. We are using an existing API that seems to not
> work quite correct under certain circumstances and we were curious if any of
> you can shed some light in this regard, and maybe point us to the right
> direction for cooking up a fix.
>
> Just because there is an API like this, doesn't necessarily mean it ever
> worked.  This one clearly doesn't, and it got introduced before we as a
> community took a rather harder stance towards code review.

I totally understand that.

>
> Architecturally speaking, faults are always raised as a direct consequence of
> the current state.  Therefore, having in introspection agent interposing on
> the instruction stream and causing faults as a side effect of EPT
> permissions/etc, is quite natural and in line with architectural expectation.
>
> You also have a second usecase for this API, which is to trick Windows into
> paging in a frame you care about looking at.  Overall, using this #PF method
> to get Windows to page content back in clearly the rational way of making
> that happen, but it is very definitely a non-architectural usecase; if windows
> were to double check the instruction stream on getting this pagefault, it
> would get very confused, as the pagefault it received has nothing to do with
> the code pointed at in the exception frame.

This is indeed OS specific. After all, the entire concept of "VM introspection" 
deals with OS specifics and internals. Before adding support to a new 
build/version of the kernel, we make sure everything works fine. If something 
would not work properly, we would adjust our logic. As of now, every operating 
system that we've tested with has a consistent behavior regarding handling the 
injected #PF: it doesn't care what instruction triggered it, it doesn't inspect 
it, etc., it just swaps back in the page pointed by CR2.

>
> It is quite likely that these different usecases might have different 
> solutions.
> IMO, the former should probably be controlled by responses in the
> vm_event ring, but this latter issue probably shouldn't.
>
> When it comes to injecting exceptions, there are some restrictions which
> limit what can legitimately be done.  We can only inject a single thing at 
> once.
> Stacking a #PF on top of a plain interrupt could be resolved by leaving the
> interrupt pending in the vLAPIC and injecting the #PF instead.  Stacking a #PF
> on top of a different fault is going to cause
> hvm_combine_exceptions() to turn it into something more nasty.  OTOH,
> deferring the #PF by even a single instruction could result in it being sent 
> in
> an unsafe context, which should also be avoided.

If I understand (at least some portions of) the Xen code well, the real issue 
appears when the VM exit handler sees that the IDT_VECTORING_INFO is valid and 
copies it back in the VM_ENTRY_INTR_INFO in order to re-inject the event. We 
may then overwrite that with the #PF event. I'm not sure what other code paths 
cause event injections, but this one seemed obvious. In our test hypervisor, 
the #PF injection takes precedence - we leave all interrupts pending in the 
virtual LAPIC and we postpone any event re-injection.

>
> What hard propertied do you need for this usecase, and are there any
> properties can afford to be flexible?

Ideally, knowing that once we called that API, the #PF will get injected. At 
the minimum, I think that knowing whether the #PF was injected or not is 
mandatory, in order to know what to do next.

>
> ~Andrew
>
> 
> This email was scanned by Bitdefender

Best regards,
Andrei.


This email was scanned by Bitdefender

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-01 Thread Andrei Vlad LUTAS
-Original Message-
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: 1 November, 2016 18:40
To: rcojoc...@bitdefender.com; Andrei Vlad LUTAS <vlu...@bitdefender.com>
Cc: andrew.coop...@citrix.com; xen-de...@lists.xenproject.org; 
ta...@tklengyel.com
Subject: Re: RE: [Xen-devel] xc_hvm_inject_trap() races

>>> Andrei Vlad LUTAS <vlu...@bitdefender.com> 11/01/16 5:13 PM >>>

>First of all, please don't top post.

>>First of all, to answer your original question: the injection decision
>>is made when the introspection logic needs to inspect a page that is
>>not present in the physical memory. We don't really care if the current
>>instruction triggers multiple faults or not (and here I'm not sure what
>>you mean by that - multiple exceptions, or multiple EPT violations -
>>but the answer is still the same), and removing the page restrictions
>>after the #PF injection is introspection specific logic - the address
>>for which we inject the #PF doesn't have to be related in any way to the 
>>current instruction.

>Ah, that's this no-architectural behavior again.

I don't think the HVI #PF injection internals or how the #PF is handled by the 
OS are relevant here. We are using an existing API that seems to not work quite 
correct under certain circumstances and we were curious if any of you can shed 
some light in this regard, and maybe point us to the right direction for 
cooking up a fix.

>What if the OS doesn't fully carry out the page-in, relying on the #PF to 
>retrigger once the insn for which it got reported has been restarted?

Can you be more specific?

> Or what if the page gets paged out again before the insn actually gets to 
> execute (e.g. because a re-schedule happened inside the guest on the way out 
> of the #PF handler)? All of this suggests that you really can't lift >any 
> restrictions _before_ seeing what you need to see.

We don't really care when and how the #PF is handled. We don't care if the page 
is paged out at some random point. What we do know is that at a certain point 
in the future, the page will be swapped in; how do we know when? The OS will 
write the guest page tables, at which point we can inspect the physical page 
itself (so you can see here why we don't care about the page being swapped out 
sometime in the future). So we really _can_ lift any restriction we want at 
that point.

>>Assuming that we wouldn't remove the restrictions and we would rely on
>>re-generating the event - that is not acceptable: first of all because
>>the instruction would normally be emulated anyway before re-entering
>>the guest,

>How would that be a problem?

I thought it was obvious without further clarification: how can we expect the 
exact same event to be generated, if the instruction that triggered it in the 
first place was emulated or single stepped?

>>and secondly because that is not a normal CPU behavior

>This really is the main problem here, afaict.

Best regards,
Andrei.



This email was scanned by Bitdefender

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xc_hvm_inject_trap() races

2016-11-01 Thread Andrei Vlad LUTAS
Hello,

First of all, to answer your original question: the injection decision is made 
when the introspection logic needs to inspect a page that is not present in the 
physical memory. We don't really care if the current instruction triggers 
multiple faults or not (and here I'm not sure what you mean by that - multiple 
exceptions, or multiple EPT violations - but the answer is still the same), and 
removing the page restrictions after the #PF injection is introspection 
specific logic - the address for which we inject the #PF doesn't have to be 
related in any way to the current instruction. Assuming that we wouldn't remove 
the restrictions and we would rely on re-generating the event - that is not 
acceptable: first of all because the instruction would normally be emulated 
anyway before re-entering the guest, and secondly because that is not a normal 
CPU behavior (and it would break internal introspection logic) - once an 
instruction triggered an exit, it should be emulated or single-stepped.

Best regards,
Andrei.
 
-Original Message-
From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan 
Beulich
Sent: 1 November, 2016 17:54
To: rcojoc...@bitdefender.com
Cc: andrew.coop...@citrix.com; ta...@tklengyel.com; 
xen-de...@lists.xenproject.org
Subject: Re: [Xen-devel] xc_hvm_inject_trap() races

>>> Razvan Cojocaru  11/01/16 11:53 AM >>>
>On 11/01/2016 12:30 PM, Jan Beulich wrote:
> Razvan Cojocaru  11/01/16 10:04 AM >>>
>>> We've stumbled across the following scenario: we're injecting a #PF 
>>> to try to bring a swapped page back, but Xen already have a pending 
>>> interrupt, and the two collide.
>>>
>>> I've logged what happens in hvm_do_resume() at the point of 
>>> injection, and stumbled across this:
>>>
>>> (XEN) [  252.878389] vector: 14, type: 3, error_code: 0,
>>> VM_ENTRY_INTR_INFO: 0x80e1
>>>
>>> VM_ENTRY_INTR_INFO does have INTR_INFO_VALID_MASK set here.
>> 
>> So a first question I have is this: What are the criteria that made 
>> your application decide it needs to inject a trap? Obviously there 
>> must have been some kind of event in the guest that triggered this. 
>> Hence the question is whether this same event wouldn't re-trigger at 
>> the end of the hardware interrupt (or could be made re-trigger reasonably 
>> easily).
>> Because in the end what you're trying to do here is something that's 
>> architecturally impossible: There can't be a (non-nested) exception 
>> once an external interrupt has been accepted (i.e. any subsequent 
>> exception can only be related to the delivery of that interrupt 
>> vector, not to the code which was running when the interrupt was signaled).
>
>Unfortunately there are two main reasons why relying on the conditions 
>for injecting the page fault repeating is problematic:
>
>1. We'd need to be able differentiate between a failed run (where 
>injection doesn't work) and a succesful run, with no real possibility 
>to know the difference for sure. So we don't know how to adapt the 
>application's internal state based on some events: is the event the 
>"final" one, or just a duplicate? xc_hvm_inject_trap() does not tell us 
>(as indeed it cannot know) if the injection succeeded, and there's no 
>other way to know.
>
>2. More importantly (although working around 1. is far from trivial), 
>the event may not be repeatable. As an example, #PF injection can occur 
>as part of handling an EPT event, but during handling the event the 
>introspection engine can decide to lift the restrictions on said page 
>after injecting the #PF. So the application relied on the #PF being 
>delivered, and with the restrictions lifted from the page there will be 
>no further EPT events for that page, therefore the main condition for 
>triggering the #PF is lost forever.

Isn't this a problem you also have under other circumstances, e.g.
multiple faults occurring for a single instruction? Which gets us to the fact 
that you didn't answer at all the initial question I did raise. Apart from that 
I'm also not really understanding the model you describe:
What good does injecting #PF alongside lifting restrictions? I'd normally 
expect just one of the two to occur to deal with whatever caused the original 
event.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


This email was scanned by Bitdefender
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel