Re: [Xen-devel] [PATCH v3 for-next 4/4] xen: Convert __page_to_mfn and __mfn_to_page to use typesafe MFN

2017-11-01 Thread Razvan Cojocaru
On 11/01/2017 04:03 PM, Julien Grall wrote:
> Most of the users of page_to_mfn and mfn_to_page are either overriding
> the macros to make them work with mfn_t or use mfn_x/_mfn because the
> rest of the function use mfn_t.
> 
> So make __page_to_mfn and __mfn_to_page return mfn_t by default.
> 
> Only reasonable clean-ups are done in this patch because it is
> already quite big. So some of the files now override page_to_mfn and
> mfn_to_page to avoid using mfn_t.
> 
> Lastly, domain_page_to_mfn is also converted to use mfn_t given that
> most of the callers are now switched to _mfn(domain_page_to_mfn(...)).
> 
> Signed-off-by: Julien Grall <julien.gr...@linaro.org>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/mm: Supresses vm_events caused by page-walks

2017-10-30 Thread Razvan Cojocaru
On 10/30/2017 07:07 PM, Tamas K Lengyel wrote:
> On Mon, Oct 30, 2017 at 11:01 AM, Razvan Cojocaru
> <rcojoc...@bitdefender.com> wrote:
>> On 10/30/2017 06:39 PM, Tamas K Lengyel wrote:
>>> On Mon, Oct 30, 2017 at 10:24 AM, Razvan Cojocaru
>>> <rcojoc...@bitdefender.com> wrote:
>>>> On 30.10.2017 18:01, Tamas K Lengyel wrote:
>>>>> On Mon, Oct 30, 2017 at 4:32 AM, Alexandru Isaila
>>>>> <aisa...@bitdefender.com> wrote:
>>>>>> This patch is adding a way to enable/disable nested pagefault
>>>>>> events. It introduces the xc_monitor_nested_pagefault function
>>>>>> and adds the nested_pagefault_disabled in the monitor structure.
>>>>>> This is needed by the introspection so it will only get gla
>>>>>> faults and not get spammed with other faults.
>>>>>> In p2m_set_ad_bits the v->arch.sse_pg_dirty.eip and
>>>>>> v->arch.sse_pg_dirty.gla are used to mark that this is the
>>>>>> second time a fault occurs and the dirty bit is set.
>>>>>
>>>>> Could you describe under what conditions do you get these other faults?
>>>>
>>>> Hey Tamas, the whole story is at page 8 of this document:
>>>>
>>>> https://www.researchgate.net/publication/281835515_Proposed_Processor_Extensions_for_Significant_Speedup_of_Hypervisor_Memory_Introspection
>>>
>>> Hi Razvan,
>>> thanks but I'm not sure that doc addresses my question. You
>>> effectively filter out npfec_kind_in_gpt and npfec_kind_unknown in
>>> this patch. The first, npfec_kind_in_gpt should only happen if you
>>> have restricted access to the gpt with ept and the processor couldn't
>>> walk the table. But if you don't want to get events of these types
>>> then why not simply not restrict access the gpt to begin with? And as
>>> for npfec_kind_unknown, I don't think that gets generated under any
>>> situation. So hence my question, what is your setup that makes this
>>> patch necessary?
>>
>> On the npfec_kind_unknown case, indeed, we were wondering when that
>> might possibly occur when discussing this patch - it's probably reserved
>> for the future?
>>
>> On why our introspection engine decides to restrict access to those
>> specific pages, I am not intimate with its inner workings, and not sure
>> how much could be disclosed here in any case. Is it not a worthwhile
>> (and otherwise harmless) tool to be able to switch A/D bits-triggered
>> EPT faults anyway, for introspection purposes?
> 
> It changes the default behavior of mem_access events so I just wanted
> to get some background on when that is really required. Technically
> there is no reason why we couldn't do that filtering in Xen. I think
> it might be better to flip the filter the other way though so the
> default behavior remains as is (ie. change the option to enable
> filtering instead of enabling monitoring).

Wait, it shouldn't change the default behaviour at all. If nobody calls
that function, all the EPT event kinds should be sent out - the new
monitor flag is a "disable" flag for non-GLA event (the so-called
"nested page fault" events).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/mm: Supresses vm_events caused by page-walks

2017-10-30 Thread Razvan Cojocaru
On 10/30/2017 06:39 PM, Tamas K Lengyel wrote:
> On Mon, Oct 30, 2017 at 10:24 AM, Razvan Cojocaru
> <rcojoc...@bitdefender.com> wrote:
>> On 30.10.2017 18:01, Tamas K Lengyel wrote:
>>> On Mon, Oct 30, 2017 at 4:32 AM, Alexandru Isaila
>>> <aisa...@bitdefender.com> wrote:
>>>> This patch is adding a way to enable/disable nested pagefault
>>>> events. It introduces the xc_monitor_nested_pagefault function
>>>> and adds the nested_pagefault_disabled in the monitor structure.
>>>> This is needed by the introspection so it will only get gla
>>>> faults and not get spammed with other faults.
>>>> In p2m_set_ad_bits the v->arch.sse_pg_dirty.eip and
>>>> v->arch.sse_pg_dirty.gla are used to mark that this is the
>>>> second time a fault occurs and the dirty bit is set.
>>>
>>> Could you describe under what conditions do you get these other faults?
>>
>> Hey Tamas, the whole story is at page 8 of this document:
>>
>> https://www.researchgate.net/publication/281835515_Proposed_Processor_Extensions_for_Significant_Speedup_of_Hypervisor_Memory_Introspection
> 
> Hi Razvan,
> thanks but I'm not sure that doc addresses my question. You
> effectively filter out npfec_kind_in_gpt and npfec_kind_unknown in
> this patch. The first, npfec_kind_in_gpt should only happen if you
> have restricted access to the gpt with ept and the processor couldn't
> walk the table. But if you don't want to get events of these types
> then why not simply not restrict access the gpt to begin with? And as
> for npfec_kind_unknown, I don't think that gets generated under any
> situation. So hence my question, what is your setup that makes this
> patch necessary?

On the npfec_kind_unknown case, indeed, we were wondering when that
might possibly occur when discussing this patch - it's probably reserved
for the future?

On why our introspection engine decides to restrict access to those
specific pages, I am not intimate with its inner workings, and not sure
how much could be disclosed here in any case. Is it not a worthwhile
(and otherwise harmless) tool to be able to switch A/D bits-triggered
EPT faults anyway, for introspection purposes?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/mm: Supresses vm_events caused by page-walks

2017-10-30 Thread Razvan Cojocaru
On 30.10.2017 18:01, Tamas K Lengyel wrote:
> On Mon, Oct 30, 2017 at 4:32 AM, Alexandru Isaila
>  wrote:
>> This patch is adding a way to enable/disable nested pagefault
>> events. It introduces the xc_monitor_nested_pagefault function
>> and adds the nested_pagefault_disabled in the monitor structure.
>> This is needed by the introspection so it will only get gla
>> faults and not get spammed with other faults.
>> In p2m_set_ad_bits the v->arch.sse_pg_dirty.eip and
>> v->arch.sse_pg_dirty.gla are used to mark that this is the
>> second time a fault occurs and the dirty bit is set.
> 
> Could you describe under what conditions do you get these other faults?

Hey Tamas, the whole story is at page 8 of this document:

https://www.researchgate.net/publication/281835515_Proposed_Processor_Extensions_for_Significant_Speedup_of_Hypervisor_Memory_Introspection


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-23 Thread Razvan Cojocaru
On 23.10.2017 11:41, Jan Beulich wrote:
>>>> On 23.10.17 at 10:34, <rcojoc...@bitdefender.com> wrote:
> 
>>
>> On 23.10.2017 11:10, Jan Beulich wrote:
>>>>>> On 20.10.17 at 18:32, <rcojoc...@bitdefender.com> wrote:
>>>> On 10/20/2017 07:15 PM, Wei Liu wrote:
>>>>> On Mon, Oct 16, 2017 at 08:07:41PM +0300, Petre Pircalabu wrote:
>>>>>> From: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>>>>
>>>>>> For the default EPT view we have xc_set_mem_access_multi(), which
>>>>>> is able to set an array of pages to an array of access rights with
>>>>>> a single hypercall. However, this functionality was lacking for the
>>>>>> altp2m subsystem, which could only set page restrictions for one
>>>>>> page at a time. This patch addresses the gap.
>>>>>>
>>>>>> HVMOP_altp2m_set_mem_access_multi has been added as a HVMOP (as opposed 
>>>>>> to a
>>>>>> DOMCTL) for consistency with its HVMOP_altp2m_set_mem_access counterpart 
>> (and
>>>>>> hence with the original altp2m design, where domains are allowed - with 
>>>>>> the
>>>>>> proper altp2m access rights - to alter these settings), in the absence 
>>>>>> of an
>>>>>> official position on the issue from the original altp2m designers.
>>>>>>
>>>>>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>>>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>>>>>
>>>>>
>>>>> The title is a bit misleading -- this patch actually contains changes to
>>>>> hypervisor as well.
>>>>
>>>> Sorry, I have assumed that the hypervisor changes are implied. We're
>>>> happy to change it. Would "x86/altp2m: Added
>>>> xc_altp2m_set_mem_access_multi() and hypervisor support" be better?
>>>
>>> But please not again "Added" - we've had this discussion before.
>>> The title is supposed to tell what a patch does, not what the state
>>> of the code is after it was applied.
>>
>> Will do, how does "{xen,libxc}/altp2m: support for setting restrictions
>> for an array of pages" sound?
> 
> The text is fine, but I'm not sure the {xen,libxc} part of the prefix
> is really very useful.

I was hoping to address Wei's comment with it - 'xen' would stand for
the hypervisor part, and 'libxc' for the toolstack part. However, you're
right: for one, the 'x86' part was useful, and then the problem before
was not so much that it didn't explicitly specify 'xen', but that it
implied that the changes have more to do with libxc (because it
mentioned xc_altp2m_set_mem_access_multi()).

"x86/altp2m: support for setting restrictions for an array of pages" it
is then. :) Sorry for causing confusion!


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-23 Thread Razvan Cojocaru


On 23.10.2017 11:10, Jan Beulich wrote:
>>>> On 20.10.17 at 18:32, <rcojoc...@bitdefender.com> wrote:
>> On 10/20/2017 07:15 PM, Wei Liu wrote:
>>> On Mon, Oct 16, 2017 at 08:07:41PM +0300, Petre Pircalabu wrote:
>>>> From: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>>
>>>> For the default EPT view we have xc_set_mem_access_multi(), which
>>>> is able to set an array of pages to an array of access rights with
>>>> a single hypercall. However, this functionality was lacking for the
>>>> altp2m subsystem, which could only set page restrictions for one
>>>> page at a time. This patch addresses the gap.
>>>>
>>>> HVMOP_altp2m_set_mem_access_multi has been added as a HVMOP (as opposed to 
>>>> a
>>>> DOMCTL) for consistency with its HVMOP_altp2m_set_mem_access counterpart 
>>>> (and
>>>> hence with the original altp2m design, where domains are allowed - with the
>>>> proper altp2m access rights - to alter these settings), in the absence of 
>>>> an
>>>> official position on the issue from the original altp2m designers.
>>>>
>>>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>>>
>>>
>>> The title is a bit misleading -- this patch actually contains changes to
>>> hypervisor as well.
>>
>> Sorry, I have assumed that the hypervisor changes are implied. We're
>> happy to change it. Would "x86/altp2m: Added
>> xc_altp2m_set_mem_access_multi() and hypervisor support" be better?
> 
> But please not again "Added" - we've had this discussion before.
> The title is supposed to tell what a patch does, not what the state
> of the code is after it was applied.

Will do, how does "{xen,libxc}/altp2m: support for setting restrictions
for an array of pages" sound?

We'll change the title as soon as we have comments to address for a new
version.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-20 Thread Razvan Cojocaru
On 10/20/2017 07:39 PM, Wei Liu wrote:
> On Fri, Oct 20, 2017 at 07:32:50PM +0300, Razvan Cojocaru wrote:
>> On 10/20/2017 07:15 PM, Wei Liu wrote:
>>> On Mon, Oct 16, 2017 at 08:07:41PM +0300, Petre Pircalabu wrote:
>>>> From: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>>
>>>> For the default EPT view we have xc_set_mem_access_multi(), which
>>>> is able to set an array of pages to an array of access rights with
>>>> a single hypercall. However, this functionality was lacking for the
>>>> altp2m subsystem, which could only set page restrictions for one
>>>> page at a time. This patch addresses the gap.
>>>>
>>>> HVMOP_altp2m_set_mem_access_multi has been added as a HVMOP (as opposed to 
>>>> a
>>>> DOMCTL) for consistency with its HVMOP_altp2m_set_mem_access counterpart 
>>>> (and
>>>> hence with the original altp2m design, where domains are allowed - with the
>>>> proper altp2m access rights - to alter these settings), in the absence of 
>>>> an
>>>> official position on the issue from the original altp2m designers.
>>>>
>>>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>>>
>>>
>>> The title is a bit misleading -- this patch actually contains changes to
>>> hypervisor as well.
>>
>> Sorry, I have assumed that the hypervisor changes are implied.
> 
> And to expound my thought on this -- change in hypervisor is not
> implied. We have had cases in which only toolstack change was needed
> because hypervisor code was already there. Getting the title correct
> will help reviewers identify patches they need to review.

Got it. We'll change the title in the next iteration.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-20 Thread Razvan Cojocaru
On 10/20/2017 07:15 PM, Wei Liu wrote:
> On Mon, Oct 16, 2017 at 08:07:41PM +0300, Petre Pircalabu wrote:
>> From: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>
>> For the default EPT view we have xc_set_mem_access_multi(), which
>> is able to set an array of pages to an array of access rights with
>> a single hypercall. However, this functionality was lacking for the
>> altp2m subsystem, which could only set page restrictions for one
>> page at a time. This patch addresses the gap.
>>
>> HVMOP_altp2m_set_mem_access_multi has been added as a HVMOP (as opposed to a
>> DOMCTL) for consistency with its HVMOP_altp2m_set_mem_access counterpart (and
>> hence with the original altp2m design, where domains are allowed - with the
>> proper altp2m access rights - to alter these settings), in the absence of an
>> official position on the issue from the original altp2m designers.
>>
>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>
> 
> The title is a bit misleading -- this patch actually contains changes to
> hypervisor as well.

Sorry, I have assumed that the hypervisor changes are implied. We're
happy to change it. Would "x86/altp2m: Added
xc_altp2m_set_mem_access_multi() and hypervisor support" be better?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 00/14] Intel EPT-Based Sub-page Write Protection Support.

2017-10-20 Thread Razvan Cojocaru
On 20.10.2017 11:37, Yi Zhang wrote:
> On 2017-10-19 at 12:07:44 +0300, Razvan Cojocaru wrote:
>> On 19.10.2017 11:04, Zhang Yi wrote:
>>> From: Zhang Yi Z <yi.z.zh...@linux.intel.com>
>>>
>>> Hi All,
>>>
>>> Here is a patch-series which adding EPT-Based Sub-page Write Protection 
>>> Support. You can get It's software developer manuals from:
>>>
>>> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
>>
>> Has this been tested in any way with altp2m also enabled?
> 
> Thanks for your review Razvan. Haven't test it on altp2m mode, We plan
> to add the altp2m compatibility in next part of this feature.

Fair enough. Thanks for adding this feature!

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 00/14] Intel EPT-Based Sub-page Write Protection Support.

2017-10-20 Thread Razvan Cojocaru
On 20.10.2017 11:37, Yi Zhang wrote:
> On 2017-10-19 at 12:07:44 +0300, Razvan Cojocaru wrote:
>> On 19.10.2017 11:04, Zhang Yi wrote:
>>> From: Zhang Yi Z <yi.z.zh...@linux.intel.com>
>>>
>>> Hi All,
>>>
>>> Here is a patch-series which adding EPT-Based Sub-page Write Protection 
>>> Support. You can get It's software developer manuals from:
>>>
>>> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
>>
>> Has this been tested in any way with altp2m also enabled?
> 
> Thanks for your review Razvan. Haven't test it on altp2m mode, We plan
> to add the altp2m compatibility in next part of this feature.

Fair enough. Thanks for adding this feature!

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 00/14] Intel EPT-Based Sub-page Write Protection Support.

2017-10-19 Thread Razvan Cojocaru
On 19.10.2017 11:04, Zhang Yi wrote:
> From: Zhang Yi Z 
> 
> Hi All,
> 
> Here is a patch-series which adding EPT-Based Sub-page Write Protection 
> Support. You can get It's software developer manuals from:
> 
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Has this been tested in any way with altp2m also enabled?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 14/14] xen: tools: Added xen-subpage tool.

2017-10-19 Thread Razvan Cojocaru
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define DPRINTF(a, b...) fprintf(stderr, a, ## b)
> +#define ERROR(a, b...) fprintf(stderr, a "\n", ## b)
> +#define PERROR(a, b...) fprintf(stderr, a ": %s\n", ## b, strerror(errno))
> +
> +void usage(char* progname)
> +{
> +fprintf(stderr, "Usage: %s [-m]  get|set [gfn] [bit_map]", 
> progname);
> +
> +fprintf(stderr,
> +"\n"
> +"set - set gfn bitmap.\n"
> +"\n"
> +"-m requires this program to run\n");
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +domid_t domain_id;
> +xc_interface *xch;
> +xen_pfn_t gfn = 0;
> +uint32_t access = 0;
> +int required = 0;
> +int rc = 0;
> +
> +char* progname = argv[0];
> +argv++;
> +argc--;
> +
> +if ( argc == 5 && argv[0][0] == '-' )
> +{
> +if ( !strcmp(argv[0], "-m") )
> +required = 1;
> +else
> +{
> +usage(progname);
> +return -1;
> +}
> +argv++;
> +argc--;
> +}
> +
> +if ( argc != 4 )
> +{
> +usage(progname);
> +return -1;
> +}
> +
> +domain_id = atoi(argv[0]);
> +argv++;
> +argc--;
> +
> +if ( !strcmp(argv[0], "set") )
> +{
> +gfn = strtoul(argv[1], 0, 0);
> +access = strtoul(argv[2], 0, 0);
> +DPRINTF("set subpage gfn:0x%lx -- map:0x%x\n", gfn, access);
> +xch = xc_interface_open(NULL, NULL, 0);
> +if ( !xch )
> +{
> +ERROR("get interface error\n");
> +return -1;
> +}
> +xc_mem_set_subpage(xch, domain_id, gfn, access);
> +xc_interface_close(xch);
> +}
> +else
> +{
> +usage(argv[0]);
> +return -1;
> +}
> +
> +return rc;
> +}

As far as I understand, this example just calls the new hypercall and exits.

Should there be another vm_event-subscribed application that will be
affected by the changes? If so, doesn't this rather belong in the
xen-access.c test?

Also, no explanation is given in comments in the source code or the
displayed help for useful values of the access parameter, and what it
stands for.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 13/14] xen: tools: Introduce the set-subpage into xenctrl

2017-10-19 Thread Razvan Cojocaru
On 19.10.2017 11:15, Zhang Yi wrote:
> From: Zhang Yi Z 
> 
> Introduce the Xen Hypercall HVMOP_set_subpage into Xenctl.
> The API is defined as flowing.
> 
> int xc_mem_set_subpage(xc_interface *handle, domid_t domid,
>xen_pfn_t gfn, uint32_t access);
> 
> Signed-off-by: Zhang, Yi Z 
> ---
>  tools/libxc/include/xenctrl.h |  2 ++
>  tools/libxc/xc_mem_paging.c   | 20 
>  2 files changed, 22 insertions(+)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index bde8313..a13f2c7 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1956,6 +1956,8 @@ int xc_mem_paging_evict(xc_interface *xch, domid_t 
> domain_id, uint64_t gfn);
>  int xc_mem_paging_prep(xc_interface *xch, domid_t domain_id, uint64_t gfn);
>  int xc_mem_paging_load(xc_interface *xch, domid_t domain_id,
> uint64_t gfn, void *buffer);
> +int xc_mem_set_subpage(xc_interface *handle, domid_t domid,
> + xen_pfn_t gfn, uint32_t access);

Would you consider a small comment here explaining at least the access
parameter?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/hvm: Add MSR old value

2017-10-13 Thread Razvan Cojocaru
On 10/13/2017 07:26 PM, Tamas K Lengyel wrote:
> On Fri, Oct 13, 2017 at 9:50 AM, Jan Beulich  wrote:
> On 13.10.17 at 14:50,  wrote:
>>> This patch adds the old value param and the onchangeonly option
>>> to the VM_EVENT_REASON_MOV_TO_MSR event.
>>>
>>> The param was added to the vm_event_mov_to_msr struct and to the
>>> hvm_monitor_msr function. Finally I've changed the bool_t param
>>> to a bool for the hvm_msr_write_intercept function.
>>>
>>> Signed-off-by: Alexandru Isaila 
>>> Acked-by: Tamas K Lengyel 
>>
>> I think this should have been dropped with a bug having been fixed
>> (for only cosmetic changes it would be fine to keep). For the bits
>> where it's applicable
>> Acked-by: Jan Beulich 
> 
> Indeed. It's fine for now, my ack still stands.

That was my mistake - I thought that the fix was so obvious that Tamas
couldn't possibly object, and advised Alexandru to keep the ack. We'll
be more careful next time.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/hvm: Add MSR old value

2017-10-13 Thread Razvan Cojocaru

On 13.10.2017 13:29, Jan Beulich wrote:

+__set_bit(index + sizeof(struct monitor_msr_bitmap), bitmap);


I think you miss "* 8" here - a bit position plus sizeof() doesn't
produce any useful value.

But what's worse - having read till the end of the patch I don't
see you change any allocation, yet you clearly need to double
the space now that you need two bits per MSR.


We did this:

diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
index e59f1f5..a3046c6 100644
--- a/xen/arch/x86/monitor.c
+++ b/xen/arch/x86/monitor.c
@@ -25,7 +25,7 @@
 int arch_monitor_init_domain(struct domain *d)
 {
 if ( !d->arch.monitor.msr_bitmap )
-d->arch.monitor.msr_bitmap = xzalloc(struct monitor_msr_bitmap);
+d->arch.monitor.msr_bitmap = xzalloc_array(struct 
monitor_msr_bitmap, 2);


 if ( !d->arch.monitor.msr_bitmap )
 return -ENOMEM;
@@ -67,7 +67,7 @@ static unsigned long *monitor_bitmap_for_msr(const 
struct domain *d, u32 *msr)

 }
 }

I.e., we are now allocating an array of size 2 of struct 
monitor_msr_bitmaps with xzalloc_array().



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] X86EMUL_CMPXCHG_FAILED

2017-10-11 Thread Razvan Cojocaru

Hello,

Now that "x86/hvm: implement hvmemul_write() using real mappings" is in, 
we can start working again on implementing hvmemul_cmpxchg() with a real 
CMPXCHG, and finally fix the SMP emulation race upstream.


However, in order to do that we would need X86EMUL_CMPXCHG_FAILED which 
has been dropped by Andrew here:


https://patchwork.kernel.org/patch/9449339/

and re-contributed by Jan here:

https://patchwork.kernel.org/patch/9651613/

(the patch is a combination of Jan's patch and my fumbling with CMPXCHG).

However, I remember Jan saying that his patch is no longer the way to go 
here. How should we proceed?



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-10 Thread Razvan Cojocaru
On 10/10/2017 05:24 PM, Jan Beulich wrote:
 On 10.10.17 at 16:13,  wrote:
>> On Ma, 2017-10-10 at 06:28 -0600, Jan Beulich wrote:> > 
>
>>
>> a.u.set_mem_access_multi.pfn_list,
 +  a.u.set_mem_access_multi.acc
 ess_list,
 +  a.u.set_mem_access_multi.nr,
 +  a.u.set_mem_access_multi.opa
 que,
 +  MEMOP_CMD_MASK,
>>> This not being a memop, re-using this value looks arbitrary.
>>> Unless it absolutely has to be this value, please either add a brief
>>> comment saying this just happens to fit your need or use a literal
>>> constant.
>> I will add a comment to clarify the reason for using MEMOP_CMD_MASK:
>> e.g.
>> /* MEMOP_CMD_MASK is used here to mirror the way
>> p2m_set_mem_access_multi() is called for the
>> XENMEM_access_op_set_access_multi case. */
> 
> But that's neither brief nor an actual reason (if at all it is a cosmetic
> one). A wider or more narrow mask would do, afaict.

Revisiting the p2m_set_mem_access_multi() code, the mask is used here:

447 /* Check for continuation if it's not the last iteration. */
448 if ( nr > ++start && !(start & mask) &&
hypercall_preempt_check() )
449 {
450 rc = start;
451 break;
452 }

If I'm reading the code correctly, MEMOP_CMD_MASK happens to be
0011, which allows an interruption in the processing loop
every 64th time we go through it.

Now, it does indeed make more sense to see MEMOP_CMD_MASK being used
with XENMEM_access_op_set_access_multi than it does with altp2m (where
we're not dealing with memops). However, indeed almost any mask would do
here (0x1f, for example, or 0x7f, or something else entirely).

The reason I've initially picked MEMOP_CMD_MASK for this patch is that
it had seemed like a reasonable default (currenly made use of with
XENMEM_access_op_set_access_multi). I'm quite likely missing something,
but I don't know what criteria we should use for picking a good value
here, and how to express it. And if we go with the tried and true
MEMOP_CMD_MASK (because it seems to work well), is it not appropriate to
express that intent by using its name in the code?

Should we just use it's value directly (i.e. 0x3f)?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-09 Thread Razvan Cojocaru

On 09.10.2017 10:23, Jan Beulich wrote:

On 06.10.17 at 18:07,  wrote:

On 10/06/2017 06:34 PM, Jan Beulich wrote:

On 05.10.17 at 17:42,  wrote:

@@ -4451,6 +4453,7 @@ static int do_altp2m_op(
  case HVMOP_altp2m_destroy_p2m:
  case HVMOP_altp2m_switch_p2m:
  case HVMOP_altp2m_set_mem_access:
+case HVMOP_altp2m_set_mem_access_multi:


Was it agreed that this, just like others (many wrongly, I think) is
supposed to be invokable by the affected domain itself? Its non-
altp2m counterpart is a DM_PRIV operation. If the one here is
meant to be different, I think the commit message should say why.


In the absence of an answer from the designers of altp2m, we've chosen
to remain consistent with HVMOP_altp2m_set_mem_access - since that is
allowed to be invoked by the domain itself, this operation is also
allowed to do that.

Back in March, I've sent a DOMCTL version:

https://patchwork.kernel.org/patch/9633615/

and a HVMOP version (minus the compat part):

https://patchwork.kernel.org/patch/9612799/

It has been discussed, and an authoritative answer on the design of this
was sought out, but despite several kind reminders during this time, it
never came. At this point, the least modification to the initial design
appears to be to keep the new operation as a HVMOP. This is an important
optimization, and the waiting period for objections has surely been
reasonable.


Okay, this is (sort of) fine, but especially when there was no
feedback the decision taken (and its reason) should be recorded
in the commit message. As stated above as well as earlier, I
strongly think that the altp2m permissions are too lax right now
(and hence the patch here widens the problem, but I can agree
that making it match set_mem_access is not unreasonable).


Of course. We'll update the commit message. Thanks for the review!


Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-10-06 Thread Razvan Cojocaru
On 10/06/2017 06:34 PM, Jan Beulich wrote:
 On 05.10.17 at 17:42,  wrote:
>> @@ -4451,6 +4453,7 @@ static int do_altp2m_op(
>>  case HVMOP_altp2m_destroy_p2m:
>>  case HVMOP_altp2m_switch_p2m:
>>  case HVMOP_altp2m_set_mem_access:
>> +case HVMOP_altp2m_set_mem_access_multi:
> 
> Was it agreed that this, just like others (many wrongly, I think) is
> supposed to be invokable by the affected domain itself? Its non-
> altp2m counterpart is a DM_PRIV operation. If the one here is
> meant to be different, I think the commit message should say why.

In the absence of an answer from the designers of altp2m, we've chosen
to remain consistent with HVMOP_altp2m_set_mem_access - since that is
allowed to be invoked by the domain itself, this operation is also
allowed to do that.

Back in March, I've sent a DOMCTL version:

https://patchwork.kernel.org/patch/9633615/

and a HVMOP version (minus the compat part):

https://patchwork.kernel.org/patch/9612799/

It has been discussed, and an authoritative answer on the design of this
was sought out, but despite several kind reminders during this time, it
never came. At this point, the least modification to the initial design
appears to be to keep the new operation as a HVMOP. This is an important
optimization, and the waiting period for objections has surely been
reasonable.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 9/9] xen: Convert __page_to_mfn and __mfn_to_page to use typesafe MFN

2017-10-05 Thread Razvan Cojocaru
On 10/05/2017 08:42 PM, Julien Grall wrote:
> Most of the users of page_to_mfn and mfn_to_page are either overriding
> the macros to make them work with mfn_t or use mfn_x/_mfn because the
> rest of the function use mfn_t.
> 
> So make __page_to_mfn and __mfn_to_page return mfn_t by default.
> 
> Only reasonable clean-ups are done in this patch because it is
> already quite big. So some of the files now override page_to_mfn and
> mfn_to_page to avoid using mfn_t.
> 
> Lastly, domain_page_to_mfn is also converted to use mfn_t given that
> most of the callers are now switched to _mfn(domain_page_to_mfn(...)).
> 
> Signed-off-by: Julien Grall <julien.gr...@linaro.org>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] xen: Convert __page_to_mfn and __mfn_to_page to use typesafe MFN

2017-10-04 Thread Razvan Cojocaru
On 10/04/2017 09:15 PM, Julien Grall wrote:
> Most of the users of page_to_mfn and mfn_to_page are either overriding
> the macros to make them work with mfn_t or use mfn_x/_mfn because the
> rest of the function use mfn_t.
> 
> So make __page_to_mfn and __mfn_to_page return mfn_t by default.
> 
> Only reasonable clean-ups are done in this patch because it is
> already quite big. So some of the files now override page_to_mfn and
> mfn_to_page to avoid using mfn_t.
> 
> Signed-off-by: Julien Grall <julien.gr...@linaro.org>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/16] xen/x86: mem_access: Fix mis-indented line

2017-09-21 Thread Razvan Cojocaru
On 09/21/2017 03:40 PM, Julien Grall wrote:
> Signed-off-by: Julien Grall <julien.gr...@arm.com>
> 
> ---
> 
> Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Tamas K Lengyel <ta...@tklengyel.com>
> Cc: George Dunlap <george.dun...@eu.citrix.com>
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/domctl: Don't pause the whole domain if only getting vcpu state

2017-09-18 Thread Razvan Cojocaru
On 09/18/2017 06:35 PM, Jan Beulich wrote:
 On 12.09.17 at 15:53,  wrote:
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -625,6 +625,26 @@ long arch_do_domctl(
>>   !is_hvm_domain(d) )
>>  break;
>>  
>> +if ( domctl->u.hvmcontext_partial.type == HVM_SAVE_CODE(CPU) &&
>> + domctl->u.hvmcontext_partial.instance < d->max_vcpus )
> 
> I have to admit that I'm not in favor of such special casing, even
> less so without any code comment saying why this is so special.
> What if someone else wanted some other piece of vCPU state
> without pausing the entire domain? Wouldn't it be possible to
> generalize this to cover all such state elements?

There's no reason why all the other cases where this would the possible
shouldn't be optimized. What has made this one stand out for us is that
we're using it a lot with introspection, and the optimization counts.

But judging by the code reorganization (the addition of
hvm_save_one_cpu_ctxt()), the changes would need to be done on a
one-by-one case anyway (different queries may require different ways of
chaging the code).



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/15] xen/x86: p2m: Use typesafe gfn for the P2M callbacks get_entry and set_entry

2017-09-13 Thread Razvan Cojocaru
On 09/13/2017 09:27 PM, Julien Grall wrote:
> Hi Andrew,
> 
> On 09/13/2017 07:22 PM, Andrew Cooper wrote:
>> On 13/09/2017 18:59, Julien Grall wrote:
>>> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
>>> index 0e63d6ed11..57878b1886 100644
>>> --- a/xen/arch/x86/mm/p2m-pt.c
>>> +++ b/xen/arch/x86/mm/p2m-pt.c
>>> @@ -479,12 +479,13 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
>>>     /* Returns: 0 for success, -errno for failure */
>>>   static int
>>> -p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>> +p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_t, mfn_t mfn,
>>>    unsigned int page_order, p2m_type_t p2mt,
>>> p2m_access_t p2ma,
>>>    int sve)
>>>   {
>>>   /* XXX -- this might be able to be faster iff current->domain
>>> == d */
>>>   void *table;
>>> +    unsigned long gfn = gfn_x(gfn_t);
>>>   unsigned long i, gfn_remainder = gfn;
>>>   l1_pgentry_t *p2m_entry, entry_content;
>>>   /* Intermediate table to free if we're replacing it with a
>>> superpage. */
>>> @@ -731,11 +732,12 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>>> unsigned long gfn, mfn_t mfn,
>>>   }
>>>     static mfn_t
>>> -p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
>>> +p2m_pt_get_entry(struct p2m_domain *p2m, gfn_t gfn_t,
>>>    p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
>>>    unsigned int *page_order, bool_t *sve)
>>>   {
>>>   mfn_t mfn;
>>> +    unsigned long gfn = gfn_x(gfn_t);
>>
>> These two are rather risky, because you shadow the gfn_t type with a
>> variable named gfn_t.  I know its ugly, but how about just gfn_ ?
> 
> I can do that. Hopefully in the future both functions will be fully
> typesafe, so gfn_ will completely disappear.

Ah, I had proposed gfn_l for unsigned long gfns - it seems to be the
unspoken convention in the rest of the code, so I think it fits nicely.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/15] xen/x86: p2m: Use typesafe gfn for the P2M callbacks get_entry and set_entry

2017-09-13 Thread Razvan Cojocaru
On 09/13/2017 08:59 PM, Julien Grall wrote:
> Signed-off-by: Julien Grall <julien.gr...@arm.com>
> 
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Tamas K Lengyel <ta...@tklengyel.com>
> Cc: George Dunlap <george.dun...@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakaj...@intel.com>
> Cc: Kevin Tian <kevin.t...@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c|  2 +-
>  xen/arch/x86/mm/mem_access.c  | 19 +--
>  xen/arch/x86/mm/mem_sharing.c |  4 +--
>  xen/arch/x86/mm/p2m-ept.c |  6 ++--
>  xen/arch/x86/mm/p2m-pod.c | 15 +
>  xen/arch/x86/mm/p2m-pt.c  |  6 ++--
>  xen/arch/x86/mm/p2m.c | 77 
> +++
>  xen/include/asm-x86/p2m.h |  4 +--
>  8 files changed, 73 insertions(+), 60 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 6cb903def5..53be8c093c 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1787,7 +1787,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
> long gla,
>  {
>  bool_t sve;
>  
> -p2m->get_entry(p2m, gfn, , , 0, NULL, );
> +p2m->get_entry(p2m, _gfn(gfn), , , 0, NULL, );
>  
>  if ( !sve && altp2m_vcpu_emulate_ve(curr) )
>  {
> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
> index 9211fc0abe..948e377e69 100644
> --- a/xen/arch/x86/mm/mem_access.c
> +++ b/xen/arch/x86/mm/mem_access.c
> @@ -66,7 +66,7 @@ static int _p2m_get_mem_access(struct p2m_domain *p2m, 
> gfn_t gfn,
>  }
>  
>  gfn_lock(p2m, gfn, 0);
> -mfn = p2m->get_entry(p2m, gfn_x(gfn), , , 0, NULL, NULL);
> +mfn = p2m->get_entry(p2m, gfn, , , 0, NULL, NULL);
>  gfn_unlock(p2m, gfn, 0);
>  
>  if ( mfn_eq(mfn, INVALID_MFN) )
> @@ -142,7 +142,7 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>vm_event_request_t **req_ptr)
>  {
>  struct vcpu *v = current;
> -unsigned long gfn = gpa >> PAGE_SHIFT;
> +gfn_t gfn = gaddr_to_gfn(gpa);
>  struct domain *d = v->domain;
>  struct p2m_domain *p2m = NULL;
>  mfn_t mfn;
> @@ -215,7 +215,7 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>  *req_ptr = req;
>  
>  req->reason = VM_EVENT_REASON_MEM_ACCESS;
> -req->u.mem_access.gfn = gfn;
> +req->u.mem_access.gfn = gfn_x(gfn);
>  req->u.mem_access.offset = gpa & ((1 << PAGE_SHIFT) - 1);
>  if ( npfec.gla_valid )
>  {
> @@ -247,7 +247,7 @@ int p2m_set_altp2m_mem_access(struct domain *d, struct 
> p2m_domain *hp2m,
>  unsigned long gfn_l = gfn_x(gfn);

I'd drop gfn_l completely here and just use gfn_x(gfn) in the two places
below where it's used. It would get rid of a line of code in this
function. But I wouldn't hold the patch over this, so if nobody else has
comments it's fine as is.

>  int rc;
>  
> -mfn = ap2m->get_entry(ap2m, gfn_l, , _a, 0, NULL, NULL);
> +mfn = ap2m->get_entry(ap2m, gfn, , _a, 0, NULL, NULL);
>  
>  /* Check host p2m if no valid entry in alternate */
>  if ( !mfn_valid(mfn) )
> @@ -264,16 +264,16 @@ int p2m_set_altp2m_mem_access(struct domain *d, struct 
> p2m_domain *hp2m,
>  if ( page_order != PAGE_ORDER_4K )
>  {
>  unsigned long mask = ~((1UL << page_order) - 1);
> -unsigned long gfn2_l = gfn_l & mask;
> +gfn_t gfn2 = _gfn(gfn_l & mask);
>  mfn_t mfn2 = _mfn(mfn_x(mfn) & mask);
>  
> -rc = ap2m->set_entry(ap2m, gfn2_l, mfn2, page_order, t, old_a, 
> 1);
> +rc = ap2m->set_entry(ap2m, gfn2, mfn2, page_order, t, old_a, 1);
>  if ( rc )
>  return rc;
>  }
>  }
>  
> -return ap2m->set_entry(ap2m, gfn_l, mfn, PAGE_ORDER_4K, t, a,
> +return ap2m->set_entry(ap2m, gfn, mfn, PAGE_ORDER_4K, t, a,
>   (current->domain != d));

If you need to send V2, could you please also indent the last line so
that '(' is under the 'a' in 'ap2m'? You don't have to, but it'd be a
coding style fix coming in on top of this patch.

>  }
>  
> @@ -295,10 +295,9 @@ static int set_mem_access(struct domain *d, struct 
> p2m_domain *p2m,
>  mfn_t mfn;
>  p2m_access_t _a;
>  p2m_type_t t;
> -unsigned long gfn_l = gfn_x(gfn);
>  
> -mfn = p2m->get_entry(p2m, gfn_l, , &_a, 0, 

Re: [Xen-devel] [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles

2017-09-12 Thread Razvan Cojocaru
On 09/12/2017 06:08 PM, Jan Beulich wrote:
> By virtue of the struct xen_domctl container structure, most of them
> are really just cluttering the name space.
> 
> While doing so,
> - convert an enum typed (pt_irq_type_t) structure field to a fixed
>   width type,
> - make x86's paging_domctl() and descendants take a properly typed
>   handle,
> - add const in a few places.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] mem_access: switch to plain bool

2017-09-11 Thread Razvan Cojocaru

On 11.09.2017 14:16, Wei Liu wrote:

Signed-off-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
Cc: Tamas K Lengyel <ta...@tklengyel.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Julien Grall <julien.gr...@arm.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
---
  xen/arch/arm/mem_access.c|  4 ++--
  xen/arch/x86/mm/mem_access.c | 16 
  xen/include/asm-arm/mem_access.h |  8 
  xen/include/asm-x86/mem_access.h |  8 
  4 files changed, 18 insertions(+), 18 deletions(-)


Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>

Thanks for doing this!


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] monitor: switch to plain bool

2017-09-08 Thread Razvan Cojocaru

On 08.09.2017 16:44, Wei Liu wrote:

Signed-off-by: Wei Liu<wei.l...@citrix.com>
---
Cc: Razvan Cojocaru<rcojoc...@bitdefender.com>
Cc: Tamas K Lengyel<ta...@tklengyel.com>
---
  xen/arch/arm/monitor.c|  4 ++--
  xen/arch/x86/hvm/monitor.c| 10 +-
  xen/common/monitor.c  |  8 
  xen/include/asm-x86/hvm/monitor.h |  6 +++---
  xen/include/xen/monitor.h |  2 +-
  5 files changed, 15 insertions(+), 15 deletions(-)


Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 4/4] vm_event: Move vm_event_toggle_singlestep to

2017-09-05 Thread Razvan Cojocaru
On 09/05/2017 11:57 AM, Sergej Proskurin wrote:
> In this commit we move the declaration of the function
> vm_event_toggle_singlestep from  to  and
> implement the associated functionality on ARM.
> 
> Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 1/4] arm/monitor: Introduce monitoring of single-step events

2017-09-05 Thread Razvan Cojocaru
On 09/05/2017 11:57 AM, Sergej Proskurin wrote:
> In this commit, we extend the capabilities of the monitor to allow
> tracing of single-step events on ARM.
> 
> Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/hvm: Expose MSR_SHADOW_GS_BASE

2017-09-01 Thread Razvan Cojocaru

On 01.09.2017 13:55, Andrew Cooper wrote:

On 01/09/17 11:44, Alexandru Isaila wrote:

diff --git a/xen/include/public/arch-x86/hvm/save.h 
b/xen/include/public/arch-x86/hvm/save.h
index fd7bf3f..e6e8e87 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -134,6 +134,8 @@ struct hvm_hw_cpu {
  /* msr for em64t */
  uint64_t shadow_gs;
  
+uint64_t shadow_gs_base;

+


You presumably haven't tried migrating across this boundary?  (Things
will explode rather impressively when you try to restore LSTAR into SMASK.)

What's wrong with the shadow_gs in context here?


Nothing, we've missed that, while grepping xen/ for MSR_SHADOW_GS_BASE 
and 0xc102. My fault, should have checked more thoroughly. Apologies.



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 39/39] arm/xen-access: Add test of xc_altp2m_change_gfn

2017-08-30 Thread Razvan Cojocaru
On 08/30/2017 09:32 PM, Sergej Proskurin wrote:
> This commit extends xen-access by a simple test of the functionality
> provided by "xc_altp2m_change_gfn". The idea is to dynamically remap a
> trapping gfn to another mfn, which holds the same content as the
> original mfn. On success, the guest will continue to run. Subsequent
> altp2m access violations will trap into Xen and be forced by xen-access
> to switch to the default view (altp2m[0]) as before. The introduced test
> can be invoked by providing the argument "altp2m_remap".
> 
> Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>
> ---
> Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Tamas K Lengyel <ta...@tklengyel.com>
> Cc: Ian Jackson <ian.jack...@eu.citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>
> ---
> v3: Cosmetic fixes in "xenaccess_copy_gfn" and "xenaccess_change_gfn".
> 
> Added munmap in "copy_gfn" in the second error case.
> 
> Added option "altp2m_remap" selecting the altp2m-remap test.
> 
> v4: Dropped the COMPAT API for mapping foreign memory. Instead, we use the
> stable library xenforeignmemory.
> 
> Dropped the use of xc_domain_increase_reservation_exact as we do not
> need to increase the domain's physical memory. Otherwise, remapping
> a page via altp2m would become visible to the guest itself. As long
> as we have additional shadow-memory for the guest domain, we do not
> need to reserve any additional memory.
> ---
>  tools/tests/xen-access/Makefile |   2 +-
>  tools/tests/xen-access/xen-access.c | 182 
> +++-
>  2 files changed, 179 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/tests/xen-access/Makefile b/tools/tests/xen-access/Makefile
> index e11f639ccf..ab195e233f 100644
> --- a/tools/tests/xen-access/Makefile
> +++ b/tools/tests/xen-access/Makefile
> @@ -26,6 +26,6 @@ clean:
>  distclean: clean
>  
>  xen-access: xen-access.o Makefile
> - $(CC) -o $@ $< $(LDFLAGS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
> $(LDLIBS_libxenevtchn)
> + $(CC) -o $@ $< $(LDFLAGS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
> $(LDLIBS_libxenevtchn) $(LDLIBS_libxenforeignmemory)
>  
>  -include $(DEPS)
> diff --git a/tools/tests/xen-access/xen-access.c 
> b/tools/tests/xen-access/xen-access.c
> index 481337cacd..f9b9fb6bbf 100644
> --- a/tools/tests/xen-access/xen-access.c
> +++ b/tools/tests/xen-access/xen-access.c
> @@ -41,6 +41,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #if defined(__arm__) || defined(__aarch64__)
>  #include 
> @@ -49,6 +50,8 @@
>  #define START_PFN 0ULL
>  #endif
>  
> +#define INVALID_GFN ~(0UL)
> +
>  #define DPRINTF(a, b...) fprintf(stderr, a, ## b)
>  #define ERROR(a, b...) fprintf(stderr, a "\n", ## b)
>  #define PERROR(a, b...) fprintf(stderr, a ": %s\n", ## b, strerror(errno))
> @@ -76,12 +79,19 @@ typedef struct vm_event {
>  typedef struct xenaccess {
>  xc_interface *xc_handle;
>  
> +xenforeignmemory_handle *fmem;
> +
>  xen_pfn_t max_gpfn;
>  
>  vm_event_t vm_event;
> +
> +unsigned int ap2m_idx;
> +xen_pfn_t gfn_old;
> +xen_pfn_t gfn_new;
>  } xenaccess_t;
>  
>  static int interrupted;
> +static int gfn_changed = 0;
>  bool evtchn_bind = 0, evtchn_open = 0, mem_access_enable = 0;
>  
>  static void close_handler(int sig)
> @@ -89,6 +99,104 @@ static void close_handler(int sig)
>  interrupted = sig;
>  }
>  
> +static int xenaccess_copy_gfn(xenaccess_t *xenaccess,
> +  domid_t domain_id,
> +  xen_pfn_t dst_gfn,
> +  xen_pfn_t src_gfn)
> +{
> +void *src_vaddr = NULL;
> +void *dst_vaddr = NULL;
> +
> +src_vaddr = xenforeignmemory_map(xenaccess->fmem, domain_id, PROT_READ,
> + 1, _gfn, NULL);
> +if ( src_vaddr == NULL )
> +return -1;
> +
> +dst_vaddr = xenforeignmemory_map(xenaccess->fmem, domain_id, PROT_WRITE,
> + 1, _gfn, NULL> +if ( dst_vaddr 
> == NULL )
> +{
> +munmap(src_vaddr, XC_PAGE_SIZE);
> +return -1;
> +}
> +
> +memcpy(dst_vaddr, src_vaddr, XC_PAGE_SIZE);
> +
> +xenforeignmemory_unmap(xenaccess->fmem, src_vaddr, 1);
> +xenforeignmemory_unmap(xenaccess->fmem, dst_vaddr, 1);
> +
> +return 0;
> +}
> +
> +/*
> + * This function allocates and populates a page in the guest's physmap that 
> is
> + * sub

Re: [Xen-devel] [PATCH v4 29/39] x86/altp2m: Move altp2m_check to altp2m.c

2017-08-30 Thread Razvan Cojocaru
On 08/30/2017 09:32 PM, Sergej Proskurin wrote:
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index 42e6f09029..66f1d83d84 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Any reason why this include has not happened alphabetically (it belongs
to the  group)?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-access: Correct default value of write-to-CR4 switch

2017-08-30 Thread Razvan Cojocaru
On 08/30/2017 02:19 PM, Sergej Proskurin wrote:
> The current implementation configures the test environment to always
> trap on writes to the CR4 control register, even on ARM. This leads to
> issues as calling xc_monitor_write_ctrlreg on ARM with VM_EVENT_X86_CR4
> will always fail.
> 
> Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] common/vm_event: Initialize vm_event lists on domain creation

2017-08-30 Thread Razvan Cojocaru
On 08/29/2017 08:41 PM, Wei Liu wrote:
 -xfree(d->vm_event);
 +#ifdef CONFIG_HAS_MEM_PAGING
 +xfree(d->vm_event_paging);
 +#endif
 +xfree(d->vm_event_monitor);
>>> Why do you unconditionally xfree these vm_event_monitor while you don't
>>> unconditionally allocate them?
>>>
>>> Not that this is strictly a bug but this deviates from the original
>>> behaviour.
>> Isn't xfree NULL safe?
> Yes, it is.

That was my suggestion: since xfree() can handle NULL I thought the
patch would have minimal changes and be easier to review if we omit
unnecessary lines.


Thanks,
Razvan



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] x86/hvm: Allow guest_request vm_events coming from userspace

2017-08-16 Thread Razvan Cojocaru
On 08/16/2017 02:16 AM, Tamas K Lengyel wrote:
> On Tue, Aug 15, 2017 at 2:06 AM, Jan Beulich  wrote:
> On 14.08.17 at 17:53,  wrote:
>>> On Tue, Aug 8, 2017 at 2:27 AM, Alexandru Isaila  
>>> wrote:
 --- a/xen/arch/x86/hvm/hypercall.c
 +++ b/xen/arch/x86/hvm/hypercall.c
 @@ -155,6 +155,11 @@ int hvm_hypercall(struct cpu_user_regs *regs)
  /* Fallthrough to permission check. */
  case 4:
  case 2:
 +if ( currd->arch.monitor.guest_request_userspace_enabled &&
 +eax == __HYPERVISOR_hvm_op &&
 +(mode == 8 ? regs->rdi : regs->ebx) == 
 HVMOP_guest_request_vm_event )
 +break;
 +
>>>
>>> So the CPL check happens after the monitor check, which means this
>>> will trigger regardless if the hypercall is coming from userspace or
>>> kernelspace. Since the monitor option specifically says userspace,
>>> this should probably get moved into the block where CPL was checked.
>>
>> What difference would this make? For CPL0 the hypercall is
>> permitted anyway, and for CPL > 0 we specifically want to bypass
>> the CPL check. Or are you saying you want to restrict the new
>> check to just CPL3?
>>
> 
> Yes, according to the name of this monitor option this should only
> trigger a vm_event when the hypercall is coming from CPL3. However,
> the way it is implemented right now I see that this monitor option
> actually requires the other one to be enabled too. By itself this
> monitor option will not work. So I would also like to ask that the
> check in xen/common/monitor.c, if ( d->monitor.guest_request_enabled
> ), to be extended to be: if ( d->monitor.guest_request_enabled ||
> d->monitor.guest_request_userspace_enabled )

The option does not trigger anything. Its job is to allow guest requests
coming from userspace (via VMCALLs). And not to _only_ allow these for
userspace, but to allow them coming from userspace _as_well_.

The current version of the patch, if I've not missed something, does not
require d->monitor.guest_request_enabled to be true to work (the options
can be toggled independently).

The new function is meant to be called at any time, independent of
enabling / disabling the guest request vm_event (i.e. it only controls
its behaviour once it's enabled). So guest_request_userspace_enabled
should not be used as synonym for guest_request_enabled.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/hvm: Allow guest_request vm_events coming from userspace

2017-08-05 Thread Razvan Cojocaru
On 08/05/2017 04:32 AM, Tamas K Lengyel wrote:
> 
> 
> On Fri, Aug 4, 2017 at 5:32 AM, Alexandru Isaila
> > wrote:
> 
> In some introspection usecases, an in-guest agent needs to communicate
> with the external introspection agent.  An existing mechanism is
> HVMOP_guest_request_vm_event, but this is restricted to kernel usecases
> like all other hypercalls.
> 
> Introduce a mechanism whereby the introspection agent can whitelist the
> use of HVMOP_guest_request_vm_event directly from userspace.
> 
> Signed-off-by: Alexandru Isaila  >
> 
> ---
> Changes since V3:
> - Changed commit message
> - Added new lines
> - Indent the maximum space on the defines
> - Chaned the name of the define/function name/struct member
>   from vmcall to event
> ---
>  tools/libxc/include/xenctrl.h |  1 +
>  tools/libxc/xc_monitor.c  | 14 ++
>  xen/arch/x86/hvm/hypercall.c  |  5 +
>  xen/common/monitor.c  | 14 ++
>  xen/include/public/domctl.h   | 21 +++--
>  xen/include/xen/sched.h   |  5 +++--
>  6 files changed, 48 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h
> b/tools/libxc/include/xenctrl.h
> index bde8313..90a056f 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2022,6 +2022,7 @@ int xc_monitor_descriptor_access(xc_interface
> *xch, domid_t domain_id,
>   bool enable);
>  int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id,
>   bool enable, bool sync);
> +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t
> domain_id, bool enable);
>  int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id,
>  bool enable, bool sync);
>  int xc_monitor_cpuid(xc_interface *xch, domid_t domain_id, bool
> enable);
> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
> index b44ce93..6064c39 100644
> --- a/tools/libxc/xc_monitor.c
> +++ b/tools/libxc/xc_monitor.c
> @@ -161,6 +161,20 @@ int xc_monitor_guest_request(xc_interface *xch,
> domid_t domain_id, bool enable,
>  return do_domctl(xch, );
>  }
> 
> +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t
> domain_id, bool enable)
> 
> 
> This function should be prefixed with "xc_monitor_" like all the rest of
> the functions here.
That one was my suggestion, as I thought xc_monitor_-prefixed functions
are meant to toggle monitoring somehow, whereas this function only
toggles userspace use of guest request VMCALLs.

But re-adding the prefix is fine - since toggling it only influences in
the end how a vm_event is sent, it is after all monitor-related.


Thanks for the review,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-08-03 Thread Razvan Cojocaru
On 03/09/2017 06:56 PM, Jan Beulich wrote:
 On 09.03.17 at 10:38,  wrote:
>> @@ -4535,6 +4536,30 @@ static int do_altp2m_op(
>>  a.u.set_mem_access.view);
>>  break;
>>  
>> +case HVMOP_altp2m_set_mem_access_multi:
>> +if ( a.u.set_mem_access_multi.pad ||
>> + a.u.set_mem_access_multi.opaque >= a.u.set_mem_access_multi.nr 
>> )
>> +{
>> +rc = -EINVAL;
>> +break;
>> +}
>> +rc = p2m_set_mem_access_multi(d, a.u.set_mem_access_multi.pfn_list,
>> +  a.u.set_mem_access_multi.access_list,
>> +  a.u.set_mem_access_multi.nr,
>> +  a.u.set_mem_access_multi.opaque,
>> +  MEMOP_CMD_MASK,
>> +  a.u.set_mem_access_multi.view);
>> +if ( rc > 0 )
>> +{
>> +a.u.set_mem_access_multi.opaque = rc;
>> +if ( __copy_to_guest(arg, , 1) )
>> +rc = -EFAULT;
>> +else
>> +rc = hypercall_create_continuation(__HYPERVISOR_hvm_op, 
>> "lh",
>> +   HVMOP_altp2m, arg);
>> +}
>> +break;
> 
> Okay, so this is a hvmop, in which case I'm fine with the continuation
> model used.
> 
> However - is this interface supposed to be usable by a guest on itself?
> Arguably the same question would apply to some of the other sub-ops
> too, but anyway.
> 
>> --- a/xen/include/public/hvm/hvm_op.h
>> +++ b/xen/include/public/hvm/hvm_op.h
>> @@ -231,6 +231,23 @@ struct xen_hvm_altp2m_set_mem_access {
>>  typedef struct xen_hvm_altp2m_set_mem_access 
>> xen_hvm_altp2m_set_mem_access_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
>>  
>> +struct xen_hvm_altp2m_set_mem_access_multi {
>> +/* view */
>> +uint16_t view;
>> +uint16_t pad;
>> +/* Number of pages */
>> +uint32_t nr;
>> +/* Used for continuation purposes */
>> +uint64_t opaque;
>> +/* List of pfns to set access for */
>> +XEN_GUEST_HANDLE(const_uint64) pfn_list;
>> +/* Corresponding list of access settings for pfn_list */
>> +XEN_GUEST_HANDLE(const_uint8) access_list;
> 
> I'm afraid these handles aren't going to work for a 32-bit guest. Note
> how no other hvmop uses handles.

OK, so at this point, since Ravi has not expressed a preference, but
Andrew has said that these should be accessible from the guest as well,
I assume we should move forward with this as a HVMOP, addressing the
comment above regarding compatibility. (Just to help the discussion,
here is the original patch: https://patchwork.kernel.org/patch/9612799/).

I do prefer the DOMCTL version, but of course the operation will not be
available for the guest then and we may just as well make it a HVMOP
from the beginnig.

If there are any objections, please let me know.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/hvm: Allow guest_request vm_events coming from userspace

2017-08-02 Thread Razvan Cojocaru



On 01.08.2017 19:07, Tamas K Lengyel wrote:

On Tue, Aug 1, 2017 at 4:30 AM, Andrew Cooper  wrote:

On 01/08/17 10:46, Alexandru Isaila wrote:

Allow guest userspace code to request that a vm_event be sent out
via VMCALL. This functionality seems to be handy for a number of
Xen developers, as stated on the mailing list (thread "[Xen-devel]
HVMOP_guest_request_vm_event only works from guest in ring0").
This is a use case in communication between a userspace application
in the guest and the introspection application in dom0.

Signed-off-by: Alexandru Isaila 


This issue has been argued several times before, and while I am in
favour of the change, there is a legitimate argument that it breaks one
of our security boundaries.

One intermediate option comes to mind however.

Could we introduce a new monitor op which permits the use of
HVMOP_guest_request_vm_event from userspace?  This way, it requires a
positive action on behalf of the introspection agent to relax the CPL
check, rather than having the CPL check unconditionally relaxed.


I agree, it would be required to gate this on a monitor option that is
disabled by default.


That's the way we'll move forward. About that: would you prefer a new 
libxc function that toggles only the userspace part, or that we add a 
bool parameter to the existing xc_monitor_guest_request()?



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6] x86/monitor: Notify monitor if an emulation fails.

2017-08-01 Thread Razvan Cojocaru
On 07/25/2017 08:40 PM, Razvan Cojocaru wrote:
> On 07/18/2017 01:20 PM, Razvan Cojocaru wrote:
>> On 07/18/2017 01:09 PM, Andrew Cooper wrote:
>>> On 18/07/17 10:37, Petre Pircalabu wrote:
>>>> If case of a vm_event with the emulate_flags set, if the instruction
>>>> cannot be emulated, the monitor should be notified instead of directly
>>>> injecting a hw exception.
>>>> This behavior can be used to re-execute an instruction not supported by
>>>> the emulator using the real processor (e.g. altp2m) instead of just
>>>> crashing.
>>>>
>>>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>>
>>> There are many situations which end up failing an emulation with
>>> UNHANDLEABLE.
>>>
>>> What exact scenario are you looking to catch?  Is it just instructions
>>> which aren't implemented in the emulator?
>>
>> Instructions that are not implemented in the emulator are our main use
>> case for this, yes. In which case, we'd like a chance to be able to
>> single-step them (using altp2m), so that the guest will continue to run
>> even with an incomplete emulator.
>>
>> We don't care about instructions that would have failed to run in both
>> scenarios (emulated or single-stepped). I'm not sure if there are other
>> cases in which an instruction, although supported by the emulator, would
>> fail emulation but pass single-stepping.
> 
> Is there further action required on our part the get this patch
> commited? FWIW, while not ideal, from our perspective trying to
> single-step an instruction that was UNHANDLEABLE when attempting to
> emulate it is acceptable (even if UNHANDLEABLE doesn't mean that the
> instruction is not supported by the emulator). At worst it won't run
> when single-stepped either, and that'll be that.
Since this seems to be blocked as-is, I propose transforming this patch
into a series, with one patch adding a new return code specifically for
unsupported instructions (X86_EMUL_UNIMPLEMENTED or
X86_EMUL_UNSUPPORTED?), and this patch sending the vm_event out only for
that. (In which case the event's name should probably change as well to
reflect the name of the new error code.)

Thoughts?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6] x86/monitor: Notify monitor if an emulation fails.

2017-07-25 Thread Razvan Cojocaru
On 07/18/2017 01:20 PM, Razvan Cojocaru wrote:
> On 07/18/2017 01:09 PM, Andrew Cooper wrote:
>> On 18/07/17 10:37, Petre Pircalabu wrote:
>>> If case of a vm_event with the emulate_flags set, if the instruction
>>> cannot be emulated, the monitor should be notified instead of directly
>>> injecting a hw exception.
>>> This behavior can be used to re-execute an instruction not supported by
>>> the emulator using the real processor (e.g. altp2m) instead of just
>>> crashing.
>>>
>>> Signed-off-by: Petre Pircalabu <ppircal...@bitdefender.com>
>>
>> There are many situations which end up failing an emulation with
>> UNHANDLEABLE.
>>
>> What exact scenario are you looking to catch?  Is it just instructions
>> which aren't implemented in the emulator?
> 
> Instructions that are not implemented in the emulator are our main use
> case for this, yes. In which case, we'd like a chance to be able to
> single-step them (using altp2m), so that the guest will continue to run
> even with an incomplete emulator.
> 
> We don't care about instructions that would have failed to run in both
> scenarios (emulated or single-stepped). I'm not sure if there are other
> cases in which an instruction, although supported by the emulator, would
> fail emulation but pass single-stepping.

Is there further action required on our part the get this patch
commited? FWIW, while not ideal, from our perspective trying to
single-step an instruction that was UNHANDLEABLE when attempting to
emulate it is acceptable (even if UNHANDLEABLE doesn't mean that the
instruction is not supported by the emulator). At worst it won't run
when single-stepped either, and that'll be that.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Question about hvm_monitor_interrupt

2017-07-21 Thread Razvan Cojocaru
On 07/22/2017 12:33 AM, Tamas K Lengyel wrote:
> Hey Razvan,

Hello,

> the vm_event that is being generated by doing
> VM_EVENT_FLAG_GET_NEXT_INTERRUPT sends almost all required information
> about the interrupt to the listener to allow it to get reinjected,
> except the instruction length. If the listener wants to reinject the
> interrupt to the guest via xc_hvm_inject_trap the instruction length
> is something needing to be specified. So shouldn't that information be
> included in the vm_event?

We only care about requesting guest page faults (TRAP_page_fault), so
that we may be able to inspect things like swapped-out pages, so for
that purpose the instruction length is not necessary. Having said that,
there's nothing against adding the instruction length to the vm_event if
you need it.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] x86/mm: Change default value for suppress #VE in set_mem_access()

2017-07-20 Thread Razvan Cojocaru
On 07/20/2017 09:52 PM, Tamas K Lengyel wrote:
> On Thu, Jul 20, 2017 at 12:25 PM, Razvan Cojocaru
> <rcojoc...@bitdefender.com> wrote:
>> On 07/20/2017 07:46 PM, Tamas K Lengyel wrote:
>>> On Thu, Jul 20, 2017 at 10:43 AM, George Dunlap
>>> <george.dun...@eu.citrix.com> wrote:
>>>> On Wed, Jul 19, 2017 at 7:24 PM, Tamas K Lengyel <ta...@tklengyel.com> 
>>>> wrote:
>>>>>> I think the issue would be whether to allow a domain to set/clear the
>>>>>> suppress #VE bit for its pages by calling the new HVMOP on itself.
>>>>>
>>>>> This problem is not limited to setting the SVE bit. It also applies to
>>>>> swapping altp2m views. Pretty much all altp2m HVMOPs can be issued
>>>>> from a user-space program without any way to check whether that
>>>>> process is allowed to do that or not. If you don't think it is safe
>>>>> for a domain to set SVE, the none of the altp2m ops are safe for the
>>>>> domain to issue on itself. If we could say ensure only the kernel can
>>>>> issue the hvmops, that would be OK. But that's not possible at the
>>>>> moment AFAICT.
>>>>
>>>> Wait, is that right?  I think we normally restrict hypercalls to only
>>>> being made from the guest kernel, don't we?
>>>>
>>>
>>> If that's the case then it's good to know (can you point me where that
>>> restriction is done?) I was just referring to the fact that
>>> technically a userspace program can issue VMCALL.
>>
>> It is the case AFAIK. We had to do this trick to allow guest request
>> hypercalls coming from guest userspace:
>>
>> https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-x86-hvm-Allow-guest_request-vm_events-coming-from-us.patch
>>
>> By default they can only come from the guest kernel.
> 
> Makes sense. Any reason not to have that patch upstreamed?

It's been rejected: https://patchwork.kernel.org/patch/9264837/

We need that because we inject a cleanup tool in the guest once a threat
is detected, and we want to be able to know what it is doing. So every
once in a while, the in-guest tool will do a VMCALL that ends up being
an event letting the dom0 application know what it's been up to (a
communication protocol if you will).

We'd be more than happy to retry upstreaming it if I've managed to
explain the need for it better today. :)


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] x86/mm: Change default value for suppress #VE in set_mem_access()

2017-07-20 Thread Razvan Cojocaru
On 07/20/2017 07:46 PM, Tamas K Lengyel wrote:
> On Thu, Jul 20, 2017 at 10:43 AM, George Dunlap
>  wrote:
>> On Wed, Jul 19, 2017 at 7:24 PM, Tamas K Lengyel  wrote:
 I think the issue would be whether to allow a domain to set/clear the
 suppress #VE bit for its pages by calling the new HVMOP on itself.
>>>
>>> This problem is not limited to setting the SVE bit. It also applies to
>>> swapping altp2m views. Pretty much all altp2m HVMOPs can be issued
>>> from a user-space program without any way to check whether that
>>> process is allowed to do that or not. If you don't think it is safe
>>> for a domain to set SVE, the none of the altp2m ops are safe for the
>>> domain to issue on itself. If we could say ensure only the kernel can
>>> issue the hvmops, that would be OK. But that's not possible at the
>>> moment AFAICT.
>>
>> Wait, is that right?  I think we normally restrict hypercalls to only
>> being made from the guest kernel, don't we?
>>
> 
> If that's the case then it's good to know (can you point me where that
> restriction is done?) I was just referring to the fact that
> technically a userspace program can issue VMCALL.

It is the case AFAIK. We had to do this trick to allow guest request
hypercalls coming from guest userspace:

https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-x86-hvm-Allow-guest_request-vm_events-coming-from-us.patch

By default they can only come from the guest kernel.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6] x86/monitor: Notify monitor if an emulation fails.

2017-07-18 Thread Razvan Cojocaru
On 07/18/2017 01:09 PM, Andrew Cooper wrote:
> On 18/07/17 10:37, Petre Pircalabu wrote:
>> If case of a vm_event with the emulate_flags set, if the instruction
>> cannot be emulated, the monitor should be notified instead of directly
>> injecting a hw exception.
>> This behavior can be used to re-execute an instruction not supported by
>> the emulator using the real processor (e.g. altp2m) instead of just
>> crashing.
>>
>> Signed-off-by: Petre Pircalabu 
> 
> There are many situations which end up failing an emulation with
> UNHANDLEABLE.
> 
> What exact scenario are you looking to catch?  Is it just instructions
> which aren't implemented in the emulator?

Instructions that are not implemented in the emulator are our main use
case for this, yes. In which case, we'd like a chance to be able to
single-step them (using altp2m), so that the guest will continue to run
even with an incomplete emulator.

We don't care about instructions that would have failed to run in both
scenarios (emulated or single-stepped). I'm not sure if there are other
cases in which an instruction, although supported by the emulator, would
fail emulation but pass single-stepping.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/monitor: Notify monitor if an emulation fails.

2017-07-12 Thread Razvan Cojocaru
On 07/12/2017 08:21 PM, Petre Pircalabu wrote:
> If case of a vm_event with the emulate_flags set, if the instruction
> cannot be emulated, the monitor should be notified instead of directly
> injecting a hw exception.
> This behavior can be used to re-execute an instruction not supported by
> the emulator using the real processor (e.g. altp2m) instead of just
> crashing.
> 
> Signed-off-by: Petre Pircalabu 
> 
> ---
> Changed since v1:
>   * Removed the emulation kind check when calling hvm_inject_hw_exception
> 
> Changed since v2:
>   * Removed a file added by mistake
> 
> Changed since v3:
>   * Removed extra stray line
>   * Added the _enabled suffix to the emul_unhandleable monitor option

I think for these mechanical-only changes you could have kept Wei's ack.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/18] x86/monitor.c: use plain bool

2017-06-30 Thread Razvan Cojocaru
On 06/30/2017 08:45 PM, Andrew Cooper wrote:
> On 30/06/17 18:01, Wei Liu wrote:
>> Signed-off-by: Wei Liu <wei.l...@citrix.com>
> 
> This file falls under introspection maintainership, so CC'ing them (not
> that this change in controversial).
> 
> Reviewed-by: Andrew Cooper <andrew.coop...@citrix.com>

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-27 Thread Razvan Cojocaru
On 06/27/2017 02:45 PM, Jan Beulich wrote:
>>>> Razvan Cojocaru <rcojoc...@bitdefender.com> 06/27/17 1:38 PM >>>
>> On 06/27/2017 02:26 PM, Jan Beulich wrote:
>>>>>> Razvan Cojocaru <rcojoc...@bitdefender.com> 06/27/17 10:32 AM >>>
>>>> On 06/27/2017 09:21 AM, Jan Beulich wrote:
>>>>>>>> Andrew Cooper <andrew.coop...@citrix.com> 06/26/17 5:11 PM >>>
>>>>>> There is also a difference in timing; vm_event_init_domain() is called
>>>>>> when vm_event is started on the domain, not when the domain is
>>>>>> constructed.  IMO, the two should happen at the same time to be
>>>>>> consistent.  I'm not fussed at which point, as it would be fine (in
>>>>>> principle) for d->vm_event to be NULL in most cases.
>>>>>
>>>>> I very much support that last aspect - there's shouldn't be any memory
>>>>> allocated here if the domain isn't going to make use of VM events.
>>>>
>>>> Unfortunately it's not as simple as that.
>>>>
>>>> While I didn't write that code, it would seem that on domain creation
>>>> that data is being allocated because it holds information about 3
>>>> different vm_event subsystems - monitor, paging, and memshare.
>>>
>>> But it can't be that difficult to make it work the suggested way?
>>
>> We can lazy-allocate that the first time any one of the three gets
>> initialized (monitor, paging, share), and update all the code that
>> checks d->vm_event->member to check that d->vm_event is not NULL before
>> that.
>>
>> Any objections?
> 
> None here.

That's actually much more involved than I thought: almost every vm_event
function assumes that for a created domain d->vm_event is not NULL, and
takes a struct vm_event_domain *ved parameter to differentiate between
monitor, mem paging and mem sharing. So while this technically is not a
hard thing to fix at all, it would touch a lot of ARM and x86 code and
be quite an overhaul of vm_event.

My immediate reaction was to add small helper functions such as:

 42 static struct vm_event_domain *vm_event_get_ved(
 43 struct domain *d,
 44 enum vm_event_domain_type domain_type)
 45 {
 46 if ( !d->vm_event )
 47 return NULL;
 48
 49 switch ( domain_type )
 50 {
 51 #ifdef CONFIG_HAS_MEM_PAGING
 52 case VM_EVENT_DOMAIN_TYPE_MEM_PAGING:
 53 return >vm_event->paging;
 54 #endif
 55 case VM_EVENT_DOMAIN_TYPE_MONITOR:
 56 return >vm_event->monitor;
 57 #ifdef CONFIG_HAS_MEM_SHARING
 58 case VM_EVENT_DOMAIN_TYPE_MEM_SHARING:
 59 return >vm_event->share;
 60 #endif
 61 default:
 62 return NULL;
 63 }
 64 }

and try to get all places to use that line of thinking, but there's
quite a lot of them.

Tamas, any thoughts on this before going deeper?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-27 Thread Razvan Cojocaru
On 06/27/2017 02:26 PM, Jan Beulich wrote:
>>>> Razvan Cojocaru <rcojoc...@bitdefender.com> 06/27/17 10:32 AM >>>
>> On 06/27/2017 09:21 AM, Jan Beulich wrote:
>>>>>> Andrew Cooper <andrew.coop...@citrix.com> 06/26/17 5:11 PM >>>
>>>> There is also a difference in timing; vm_event_init_domain() is called
>>>> when vm_event is started on the domain, not when the domain is
>>>> constructed.  IMO, the two should happen at the same time to be
>>>> consistent.  I'm not fussed at which point, as it would be fine (in
>>>> principle) for d->vm_event to be NULL in most cases.
>>>
>>> I very much support that last aspect - there's shouldn't be any memory
>>> allocated here if the domain isn't going to make use of VM events.
>>
>> Unfortunately it's not as simple as that.
>>
>> While I didn't write that code, it would seem that on domain creation
>> that data is being allocated because it holds information about 3
>> different vm_event subsystems - monitor, paging, and memshare.
> 
> But it can't be that difficult to make it work the suggested way?

We can lazy-allocate that the first time any one of the three gets
initialized (monitor, paging, share), and update all the code that
checks d->vm_event->member to check that d->vm_event is not NULL before
that.

Any objections?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [For 4.9] Updating https://wiki.xenproject.org/wiki/Xen_Project_Release_Features to reflect support status of new features

2017-06-27 Thread Razvan Cojocaru
Hello,

> - Security > Alternative 2pm : Supported – I think we should split this
> out – it is currently implicitly covered under "Virtual Machine
> Introspection"

I agree that altp2m deserves its own space. While we're interested in
it, our current solution makes no use of it, so it's certainly possible
to do introspection without any help from altp2m.

From my, albeit limited, experience, altp2m probably fits under "Tech
preview".

If we subtract altp2m from the general "Virtual Machine Introspection"
category, I'd say that the upstream support is between "Tech Preview"
and "Supported". I'll explain.

The interface is largely stable, but we may need to add optimizations
which might change it - so while we don't want this to be happening a
lot, it will happen.

There's also functional stability with the XenServer patches (which
bypass the emulator (single-step) when it can't emulate, and has a
version of the old smp_lock() emulator race-condition patch).
Unfortunately, upstream Xen still has race condition issues when
emulating LOCKed instructions in SMP scenarios - this will be addressed
ASAP with a proper CMPXCHG patch that currently depends on a patch from
Andrew and will require rework of a patch from Jan that adds a specific
emulation return code for CMPXCHG failures.

There's also an issue with #UD injections which is covered by the
emulator bypass patch in XenServer but can cause IPIs to get lost with
the upstream code - that will also require some more thinking.

So there are still (emulation-related) quirks upstream. We'd very much
like to get them ironed out.

I hope this answers the question.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-27 Thread Razvan Cojocaru
On 06/27/2017 09:21 AM, Jan Beulich wrote:
 Andrew Cooper  06/26/17 5:11 PM >>>
>> There is also a difference in timing; vm_event_init_domain() is called
>> when vm_event is started on the domain, not when the domain is
>> constructed.  IMO, the two should happen at the same time to be
>> consistent.  I'm not fussed at which point, as it would be fine (in
>> principle) for d->vm_event to be NULL in most cases.
> 
> I very much support that last aspect - there's shouldn't be any memory
> allocated here if the domain isn't going to make use of VM events.

Unfortunately it's not as simple as that.

While I didn't write that code, it would seem that on domain creation
that data is being allocated because it holds information about 3
different vm_event subsystems - monitor, paging, and memshare.

vm_event_init_domain() is then not being called when vm_event generally
is started on the domain, but when a specific vm_event part is being
initialized (monitor usually, but could be paging, etc.).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-26 Thread Razvan Cojocaru
On 06/26/2017 03:14 PM, Andrew Cooper wrote:
> Razvan: I'd reword this to not mention livepatching.  Simply having
> list_empty() working is a good enough reason alone.

Fair enough, I'll change the patch description as soon as we hear from
Tamas, so that I might address as many comments as possible.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-26 Thread Razvan Cojocaru
On 06/26/2017 02:39 PM, Konrad Rzeszutek Wilk wrote:
> On June 26, 2017 5:48:17 AM EDT, Razvan Cojocaru <rcojoc...@bitdefender.com> 
> wrote:
>> Pending livepatch code wants to check if the vm_event wait queues
>> are active, and this is made harder by the fact that they were
> 
> 
> Hmm, it wants to? Is there an missing patch that hasn't been posted for this?
> 
> If you mean to post this _before_ the other one please adjust the commit 
> description.

Yes, I think that patch hasn't been posted yet (Andrew is / was working
on it). I thought "pending" was appropriate in the description, but I
can change the text (after seeing if there are more comments on the code).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] common/vm_event: Initialize vm_event lists on domain creation

2017-06-26 Thread Razvan Cojocaru
Pending livepatch code wants to check if the vm_event wait queues
are active, and this is made harder by the fact that they were
previously only initialized some time after the domain was created,
in vm_event_enable(). This patch initializes the lists immediately
after xzalloc()ating the vm_event memory, in domain_create(), in
the newly added init_domain_vm_event() function.

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
---
 xen/common/domain.c|  5 ++---
 xen/common/vm_event.c  | 23 ---
 xen/include/xen/vm_event.h |  2 ++
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index b22aacc..89a8f1d 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -362,11 +362,10 @@ struct domain *domain_create(domid_t domid, unsigned int 
domcr_flags,
 
 poolid = 0;
 
-err = -ENOMEM;
-d->vm_event = xzalloc(struct vm_event_per_domain);
-if ( !d->vm_event )
+if ( (err = init_domain_vm_event(d)) != 0 )
 goto fail;
 
+err = -ENOMEM;
 d->pbuf = xzalloc_array(char, DOMAIN_PBUF_SIZE);
 if ( !d->pbuf )
 goto fail;
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 9291db6..294ddd7 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -39,6 +39,26 @@
 #define vm_event_ring_lock(_ved)   spin_lock(&(_ved)->ring_lock)
 #define vm_event_ring_unlock(_ved) spin_unlock(&(_ved)->ring_lock)
 
+int init_domain_vm_event(struct domain *d)
+{
+d->vm_event = xzalloc(struct vm_event_per_domain);
+
+if ( !d->vm_event )
+return -ENOMEM;
+
+#ifdef CONFIG_HAS_MEM_PAGING
+init_waitqueue_head(>vm_event->paging.wq);
+#endif
+
+init_waitqueue_head(>vm_event->monitor.wq);
+
+#ifdef CONFIG_HAS_MEM_SHARING
+init_waitqueue_head(>vm_event->share.wq);
+#endif
+
+return 0;
+}
+
 static int vm_event_enable(
 struct domain *d,
 xen_domctl_vm_event_op_t *vec,
@@ -93,9 +113,6 @@ static int vm_event_enable(
 /* Save the pause flag for this particular ring. */
 ved->pause_flag = pause_flag;
 
-/* Initialize the last-chance wait queue. */
-init_waitqueue_head(>wq);
-
 vm_event_ring_unlock(ved);
 return 0;
 
diff --git a/xen/include/xen/vm_event.h b/xen/include/xen/vm_event.h
index 2fb3951..482243e 100644
--- a/xen/include/xen/vm_event.h
+++ b/xen/include/xen/vm_event.h
@@ -80,6 +80,8 @@ void vm_event_set_registers(struct vcpu *v, 
vm_event_response_t *rsp);
 
 void vm_event_monitor_next_interrupt(struct vcpu *v);
 
+int init_domain_vm_event(struct domain *d);
+
 #endif /* __VM_EVENT_H__ */
 
 /*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V3] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
Fixed an issue where the maximum index allowed (31) goes beyond the
actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
Coverity-ID: 1412966

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.coop...@citrix.com>

---
Changes since V2:
 - Removed stale comment.
 - Indentation.
 - Added Reviewed-by.
---
 xen/arch/x86/monitor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
index bedf13c..764195a 100644
--- a/xen/arch/x86/monitor.c
+++ b/xen/arch/x86/monitor.c
@@ -132,8 +132,8 @@ int arch_monitor_domctl_event(struct domain *d,
 unsigned int ctrlreg_bitmask;
 bool_t old_status;
 
-/* sanity check: avoid left-shift undefined behavior */
-if ( unlikely(mop->u.mov_to_cr.index > 31) )
+if ( unlikely(mop->u.mov_to_cr.index >=
+  ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) )
 return -EINVAL;
 
 if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) )
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
(Re-sent with CCs preserved).

On 06/21/2017 07:06 PM, Jan Beulich wrote:
 On 21.06.17 at 16:56,  wrote:
>> --- a/xen/arch/x86/monitor.c
>> +++ b/xen/arch/x86/monitor.c
>> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d,
>>  bool_t old_status;
>>  
>>  /* sanity check: avoid left-shift undefined behavior */
>> -if ( unlikely(mop->u.mov_to_cr.index > 31) )
>> +if ( unlikely(mop->u.mov_to_cr.index >=
>> + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) )
> 
> Indentation.

Right, that should have matched the end of the "unlikely(" above. I'll
modify it, remove the comment Wei commented on and submit V3.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
On 06/21/2017 07:06 PM, Jan Beulich wrote:
 On 21.06.17 at 16:56,  wrote:
>> --- a/xen/arch/x86/monitor.c
>> +++ b/xen/arch/x86/monitor.c
>> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d,
>>  bool_t old_status;
>>  
>>  /* sanity check: avoid left-shift undefined behavior */
>> -if ( unlikely(mop->u.mov_to_cr.index > 31) )
>> +if ( unlikely(mop->u.mov_to_cr.index >=
>> + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) )
> 
> Indentation.

Right, that should have matched the end of the "unlikely(" above. I'll
modify it, remove the comment Wei commented on and submit V3.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
On 06/21/2017 06:10 PM, Wei Liu wrote:
> On Wed, Jun 21, 2017 at 05:56:02PM +0300, Razvan Cojocaru wrote:
>> Fixed an issue where the maximum index allowed (31) goes beyond the
>> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
>> Coverity-ID: 1412966
>>
>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>
>> ---
>> Changes since V1:
>>  - Changed '3' to 'ARRAY_SIZE(...)'.
>> ---
>>  xen/arch/x86/monitor.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
>> index bedf13c..af68a79 100644
>> --- a/xen/arch/x86/monitor.c
>> +++ b/xen/arch/x86/monitor.c
>> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d,
>>  bool_t old_status;
>>  
>>  /* sanity check: avoid left-shift undefined behavior */
> 
> This comment should be deleted now.

It technically continues to be correct, but if you'd like I can send V3
- otherwise (and if it's not too much hassle) it can be deleted on
commit. I'm happy to accomodate either scenario.


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
Fixed an issue where the maximum index allowed (31) goes beyond the
actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
Coverity-ID: 1412966

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>

---
Changes since V1:
 - Changed '3' to 'ARRAY_SIZE(...)'.
---
 xen/arch/x86/monitor.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
index bedf13c..af68a79 100644
--- a/xen/arch/x86/monitor.c
+++ b/xen/arch/x86/monitor.c
@@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d,
 bool_t old_status;
 
 /* sanity check: avoid left-shift undefined behavior */
-if ( unlikely(mop->u.mov_to_cr.index > 31) )
+if ( unlikely(mop->u.mov_to_cr.index >=
+ ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) )
 return -EINVAL;
 
 if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) )
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
On 06/21/2017 05:33 PM, Andrew Cooper wrote:
> On 21/06/17 15:28, Razvan Cojocaru wrote:
>> Fixed an issue where the maximum index allowed (31) goes beyond the
>> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
>> Coverity-ID: 1412966
>>
>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>> ---
>>  xen/arch/x86/monitor.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
>> index bedf13c..4620b15 100644
>> --- a/xen/arch/x86/monitor.c
>> +++ b/xen/arch/x86/monitor.c
>> @@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d,
>>  bool_t old_status;
>>  
>>  /* sanity check: avoid left-shift undefined behavior */
>> -if ( unlikely(mop->u.mov_to_cr.index > 31) )
>> +if ( unlikely(mop->u.mov_to_cr.index > 3) )
> 
>> = ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)
> > ?

Yes, that'd be the right way to do it. :)

V2 coming up in a second.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)

2017-06-21 Thread Razvan Cojocaru
Fixed an issue where the maximum index allowed (31) goes beyond the
actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
Coverity-ID: 1412966

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
---
 xen/arch/x86/monitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
index bedf13c..4620b15 100644
--- a/xen/arch/x86/monitor.c
+++ b/xen/arch/x86/monitor.c
@@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d,
 bool_t old_status;
 
 /* sanity check: avoid left-shift undefined behavior */
-if ( unlikely(mop->u.mov_to_cr.index > 31) )
+if ( unlikely(mop->u.mov_to_cr.index > 3) )
 return -EINVAL;
 
 if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) )
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/2] x86/monitor: add masking support for write_ctrlreg events

2017-06-21 Thread Razvan Cojocaru
On 06/21/2017 04:58 PM, Wei Liu wrote:
> On Mon, Jun 19, 2017 at 03:24:38PM +0300, Petre Pircalabu wrote:
>> Add support for filtering out the write_ctrlreg monitor events if they
>> are generated only by changing certains bits.
>> A new parameter (bitmask) was added to the xc_monitor_write_ctrlreg
>> function in order to mask the event generation if the changed bits are
>> set.
>>
>> Signed-off-by: Petre Pircalabu 
>> Acked-by: Tamas K Lengyel 
> 
> Coverity isn't happy with this patch.
> 
> It seems to me there is indeed a risk to overrun the buffer (4 in size) 
> because
> the caller can specify index up to 31.
> 
> ** CID 1412966:  Memory - corruptions  (OVERRUN)  
>   
>   
> /xen/arch/x86/monitor.c: 162 in arch_monitor_domctl_event()   
>   
>   
>   
>   
>   
>   
>   
>   
> 
>   
> *** CID 1412966:  Memory - corruptions  (OVERRUN) 
>   
>   
> /xen/arch/x86/monitor.c: 162 in arch_monitor_domctl_event()   
>   
>   
> 156 ad->monitor.write_ctrlreg_onchangeonly |= 
> ctrlreg_bitmask;  
>   
> 157 else  
>   
>   
> 158 ad->monitor.write_ctrlreg_onchangeonly &= 
> ~ctrlreg_bitmask; 
>   
> 159   
>   
>   
> 160 if ( requested_status )   
>   
>   
> 161 { 
>   
>   
 CID 1412966:  Memory - corruptions  (OVERRUN)  

 
 Overrunning array "ad->monitor.write_ctrlreg_mask" of 4 8-byte 
 elements at element index 31 (byte offset 248) using index 
 "mop->u.mov_to_cr.index"
> (which evaluates to 31).  
>   
>   
> 162 ad->monitor.write_ctrlreg_mask[mop->u.mov_to_cr.index] = 
> mop->u.mov_to_cr.bitmask; 
>
> 163 ad->monitor.write_ctrlreg_enabled |= ctrlreg_bitmask; 
>   
>   
> 164 } 
>   
>   
> 165 else  
>   
>   
> 166 { 
>   
>   
> 167 ad->monitor.write_ctrlreg_mask[mop->u.mov_to_cr.index] = 
> 0;  

I vaguely remember that 31 was introduced simply as a "reserved"
precaution - we can probably safely please Coverity by simply patching
that code to not go over 3 as an index.

To Petre's credit, he did notice and propose that we change this value
but I've suggested that we keep the check as-is for the future. My bad. :)


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] xen-access: write_ctrlreg_c4 test

2017-06-16 Thread Razvan Cojocaru
On 06/16/2017 10:20 PM, Petre Pircalabu wrote:
> Add test for write_ctrlreg event handling.
> 
> Signed-off-by: Petre Pircalabu 
> ---
>  tools/tests/xen-access/xen-access.c | 53 
> -
>  1 file changed, 52 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/tests/xen-access/xen-access.c 
> b/tools/tests/xen-access/xen-access.c
> index 238011e..bbf5047 100644
> --- a/tools/tests/xen-access/xen-access.c
> +++ b/tools/tests/xen-access/xen-access.c
> @@ -57,6 +57,13 @@
>  #define X86_TRAP_DEBUG  1
>  #define X86_TRAP_INT3   3
>  
> +/* From xen/include/asm-x86/x86-defns.h */
> +#define X86_CR4_PGE0x0080 /* enable global pages */
> +
> +#ifndef ARRAY_SIZE
> +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
> +#endif
> +
>  typedef struct vm_event {
>  domid_t domain_id;
>  xenevtchn_handle *xce_handle;
> @@ -314,6 +321,24 @@ static void get_request(vm_event_t *vm_event, 
> vm_event_request_t *req)
>  }
>  
>  /*
> + * X86 control register names
> + */
> +static const char* get_x86_ctrl_reg_name(uint32_t index)
> +{
> +static const char* names[] = {
> +[VM_EVENT_X86_CR0]  = "CR0",
> +[VM_EVENT_X86_CR3]  = "CR3",
> +[VM_EVENT_X86_CR4]  = "CR4",
> +[VM_EVENT_X86_XCR0] = "XCR0",
> +};
> +
> +if ( index > ARRAY_SIZE(names) || names[index] == NULL )

I think this probably wants to be index >= ARRAY_SIZE(names).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-29 Thread Razvan Cojocaru
On 05/26/17 18:38, Jan Beulich wrote:
 On 26.05.17 at 16:37,  wrote:
>> On 05/26/17 17:29, Jan Beulich wrote:
>> On 25.05.17 at 11:40,  wrote:
 I've noticed that, with pages marked NX and vm_event emulation, we can
 end up emulating an ud2, for which hvm_emulate_one() returns
 X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
>>>
>>> Could you explain what would lead to emulation of UD2?
>>
>> If you mean in which cases does our engine mark pages NX, I'll have to
>> ask and get back to you. If you mean why generally would an UD2 end up
>> being the instruction where RIP causes an execute violation fault, I'll
>> have to check.
> 
> The question was more for the latter, as I don't understand what
> good could come from executing UD2 intentionally, unless the
> entity doing so knows there is an emulator around to do something
> sensible with it.

I owe you an answer here: I've spoken to my introspection engine
colleague Andrei, and they purposefully put an UD2 there to terminate a
malicious process (i.e. the exception is wanted).

I've found this problem while stress-testing Xen 4.9 verifying another
patch, using our in-house user-mode test applications, which simulate
this sort of malicious behaviour.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-29 Thread Razvan Cojocaru
On 05/29/17 15:11, Andrew Cooper wrote:
> On 29/05/2017 12:46, Razvan Cojocaru wrote:
>> On 05/29/17 14:05, Jan Beulich wrote:
>>>>>> On 29.05.17 at 11:20, <rcojoc...@bitdefender.com> wrote:
>>>> On 05/26/17 18:11, Andrew Cooper wrote:
>>>>> On 26/05/17 15:29, Jan Beulich wrote:
>>>>>>>>> On 25.05.17 at 11:40, <rcojoc...@bitdefender.com> wrote:
>>>>>>> I've noticed that, with pages marked NX and vm_event emulation, we can
>>>>>>> end up emulating an ud2, for which hvm_emulate_one() returns
>>>>>>> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
>>>>>> Could you explain what would lead to emulation of UD2?
>>>>>>
>>>>>>> This, in turn, causes a hvm_inject_event() call in the context of
>>>>>>> hvm_do_resume(), which can, if there's already a pending event there,
>>>>>>> cause a 101 BSOD (timer-related, if I understand correctly) or loss of
>>>>>>> input (mouse frozen, keyboard unresponsive).
>>>>>>>
>>>>>>> After much trial and error, I've been able to confirm this by leaving a
>>>>>>> guest on for almost a full day with this change:
>>>>>>>
>>>>>>>  case X86EMUL_EXCEPTION:
>>>>>>> -hvm_inject_event();
>>>>>>> +if ( !hvm_event_pending(current) )
>>>>>>> +hvm_inject_event();
>>>>>>>
>>>>>>> and checking that there's been no BSOD or loss of input.
>>>>>>>
>>>>>>> However, just losing the event here, while fine to prove that this is
>>>>>>> indeed the problem, is not OK. But I'm not sure what an elegant / robust
>>>>>>> way of fixing this is.
>>>>>> Much depends on what the other event is: If it's an interrupt, I'd
>>>>>> assume there to be an ordering problem (interrupts shouldn't be
>>>>>> injected when there is a pending exception, their delivery instead
>>>>>> should be attempted on the first instruction of the exception
>>>>>> handler [if interrupts remain on] or whenever interrupts get
>>>>>> re-enabled).
>>>>> I suspect it is an ordering issue, and something has processed and
>>>>> interrupt before the emulation occurs as part of the vm_event reply 
>>>>> happens.
>>>>>
>>>>> The interrupt ordering spec indicates that external interrupts take
>>>>> precedent over faults raised from executing an instruction, on the basis
>>>>> that once the interrupt handler returns, the instruction will generate
>>>>> the same fault again.  However, its not obvious how this is intended to
>>>>> interact with interrupt windows and vmexits.  I expect we can get away
>>>>> with ensuring that external interrupts are the final thing considered
>>>>> for injection on the return-to-guest path.
>>>>>
>>>>> It might be an idea to leave an assert in vmx_inject_event() that an
>>>>> event is not already pending, but in the short term, this probably also
>>>>> wants debugging by trying to identify what sequence of actions is
>>>>> leading us to inject two events in this case (if indeed this is what is
>>>>> happening).
>>>> With some patience, I've been able to catch the problem: "(XEN)
>>>> vmx_inject_event(3, 14) but 0, 225 pending".
>>>>
>>>>  63 /*
>>>>  64  * x86 event types. This enumeration is valid for:
>>>>  65  *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
>>>>  66  *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
>>>>  67  */
>>>>  68 enum x86_event_type {
>>>>  69 X86_EVENTTYPE_EXT_INTR, /* External interrupt */
>>>>  70 X86_EVENTTYPE_NMI = 2,  /* NMI */
>>>>  71 X86_EVENTTYPE_HW_EXCEPTION, /* Hardware exception */
>>>>  72 X86_EVENTTYPE_SW_INTERRUPT, /* Software interrupt (CD nn) */
>>>>  73 X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
>>>>  74 X86_EVENTTYPE_SW_EXCEPTION, /* INT3 (CC), INTO (CE) */
>>>>  75 };
>>>>
>>>> so an X86_EVENTTYPE_EXT_INTR is pending when we're trying to inject an
>>>> X86_EVENTTYPE_HW_EXCEPTION, as a result of the 

Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-29 Thread Razvan Cojocaru
On 05/29/17 14:05, Jan Beulich wrote:
 On 29.05.17 at 11:20,  wrote:
>> On 05/26/17 18:11, Andrew Cooper wrote:
>>> On 26/05/17 15:29, Jan Beulich wrote:
>>> On 25.05.17 at 11:40,  wrote:
> I've noticed that, with pages marked NX and vm_event emulation, we can
> end up emulating an ud2, for which hvm_emulate_one() returns
> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
 Could you explain what would lead to emulation of UD2?

> This, in turn, causes a hvm_inject_event() call in the context of
> hvm_do_resume(), which can, if there's already a pending event there,
> cause a 101 BSOD (timer-related, if I understand correctly) or loss of
> input (mouse frozen, keyboard unresponsive).
>
> After much trial and error, I've been able to confirm this by leaving a
> guest on for almost a full day with this change:
>
>  case X86EMUL_EXCEPTION:
> -hvm_inject_event();
> +if ( !hvm_event_pending(current) )
> +hvm_inject_event();
>
> and checking that there's been no BSOD or loss of input.
>
> However, just losing the event here, while fine to prove that this is
> indeed the problem, is not OK. But I'm not sure what an elegant / robust
> way of fixing this is.
 Much depends on what the other event is: If it's an interrupt, I'd
 assume there to be an ordering problem (interrupts shouldn't be
 injected when there is a pending exception, their delivery instead
 should be attempted on the first instruction of the exception
 handler [if interrupts remain on] or whenever interrupts get
 re-enabled).
>>>
>>> I suspect it is an ordering issue, and something has processed and
>>> interrupt before the emulation occurs as part of the vm_event reply happens.
>>>
>>> The interrupt ordering spec indicates that external interrupts take
>>> precedent over faults raised from executing an instruction, on the basis
>>> that once the interrupt handler returns, the instruction will generate
>>> the same fault again.  However, its not obvious how this is intended to
>>> interact with interrupt windows and vmexits.  I expect we can get away
>>> with ensuring that external interrupts are the final thing considered
>>> for injection on the return-to-guest path.
>>>
>>> It might be an idea to leave an assert in vmx_inject_event() that an
>>> event is not already pending, but in the short term, this probably also
>>> wants debugging by trying to identify what sequence of actions is
>>> leading us to inject two events in this case (if indeed this is what is
>>> happening).
>>
>> With some patience, I've been able to catch the problem: "(XEN)
>> vmx_inject_event(3, 14) but 0, 225 pending".
>>
>>  63 /*
>>  64  * x86 event types. This enumeration is valid for:
>>  65  *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
>>  66  *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
>>  67  */
>>  68 enum x86_event_type {
>>  69 X86_EVENTTYPE_EXT_INTR, /* External interrupt */
>>  70 X86_EVENTTYPE_NMI = 2,  /* NMI */
>>  71 X86_EVENTTYPE_HW_EXCEPTION, /* Hardware exception */
>>  72 X86_EVENTTYPE_SW_INTERRUPT, /* Software interrupt (CD nn) */
>>  73 X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
>>  74 X86_EVENTTYPE_SW_EXCEPTION, /* INT3 (CC), INTO (CE) */
>>  75 };
>>
>> so an X86_EVENTTYPE_EXT_INTR is pending when we're trying to inject an
>> X86_EVENTTYPE_HW_EXCEPTION, as a result of the code quoted above.
> 
> So this confirms our suspicion, but doesn't move us closer to a
> solution. The question after all is why an external interrupt is
> being delivered prior to or while emulating. As Andrew did
> explain, proper behavior would be to check for external
> interrupts and don't enter emulation if one is pending, or don't
> check for external interrupts until the _next_ instruction
> boundary. Correct architectural behavior will result either way;
> the second variant merely must not continuously defer interrupts
> (i.e. there need to be instruction boundaries at which hardware
> of software do check for them).
> 
> I'm not that familiar with the sequence of steps when dealing
> with emulation requests from an introspection agent, so I would
> hope you could go through those code paths to see where
> external interrupts are being checked for. Or wait - isn't your
> problem that you invoke emulation out of hvm_do_resume() (via
> hvm_vm_event_do_resume() -> hvm_emulate_one_vm_event()),
> which happens after all other "normal" processing of a VM exit?
> Perhaps emulation should be skipped there if an event is already
> pending injection, as emulation not having started means we still
> are on an instruction boundary?

Indeed, we are emulating after all the vmexit processing has already
happened, in the hvm_do_resume() path you've mentioned.

Not emulating if an event 

Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-29 Thread Razvan Cojocaru
On 05/26/17 18:11, Andrew Cooper wrote:
> On 26/05/17 15:29, Jan Beulich wrote:
> On 25.05.17 at 11:40,  wrote:
>>> I've noticed that, with pages marked NX and vm_event emulation, we can
>>> end up emulating an ud2, for which hvm_emulate_one() returns
>>> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
>> Could you explain what would lead to emulation of UD2?
>>
>>> This, in turn, causes a hvm_inject_event() call in the context of
>>> hvm_do_resume(), which can, if there's already a pending event there,
>>> cause a 101 BSOD (timer-related, if I understand correctly) or loss of
>>> input (mouse frozen, keyboard unresponsive).
>>>
>>> After much trial and error, I've been able to confirm this by leaving a
>>> guest on for almost a full day with this change:
>>>
>>>  case X86EMUL_EXCEPTION:
>>> -hvm_inject_event();
>>> +if ( !hvm_event_pending(current) )
>>> +hvm_inject_event();
>>>
>>> and checking that there's been no BSOD or loss of input.
>>>
>>> However, just losing the event here, while fine to prove that this is
>>> indeed the problem, is not OK. But I'm not sure what an elegant / robust
>>> way of fixing this is.
>> Much depends on what the other event is: If it's an interrupt, I'd
>> assume there to be an ordering problem (interrupts shouldn't be
>> injected when there is a pending exception, their delivery instead
>> should be attempted on the first instruction of the exception
>> handler [if interrupts remain on] or whenever interrupts get
>> re-enabled).
> 
> I suspect it is an ordering issue, and something has processed and
> interrupt before the emulation occurs as part of the vm_event reply happens.
> 
> The interrupt ordering spec indicates that external interrupts take
> precedent over faults raised from executing an instruction, on the basis
> that once the interrupt handler returns, the instruction will generate
> the same fault again.  However, its not obvious how this is intended to
> interact with interrupt windows and vmexits.  I expect we can get away
> with ensuring that external interrupts are the final thing considered
> for injection on the return-to-guest path.
> 
> It might be an idea to leave an assert in vmx_inject_event() that an
> event is not already pending, but in the short term, this probably also
> wants debugging by trying to identify what sequence of actions is
> leading us to inject two events in this case (if indeed this is what is
> happening).

With some patience, I've been able to catch the problem: "(XEN)
vmx_inject_event(3, 14) but 0, 225 pending".

 63 /*
 64  * x86 event types. This enumeration is valid for:
 65  *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
 66  *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
 67  */
 68 enum x86_event_type {
 69 X86_EVENTTYPE_EXT_INTR, /* External interrupt */
 70 X86_EVENTTYPE_NMI = 2,  /* NMI */
 71 X86_EVENTTYPE_HW_EXCEPTION, /* Hardware exception */
 72 X86_EVENTTYPE_SW_INTERRUPT, /* Software interrupt (CD nn) */
 73 X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
 74 X86_EVENTTYPE_SW_EXCEPTION, /* INT3 (CC), INTO (CE) */
 75 };

so an X86_EVENTTYPE_EXT_INTR is pending when we're trying to inject an
X86_EVENTTYPE_HW_EXCEPTION, as a result of the code quoted above.

This is the patch that has produced the above output:

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c8ef18a..3b88eca 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1807,6 +1807,12 @@ void vmx_inject_nmi(void)
X86_EVENT_NO_EC);
 }

+static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
+{
+info->cr2 = v->arch.hvm_vcpu.guest_cr[2];
+return hvm_funcs.get_pending_event(v, info);
+}
+
 /*
  * Generate a virtual event in the guest.
  * NOTES:
@@ -1821,6 +1827,15 @@ static void vmx_inject_event(const struct
x86_event *event)
 struct vcpu *curr = current;
 struct x86_event _event = *event;

+if ( hvm_event_pending(current) )
+{
+   struct x86_event ev;
+   hvm_get_pending_event(current, );
+
+   printk("vmx_inject_event(%d, %d) but %d, %d pending\n",
+  _event.type, _event.vector, ev.type, ev.vector);
+}
+
 switch ( _event.vector | -(_event.type == X86_EVENTTYPE_SW_INTERRUPT) )
 {
 case TRAP_debug:


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-26 Thread Razvan Cojocaru
On 05/26/17 17:29, Jan Beulich wrote:
 On 25.05.17 at 11:40,  wrote:
>> I've noticed that, with pages marked NX and vm_event emulation, we can
>> end up emulating an ud2, for which hvm_emulate_one() returns
>> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
> 
> Could you explain what would lead to emulation of UD2?

If you mean in which cases does our engine mark pages NX, I'll have to
ask and get back to you. If you mean why generally would an UD2 end up
being the instruction where RIP causes an execute violation fault, I'll
have to check.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

2017-05-25 Thread Razvan Cojocaru
Hello,

I've noticed that, with pages marked NX and vm_event emulation, we can
end up emulating an ud2, for which hvm_emulate_one() returns
X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().

This, in turn, causes a hvm_inject_event() call in the context of
hvm_do_resume(), which can, if there's already a pending event there,
cause a 101 BSOD (timer-related, if I understand correctly) or loss of
input (mouse frozen, keyboard unresponsive).

After much trial and error, I've been able to confirm this by leaving a
guest on for almost a full day with this change:

 case X86EMUL_EXCEPTION:
-hvm_inject_event();
+if ( !hvm_event_pending(current) )
+hvm_inject_event();

and checking that there's been no BSOD or loss of input.

However, just losing the event here, while fine to prove that this is
indeed the problem, is not OK. But I'm not sure what an elegant / robust
way of fixing this is.

Suggestions are (as always) greatly appreciated.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-08 Thread Razvan Cojocaru
On 05/08/17 15:44, Jan Beulich wrote:
 On 04.05.17 at 11:00,  wrote:
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -404,6 +404,7 @@ S:   Supported
>>  F:  tools/tests/xen-access
>>  F:  xen/arch/*/monitor.c
>>  F:  xen/arch/*/vm_event.c
>> +F:  xen/arch/*/hvm/vm_event.c
>>  F:  xen/arch/arm/mem_access.c
>>  F:  xen/arch/x86/mm/mem_access.c
>>  F:  xen/arch/x86/hvm/monitor.c
>> @@ -413,6 +414,7 @@ F:   xen/common/vm_event.c
>>  F:  xen/include/*/mem_access.h
>>  F:  xen/include/*/monitor.h
>>  F:  xen/include/*/vm_event.h
>> +F:  xen/include/*/hvm/vm_event.h
>>  F:  xen/include/asm-x86/hvm/monitor.h
> 
> Btw., I've noticed only now that these additions would better be
> x86-specific, just like their hvm/monitor.[ch] counterparts. I intend
> to adjust this while committing.

Fair enough.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-04 Thread Razvan Cojocaru
On 05/04/17 17:57, Tamas K Lengyel wrote:
> On Thu, May 4, 2017 at 5:22 AM, Jan Beulich <jbeul...@suse.com> wrote:
>>>>> On 04.05.17 at 11:14, <rcojoc...@bitdefender.com> wrote:
>>> On 05/04/17 12:11, Jan Beulich wrote:
>>>>>>> On 04.05.17 at 11:00, <rcojoc...@bitdefender.com> wrote:
>>>>> Created arch/x86/hvm/vm_event.c and include/asm-x86/hvm/vm_event.h,
>>>>> where HVM-specific vm_event-related code will live. This cleans up
>>>>> hvm_do_resume() and ensures that the vm_event maintainers are
>>>>> responsible for changes to that code.
>>>>>
>>>>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>>>> Acked-by: Tamas K Lengyel <ta...@tklengyel.com>
>>>>
>>>> Acked-by: Jan Beulich <jbeul...@suse.com>
>>>> albeit I wonder ...
>>>>
>>>>> +void hvm_vm_event_do_resume(struct vcpu *v)
>>>>> +{
>>>>> +struct monitor_write_data *w;
>>>>> +
>>>>> +if ( likely(!v->arch.vm_event) )
>>>>> +return;
>>>>
>>>> ... whether this now wouldn't better be an ASSERT().
>>>
>>> I have no objections (can this be done on commit or should I re-send V4?).
>>
>> Let's first see what Tamas thinks. If he agrees, I see not problem
>> doing the adjustment while committing.
> 
> I'm not quite sure how converting that to an ASSERT would work. It
> looks fine to me as is tbh.

I think Jan means that, since currently the only caller is
hvm_do_resume() where there's already that check now (to avoid the
call), we could here simply replace the if() with
ASSERT(v->arch.vm_event). I could be wrong. :)


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-04 Thread Razvan Cojocaru
On 05/04/17 12:11, Jan Beulich wrote:
>>>> On 04.05.17 at 11:00, <rcojoc...@bitdefender.com> wrote:
>> Created arch/x86/hvm/vm_event.c and include/asm-x86/hvm/vm_event.h,
>> where HVM-specific vm_event-related code will live. This cleans up
>> hvm_do_resume() and ensures that the vm_event maintainers are
>> responsible for changes to that code.
>>
>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>> Acked-by: Tamas K Lengyel <ta...@tklengyel.com>
> 
> Acked-by: Jan Beulich <jbeul...@suse.com>
> albeit I wonder ...
> 
>> +void hvm_vm_event_do_resume(struct vcpu *v)
>> +{
>> +struct monitor_write_data *w;
>> +
>> +if ( likely(!v->arch.vm_event) )
>> +return;
> 
> ... whether this now wouldn't better be an ASSERT().

I have no objections (can this be done on commit or should I re-send V4?).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V3 0/2] Fix vm_event resume path race condition

2017-05-04 Thread Razvan Cojocaru
This small series first creates hvm/vm_event.{h,c}, in order to bring
under vm_event maintainership the code that has previously lived in
hvm_do_resume(), and then fixes a __context_switch()-related race
condition when attempting to set registers via vm_event reply.

[PATCH V3 1/2] x86/vm_event: added hvm/vm_event.{h,c}
[PATCH V3 2/2] x86/vm_event: fix race between __context_switch() and


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V3 1/2] x86/vm_event: added hvm/vm_event.{h,c}

2017-05-04 Thread Razvan Cojocaru
Created arch/x86/hvm/vm_event.c and include/asm-x86/hvm/vm_event.h,
where HVM-specific vm_event-related code will live. This cleans up
hvm_do_resume() and ensures that the vm_event maintainers are
responsible for changes to that code.

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Acked-by: Tamas K Lengyel <ta...@tklengyel.com>

---
Changes since V2:
 - Added if ( unlikely(v->arch.vm_event) ) before the hvm_do_resume()
   call.
 - Added unlikely()s in hvm_vm_event_do_resume() for the register
   write cases.
 - Moved the copyright lines over from hvm.c.
---
 MAINTAINERS|   2 +
 xen/arch/x86/hvm/Makefile  |   1 +
 xen/arch/x86/hvm/hvm.c |  63 +--
 xen/arch/x86/hvm/vm_event.c| 103 +
 xen/include/asm-x86/hvm/vm_event.h |  34 
 5 files changed, 142 insertions(+), 61 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vm_event.c
 create mode 100644 xen/include/asm-x86/hvm/vm_event.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c345a50..10863bc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -404,6 +404,7 @@ S:  Supported
 F: tools/tests/xen-access
 F: xen/arch/*/monitor.c
 F: xen/arch/*/vm_event.c
+F: xen/arch/*/hvm/vm_event.c
 F: xen/arch/arm/mem_access.c
 F: xen/arch/x86/mm/mem_access.c
 F: xen/arch/x86/hvm/monitor.c
@@ -413,6 +414,7 @@ F:  xen/common/vm_event.c
 F: xen/include/*/mem_access.h
 F: xen/include/*/monitor.h
 F: xen/include/*/vm_event.h
+F: xen/include/*/hvm/vm_event.h
 F: xen/include/asm-x86/hvm/monitor.h
 
 VTPM
diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 0a3d0f4..02f0add 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -24,6 +24,7 @@ obj-y += stdvga.o
 obj-y += vioapic.o
 obj-y += viridian.o
 obj-y += vlapic.o
+obj-y += vm_event.o
 obj-y += vmsi.o
 obj-y += vpic.o
 obj-y += vpt.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a441955..81691e2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -34,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -61,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -484,66 +484,7 @@ void hvm_do_resume(struct vcpu *v)
 return;
 
 if ( unlikely(v->arch.vm_event) )
-{
-struct monitor_write_data *w = >arch.vm_event->write_data;
-
-if ( unlikely(v->arch.vm_event->emulate_flags) )
-{
-enum emul_kind kind = EMUL_KIND_NORMAL;
-
-/*
- * Please observ the order here to match the flag descriptions
- * provided in public/vm_event.h
- */
-if ( v->arch.vm_event->emulate_flags &
- VM_EVENT_FLAG_SET_EMUL_READ_DATA )
-kind = EMUL_KIND_SET_CONTEXT_DATA;
-else if ( v->arch.vm_event->emulate_flags &
-  VM_EVENT_FLAG_EMULATE_NOWRITE )
-kind = EMUL_KIND_NOWRITE;
-else if ( v->arch.vm_event->emulate_flags &
-  VM_EVENT_FLAG_SET_EMUL_INSN_DATA )
-kind = EMUL_KIND_SET_CONTEXT_INSN;
-
-hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
- X86_EVENT_NO_EC);
-
-v->arch.vm_event->emulate_flags = 0;
-}
-
-if ( w->do_write.msr )
-{
-if ( hvm_msr_write_intercept(w->msr, w->value, 0) ==
- X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.msr = 0;
-}
-
-if ( w->do_write.cr0 )
-{
-if ( hvm_set_cr0(w->cr0, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr0 = 0;
-}
-
-if ( w->do_write.cr4 )
-{
-if ( hvm_set_cr4(w->cr4, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr4 = 0;
-}
-
-if ( w->do_write.cr3 )
-{
-if ( hvm_set_cr3(w->cr3, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr3 = 0;
-}
-}
+hvm_vm_event_do_resume(v);
 
 /* Inject pending hw/sw event */
 if ( v->arch.hvm_vcpu.inject_event.vector >= 0 )
diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c
new file mode 100644
index 000..1934556
--- /dev/null
+++ b/xen/arch/x86/hvm/vm_event.c
@@ -0,0 +1,103 @@
+/*
+ * arch/x86/hvm/vm_event.c
+ *
+ * HVM vm_event handling routines
+ *
+ * Copyright (c) 2004, Intel Corporation.
+ * Copyright (c) 2005, International Business Machines Corporation.
+ * Copyright (c) 2008, Citrix Systems, Inc.
+ *
+ *

[Xen-devel] [PATCH V3 2/2] x86/vm_event: fix race between __context_switch() and vm_event_resume()

2017-05-04 Thread Razvan Cojocaru
The introspection agent can reply to a vm_event faster than
vmx_vmexit_handler() can complete in some cases, where it is then
not safe for vm_event_set_registers() to modify v->arch.user_regs.
In the test scenario, we were stepping over an INT3 breakpoint by
setting RIP += 1. The quick reply tended to complete before the VCPU
triggering the introspection event had properly paused and been
descheduled. If the reply occurs before __context_switch() happens,
__context_switch() clobbers the reply by overwriting
v->arch.user_regs from the stack. If we don't pass through
__context_switch() (due to switching to the idle vCPU), reply data
wouldn't be picked up when switching back straight to the original
vCPU.

This patch ensures that vm_event_resume() code only sets per-VCPU
data to be used for the actual setting of registers later in
hvm_do_resume() (similar to the model used to control setting of CRs
and MSRs).

The patch additionally removes the sync_vcpu_execstate(v) call from
vm_event_resume(), which is no longer necessary, which removes the
associated broadcast TLB flush (read: performance improvement).

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
Acked-by: Tamas K Lengyel <ta...@tklengyel.com>

---
Changes since V2:
 - Corrected commit message.
---
 xen/arch/x86/hvm/vm_event.c| 35 +++
 xen/arch/x86/vm_event.c| 22 ++
 xen/common/vm_event.c  | 17 ++---
 xen/include/asm-x86/vm_event.h |  2 ++
 4 files changed, 49 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c
index 1934556..8eeb210 100644
--- a/xen/arch/x86/hvm/vm_event.c
+++ b/xen/arch/x86/hvm/vm_event.c
@@ -25,6 +25,39 @@
 #include 
 #include 
 
+static void hvm_vm_event_set_registers(const struct vcpu *v)
+{
+ASSERT(v == current);
+
+if ( unlikely(v->arch.vm_event->set_gprs) )
+{
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+regs->rax = v->arch.vm_event->gprs.rax;
+regs->rbx = v->arch.vm_event->gprs.rbx;
+regs->rcx = v->arch.vm_event->gprs.rcx;
+regs->rdx = v->arch.vm_event->gprs.rdx;
+regs->rsp = v->arch.vm_event->gprs.rsp;
+regs->rbp = v->arch.vm_event->gprs.rbp;
+regs->rsi = v->arch.vm_event->gprs.rsi;
+regs->rdi = v->arch.vm_event->gprs.rdi;
+
+regs->r8 = v->arch.vm_event->gprs.r8;
+regs->r9 = v->arch.vm_event->gprs.r9;
+regs->r10 = v->arch.vm_event->gprs.r10;
+regs->r11 = v->arch.vm_event->gprs.r11;
+regs->r12 = v->arch.vm_event->gprs.r12;
+regs->r13 = v->arch.vm_event->gprs.r13;
+regs->r14 = v->arch.vm_event->gprs.r14;
+regs->r15 = v->arch.vm_event->gprs.r15;
+
+regs->rflags = v->arch.vm_event->gprs.rflags;
+regs->rip = v->arch.vm_event->gprs.rip;
+
+v->arch.vm_event->set_gprs = false;
+}
+}
+
 void hvm_vm_event_do_resume(struct vcpu *v)
 {
 struct monitor_write_data *w;
@@ -32,6 +65,8 @@ void hvm_vm_event_do_resume(struct vcpu *v)
 if ( likely(!v->arch.vm_event) )
 return;
 
+hvm_vm_event_set_registers(v);
+
 w = >arch.vm_event->write_data;
 
 if ( unlikely(v->arch.vm_event->emulate_flags) )
diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
index 502ccc2..a6ea42c 100644
--- a/xen/arch/x86/vm_event.c
+++ b/xen/arch/x86/vm_event.c
@@ -113,26 +113,8 @@ void vm_event_set_registers(struct vcpu *v, 
vm_event_response_t *rsp)
 {
 ASSERT(atomic_read(>vm_event_pause_count));
 
-v->arch.user_regs.rax = rsp->data.regs.x86.rax;
-v->arch.user_regs.rbx = rsp->data.regs.x86.rbx;
-v->arch.user_regs.rcx = rsp->data.regs.x86.rcx;
-v->arch.user_regs.rdx = rsp->data.regs.x86.rdx;
-v->arch.user_regs.rsp = rsp->data.regs.x86.rsp;
-v->arch.user_regs.rbp = rsp->data.regs.x86.rbp;
-v->arch.user_regs.rsi = rsp->data.regs.x86.rsi;
-v->arch.user_regs.rdi = rsp->data.regs.x86.rdi;
-
-v->arch.user_regs.r8 = rsp->data.regs.x86.r8;
-v->arch.user_regs.r9 = rsp->data.regs.x86.r9;
-v->arch.user_regs.r10 = rsp->data.regs.x86.r10;
-v->arch.user_regs.r11 = rsp->data.regs.x86.r11;
-v->arch.user_regs.r12 = rsp->data.regs.x86.r12;
-v->arch.user_regs.r13 = rsp->data.regs.x86.r13;
-v->arch.user_regs.r14 = rsp->data.regs.x86.r14;
-v->arch.user_regs.r15 = rsp->data.regs.x86.r15;
-
-v->arch.user_regs.rflags = rsp->data.regs.x86.rflags;
-v->arch.user_regs.rip = rsp->data.regs.x86.rip;
+v->arch.vm_event->g

Re: [Xen-devel] [PATCH V2 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-03 Thread Razvan Cojocaru
On 05/03/2017 11:32 PM, Tamas K Lengyel wrote:
> On Wed, May 3, 2017 at 4:16 PM, Razvan Cojocaru
> <rcojoc...@bitdefender.com> wrote:
>> On 05/03/2017 11:05 PM, Tamas K Lengyel wrote:
>>>
>>>
>>> On Wed, May 3, 2017 at 6:48 AM, Jan Beulich <jbeul...@suse.com
>>> <mailto:jbeul...@suse.com>> wrote:
>>>
>>> >>> On 03.05.17 at 12:37, <rcojoc...@bitdefender.com 
>>> <mailto:rcojoc...@bitdefender.com>> wrote:
>>> > On 05/03/17 12:51, Jan Beulich wrote:
>>> >>>>> On 03.05.17 at 11:10, <rcojoc...@bitdefender.com 
>>> <mailto:rcojoc...@bitdefender.com>> wrote:
>>> >>> --- /dev/null
>>> >>> +++ b/xen/arch/x86/hvm/vm_event.c
>>> >>> @@ -0,0 +1,101 @@
>>> >>> +/*
>>> >>> + * arch/x86/hvm/vm_event.c
>>> >>> + *
>>> >>> + * HVM vm_event handling routines
>>> >>> + *
>>> >>> + * Copyright (c) 2017 Razvan Cojocaru (rcojoc...@bitdefender.com 
>>> <mailto:rcojoc...@bitdefender.com>)
>>> >>
>>> >> I'm notoriously bad when it comes to copyrights, but you just
>>> >> moving code makes me wonder whether this is appropriate.
>>> >
>>> > To be honest I quite agree with you, and in the beginning I just meant
>>> > to have no Copyright line in there at all - but I remembered a
>>> > discussion a while back where a patch was I believe rejected because 
>>> it
>>> > lacked one. So I've just copied Tamas' file (vm_event.c) and only
>>> > changed the copyright line because I didn't really know what else to 
>>> put
>>> > there.
>>> >
>>> > I'm quite happy to remove it altogether. Will that do?
>>>
>>> Afaic - sure. But as said, I'm quite bad at such things ...
>>>
>>>
>>> Since this is just code-movement from hvm.c to a separate file I would
>>> say it should retain the copyright lines from hvm.c. Other then that it
>>> looks good to me.
>>
>> Actually the funny part about that is that while this is indeed only
>> moved code, I have written all of that code in the first place, so I've
>> moved my own code. :)
> 
> Well, I've also added like two lines to it with
> VM_EVENT_FLAG_SET_EMUL_INS_DATA ;)

Right, my bad. :)

>> But I have no problem with either removing the copyright line altogether
>> or using the lines in hvm.c as suggested.
> 
> Yeap, I'm fine with either route but the safe bet is to just use the
> one from hvm.c (and add yourself to it if you feel like).

Alright, I'll just copy it from hvm.c.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-03 Thread Razvan Cojocaru
On 05/03/2017 11:05 PM, Tamas K Lengyel wrote:
> 
> 
> On Wed, May 3, 2017 at 6:48 AM, Jan Beulich <jbeul...@suse.com
> <mailto:jbeul...@suse.com>> wrote:
> 
> >>> On 03.05.17 at 12:37, <rcojoc...@bitdefender.com 
> <mailto:rcojoc...@bitdefender.com>> wrote:
> > On 05/03/17 12:51, Jan Beulich wrote:
> >>>>> On 03.05.17 at 11:10, <rcojoc...@bitdefender.com 
> <mailto:rcojoc...@bitdefender.com>> wrote:
> >>> --- /dev/null
> >>> +++ b/xen/arch/x86/hvm/vm_event.c
> >>> @@ -0,0 +1,101 @@
> >>> +/*
> >>> + * arch/x86/hvm/vm_event.c
> >>> + *
> >>> + * HVM vm_event handling routines
> >>> + *
> >>> + * Copyright (c) 2017 Razvan Cojocaru (rcojoc...@bitdefender.com 
> <mailto:rcojoc...@bitdefender.com>)
> >>
> >> I'm notoriously bad when it comes to copyrights, but you just
> >> moving code makes me wonder whether this is appropriate.
> >
> > To be honest I quite agree with you, and in the beginning I just meant
> > to have no Copyright line in there at all - but I remembered a
> > discussion a while back where a patch was I believe rejected because it
> > lacked one. So I've just copied Tamas' file (vm_event.c) and only
> > changed the copyright line because I didn't really know what else to put
> > there.
> >
> > I'm quite happy to remove it altogether. Will that do?
> 
> Afaic - sure. But as said, I'm quite bad at such things ...
> 
> 
> Since this is just code-movement from hvm.c to a separate file I would
> say it should retain the copyright lines from hvm.c. Other then that it
> looks good to me.

Actually the funny part about that is that while this is indeed only
moved code, I have written all of that code in the first place, so I've
moved my own code. :)

But I have no problem with either removing the copyright line altogether
or using the lines in hvm.c as suggested.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] hvm: fix hypervisor crash in hvm_save_one()

2017-05-03 Thread Razvan Cojocaru
On 05/03/17 15:22, Jan Beulich wrote:
> hvm_save_cpu_ctxt() returns success without writing any data into
> hvm_domain_context_t when all VCPUs are offline. This can then crash
> the hypervisor (with FATAL PAGE FAULT) in hvm_save_one() via the
> "off < (ctxt.cur - sizeof(*desc))" for() test, where ctxt.cur remains 0,
> causing an underflow which leads the hypervisor to go off the end of the
> ctxt buffer.
> 
> This has been broken since Xen 4.4 (c/s e019c606f59).
> It has happened in practice with an HVM Linux VM (Debian 8) queried around
> shutdown:
> 
> (XEN) hvm.c:1595:d3v0 All CPUs offline -- powering off.
> (XEN) [ Xen-4.9-rc  x86_64  debug=y   Not tainted ]
> (XEN) CPU:5
> (XEN) RIP:e008:[] hvm_save_one+0x145/0x1fd
> (XEN) RFLAGS: 00010286   CONTEXT: hypervisor (d0v2)
> (XEN) rax: 830492cbb445   rbx:    rcx: 83039343b400
> (XEN) rdx: ff88004d   rsi: fff8   rdi: 
> (XEN) rbp: 8304103e7c88   rsp: 8304103e7c48   r8:  0001
> (XEN) r9:  deadbeefdeadf00d   r10:    r11: 0282
> (XEN) r12: 7f43a3b14004   r13: fffe   r14: 
> (XEN) r15: 830400c41000   cr0: 80050033   cr4: 001526e0
> (XEN) cr3: 000402e13000   cr2: 830492cbb447
> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
> (XEN) Xen code around  (hvm_save_one+0x145/0x1fd):
> (XEN)  00 00 48 01 c8 83 c2 08 <66> 39 58 02 75 64 eb 08 48 89 c8 ba 08 00 00 
> 00
> (XEN) Xen stack trace from rsp=8304103e7c48:
> (XEN)0410 83039343b400 8304103e7c70 8304103e7da8
> (XEN)830400c41000 7f43a3b13004 8304103b7000 ffea
> (XEN)8304103e7d48 82d0802683d4 8300d19fd000 82d0802320d8
> (XEN)830400c41000  8304103e7cd8 82d08026ff3d
> (XEN) 8300d19fd000 8304103e7cf8 82d080232142
> (XEN) 8300d19fd000 8304103e7d28 82d080207051
> (XEN)8304103e7d18 830400c41000 0202 830400c41000
> (XEN) 7f43a3b13004  deadbeefdeadf00d
> (XEN)8304103e7e68 82d080206c47 0700 830410375bd0
> (XEN)0296 830410375c78 830410375c80 0003
> (XEN)8304103e7e68 8304103b67c0 8304103b7000 8304103b67c0
> (XEN)000d0037 0003 0002 7f43a3b14004
> (XEN)7ffd5d925590  0001 
> (XEN)ea8f8000  7ffd 
> (XEN)7f43a276f557  ea8f8000 
> (XEN)7ffd5d9255e0 7f43a23280b2 7ffd5d926058 8304103e7f18
> (XEN)8300d19fe000 0024 82d0802053e5 deadbeefdeadf00d
> (XEN)8304103e7f08 82d080351565 01003fff 7f43a3b13004
> (XEN)deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d
> (XEN)8800781425c0 88007ce94300 8304103e7ed8 82d0802719ec
> (XEN) Xen call trace:
> (XEN)[] hvm_save_one+0x145/0x1fd
> (XEN)[] arch_do_domctl+0xa7a/0x259f
> (XEN)[] do_domctl+0x1862/0x1b7b
> (XEN)[] pv_hypercall+0x1ef/0x42c
> (XEN)[] entry.o#test_all_events+0/0x30
> (XEN)
> (XEN) Pagetable walk from 830492cbb447:
> (XEN)  L4[0x106] = dbc36063 
> (XEN)  L3[0x012] =  
> (XEN)
> (XEN) 
> (XEN) Panic on CPU 5:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=]
> (XEN) Faulting linear address: 830492cbb447
> (XEN) 
> 
> At the same time pave the way for having zero-length records.
> 
> Inspired by an earlier patch from Andrew and Razvan.
> 
> Reported-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Diagnosed-by: Andrew Cooper <andrew.coop...@citrix.com>
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> Acked-by: Tim Deegan <t...@xen.org>
> Release-Acked-by: Julien Grall <julien.gr...@arm.com>

Tested-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-03 Thread Razvan Cojocaru
On 05/03/17 12:30, Jan Beulich wrote:
 On 03.05.17 at 11:21,  wrote:
>> At 10:15 +0100 on 03 May (1493806508), Tim Deegan wrote:
>>> At 00:31 -0600 on 03 May (1493771502), Jan Beulich wrote:
 +else if ( ctxt.cur > sizeof(*desc) )
  {
  uint32_t off;
 -const struct hvm_save_descriptor *desc;
  
 -rv = -ENOENT;
  for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += 
 desc->length )
>>
>> It occurs to me that as well as underflowing, this test is off by one.
>> It ought to be "off + sizeof(*desc) <= ctxt.cur" to allow for a
>> zero-length record.  AFAIK we don't actually have any of those, so
>> it's academic, but we might want to represent the presence of some
>> feature without having any feature-specific state to save.
> 
> Good point; I already have two follow-up patches, one of which I
> think this adjustment would easily fit into.

Should I re-send the original patch with the updated comment then (thus
also being able to keep Andrew's Signed-off-by), and if so, is it
alright to keep Julien's Release-Acked-by?

Or will you use this later patch (presumably with your Signed-off-by),
in which case I should test it?

Things got rather confusing (apologies for my own part in the confusion).


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 2/2] x86/vm_event: fix race between __context_switch() and vm_event_resume()

2017-05-03 Thread Razvan Cojocaru
On 05/03/17 13:01, Jan Beulich wrote:
 On 03.05.17 at 11:10,  wrote:
>> The introspection agent can reply to a vm_event faster than
>> vmx_vmexit_handler() can complete in some cases, where it is then
>> not safe for vm_event_set_registers() to modify v->arch.user_regs.
>> In the test scenario, we were stepping over an INT3 breakpoint by
>> setting RIP += 1. The quick reply tended to complete before the VCPU
>> triggering the introspection event had properly paused and been
>> descheduled. If the reply occurs before __context_switch() happens,
>> __context_switch() clobbers the reply by overwriting
>> v->arch.user_regs from the stack. If the reply occurs after
>> __context_switch(), we don't pass through __context_switch() when
>> transitioning to idle.
> 
> This last sentence still looks to be wrong (and even self-contradictory).
> The reply can't occur after __context_switch() if we don't make it there
> in the first place. How about "If we don't pass through
> __context_switch() (due to switching to the idle vCPU), reply data
> wouldn't be picked up when switching back straight to the original
> vCPU"?

Quite right, it's very convoluted. I'll update the comment.


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 1/2] x86/vm_event: added hvm/vm_event.{h, c}

2017-05-03 Thread Razvan Cojocaru
On 05/03/17 12:51, Jan Beulich wrote:
>>>> On 03.05.17 at 11:10, <rcojoc...@bitdefender.com> wrote:
>> @@ -483,67 +483,7 @@ void hvm_do_resume(struct vcpu *v)
>>  if ( !handle_hvm_io_completion(v) )
>>  return;
>>  
>> -if ( unlikely(v->arch.vm_event) )
>> -{
>> -struct monitor_write_data *w = >arch.vm_event->write_data;
>> -
>> -if ( unlikely(v->arch.vm_event->emulate_flags) )
>> -{
>> -enum emul_kind kind = EMUL_KIND_NORMAL;
>> -
>> -/*
>> - * Please observ the order here to match the flag descriptions
>> - * provided in public/vm_event.h
>> - */
>> -if ( v->arch.vm_event->emulate_flags &
>> - VM_EVENT_FLAG_SET_EMUL_READ_DATA )
>> -kind = EMUL_KIND_SET_CONTEXT_DATA;
>> -else if ( v->arch.vm_event->emulate_flags &
>> -  VM_EVENT_FLAG_EMULATE_NOWRITE )
>> -kind = EMUL_KIND_NOWRITE;
>> -else if ( v->arch.vm_event->emulate_flags &
>> -  VM_EVENT_FLAG_SET_EMUL_INSN_DATA )
>> -kind = EMUL_KIND_SET_CONTEXT_INSN;
>> -
>> -hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
>> - X86_EVENT_NO_EC);
>> -
>> -v->arch.vm_event->emulate_flags = 0;
>> -}
>> -
>> -if ( w->do_write.msr )
>> -{
>> -if ( hvm_msr_write_intercept(w->msr, w->value, 0) ==
>> - X86EMUL_EXCEPTION )
>> -hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> -
>> -w->do_write.msr = 0;
>> -}
>> -
>> -if ( w->do_write.cr0 )
>> -{
>> -if ( hvm_set_cr0(w->cr0, 0) == X86EMUL_EXCEPTION )
>> -hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> -
>> -w->do_write.cr0 = 0;
>> -}
>> -
>> -if ( w->do_write.cr4 )
>> -{
>> -if ( hvm_set_cr4(w->cr4, 0) == X86EMUL_EXCEPTION )
>> -hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> -
>> -w->do_write.cr4 = 0;
>> -}
>> -
>> -if ( w->do_write.cr3 )
>> -{
>> -if ( hvm_set_cr3(w->cr3, 0) == X86EMUL_EXCEPTION )
>> -hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> -
>> -    w->do_write.cr3 = 0;
>> -}
>> -}
>> +hvm_vm_event_do_resume(v);
> 
> As indicated before, I think we want to keep
> 
> if ( unlikely(v->arch.vm_event) )
> 
> here of in an inline wrapper, to avoid the actual function call in the
> common case.

Will do.

>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/vm_event.c
>> @@ -0,0 +1,101 @@
>> +/*
>> + * arch/x86/hvm/vm_event.c
>> + *
>> + * HVM vm_event handling routines
>> + *
>> + * Copyright (c) 2017 Razvan Cojocaru (rcojoc...@bitdefender.com)
> 
> I'm notoriously bad when it comes to copyrights, but you just
> moving code makes me wonder whether this is appropriate.

To be honest I quite agree with you, and in the beginning I just meant
to have no Copyright line in there at all - but I remembered a
discussion a while back where a patch was I believe rejected because it
lacked one. So I've just copied Tamas' file (vm_event.c) and only
changed the copyright line because I didn't really know what else to put
there.

I'm quite happy to remove it altogether. Will that do?

>> +void hvm_vm_event_do_resume(struct vcpu *v)
>> +{
>> +struct monitor_write_data *w;
>> +
>> +if ( likely(!v->arch.vm_event) )
>> +return;
>> +
>> +w = >arch.vm_event->write_data;
>> +
>> +if ( unlikely(v->arch.vm_event->emulate_flags) )
>> +{
>> +enum emul_kind kind = EMUL_KIND_NORMAL;
>> +
>> +/*
>> + * Please observe the order here to match the flag descriptions
>> + * provided in public/vm_event.h
>> + */
>> +if ( v->arch.vm_event->emulate_flags &
>> + VM_EVENT_FLAG_SET_EMUL_READ_DATA )
>> +kind = EMUL_KIND_SET_CONTEXT_DATA;
>> +else if ( v->arch.vm_event->emulate_flags &
>> +  VM_EVENT_FLAG_EMULATE_NOWRITE )
>> +kind = EMUL_KIND_NOWRITE;
>> +else if ( v->arch.vm_event-&

Re: [Xen-devel] [PATCH V2] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-03 Thread Razvan Cojocaru
On 05/03/17 12:15, Tim Deegan wrote:
> At 00:31 -0600 on 03 May (1493771502), Jan Beulich wrote:
>> Hmm, with both of you being of that opinion, I've taken another
>> look. I think I see now why you think that way (this being data
>> from an internal producer, overflow/underflow are not a primary
>> concern), so I'll withdraw my objection to the original patch (i.e.
>> I agree taking it with the v2 description). However, an alternative
>> would be
>>
>> --- unstable.orig/xen/common/hvm/save.c
>> +++ unstable/xen/common/hvm/save.c
>> @@ -79,14 +79,15 @@ size_t hvm_save_size(struct domain *d)
>>  int hvm_save_one(struct domain *d, uint16_t typecode, uint16_t instance, 
>>   XEN_GUEST_HANDLE_64(uint8) handle)
>>  {
>> -int rv = 0;
>> +int rv = -ENOENT;
>>  size_t sz = 0;
>>  struct vcpu *v;
>>  hvm_domain_context_t ctxt = { 0, };
>> +const struct hvm_save_descriptor *desc;
>>  
>>  if ( d->is_dying 
>>   || typecode > HVM_SAVE_CODE_MAX 
>> - || hvm_sr_handlers[typecode].size < sizeof(struct 
>> hvm_save_descriptor)
>> + || hvm_sr_handlers[typecode].size < sizeof(*desc)
>>   || hvm_sr_handlers[typecode].save == NULL )
>>  return -EINVAL;
>>  
>> @@ -107,12 +108,10 @@ int hvm_save_one(struct domain *d, uint1
>> d->domain_id, typecode);
>>  rv = -EFAULT;
>>  }
>> -else
>> +else if ( ctxt.cur > sizeof(*desc) )
>>  {
>>  uint32_t off;
>> -const struct hvm_save_descriptor *desc;
>>  
>> -rv = -ENOENT;
>>  for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += 
>> desc->length )
>>  {
>>  desc = (void *)(ctxt.data + off);
>> @@ -122,7 +121,8 @@ int hvm_save_one(struct domain *d, uint1
>>  {
>>  uint32_t copy_length = desc->length;
>>  
>> -if ( off + copy_length > ctxt.cur )
>> +if ( ctxt.cur < copy_length ||
>> + off > ctxt.cur - copy_length )
>>  copy_length = ctxt.cur - off;
>>  rv = 0;
>>  if ( copy_to_guest(handle, ctxt.data + off, copy_length) )
>>
>> taking care of overflow/underflow (now consistently) as well, plus
>> avoiding the (imo ugly) goto without making the code harder to
>> read. Thoughts?
> 
> Any of these three patches is fine by me.

I feel the same. If Andrew prefers this version I'm happy to test it,
otherwise re-sending the first patch with the corrected description is
the fastest path.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 2/2] x86/vm_event: fix race between __context_switch() and vm_event_resume()

2017-05-03 Thread Razvan Cojocaru
The introspection agent can reply to a vm_event faster than
vmx_vmexit_handler() can complete in some cases, where it is then
not safe for vm_event_set_registers() to modify v->arch.user_regs.
In the test scenario, we were stepping over an INT3 breakpoint by
setting RIP += 1. The quick reply tended to complete before the VCPU
triggering the introspection event had properly paused and been
descheduled. If the reply occurs before __context_switch() happens,
__context_switch() clobbers the reply by overwriting
v->arch.user_regs from the stack. If the reply occurs after
__context_switch(), we don't pass through __context_switch() when
transitioning to idle.

This patch ensures that vm_event_resume() code only sets per-VCPU
data to be used for the actual setting of registers later in
hvm_do_resume() (similar to the model used to control setting of CRs
and MSRs).

The patch additionally removes the sync_vcpu_execstate(v) call from
vm_event_resume(), which is no longer necessary, which removes the
associated broadcast TLB flush (read: performance improvement).

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
 xen/arch/x86/hvm/vm_event.c| 35 +++
 xen/arch/x86/vm_event.c| 22 ++
 xen/common/vm_event.c  | 17 ++---
 xen/include/asm-x86/vm_event.h |  2 ++
 4 files changed, 49 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c
index b35ee12..bfb95a2 100644
--- a/xen/arch/x86/hvm/vm_event.c
+++ b/xen/arch/x86/hvm/vm_event.c
@@ -23,6 +23,39 @@
 #include 
 #include 
 
+static void hvm_vm_event_set_registers(const struct vcpu *v)
+{
+ASSERT(v == current);
+
+if ( unlikely(v->arch.vm_event->set_gprs) )
+{
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+regs->rax = v->arch.vm_event->gprs.rax;
+regs->rbx = v->arch.vm_event->gprs.rbx;
+regs->rcx = v->arch.vm_event->gprs.rcx;
+regs->rdx = v->arch.vm_event->gprs.rdx;
+regs->rsp = v->arch.vm_event->gprs.rsp;
+regs->rbp = v->arch.vm_event->gprs.rbp;
+regs->rsi = v->arch.vm_event->gprs.rsi;
+regs->rdi = v->arch.vm_event->gprs.rdi;
+
+regs->r8 = v->arch.vm_event->gprs.r8;
+regs->r9 = v->arch.vm_event->gprs.r9;
+regs->r10 = v->arch.vm_event->gprs.r10;
+regs->r11 = v->arch.vm_event->gprs.r11;
+regs->r12 = v->arch.vm_event->gprs.r12;
+regs->r13 = v->arch.vm_event->gprs.r13;
+regs->r14 = v->arch.vm_event->gprs.r14;
+regs->r15 = v->arch.vm_event->gprs.r15;
+
+regs->rflags = v->arch.vm_event->gprs.rflags;
+regs->rip = v->arch.vm_event->gprs.rip;
+
+v->arch.vm_event->set_gprs = false;
+}
+}
+
 void hvm_vm_event_do_resume(struct vcpu *v)
 {
 struct monitor_write_data *w;
@@ -30,6 +63,8 @@ void hvm_vm_event_do_resume(struct vcpu *v)
 if ( likely(!v->arch.vm_event) )
 return;
 
+hvm_vm_event_set_registers(v);
+
 w = >arch.vm_event->write_data;
 
 if ( unlikely(v->arch.vm_event->emulate_flags) )
diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
index 502ccc2..a6ea42c 100644
--- a/xen/arch/x86/vm_event.c
+++ b/xen/arch/x86/vm_event.c
@@ -113,26 +113,8 @@ void vm_event_set_registers(struct vcpu *v, 
vm_event_response_t *rsp)
 {
 ASSERT(atomic_read(>vm_event_pause_count));
 
-v->arch.user_regs.rax = rsp->data.regs.x86.rax;
-v->arch.user_regs.rbx = rsp->data.regs.x86.rbx;
-v->arch.user_regs.rcx = rsp->data.regs.x86.rcx;
-v->arch.user_regs.rdx = rsp->data.regs.x86.rdx;
-v->arch.user_regs.rsp = rsp->data.regs.x86.rsp;
-v->arch.user_regs.rbp = rsp->data.regs.x86.rbp;
-v->arch.user_regs.rsi = rsp->data.regs.x86.rsi;
-v->arch.user_regs.rdi = rsp->data.regs.x86.rdi;
-
-v->arch.user_regs.r8 = rsp->data.regs.x86.r8;
-v->arch.user_regs.r9 = rsp->data.regs.x86.r9;
-v->arch.user_regs.r10 = rsp->data.regs.x86.r10;
-v->arch.user_regs.r11 = rsp->data.regs.x86.r11;
-v->arch.user_regs.r12 = rsp->data.regs.x86.r12;
-v->arch.user_regs.r13 = rsp->data.regs.x86.r13;
-v->arch.user_regs.r14 = rsp->data.regs.x86.r14;
-v->arch.user_regs.r15 = rsp->data.regs.x86.r15;
-
-v->arch.user_regs.rflags = rsp->data.regs.x86.rflags;
-v->arch.user_regs.rip = rsp->data.regs.x86.rip;
+v->arch.vm_event->gprs = rsp->data.regs.x86;
+v->arch.vm_event->set_gprs = true;
 }
 
 void vm_event_monitor_next_interrupt(struct vcpu *v)
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 0fe9a53..9291db

[Xen-devel] [PATCH V2 1/2] x86/vm_event: added hvm/vm_event.{h,c}

2017-05-03 Thread Razvan Cojocaru
Created arch/x86/hvm/vm_event.c and include/asm-x86/hvm/vm_event.h,
where HVM-specific vm_event-related code will live. This cleans up
hvm_do_resume() and ensures that the vm_event maintainers are
responsible for changes to that code.

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
---
 MAINTAINERS|   2 +
 xen/arch/x86/hvm/Makefile  |   1 +
 xen/arch/x86/hvm/hvm.c |  64 +--
 xen/arch/x86/hvm/vm_event.c| 101 +
 xen/include/asm-x86/hvm/vm_event.h |  34 +
 5 files changed, 140 insertions(+), 62 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vm_event.c
 create mode 100644 xen/include/asm-x86/hvm/vm_event.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c345a50..10863bc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -404,6 +404,7 @@ S:  Supported
 F: tools/tests/xen-access
 F: xen/arch/*/monitor.c
 F: xen/arch/*/vm_event.c
+F: xen/arch/*/hvm/vm_event.c
 F: xen/arch/arm/mem_access.c
 F: xen/arch/x86/mm/mem_access.c
 F: xen/arch/x86/hvm/monitor.c
@@ -413,6 +414,7 @@ F:  xen/common/vm_event.c
 F: xen/include/*/mem_access.h
 F: xen/include/*/monitor.h
 F: xen/include/*/vm_event.h
+F: xen/include/*/hvm/vm_event.h
 F: xen/include/asm-x86/hvm/monitor.h
 
 VTPM
diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 0a3d0f4..02f0add 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -24,6 +24,7 @@ obj-y += stdvga.o
 obj-y += vioapic.o
 obj-y += viridian.o
 obj-y += vlapic.o
+obj-y += vm_event.o
 obj-y += vmsi.o
 obj-y += vpic.o
 obj-y += vpt.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a441955..f010394 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -34,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -61,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -483,67 +483,7 @@ void hvm_do_resume(struct vcpu *v)
 if ( !handle_hvm_io_completion(v) )
 return;
 
-if ( unlikely(v->arch.vm_event) )
-{
-struct monitor_write_data *w = >arch.vm_event->write_data;
-
-if ( unlikely(v->arch.vm_event->emulate_flags) )
-{
-enum emul_kind kind = EMUL_KIND_NORMAL;
-
-/*
- * Please observ the order here to match the flag descriptions
- * provided in public/vm_event.h
- */
-if ( v->arch.vm_event->emulate_flags &
- VM_EVENT_FLAG_SET_EMUL_READ_DATA )
-kind = EMUL_KIND_SET_CONTEXT_DATA;
-else if ( v->arch.vm_event->emulate_flags &
-  VM_EVENT_FLAG_EMULATE_NOWRITE )
-kind = EMUL_KIND_NOWRITE;
-else if ( v->arch.vm_event->emulate_flags &
-  VM_EVENT_FLAG_SET_EMUL_INSN_DATA )
-kind = EMUL_KIND_SET_CONTEXT_INSN;
-
-hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
- X86_EVENT_NO_EC);
-
-v->arch.vm_event->emulate_flags = 0;
-}
-
-if ( w->do_write.msr )
-{
-if ( hvm_msr_write_intercept(w->msr, w->value, 0) ==
- X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.msr = 0;
-}
-
-if ( w->do_write.cr0 )
-{
-if ( hvm_set_cr0(w->cr0, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr0 = 0;
-}
-
-if ( w->do_write.cr4 )
-{
-if ( hvm_set_cr4(w->cr4, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr4 = 0;
-}
-
-if ( w->do_write.cr3 )
-{
-if ( hvm_set_cr3(w->cr3, 0) == X86EMUL_EXCEPTION )
-hvm_inject_hw_exception(TRAP_gp_fault, 0);
-
-w->do_write.cr3 = 0;
-}
-}
+hvm_vm_event_do_resume(v);
 
 /* Inject pending hw/sw event */
 if ( v->arch.hvm_vcpu.inject_event.vector >= 0 )
diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c
new file mode 100644
index 000..b35ee12
--- /dev/null
+++ b/xen/arch/x86/hvm/vm_event.c
@@ -0,0 +1,101 @@
+/*
+ * arch/x86/hvm/vm_event.c
+ *
+ * HVM vm_event handling routines
+ *
+ * Copyright (c) 2017 Razvan Cojocaru (rcojoc...@bitdefender.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHAN

[Xen-devel] [PATCH V2 0/2] Fix vm_event resume path race condition

2017-05-03 Thread Razvan Cojocaru
This small series first creates hvm/vm_event.{h,c}, in order to bring
under vm_event maintainership the code that has previously lived in
hvm_do_resume(), and then fixes a __context_switch()-related race
condition when attempting to set registers via vm_event reply.

Previously this has been a single patch, "x86/vm_event: fix race
between vmx_vmexit_handler() and vm_event_resume()".

[PATCH V2 1/2] x86/vm_event: added hvm/vm_event.{h,c}
[PATCH V2 2/2] x86/vm_event: fix race between __context_switch() and


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
On 05/02/2017 07:11 PM, Andrew Cooper wrote:
> On 02/05/17 17:02, Tim Deegan wrote:
>> At 18:21 +0300 on 02 May (1493749307), Razvan Cojocaru wrote:
>>> hvm_save_cpu_ctxt() returns success without writing any data into
>>> hvm_domain_context_t when all VCPUs are offline. This can then crash
>>> the hypervisor (with FATAL PAGE FAULT) in hvm_save_one() via the
>>> "off < (ctxt.cur - sizeof(*desc))" for() test, where ctxt.cur remains 0,
>>> causing an underflow which leads the hypervisor to go off the end of the
>>> ctxt buffer.
>> [...]
>>> Reported-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>>> Tested-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>> I actually preferred the first patch
> 
> As did I.  Seeing as there is no more of my code in it, you should
> probably drop my SoB, but this can be fixed up on commit if there are no
> other issues.

Hah, I've just replied that you should be the author. :)
I am fine with however you prefer this to go.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
On 05/02/2017 06:41 PM, Jan Beulich wrote:
>>>> On 02.05.17 at 17:21, <rcojoc...@bitdefender.com> wrote:
>> hvm_save_cpu_ctxt() returns success without writing any data into
>> hvm_domain_context_t when all VCPUs are offline. This can then crash
>> the hypervisor (with FATAL PAGE FAULT) in hvm_save_one() via the
>> "off < (ctxt.cur - sizeof(*desc))" for() test, where ctxt.cur remains 0,
>> causing an underflow which leads the hypervisor to go off the end of the
>> ctxt buffer.
>>
>> This has been broken since Xen 4.4 (c/s e019c606f59).
>> It has happened in practice with an HVM Linux VM (Debian 8) queried around
>> shutdown:
>>
>> (XEN) hvm.c:1595:d3v0 All CPUs offline -- powering off.
>> (XEN) [ Xen-4.9-rc  x86_64  debug=y   Not tainted ]
>> (XEN) CPU:5
>> (XEN) RIP:e008:[] hvm_save_one+0x145/0x1fd
>> (XEN) RFLAGS: 00010286   CONTEXT: hypervisor (d0v2)
>> (XEN) rax: 830492cbb445   rbx:    rcx: 83039343b400
>> (XEN) rdx: ff88004d   rsi: fff8   rdi: 
>> (XEN) rbp: 8304103e7c88   rsp: 8304103e7c48   r8:  0001
>> (XEN) r9:  deadbeefdeadf00d   r10:    r11: 0282
>> (XEN) r12: 7f43a3b14004   r13: fffe   r14: 
>> (XEN) r15: 830400c41000   cr0: 80050033   cr4: 001526e0
>> (XEN) cr3: 000402e13000   cr2: 830492cbb447
>> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
>> (XEN) Xen code around  (hvm_save_one+0x145/0x1fd):
>> (XEN)  00 00 48 01 c8 83 c2 08 <66> 39 58 02 75 64 eb 08 48 89 c8 ba 08 00 
>> 00 00
>> (XEN) Xen stack trace from rsp=8304103e7c48:
>> (XEN)0410 83039343b400 8304103e7c70 8304103e7da8
>> (XEN)830400c41000 7f43a3b13004 8304103b7000 ffea
>> (XEN)8304103e7d48 82d0802683d4 8300d19fd000 82d0802320d8
>> (XEN)830400c41000  8304103e7cd8 82d08026ff3d
>> (XEN) 8300d19fd000 8304103e7cf8 82d080232142
>> (XEN) 8300d19fd000 8304103e7d28 82d080207051
>> (XEN)8304103e7d18 830400c41000 0202 830400c41000
>> (XEN) 7f43a3b13004  deadbeefdeadf00d
>> (XEN)8304103e7e68 82d080206c47 0700 830410375bd0
>> (XEN)0296 830410375c78 830410375c80 0003
>> (XEN)8304103e7e68 8304103b67c0 8304103b7000 8304103b67c0
>> (XEN)000d0037 0003 0002 7f43a3b14004
>> (XEN)7ffd5d925590  0001 
>> (XEN)ea8f8000  7ffd 
>> (XEN)7f43a276f557  ea8f8000 
>> (XEN)7ffd5d9255e0 7f43a23280b2 7ffd5d926058 8304103e7f18
>> (XEN)8300d19fe000 0024 82d0802053e5 deadbeefdeadf00d
>> (XEN)8304103e7f08 82d080351565 01003fff 7f43a3b13004
>> (XEN)deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d
>> (XEN)8800781425c0 88007ce94300 8304103e7ed8 82d0802719ec
>> (XEN) Xen call trace:
>> (XEN)[] hvm_save_one+0x145/0x1fd
>> (XEN)[] arch_do_domctl+0xa7a/0x259f
>> (XEN)[] do_domctl+0x1862/0x1b7b
>> (XEN)[] pv_hypercall+0x1ef/0x42c
>> (XEN)[] entry.o#test_all_events+0/0x30
>> (XEN)
>> (XEN) Pagetable walk from 830492cbb447:
>> (XEN)  L4[0x106] = dbc36063 
>> (XEN)  L3[0x012] =  ffff
>> (XEN)
>> (XEN) 
>> (XEN) Panic on CPU 5:
>> (XEN) FATAL PAGE FAULT
>> (XEN) [error_code=]
>> (XEN) Faulting linear address: 830492cbb447
>> (XEN) 
>>
>> Reported-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>> Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
> 
> With this it's not really clear to me whether Andrew or you is the
> patch author.

Andrew deserves the credit. While I've found and tested the issue, and
did the actual submitting / small changes, his help finding the exact
cause, and advice on the best way to fix it is what has made the difference.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
hvm_save_cpu_ctxt() returns success without writing any data into
hvm_domain_context_t when all VCPUs are offline. This can then crash
the hypervisor (with FATAL PAGE FAULT) in hvm_save_one() via the
"off < (ctxt.cur - sizeof(*desc))" for() test, where ctxt.cur remains 0,
causing an underflow which leads the hypervisor to go off the end of the
ctxt buffer.

This has been broken since Xen 4.4 (c/s e019c606f59).
It has happened in practice with an HVM Linux VM (Debian 8) queried around
shutdown:

(XEN) hvm.c:1595:d3v0 All CPUs offline -- powering off.
(XEN) [ Xen-4.9-rc  x86_64  debug=y   Not tainted ]
(XEN) CPU:5
(XEN) RIP:e008:[] hvm_save_one+0x145/0x1fd
(XEN) RFLAGS: 00010286   CONTEXT: hypervisor (d0v2)
(XEN) rax: 830492cbb445   rbx:    rcx: 83039343b400
(XEN) rdx: ff88004d   rsi: fff8   rdi: 
(XEN) rbp: 8304103e7c88   rsp: 8304103e7c48   r8:  0001
(XEN) r9:  deadbeefdeadf00d   r10:    r11: 0282
(XEN) r12: 7f43a3b14004   r13: fffe   r14: 
(XEN) r15: 830400c41000   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 000402e13000   cr2: 830492cbb447
(XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
(XEN) Xen code around  (hvm_save_one+0x145/0x1fd):
(XEN)  00 00 48 01 c8 83 c2 08 <66> 39 58 02 75 64 eb 08 48 89 c8 ba 08 00 00 00
(XEN) Xen stack trace from rsp=8304103e7c48:
(XEN)0410 83039343b400 8304103e7c70 8304103e7da8
(XEN)830400c41000 7f43a3b13004 8304103b7000 ffea
(XEN)8304103e7d48 82d0802683d4 8300d19fd000 82d0802320d8
(XEN)830400c41000  8304103e7cd8 82d08026ff3d
(XEN) 8300d19fd000 8304103e7cf8 82d080232142
(XEN) 8300d19fd000 8304103e7d28 82d080207051
(XEN)8304103e7d18 830400c41000 0202 830400c41000
(XEN) 7f43a3b13004  deadbeefdeadf00d
(XEN)8304103e7e68 82d080206c47 0700 830410375bd0
(XEN)0296 830410375c78 830410375c80 0003
(XEN)8304103e7e68 8304103b67c0 8304103b7000 8304103b67c0
(XEN)000d0037 0003 0002 7f43a3b14004
(XEN)7ffd5d925590  0001 
(XEN)ea8f8000  7ffd 
(XEN)7f43a276f557  ea8f8000 
(XEN)7ffd5d9255e0 7f43a23280b2 7ffd5d926058 8304103e7f18
(XEN)8300d19fe000 0024 82d0802053e5 deadbeefdeadf00d
(XEN)8304103e7f08 82d080351565 01003fff 7f43a3b13004
(XEN)deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d
(XEN)8800781425c0 88007ce94300 8304103e7ed8 82d0802719ec
(XEN) Xen call trace:
(XEN)[] hvm_save_one+0x145/0x1fd
(XEN)[] arch_do_domctl+0xa7a/0x259f
(XEN)[] do_domctl+0x1862/0x1b7b
(XEN)[] pv_hypercall+0x1ef/0x42c
(XEN)[] entry.o#test_all_events+0/0x30
(XEN)
(XEN) Pagetable walk from 830492cbb447:
(XEN)  L4[0x106] = dbc36063 
(XEN)  L3[0x012] =  
(XEN)
(XEN) 
(XEN) Panic on CPU 5:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=]
(XEN) Faulting linear address: 830492cbb447
(XEN) 

Reported-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Tested-by: Razvan Cojocaru <rcojoc...@bitdefender.com>

---
Changes since V1:
 - Corrected patch description.
 - Now checking whether the function got back any data at all, prior to
   entering the for() loop.
---
 xen/common/hvm/save.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c
index 78706f5..3bdd124 100644
--- a/xen/common/hvm/save.c
+++ b/xen/common/hvm/save.c
@@ -113,6 +113,9 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
uint16_t instance,
 const struct hvm_save_descriptor *desc;
 
 rv = -ENOENT;
+if ( ctxt.cur < sizeof(*desc) )
+goto out;
+
 for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += desc->length )
 {
 desc = (void *)(ctxt.data + off);
@@ -132,6 +135,7 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
uint16_t instance,
 }
 }
 
+ out:
 xfree(ctxt.data);
 return rv;
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
On 05/02/17 17:09, Jan Beulich wrote:
 On 02.05.17 at 15:54,  wrote:
>> On 05/02/17 16:48, Jan Beulich wrote:
>> On 02.05.17 at 15:25,  wrote:
 --- a/xen/common/hvm/save.c
 +++ b/xen/common/hvm/save.c
 @@ -113,7 +113,7 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
 uint16_t instance,
  const struct hvm_save_descriptor *desc;
  
  rv = -ENOENT;
 -for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += 
 desc->length 
>> )
 +for ( off = 0; (off + sizeof(*desc)) < ctxt.cur; off += 
 desc->length 
>> )
  {
  desc = (void *)(ctxt.data + off);
  /* Move past header */
>>>
>>> I don't think this is an appropriate fix. Instead I think the function
>>> should check whether it got back any data at all, prior to entering
>>> the loop. Furthermore it might be worth considering to (also)
>>> refuse doing anything here if the domain's is_dying marker has
>>> already been set.
>>
>> hvm_save_one() already checks is_dying:
>>
>>  77 /* Extract a single instance of a save record, by marshalling all
>>  78  * records of that type and copying out the one we need. */
>>  79 int hvm_save_one(struct domain *d, uint16_t typecode, uint16_t
>> instance,
>>  80  XEN_GUEST_HANDLE_64(uint8) handle)
>>  81 {
>>  82 int rv = 0;
>>  83 size_t sz = 0;
>>  84 struct vcpu *v;
>>  85 hvm_domain_context_t ctxt = { 0, };
>>  86
>>  87 if ( d->is_dying
>>  88  || typecode > HVM_SAVE_CODE_MAX
>>  89  || hvm_sr_handlers[typecode].size < sizeof(struct
>> hvm_save_descriptor)
>>  90  || hvm_sr_handlers[typecode].save == NULL )
>>  91 return -EINVAL;
> 
> Hmm, interesting. The timing window to see is_dying clear here,
> bit no vCPU-s left there should be pretty small, so I wonder how
> you've managed to hit it. But anyway ...
> 
>> As for checking whether the handler wrote any data, I believe that
>> Andrew has checked and none of the handlers report when no data is being
>> passed on.
> 
> ... that's not what I've read out of his replies. I don't think the
> handlers need to report anything special. It is the caller which
> should check whether, despite having got back "success" there's
> no data in the buffer.

So you would prefer something like this?

diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c
index 78706f5..d4c8d84 100644
--- a/xen/common/hvm/save.c
+++ b/xen/common/hvm/save.c
@@ -113,6 +113,10 @@ int hvm_save_one(struct domain *d, uint16_t
typecode, uint16_t instance,
 const struct hvm_save_descriptor *desc;

 rv = -ENOENT;
+
+if ( !ctxt.cur )
+goto out;
+
 for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off +=
desc->length )
 {
 desc = (void *)(ctxt.data + off);
@@ -132,6 +136,7 @@ int hvm_save_one(struct domain *d, uint16_t
typecode, uint16_t instance,
 }
 }

+out:
 xfree(ctxt.data);
 return rv;
}


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
On 05/02/17 16:48, Jan Beulich wrote:
 On 02.05.17 at 15:25,  wrote:
>> hvm_save_cpu_ctxt() does a memset(, 0, sizeof(ctxt)), which
>> can lead to ctxt.cur being 0. This can then crash the hypervisor
>> (with FATAL PAGE FAULT) in hvm_save_one() via the
>> "off < (ctxt.cur - sizeof(*desc))" for() test. This has happened
>> in practice with a Linux VM queried around shutdown:
> 
> While a fix is indeed needed here, I can't connect your description
> with the actual crash: The memset() you refer to affects a local
> variable only, which happens to have the same name as a different
> variable in the caller. You also don't make clear at all why this would
> be an issue after guest shutdown was initiated already. Aiui the
> problem is that hvm_save_cpu_ctxt() might find all vCPU-s having
> VPF_down set, in which case it wouldn't put anything into the passed
> in buffer.

That is indeed the case, I've misunderstood the cause of the issue.

>> --- a/xen/common/hvm/save.c
>> +++ b/xen/common/hvm/save.c
>> @@ -113,7 +113,7 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
>> uint16_t instance,
>>  const struct hvm_save_descriptor *desc;
>>  
>>  rv = -ENOENT;
>> -for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += 
>> desc->length )
>> +for ( off = 0; (off + sizeof(*desc)) < ctxt.cur; off += 
>> desc->length )
>>  {
>>  desc = (void *)(ctxt.data + off);
>>  /* Move past header */
> 
> I don't think this is an appropriate fix. Instead I think the function
> should check whether it got back any data at all, prior to entering
> the loop. Furthermore it might be worth considering to (also)
> refuse doing anything here if the domain's is_dying marker has
> already been set.

hvm_save_one() already checks is_dying:

 77 /* Extract a single instance of a save record, by marshalling all
 78  * records of that type and copying out the one we need. */
 79 int hvm_save_one(struct domain *d, uint16_t typecode, uint16_t
instance,
 80  XEN_GUEST_HANDLE_64(uint8) handle)
 81 {
 82 int rv = 0;
 83 size_t sz = 0;
 84 struct vcpu *v;
 85 hvm_domain_context_t ctxt = { 0, };
 86
 87 if ( d->is_dying
 88  || typecode > HVM_SAVE_CODE_MAX
 89  || hvm_sr_handlers[typecode].size < sizeof(struct
hvm_save_descriptor)
 90  || hvm_sr_handlers[typecode].save == NULL )
 91 return -EINVAL;

As for checking whether the handler wrote any data, I believe that
Andrew has checked and none of the handlers report when no data is being
passed on.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
On 05/02/17 16:41, Tim Deegan wrote:
> Hi,
> 
> At 16:25 +0300 on 02 May (1493742339), Razvan Cojocaru wrote:
>> hvm_save_cpu_ctxt() does a memset(, 0, sizeof(ctxt)), which
>> can lead to ctxt.cur being 0. This can then crash the hypervisor
>> (with FATAL PAGE FAULT) in hvm_save_one() via the
>> "off < (ctxt.cur - sizeof(*desc))" for() test. This has happened
>> in practice with a Linux VM queried around shutdown:
> 
> Good fix, thanks!  But I think that memset is innocent -- it's
> clearing a local "struct hvm_hw_cpu ctxt", not the caller's
> "hvm_domain_context_t ctxt".  AFAICS the underflow happens when the
> per-type handler returns no data at all (because all VCPUs are
> offline).
> 
> With the commit message fixed,
> 
> Reviewed-by: Tim Deegan <t...@xen.org>

Indeed, sorry about the misunderstanding. I'll fix the commit message
and resend V2.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen/hvm: fix hypervisor crash with hvm_save_one()

2017-05-02 Thread Razvan Cojocaru
hvm_save_cpu_ctxt() does a memset(, 0, sizeof(ctxt)), which
can lead to ctxt.cur being 0. This can then crash the hypervisor
(with FATAL PAGE FAULT) in hvm_save_one() via the
"off < (ctxt.cur - sizeof(*desc))" for() test. This has happened
in practice with a Linux VM queried around shutdown:

(XEN) hvm.c:1595:d3v0 All CPUs offline -- powering off.
(XEN) [ Xen-4.9-rc  x86_64  debug=y   Not tainted ]
(XEN) CPU:5
(XEN) RIP:e008:[] hvm_save_one+0x145/0x1fd
(XEN) RFLAGS: 00010286   CONTEXT: hypervisor (d0v2)
(XEN) rax: 830492cbb445   rbx:    rcx: 83039343b400
(XEN) rdx: ff88004d   rsi: fff8   rdi: 
(XEN) rbp: 8304103e7c88   rsp: 8304103e7c48   r8:  0001
(XEN) r9:  deadbeefdeadf00d   r10:    r11: 0282
(XEN) r12: 7f43a3b14004   r13: fffe   r14: 
(XEN) r15: 830400c41000   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 000402e13000   cr2: 830492cbb447
(XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
(XEN) Xen code around  (hvm_save_one+0x145/0x1fd):
(XEN)  00 00 48 01 c8 83 c2 08 <66> 39 58 02 75 64 eb 08 48 89 c8 ba 08 00 00 00
(XEN) Xen stack trace from rsp=8304103e7c48:
(XEN)0410 83039343b400 8304103e7c70 8304103e7da8
(XEN)830400c41000 7f43a3b13004 8304103b7000 ffea
(XEN)8304103e7d48 82d0802683d4 8300d19fd000 82d0802320d8
(XEN)830400c41000  8304103e7cd8 82d08026ff3d
(XEN) 8300d19fd000 8304103e7cf8 82d080232142
(XEN) 8300d19fd000 8304103e7d28 82d080207051
(XEN)8304103e7d18 830400c41000 0202 830400c41000
(XEN) 7f43a3b13004  deadbeefdeadf00d
(XEN)8304103e7e68 82d080206c47 0700 830410375bd0
(XEN)0296 830410375c78 830410375c80 0003
(XEN)8304103e7e68 8304103b67c0 8304103b7000 8304103b67c0
(XEN)000d0037 0003 0002 7f43a3b14004
(XEN)7ffd5d925590  0001 
(XEN)ea8f8000  7ffd 
(XEN)7f43a276f557  ea8f8000 
(XEN)7ffd5d9255e0 7f43a23280b2 7ffd5d926058 8304103e7f18
(XEN)8300d19fe000 0024 82d0802053e5 deadbeefdeadf00d
(XEN)8304103e7f08 82d080351565 01003fff 7f43a3b13004
(XEN)deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d
(XEN)8800781425c0 88007ce94300 8304103e7ed8 82d0802719ec
(XEN) Xen call trace:
(XEN)[] hvm_save_one+0x145/0x1fd
(XEN)[] arch_do_domctl+0xa7a/0x259f
(XEN)[] do_domctl+0x1862/0x1b7b
(XEN)[] pv_hypercall+0x1ef/0x42c
(XEN)[] entry.o#test_all_events+0/0x30
(XEN)
(XEN) Pagetable walk from 830492cbb447:
(XEN)  L4[0x106] = dbc36063 
(XEN)  L3[0x012] =  
(XEN)
(XEN) 
(XEN) Panic on CPU 5:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=]
(XEN) Faulting linear address: 830492cbb447
(XEN) 

Reported-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Tested-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
---
 xen/common/hvm/save.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c
index 78706f5..ea2e251 100644
--- a/xen/common/hvm/save.c
+++ b/xen/common/hvm/save.c
@@ -113,7 +113,7 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
uint16_t instance,
 const struct hvm_save_descriptor *desc;
 
 rv = -ENOENT;
-for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += desc->length )
+for ( off = 0; (off + sizeof(*desc)) < ctxt.cur; off += desc->length )
 {
 desc = (void *)(ctxt.data + off);
 /* Move past header */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 4/4] arm/mem_access: Add software guest-page-table walk

2017-05-01 Thread Razvan Cojocaru
On 04/30/2017 10:48 PM, Sergej Proskurin wrote:
> The function p2m_mem_access_check_and_get_page in mem_access.c
> translates a gva to an ipa by means of the hardware functionality
> implemented in the function gva_to_ipa. If mem_access is active,
> hardware-based gva to ipa translation might fail, as gva_to_ipa uses the
> guest's translation tables, access to which might be restricted by the
> active VTTBR. To address this issue, we perform the gva to ipa
> translation in software.
> 
> Signed-off-by: Sergej Proskurin <prosku...@sec.in.tum.de>
> ---
> Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Tamas K Lengyel <ta...@tklengyel.com>
> Cc: Stefano Stabellini <sstabell...@kernel.org>
> Cc: Julien Grall <julien.gr...@arm.com>
> ---
>  xen/arch/arm/mem_access.c | 140 
> +-
>  1 file changed, 139 insertions(+), 1 deletion(-)

My ARM knowledge is scant to say the least, and I have no way to test
this code, so I'll leave it to Tamas who has done some ARM work in the
past. In any case - to state the obvious - the main acks here I believe
are Julien and Stefano.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()

2017-04-28 Thread Razvan Cojocaru
On 04/28/2017 09:25 AM, Jan Beulich wrote:
 On 27.04.17 at 19:31,  wrote:
>> On 27/04/17 09:52, Jan Beulich wrote:
>> On 27.04.17 at 10:34,  wrote:
 On 27/04/2017 09:01, Jan Beulich wrote:
 On 27.04.17 at 09:22,  wrote:
>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -357,6 +357,13 @@ void vm_event_resume(struct domain *d, struct 
>> vm_event_domain *ved)
>>  {
>>  vm_event_response_t rsp;
>>  
>> +/*
>> + * vm_event_resume() runs either from XEN_DOMCTL_VM_EVENT_OP_*, or
>> + * EVTCHN_send from the introspection consumer.  Both contexts are
>> + * guaranteed not to be the subject of vm_event responses.
>> + */
>> +ASSERT(d != current->domain);
> What is this meant to guard against?
 Documentation, as much as anything else.  It took a long time to
 untangle the contexts involved here; I'm not convinced the logic is safe
 to run in context, and it certainly doesn't need to.
>>> Well, as said - I think it is too strict now: You only need the vCPU
>>> not be current afaict, and it really matters here which of the two
>>> "current"-s you actually mean (and it looks to me as if you mean
>>> the other one, guaranteeing register state to no longer be on the
>>> stack).
>>
>> We don't know the vcpu at this point; that information comes out of the
>> replies on the ring.
>>
>> We also may process multiple replies in this one loop.  In the worse
>> case, we process one reply for every vcpu in d.
>>
>> If the ASSERT() were to be deferred until the middle of the request
>> loop, we could ASSERT(v != current) for every vcpu in d, but that is
>> identical to this single assertion.
> 
> No, it isn't. It would only be if you iterated over all vCPU-s of that
> domain (which as you say may or may not happen).

I will modify to the code to whatever you and Andrew decide is best, but
just in case this helps decide, in my experience iterating over all
VCPUs here will happen more often than not - even looking at
xen-access.c today, it poll()s with a timeout, processes all collected
events (which, if they are sync events - and they always are with our
application - there can be several of them only if they correspond to
different VCPUs), and only then signals the event channel.

But there are of course cases in which less than all the VCPUs have
placed events in the ring buffer.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()

2017-04-27 Thread Razvan Cojocaru
On 04/27/17 12:00, Jan Beulich wrote:
 On 27.04.17 at 10:34,  wrote:
>> On 04/27/17 11:18, Jan Beulich wrote:
>> On 27.04.17 at 10:11,  wrote:
 On 04/27/17 11:01, Jan Beulich wrote:
 On 27.04.17 at 09:22,  wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -473,6 +473,39 @@ static bool hvm_get_pending_event(struct vcpu *v, 
>> struct x86_event *info)
>>  return hvm_funcs.get_pending_event(v, info);
>>  }
>>  
>> +static void hvm_vm_event_set_registers(struct vcpu *v)
>> +{
>> +ASSERT(v == current);
>> +
>> +if ( v->arch.vm_event->set_gprs )
>
> I think we will want an unlikely() here. We anyway can only hope for
> the compiler to always inline this function, such that non-VM-event
> setups don't get penalized by the extra call here. Strictly speaking
> the function doesn't belong into this file ...

 Should I move it to the x86 vm_event code?
>>>
>>> If you do, then you'll want to have an inline wrapper with the if()
>>> in it, so the actual call at the machine level would be avoided in the
>>> common case.
>>
>> Sorry, I'm not sure I understand: if moving it is not required, I'd
>> prefer to leave it where it is, as we already have
>> vm_event_set_registers() in arch/x86/vm_event.c and I feel it would
>> complicate matters there (I'd have to perhaps prepend "hvm_" to it in
>> which case it wouldn't really belong in vm_event.c either). But if it's
>> necessary, I'll move it - do you want me to move it?
> 
> Well, logically this is a VM event function, so belongs into vm_event.c.
> So I'd _prefer_ for it to be moved, but I don't insist (in the end, by
> leaving it where it is, you and Tamas [wrongly imo] are not the
> maintainers of it). But I agree, x86/vm_event.c isn't an ideal place
> either, better would be x86/hvm/vm_event.c.
> 
> Otoh hvm_do_resume() open codes all the CR and MSR writes
> (getting the order wrong as it seems, btw, as some MSR writes
> depend on certain CR settings), so maybe a separate function isn't
> needed at all here? In fact the bulk of the function is VM event
> code, so one might also view it the other way around, and most or
> all of this should move into a new function, for example
> hvm_vm_event_do_resume().

All fair points. If there are no objections, I'll add x86/hvm/vm_event.c
and asm-x86/hvm/vm_event.h, and place hvm_vm_do_resume() in it. I'll
also add the files to MAINTAINERS under vm_event.

In the process of moving the code, I'll also put the MSR write last.

This will then become a 2-patch series, with the 1st patch doing the
above, and the second patch will be this one.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()

2017-04-27 Thread Razvan Cojocaru
On 04/27/17 11:18, Jan Beulich wrote:
 On 27.04.17 at 10:11,  wrote:
>> On 04/27/17 11:01, Jan Beulich wrote:
>> On 27.04.17 at 09:22,  wrote:
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -473,6 +473,39 @@ static bool hvm_get_pending_event(struct vcpu *v, 
 struct x86_event *info)
  return hvm_funcs.get_pending_event(v, info);
  }
  
 +static void hvm_vm_event_set_registers(struct vcpu *v)
 +{
 +ASSERT(v == current);
 +
 +if ( v->arch.vm_event->set_gprs )
>>>
>>> I think we will want an unlikely() here. We anyway can only hope for
>>> the compiler to always inline this function, such that non-VM-event
>>> setups don't get penalized by the extra call here. Strictly speaking
>>> the function doesn't belong into this file ...
>>
>> Should I move it to the x86 vm_event code?
> 
> If you do, then you'll want to have an inline wrapper with the if()
> in it, so the actual call at the machine level would be avoided in the
> common case.

Sorry, I'm not sure I understand: if moving it is not required, I'd
prefer to leave it where it is, as we already have
vm_event_set_registers() in arch/x86/vm_event.c and I feel it would
complicate matters there (I'd have to perhaps prepend "hvm_" to it in
which case it wouldn't really belong in vm_event.c either). But if it's
necessary, I'll move it - do you want me to move it?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()

2017-04-27 Thread Razvan Cojocaru
On 04/27/17 11:01, Jan Beulich wrote:
 On 27.04.17 at 09:22,  wrote:
>> The introspection agent can reply to a vm_event faster than
>> vmx_vmexit_handler() can complete in some cases, where it is then
>> not safe for vm_event_set_registers() to modify v->arch.user_regs.
> 
> This needs more explanation: I cannot see the connection between
> vmx_vmexit_handler() completing and (un)safety of modification of
> v->arch.user_regs. The latter is unsafe as long as the vCPU hasn't
> gone through __context_switch(), and the former doesn't call that
> function directly or indirectly (i.e. I think it is more than just
> vmx_vmexit_handler() which needs to be completed by the time
> register modification becomes safe to do).

Indeed, this is exactly the case (__context_switch() doesn't go
through). I'll re-word the commit message.

>> This patch ensures that vm_event_resume() code only sets per-VCPU
>> data to be used for the actual setting of registers only in
>> hvm_do_resume() (similar to the model used to control setting of CRs
>> and MSRs).
> 
> I think the second "only" would better be "later".

Yes, it's actually redundant (sorry). I'll change it to "later".

>> The patch additionally removes the sync_vcpu_execstate(v) call from
>> vm_event_resume(), which is no longer necessary, which removes the
>> associated broadcast TLB flush (read: performance improvement).
> 
> Depending on the better explanation above, it may or may not be
> further necessary to also better explain the "no longer necessary"
> part here.
> 
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -473,6 +473,39 @@ static bool hvm_get_pending_event(struct vcpu *v, 
>> struct x86_event *info)
>>  return hvm_funcs.get_pending_event(v, info);
>>  }
>>  
>> +static void hvm_vm_event_set_registers(struct vcpu *v)
> 
> This could be const, as *v itself isn't being modified, but maybe it's
> better to leave it non-const indeed.

I have no problem changing it if there are no objections.

>> +{
>> +ASSERT(v == current);
>> +
>> +if ( v->arch.vm_event->set_gprs )
> 
> I think we will want an unlikely() here. We anyway can only hope for
> the compiler to always inline this function, such that non-VM-event
> setups don't get penalized by the extra call here. Strictly speaking
> the function doesn't belong into this file ...

Should I move it to the x86 vm_event code?

>> +{
>> +struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +
>> +regs->rax = v->arch.vm_event->gprs.rax;
>> +regs->rbx = v->arch.vm_event->gprs.rbx;
>> +regs->rcx = v->arch.vm_event->gprs.rcx;
>> +regs->rdx = v->arch.vm_event->gprs.rdx;
>> +regs->rsp = v->arch.vm_event->gprs.rsp;
>> +regs->rbp = v->arch.vm_event->gprs.rbp;
>> +regs->rsi = v->arch.vm_event->gprs.rsi;
>> +regs->rdi = v->arch.vm_event->gprs.rdi;
>> +
>> +regs->r8 = v->arch.vm_event->gprs.r8;
>> +regs->r9 = v->arch.vm_event->gprs.r9;
>> +regs->r10 = v->arch.vm_event->gprs.r10;
>> +regs->r11 = v->arch.vm_event->gprs.r11;
>> +regs->r12 = v->arch.vm_event->gprs.r12;
>> +regs->r13 = v->arch.vm_event->gprs.r13;
>> +regs->r14 = v->arch.vm_event->gprs.r14;
>> +regs->r15 = v->arch.vm_event->gprs.r15;
>> +
>> +regs->rflags = v->arch.vm_event->gprs.rflags;
>> +regs->rip = v->arch.vm_event->gprs.rip;
>> +
>> +v->arch.vm_event->set_gprs = 0;
> 
> false (and true/bool elsewhere)

I'll change it to bool.

>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -357,6 +357,13 @@ void vm_event_resume(struct domain *d, struct 
>> vm_event_domain *ved)
>>  {
>>  vm_event_response_t rsp;
>>  
>> +/*
>> + * vm_event_resume() runs either from XEN_DOMCTL_VM_EVENT_OP_*, or
>> + * EVTCHN_send from the introspection consumer.  Both contexts are
>> + * guaranteed not to be the subject of vm_event responses.
>> + */
>> +ASSERT(d != current->domain);
> 
> What is this meant to guard against? It surely isn't ...
> 
>> @@ -375,13 +382,6 @@ void vm_event_resume(struct domain *d, struct 
>> vm_event_domain *ved)
>>  v = d->vcpu[rsp.vcpu_id];
>>  
>>  /*
>> - * Make sure the vCPU state has been synchronized for the custom
>> - * handlers.
>> - */
>> -if ( atomic_read(>vm_event_pause_count) )
>> -sync_vcpu_execstate(v);
> 
> ... a "replacement" for this, as the state of "current" doesn't reflect
> whether register state has been saved (that's this_cpu(curr_vcpu)
> instead). Nor would a comparison of domains seem to be the right
> thing - a comparison of vcpus ought to suffice (and be less strict,
> allowing for something hypothetical like self-introspection).

This part (the ASSERT and comment) is from Andrew, he can help us here.


Thanks,
Razvan

___
Xen-devel mailing list

[Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()

2017-04-27 Thread Razvan Cojocaru
The introspection agent can reply to a vm_event faster than
vmx_vmexit_handler() can complete in some cases, where it is then
not safe for vm_event_set_registers() to modify v->arch.user_regs.
This patch ensures that vm_event_resume() code only sets per-VCPU
data to be used for the actual setting of registers only in
hvm_do_resume() (similar to the model used to control setting of CRs
and MSRs).
The patch additionally removes the sync_vcpu_execstate(v) call from
vm_event_resume(), which is no longer necessary, which removes the
associated broadcast TLB flush (read: performance improvement).

Signed-off-by: Razvan Cojocaru <rcojoc...@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
 xen/arch/x86/hvm/hvm.c | 35 +++
 xen/arch/x86/vm_event.c| 22 ++
 xen/common/vm_event.c  | 14 +++---
 xen/include/asm-x86/vm_event.h |  2 ++
 4 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a441955..520942a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -473,6 +473,39 @@ static bool hvm_get_pending_event(struct vcpu *v, struct 
x86_event *info)
 return hvm_funcs.get_pending_event(v, info);
 }
 
+static void hvm_vm_event_set_registers(struct vcpu *v)
+{
+ASSERT(v == current);
+
+if ( v->arch.vm_event->set_gprs )
+{
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+regs->rax = v->arch.vm_event->gprs.rax;
+regs->rbx = v->arch.vm_event->gprs.rbx;
+regs->rcx = v->arch.vm_event->gprs.rcx;
+regs->rdx = v->arch.vm_event->gprs.rdx;
+regs->rsp = v->arch.vm_event->gprs.rsp;
+regs->rbp = v->arch.vm_event->gprs.rbp;
+regs->rsi = v->arch.vm_event->gprs.rsi;
+regs->rdi = v->arch.vm_event->gprs.rdi;
+
+regs->r8 = v->arch.vm_event->gprs.r8;
+regs->r9 = v->arch.vm_event->gprs.r9;
+regs->r10 = v->arch.vm_event->gprs.r10;
+regs->r11 = v->arch.vm_event->gprs.r11;
+regs->r12 = v->arch.vm_event->gprs.r12;
+regs->r13 = v->arch.vm_event->gprs.r13;
+regs->r14 = v->arch.vm_event->gprs.r14;
+regs->r15 = v->arch.vm_event->gprs.r15;
+
+regs->rflags = v->arch.vm_event->gprs.rflags;
+regs->rip = v->arch.vm_event->gprs.rip;
+
+v->arch.vm_event->set_gprs = 0;
+}
+}
+
 void hvm_do_resume(struct vcpu *v)
 {
 check_wakeup_from_wait();
@@ -487,6 +520,8 @@ void hvm_do_resume(struct vcpu *v)
 {
 struct monitor_write_data *w = >arch.vm_event->write_data;
 
+hvm_vm_event_set_registers(v);
+
 if ( unlikely(v->arch.vm_event->emulate_flags) )
 {
 enum emul_kind kind = EMUL_KIND_NORMAL;
diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
index 502ccc2..f66780a 100644
--- a/xen/arch/x86/vm_event.c
+++ b/xen/arch/x86/vm_event.c
@@ -113,26 +113,8 @@ void vm_event_set_registers(struct vcpu *v, 
vm_event_response_t *rsp)
 {
 ASSERT(atomic_read(>vm_event_pause_count));
 
-v->arch.user_regs.rax = rsp->data.regs.x86.rax;
-v->arch.user_regs.rbx = rsp->data.regs.x86.rbx;
-v->arch.user_regs.rcx = rsp->data.regs.x86.rcx;
-v->arch.user_regs.rdx = rsp->data.regs.x86.rdx;
-v->arch.user_regs.rsp = rsp->data.regs.x86.rsp;
-v->arch.user_regs.rbp = rsp->data.regs.x86.rbp;
-v->arch.user_regs.rsi = rsp->data.regs.x86.rsi;
-v->arch.user_regs.rdi = rsp->data.regs.x86.rdi;
-
-v->arch.user_regs.r8 = rsp->data.regs.x86.r8;
-v->arch.user_regs.r9 = rsp->data.regs.x86.r9;
-v->arch.user_regs.r10 = rsp->data.regs.x86.r10;
-v->arch.user_regs.r11 = rsp->data.regs.x86.r11;
-v->arch.user_regs.r12 = rsp->data.regs.x86.r12;
-v->arch.user_regs.r13 = rsp->data.regs.x86.r13;
-v->arch.user_regs.r14 = rsp->data.regs.x86.r14;
-v->arch.user_regs.r15 = rsp->data.regs.x86.r15;
-
-v->arch.user_regs.rflags = rsp->data.regs.x86.rflags;
-v->arch.user_regs.rip = rsp->data.regs.x86.rip;
+v->arch.vm_event->gprs = rsp->data.regs.x86;
+v->arch.vm_event->set_gprs = 1;
 }
 
 void vm_event_monitor_next_interrupt(struct vcpu *v)
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 0fe9a53..498749b 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -357,6 +357,13 @@ void vm_event_resume(struct domain *d, struct 
vm_event_domain *ved)
 {
 vm_event_response_t rsp;
 
+/*
+ * vm_event_resume() runs either from XEN_DOMCTL_VM_EVENT_OP_*, or
+ * EVTCHN_send from the introspection consumer.  Both contexts are
+ * guaranteed not to be the subject 

Re: [Xen-devel] [PATCH v2 3/4] VMX: don't blindly enable descriptor table exiting control

2017-04-18 Thread Razvan Cojocaru
On 04/18/2017 01:32 PM, Jan Beulich wrote:
> This is an optional feature and hence we should check for it before
> use.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> ---
> v2: Re-do detection of availability, resulting in almost all of the
> changes done here being different than in v1.

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] tools: Use POSIX signal.h instead of sys/signal.h

2017-04-18 Thread Razvan Cojocaru
On 04/18/2017 01:03 PM, Wei Liu wrote:
> On Mon, Apr 17, 2017 at 02:33:11PM -0700, Alistair Francis wrote:
>> The POSIX spec specifies to use:
>> #include 
>> instead of:
>> #include 
>> as seen here:
>>http://pubs.opengroup.org/onlinepubs/009695399/functions/signal.html
>>
>> This removes the warning:
>> #warning redirecting incorrect #include  to 
>> when building with the musl C-library.
>>
>> Signed-off-by: Alistair Francis <alistair.fran...@xilinx.com>
> 
> Acked-by: Wei Liu <wei.l...@citrix.com>

FWIW:

Acked-by: Razvan Cojocaru <rcojoc...@bitdefender.com>


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   4   5   6   7   8   9   >