Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Jan Beulich
>>> On 24.03.17 at 13:35,  wrote:
> Besides, even the p2m_get_ioreq_server() is used here in 
> ept_p2m_type_to_flags(), it can only
> provide a limited protection, there's a chance that returned flag in 
> ept_p2m_type_to_flags()
> be outdated in above situation.
> 
> So, since we do not really care about the p2m entry permission in such 
> extreme situation, and
> we can not 100% guarantee the lock will protect it, I do not think we 
> need to use the lock here.
> 
> I am not sure if this explanation is convincing to you, but I'm also 
> open to be convinced. :-)

Well, okay, keep it as it is then. I'm not fully convinced, but the
change wouldn't buy us much.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Yu Zhang



On 3/24/2017 6:19 PM, Jan Beulich wrote:

On 24.03.17 at 10:05,  wrote:

On 3/23/2017 4:57 PM, Jan Beulich wrote:

On 23.03.17 at 04:23,  wrote:

On 3/22/2017 10:21 PM, Jan Beulich wrote:

On 21.03.17 at 03:52,  wrote:

@@ -177,8 +178,64 @@ static int hvmemul_do_io(
break;
case X86EMUL_UNHANDLEABLE:
{
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, );
+/*
+ * Xen isn't emulating the instruction internally, so see if
+ * there's an ioreq server that can handle it. Rules:
+ *
+ * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
+ * to choose the ioreq server by range. If no server is found,
+ * the access is ignored.
+ *
+ * - p2m_ioreq_server accesses are handled by the designated
+ * ioreq_server for the domain, but there are some corner
+ * cases:
+ *
+ *   - If the domain ioreq_server is NULL, assume there is a
+ *   race between the unbinding of ioreq server and guest fault
+ *   so re-try the instruction.

And that retry won't come back here because of? (The answer
should not include any behavior added by subsequent patches.)

You got me. :)
In this patch, retry will come back here. It should be after patch 4 or
patch 5 that the retry
will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).

In which case I think we shouldn't insist on you to change things, but
you should spell out very clearly that this patch should not go in
without the others going in at the same time.

So maybe it would be better we leave the retry part to a later patch,
say patch 4/5 or patch 5/5,
and return unhandleable in this patch?

I don't follow. I've specifically suggested that you don't change
the code, but simply state clearly the requirement that patches
2...5 of this series should all go in at the same time. I don't mind
you making changes, but the risk then is that further round trips
may be required because of there being new issues with the
changes you may do.


Thanks, Jan. I'll keep the code, and add a note in the commit message of 
this patch.



--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
L1_gpa, paddr_t *L0_gpa,
if ( *p2mt == p2m_mmio_direct )
goto direct_mmio_out;
rc = NESTEDHVM_PAGEFAULT_MMIO;
-if ( *p2mt == p2m_mmio_dm )
+if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )

Btw., how does this addition match up with the rc value being
assigned right before the if()?

Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger
handle_mmio() later in
hvm_hap_nested_page_fault(). Guess that is what we expected.

That's probably what is expected, but it's no MMIO which we're
doing in that case. And note that we've stopped abusing
handle_mmio() for non-MMIO purposes a little while ago (commit
3dd00f7b56 ["x86/HVM: restrict permitted instructions during
special purpose emulation"]).

OK. So what about we just remove this "*p2mt == p2m_ioreq_server"?

Well, you must have had a reason to add it. To be honest, I don't
care too much about the nested code (as it's far from production
ready anyway), so leaving the code above untouched would be
fine with me, but taking care of adjustments to nested code where
they're actually needed would be even better. So the preferred
option is for you to explain why you've done the change above,
and why you think it's correct/needed. The next best option might
be to drop the change.


Got it. I now prefer to drop the change. This code was added at the 
early stage of this patchset when
we hope p2m_ioreq_server can always trigger a handle_mmio(), but frankly 
we do not, and probably

will not use the nested case in the foreseeable future.


--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
ept_entry_t *entry,
entry->r = entry->w = entry->x = 1;
entry->a = entry->d = !!cpu_has_vmx_ept_ad;
break;
+case p2m_ioreq_server:
+entry->r = 1;
+entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);

Is this effectively open coded p2m_get_ioreq_server() actually
okay? If so, why does the function need to be used elsewhere,
instead of doing direct, lock-free accesses?

Maybe your comments is about whether it is necessary to use the lock in
p2m_get_ioreq_server()?
I still believe so, it does not only protect the value of ioreq server,
but also the flag together with it.

Besides, it is used not only in the emulation process, but also the
hypercall to set the mem type.
So the lock can still provide some kind protection against the
p2m_set_ioreq_server() - even it 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Yu Zhang



On 3/24/2017 5:26 PM, Tian, Kevin wrote:

From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
Sent: Wednesday, March 22, 2017 6:13 PM

On 3/22/2017 3:49 PM, Tian, Kevin wrote:

From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
Sent: Tuesday, March 21, 2017 10:53 AM

A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to

let

one ioreq server claim/disclaim its responsibility for the handling
of guest pages with p2m type p2m_ioreq_server. Users of this DMOP can
specify which kind of operation is supposed to be emulated in a
parameter named flags. Currently, this DMOP only support the emulation

of write operations.

And it can be further extended to support the emulation of read ones
if an ioreq server has such requirement in the future.

p2m_ioreq_server was already introduced before. Do you want to give
some background how current state is around that type which will be
helpful about purpose of this patch?

Sorry? I thought the background is described in the cover letter.
Previously p2m_ioreq_server is only for write-protection, and is tracked in an
ioreq server's rangeset, this patch is to bind the p2m type with an ioreq
server directly.

cover letter will not be in git repo. Better you can include it to make
this commit along complete.


OK. Thanks.


For now, we only support one ioreq server for this p2m type, so once
an ioreq server has claimed its ownership, subsequent calls of the
XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with p2m_ioreq_server, by
triggering this new DMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

Note both XEN_DMOP_map_mem_type_to_ioreq_server and

p2m_ioreq_server

are only supported for HVMs with HAP enabled.

Also note that only after one ioreq server claims its ownership of
p2m_ioreq_server, will the p2m type change to p2m_ioreq_server be
allowed.

Signed-off-by: Paul Durrant 
Signed-off-by: Yu Zhang 
Acked-by: Tim Deegan 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Paul Durrant 
Cc: George Dunlap 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: Tim Deegan 

changes in v8:
- According to comments from Jan & Paul: comments changes in
hvmemul_do_io().
- According to comments from Jan: remove the redundant code which
would only
  be useful for read emulations.
- According to comments from Jan: change interface which maps mem
type to
  ioreq server, removed uint16_t pad and added an uint64_t opaque.
- Address other comments from Jan, i.e. correct return values; remove

stray

  cast.

changes in v7:
- Use new ioreq server interface -
XEN_DMOP_map_mem_type_to_ioreq_server.
- According to comments from George: removed
domain_pause/unpause() in
  hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
  and we can avoid the:
  a> deadlock issue existed in v6 patch, between p2m lock and ioreq

server

 lock by using these locks in the same order - solved in patch 4;
  b> for race condition between vm exit and ioreq server unbinding, we

can

 just retry this instruction.
- According to comments from Jan and George: continue to clarify logic

in

  hvmemul_do_io().
- According to comments from Jan: clarify comment in
p2m_set_ioreq_server().

changes in v6:
- Clarify logic in hvmemul_do_io().
- Use recursive lock for ioreq server lock.
- Remove debug print when mapping ioreq server.
- Clarify code in ept_p2m_type_to_flags() for consistency.
- Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
- Add comments for HVMMEM_ioreq_server to note only changes
  to/from HVMMEM_ram_rw are permitted.
- Add domain_pause/unpause() in

hvm_map_mem_type_to_ioreq_server()

  to avoid the race condition when a vm exit happens on a write-
  protected page, just to find the ioreq server has been unmapped
  already.
- Introduce a seperate patch to delay the release of p2m
  lock to avoid the race condition.
- Introduce a seperate patch to handle the read-modify-write
  operations on a write protected page.

changes in v5:
- Simplify logic in hvmemul_do_io().
- Use natual width types instead of fixed width types when possible.
- Do not grant executable permission for p2m_ioreq_server entries.
- Clarify comments and commit message.
- Introduce a seperate patch to recalculate the p2m types after
  the ioreq server unmaps the p2m_ioreq_server.

changes in v4:
- According to Paul's advice, add comments around the definition
  of HVMMEM_iore_server in hvm_op.h.
- According to Wei Liu's comments, change the format of the commit
  message.

changes in v3:
- Only support write emulation in this patch;

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Jan Beulich
>>> On 24.03.17 at 10:05,  wrote:
> On 3/23/2017 4:57 PM, Jan Beulich wrote:
> On 23.03.17 at 04:23,  wrote:
>>> On 3/22/2017 10:21 PM, Jan Beulich wrote:
>>> On 21.03.17 at 03:52,  wrote:
> @@ -177,8 +178,64 @@ static int hvmemul_do_io(
>break;
>case X86EMUL_UNHANDLEABLE:
>{
> -struct hvm_ioreq_server *s =
> -hvm_select_ioreq_server(curr->domain, );
> +/*
> + * Xen isn't emulating the instruction internally, so see if
> + * there's an ioreq server that can handle it. Rules:
> + *
> + * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
> + * to choose the ioreq server by range. If no server is found,
> + * the access is ignored.
> + *
> + * - p2m_ioreq_server accesses are handled by the designated
> + * ioreq_server for the domain, but there are some corner
> + * cases:
> + *
> + *   - If the domain ioreq_server is NULL, assume there is a
> + *   race between the unbinding of ioreq server and guest fault
> + *   so re-try the instruction.
 And that retry won't come back here because of? (The answer
 should not include any behavior added by subsequent patches.)
>>> You got me. :)
>>> In this patch, retry will come back here. It should be after patch 4 or
>>> patch 5 that the retry
>>> will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).
>> In which case I think we shouldn't insist on you to change things, but
>> you should spell out very clearly that this patch should not go in
>> without the others going in at the same time.
> 
> So maybe it would be better we leave the retry part to a later patch, 
> say patch 4/5 or patch 5/5,
> and return unhandleable in this patch?

I don't follow. I've specifically suggested that you don't change
the code, but simply state clearly the requirement that patches
2...5 of this series should all go in at the same time. I don't mind
you making changes, but the risk then is that further round trips
may be required because of there being new issues with the
changes you may do.

> --- a/xen/arch/x86/mm/hap/nested_hap.c
> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> @@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
> L1_gpa, paddr_t *L0_gpa,
>if ( *p2mt == p2m_mmio_direct )
>goto direct_mmio_out;
>rc = NESTEDHVM_PAGEFAULT_MMIO;
> -if ( *p2mt == p2m_mmio_dm )
> +if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
 Btw., how does this addition match up with the rc value being
 assigned right before the if()?
>>> Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger
>>> handle_mmio() later in
>>> hvm_hap_nested_page_fault(). Guess that is what we expected.
>> That's probably what is expected, but it's no MMIO which we're
>> doing in that case. And note that we've stopped abusing
>> handle_mmio() for non-MMIO purposes a little while ago (commit
>> 3dd00f7b56 ["x86/HVM: restrict permitted instructions during
>> special purpose emulation"]).
> 
> OK. So what about we just remove this "*p2mt == p2m_ioreq_server"?

Well, you must have had a reason to add it. To be honest, I don't
care too much about the nested code (as it's far from production
ready anyway), so leaving the code above untouched would be
fine with me, but taking care of adjustments to nested code where
they're actually needed would be even better. So the preferred
option is for you to explain why you've done the change above,
and why you think it's correct/needed. The next best option might
be to drop the change.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
> *p2m, ept_entry_t *entry,
>entry->r = entry->w = entry->x = 1;
>entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>break;
> +case p2m_ioreq_server:
> +entry->r = 1;
> +entry->w = !(p2m->ioreq.flags & 
> XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
 Is this effectively open coded p2m_get_ioreq_server() actually
 okay? If so, why does the function need to be used elsewhere,
 instead of doing direct, lock-free accesses?
>>> Maybe your comments is about whether it is necessary to use the lock in
>>> p2m_get_ioreq_server()?
>>> I still believe so, it does not only protect the value of ioreq server,
>>> but also the flag together with it.
>>>
>>> Besides, it is used not only in the emulation process, but also the
>>> hypercall to set the mem type.
>>> So the lock can still provide some kind protection against the
>>> p2m_set_ioreq_server() - even it does
>>> 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Tian, Kevin
> From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
> Sent: Wednesday, March 22, 2017 6:13 PM
> 
> On 3/22/2017 3:49 PM, Tian, Kevin wrote:
> >> From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
> >> Sent: Tuesday, March 21, 2017 10:53 AM
> >>
> >> A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to
> let
> >> one ioreq server claim/disclaim its responsibility for the handling
> >> of guest pages with p2m type p2m_ioreq_server. Users of this DMOP can
> >> specify which kind of operation is supposed to be emulated in a
> >> parameter named flags. Currently, this DMOP only support the emulation
> of write operations.
> >> And it can be further extended to support the emulation of read ones
> >> if an ioreq server has such requirement in the future.
> > p2m_ioreq_server was already introduced before. Do you want to give
> > some background how current state is around that type which will be
> > helpful about purpose of this patch?
> 
> Sorry? I thought the background is described in the cover letter.
> Previously p2m_ioreq_server is only for write-protection, and is tracked in an
> ioreq server's rangeset, this patch is to bind the p2m type with an ioreq
> server directly.

cover letter will not be in git repo. Better you can include it to make
this commit along complete.

> 
> >> For now, we only support one ioreq server for this p2m type, so once
> >> an ioreq server has claimed its ownership, subsequent calls of the
> >> XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can also
> >> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
> >> triggering this new DMOP, with ioreq server id set to the current
> >> owner's and flags parameter set to 0.
> >>
> >> Note both XEN_DMOP_map_mem_type_to_ioreq_server and
> p2m_ioreq_server
> >> are only supported for HVMs with HAP enabled.
> >>
> >> Also note that only after one ioreq server claims its ownership of
> >> p2m_ioreq_server, will the p2m type change to p2m_ioreq_server be
> >> allowed.
> >>
> >> Signed-off-by: Paul Durrant 
> >> Signed-off-by: Yu Zhang 
> >> Acked-by: Tim Deegan 
> >> ---
> >> Cc: Jan Beulich 
> >> Cc: Andrew Cooper 
> >> Cc: Paul Durrant 
> >> Cc: George Dunlap 
> >> Cc: Jun Nakajima 
> >> Cc: Kevin Tian 
> >> Cc: Tim Deegan 
> >>
> >> changes in v8:
> >>- According to comments from Jan & Paul: comments changes in
> >> hvmemul_do_io().
> >>- According to comments from Jan: remove the redundant code which
> >> would only
> >>  be useful for read emulations.
> >>- According to comments from Jan: change interface which maps mem
> >> type to
> >>  ioreq server, removed uint16_t pad and added an uint64_t opaque.
> >>- Address other comments from Jan, i.e. correct return values; remove
> stray
> >>  cast.
> >>
> >> changes in v7:
> >>- Use new ioreq server interface -
> >> XEN_DMOP_map_mem_type_to_ioreq_server.
> >>- According to comments from George: removed
> >> domain_pause/unpause() in
> >>  hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
> >>  and we can avoid the:
> >>  a> deadlock issue existed in v6 patch, between p2m lock and ioreq
> server
> >> lock by using these locks in the same order - solved in patch 4;
> >>  b> for race condition between vm exit and ioreq server unbinding, we
> can
> >> just retry this instruction.
> >>- According to comments from Jan and George: continue to clarify logic
> in
> >>  hvmemul_do_io().
> >>- According to comments from Jan: clarify comment in
> >> p2m_set_ioreq_server().
> >>
> >> changes in v6:
> >>- Clarify logic in hvmemul_do_io().
> >>- Use recursive lock for ioreq server lock.
> >>- Remove debug print when mapping ioreq server.
> >>- Clarify code in ept_p2m_type_to_flags() for consistency.
> >>- Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
> >>- Add comments for HVMMEM_ioreq_server to note only changes
> >>  to/from HVMMEM_ram_rw are permitted.
> >>- Add domain_pause/unpause() in
> hvm_map_mem_type_to_ioreq_server()
> >>  to avoid the race condition when a vm exit happens on a write-
> >>  protected page, just to find the ioreq server has been unmapped
> >>  already.
> >>- Introduce a seperate patch to delay the release of p2m
> >>  lock to avoid the race condition.
> >>- Introduce a seperate patch to handle the read-modify-write
> >>  operations on a write protected page.
> >>
> >> changes in v5:
> >>- Simplify logic in hvmemul_do_io().
> >>- Use natual width types instead of fixed width types when possible.
> >>- Do not grant executable permission for p2m_ioreq_server entries.
> >>- Clarify comments and commit message.
> >>- Introduce a seperate patch to 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-24 Thread Yu Zhang



On 3/23/2017 4:57 PM, Jan Beulich wrote:

On 23.03.17 at 04:23,  wrote:

On 3/22/2017 10:21 PM, Jan Beulich wrote:

On 21.03.17 at 03:52,  wrote:

---
   xen/arch/x86/hvm/dm.c| 37 ++--
   xen/arch/x86/hvm/emulate.c   | 65 ---
   xen/arch/x86/hvm/ioreq.c | 38 +
   xen/arch/x86/mm/hap/nested_hap.c |  2 +-
   xen/arch/x86/mm/p2m-ept.c|  8 -
   xen/arch/x86/mm/p2m-pt.c | 19 +++
   xen/arch/x86/mm/p2m.c| 74 

   xen/arch/x86/mm/shadow/multi.c   |  3 +-
   xen/include/asm-x86/hvm/ioreq.h  |  2 ++
   xen/include/asm-x86/p2m.h| 26 --
   xen/include/public/hvm/dm_op.h   | 28 +++
   xen/include/public/hvm/hvm_op.h  |  8 -
   12 files changed, 290 insertions(+), 20 deletions(-)

Btw., isn't there a libdevicemodel wrapper missing here for this new
sub-op?

Yes. I planed to add the wrapper code in another patch after this series
is accepted.
Is this a must in this patchset?

I think so, or else the code you add is effectively dead. We should
avoid encouraging people to bypass libxc.


OK. I'll try to add another patch to do so, along with the existing 
ones. Thanks.

@@ -177,8 +178,64 @@ static int hvmemul_do_io(
   break;
   case X86EMUL_UNHANDLEABLE:
   {
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, );
+/*
+ * Xen isn't emulating the instruction internally, so see if
+ * there's an ioreq server that can handle it. Rules:
+ *
+ * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
+ * to choose the ioreq server by range. If no server is found,
+ * the access is ignored.
+ *
+ * - p2m_ioreq_server accesses are handled by the designated
+ * ioreq_server for the domain, but there are some corner
+ * cases:
+ *
+ *   - If the domain ioreq_server is NULL, assume there is a
+ *   race between the unbinding of ioreq server and guest fault
+ *   so re-try the instruction.

And that retry won't come back here because of? (The answer
should not include any behavior added by subsequent patches.)

You got me. :)
In this patch, retry will come back here. It should be after patch 4 or
patch 5 that the retry
will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).

In which case I think we shouldn't insist on you to change things, but
you should spell out very clearly that this patch should not go in
without the others going in at the same time.


So maybe it would be better we leave the retry part to a later patch, 
say patch 4/5 or patch 5/5,

and return unhandleable in this patch?


--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
L1_gpa, paddr_t *L0_gpa,
   if ( *p2mt == p2m_mmio_direct )
   goto direct_mmio_out;
   rc = NESTEDHVM_PAGEFAULT_MMIO;
-if ( *p2mt == p2m_mmio_dm )
+if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )

Btw., how does this addition match up with the rc value being
assigned right before the if()?

Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger
handle_mmio() later in
hvm_hap_nested_page_fault(). Guess that is what we expected.

That's probably what is expected, but it's no MMIO which we're
doing in that case. And note that we've stopped abusing
handle_mmio() for non-MMIO purposes a little while ago (commit
3dd00f7b56 ["x86/HVM: restrict permitted instructions during
special purpose emulation"]).


OK. So what about we just remove this "*p2mt == p2m_ioreq_server"?


--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
ept_entry_t *entry,
   entry->r = entry->w = entry->x = 1;
   entry->a = entry->d = !!cpu_has_vmx_ept_ad;
   break;
+case p2m_ioreq_server:
+entry->r = 1;
+entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);

Is this effectively open coded p2m_get_ioreq_server() actually
okay? If so, why does the function need to be used elsewhere,
instead of doing direct, lock-free accesses?

Maybe your comments is about whether it is necessary to use the lock in
p2m_get_ioreq_server()?
I still believe so, it does not only protect the value of ioreq server,
but also the flag together with it.

Besides, it is used not only in the emulation process, but also the
hypercall to set the mem type.
So the lock can still provide some kind protection against the
p2m_set_ioreq_server() - even it does
not always do so.

The question, fundamentally, is about consistency: The same
access model should be followed universally, unless 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-23 Thread Jan Beulich
>>> On 23.03.17 at 04:23,  wrote:
> On 3/22/2017 10:21 PM, Jan Beulich wrote:
> On 21.03.17 at 03:52,  wrote:
>>> ---
>>>   xen/arch/x86/hvm/dm.c| 37 ++--
>>>   xen/arch/x86/hvm/emulate.c   | 65 ---
>>>   xen/arch/x86/hvm/ioreq.c | 38 +
>>>   xen/arch/x86/mm/hap/nested_hap.c |  2 +-
>>>   xen/arch/x86/mm/p2m-ept.c|  8 -
>>>   xen/arch/x86/mm/p2m-pt.c | 19 +++
>>>   xen/arch/x86/mm/p2m.c| 74 
>>> 
>>>   xen/arch/x86/mm/shadow/multi.c   |  3 +-
>>>   xen/include/asm-x86/hvm/ioreq.h  |  2 ++
>>>   xen/include/asm-x86/p2m.h| 26 --
>>>   xen/include/public/hvm/dm_op.h   | 28 +++
>>>   xen/include/public/hvm/hvm_op.h  |  8 -
>>>   12 files changed, 290 insertions(+), 20 deletions(-)
>> Btw., isn't there a libdevicemodel wrapper missing here for this new
>> sub-op?
> 
> Yes. I planed to add the wrapper code in another patch after this series 
> is accepted.
> Is this a must in this patchset?

I think so, or else the code you add is effectively dead. We should
avoid encouraging people to bypass libxc.

>>> @@ -177,8 +178,64 @@ static int hvmemul_do_io(
>>>   break;
>>>   case X86EMUL_UNHANDLEABLE:
>>>   {
>>> -struct hvm_ioreq_server *s =
>>> -hvm_select_ioreq_server(curr->domain, );
>>> +/*
>>> + * Xen isn't emulating the instruction internally, so see if
>>> + * there's an ioreq server that can handle it. Rules:
>>> + *
>>> + * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
>>> + * to choose the ioreq server by range. If no server is found,
>>> + * the access is ignored.
>>> + *
>>> + * - p2m_ioreq_server accesses are handled by the designated
>>> + * ioreq_server for the domain, but there are some corner
>>> + * cases:
>>> + *
>>> + *   - If the domain ioreq_server is NULL, assume there is a
>>> + *   race between the unbinding of ioreq server and guest fault
>>> + *   so re-try the instruction.
>> And that retry won't come back here because of? (The answer
>> should not include any behavior added by subsequent patches.)
> 
> You got me. :)
> In this patch, retry will come back here. It should be after patch 4 or 
> patch 5 that the retry
> will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).

In which case I think we shouldn't insist on you to change things, but
you should spell out very clearly that this patch should not go in
without the others going in at the same time.

>>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>>> @@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
>>> L1_gpa, paddr_t *L0_gpa,
>>>   if ( *p2mt == p2m_mmio_direct )
>>>   goto direct_mmio_out;
>>>   rc = NESTEDHVM_PAGEFAULT_MMIO;
>>> -if ( *p2mt == p2m_mmio_dm )
>>> +if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
>> Btw., how does this addition match up with the rc value being
>> assigned right before the if()?
> 
> Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger 
> handle_mmio() later in
> hvm_hap_nested_page_fault(). Guess that is what we expected.

That's probably what is expected, but it's no MMIO which we're
doing in that case. And note that we've stopped abusing
handle_mmio() for non-MMIO purposes a little while ago (commit
3dd00f7b56 ["x86/HVM: restrict permitted instructions during
special purpose emulation"]).

>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
>>> *p2m, ept_entry_t *entry,
>>>   entry->r = entry->w = entry->x = 1;
>>>   entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>>   break;
>>> +case p2m_ioreq_server:
>>> +entry->r = 1;
>>> +entry->w = !(p2m->ioreq.flags & 
>>> XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
>> Is this effectively open coded p2m_get_ioreq_server() actually
>> okay? If so, why does the function need to be used elsewhere,
>> instead of doing direct, lock-free accesses?
> 
> Maybe your comments is about whether it is necessary to use the lock in 
> p2m_get_ioreq_server()?
> I still believe so, it does not only protect the value of ioreq server, 
> but also the flag together with it.
> 
> Besides, it is used not only in the emulation process, but also the 
> hypercall to set the mem type.
> So the lock can still provide some kind protection against the 
> p2m_set_ioreq_server() - even it does
> not always do so.

The question, fundamentally, is about consistency: The same
access model should be followed universally, unless there is an
explicit reason for an exception.

Jan


Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-22 Thread Yu Zhang



On 3/22/2017 10:21 PM, Jan Beulich wrote:

On 21.03.17 at 03:52,  wrote:

---
  xen/arch/x86/hvm/dm.c| 37 ++--
  xen/arch/x86/hvm/emulate.c   | 65 ---
  xen/arch/x86/hvm/ioreq.c | 38 +
  xen/arch/x86/mm/hap/nested_hap.c |  2 +-
  xen/arch/x86/mm/p2m-ept.c|  8 -
  xen/arch/x86/mm/p2m-pt.c | 19 +++
  xen/arch/x86/mm/p2m.c| 74 
  xen/arch/x86/mm/shadow/multi.c   |  3 +-
  xen/include/asm-x86/hvm/ioreq.h  |  2 ++
  xen/include/asm-x86/p2m.h| 26 --
  xen/include/public/hvm/dm_op.h   | 28 +++
  xen/include/public/hvm/hvm_op.h  |  8 -
  12 files changed, 290 insertions(+), 20 deletions(-)

Btw., isn't there a libdevicemodel wrapper missing here for this new
sub-op?


Yes. I planed to add the wrapper code in another patch after this series 
is accepted.

Is this a must in this patchset?


@@ -177,8 +178,64 @@ static int hvmemul_do_io(
  break;
  case X86EMUL_UNHANDLEABLE:
  {
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, );
+/*
+ * Xen isn't emulating the instruction internally, so see if
+ * there's an ioreq server that can handle it. Rules:
+ *
+ * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
+ * to choose the ioreq server by range. If no server is found,
+ * the access is ignored.
+ *
+ * - p2m_ioreq_server accesses are handled by the designated
+ * ioreq_server for the domain, but there are some corner
+ * cases:
+ *
+ *   - If the domain ioreq_server is NULL, assume there is a
+ *   race between the unbinding of ioreq server and guest fault
+ *   so re-try the instruction.

And that retry won't come back here because of? (The answer
should not include any behavior added by subsequent patches.)


You got me. :)
In this patch, retry will come back here. It should be after patch 4 or 
patch 5 that the retry

will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).


+ */
+struct hvm_ioreq_server *s = NULL;
+p2m_type_t p2mt = p2m_invalid;
+
+if ( is_mmio )
+{
+unsigned long gmfn = paddr_to_pfn(addr);
+
+get_gfn_query_unlocked(currd, gmfn, );
+
+if ( p2mt == p2m_ioreq_server )
+{
+unsigned int flags;
+
+/*
+ * Value of s could be stale, when we lost a race
+ * with dm_op which unmaps p2m_ioreq_server from the
+ * ioreq server. Yet there's no cheap way to avoid
+ * this, so device model need to do the check.
+ */
+s = p2m_get_ioreq_server(currd, );
+
+/*
+ * If p2mt is ioreq_server but ioreq_server is NULL,
+ * we probably lost a race with unbinding of ioreq
+ * server, just retry the access.
+ */

This repeats the earlier comment - please settle on where to state
this, but don't say the exact same thing twice within a few lines of
code.


Thanks, will remove this comment.


+if ( s == NULL )
+{
+rc = X86EMUL_RETRY;
+vio->io_req.state = STATE_IOREQ_NONE;
+break;
+}
+}
+}
+
+/*
+ * Value of s could be stale, when we lost a race with dm_op
+ * which unmaps this PIO/MMIO address from the ioreq server.
+ * The device model side need to do the check.

I think "will do" would be more natural here, or add "anyway" to
the end of the sentence.



Got it. Thanks.


@@ -914,6 +916,42 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, 
ioservid_t id,
  return rc;
  }
  
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,

+ uint32_t type, uint32_t flags)
+{
+struct hvm_ioreq_server *s;
+int rc;
+
+/* For now, only HVMMEM_ioreq_server is supported. */
+if ( type != HVMMEM_ioreq_server )
+return -EINVAL;
+
+/* For now, only write emulation is supported. */
+if ( flags & ~(XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )

Stray parentheses.


Got it.

--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
L1_gpa, paddr_t *L0_gpa,
  if ( *p2mt == p2m_mmio_direct )
  goto direct_mmio_out;
  rc = NESTEDHVM_PAGEFAULT_MMIO;
-if ( *p2mt == p2m_mmio_dm )
+if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )

Btw., how does this addition match up with the rc value being
assigned right before the if()?


Well returning a 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-22 Thread Jan Beulich
>>> On 21.03.17 at 03:52,  wrote:
> ---
>  xen/arch/x86/hvm/dm.c| 37 ++--
>  xen/arch/x86/hvm/emulate.c   | 65 ---
>  xen/arch/x86/hvm/ioreq.c | 38 +
>  xen/arch/x86/mm/hap/nested_hap.c |  2 +-
>  xen/arch/x86/mm/p2m-ept.c|  8 -
>  xen/arch/x86/mm/p2m-pt.c | 19 +++
>  xen/arch/x86/mm/p2m.c| 74 
> 
>  xen/arch/x86/mm/shadow/multi.c   |  3 +-
>  xen/include/asm-x86/hvm/ioreq.h  |  2 ++
>  xen/include/asm-x86/p2m.h| 26 --
>  xen/include/public/hvm/dm_op.h   | 28 +++
>  xen/include/public/hvm/hvm_op.h  |  8 -
>  12 files changed, 290 insertions(+), 20 deletions(-)

Btw., isn't there a libdevicemodel wrapper missing here for this new
sub-op?

> @@ -177,8 +178,64 @@ static int hvmemul_do_io(
>  break;
>  case X86EMUL_UNHANDLEABLE:
>  {
> -struct hvm_ioreq_server *s =
> -hvm_select_ioreq_server(curr->domain, );
> +/*
> + * Xen isn't emulating the instruction internally, so see if
> + * there's an ioreq server that can handle it. Rules:
> + *
> + * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
> + * to choose the ioreq server by range. If no server is found,
> + * the access is ignored.
> + *
> + * - p2m_ioreq_server accesses are handled by the designated
> + * ioreq_server for the domain, but there are some corner
> + * cases:
> + *
> + *   - If the domain ioreq_server is NULL, assume there is a
> + *   race between the unbinding of ioreq server and guest fault
> + *   so re-try the instruction.

And that retry won't come back here because of? (The answer
should not include any behavior added by subsequent patches.)

> + */
> +struct hvm_ioreq_server *s = NULL;
> +p2m_type_t p2mt = p2m_invalid;
> +
> +if ( is_mmio )
> +{
> +unsigned long gmfn = paddr_to_pfn(addr);
> +
> +get_gfn_query_unlocked(currd, gmfn, );
> +
> +if ( p2mt == p2m_ioreq_server )
> +{
> +unsigned int flags;
> +
> +/*
> + * Value of s could be stale, when we lost a race
> + * with dm_op which unmaps p2m_ioreq_server from the
> + * ioreq server. Yet there's no cheap way to avoid
> + * this, so device model need to do the check.
> + */
> +s = p2m_get_ioreq_server(currd, );
> +
> +/*
> + * If p2mt is ioreq_server but ioreq_server is NULL,
> + * we probably lost a race with unbinding of ioreq
> + * server, just retry the access.
> + */

This repeats the earlier comment - please settle on where to state
this, but don't say the exact same thing twice within a few lines of
code.

> +if ( s == NULL )
> +{
> +rc = X86EMUL_RETRY;
> +vio->io_req.state = STATE_IOREQ_NONE;
> +break;
> +}
> +}
> +}
> +
> +/*
> + * Value of s could be stale, when we lost a race with dm_op
> + * which unmaps this PIO/MMIO address from the ioreq server.
> + * The device model side need to do the check.

I think "will do" would be more natural here, or add "anyway" to
the end of the sentence.

> @@ -914,6 +916,42 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain 
> *d, ioservid_t id,
>  return rc;
>  }
>  
> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
> + uint32_t type, uint32_t flags)
> +{
> +struct hvm_ioreq_server *s;
> +int rc;
> +
> +/* For now, only HVMMEM_ioreq_server is supported. */
> +if ( type != HVMMEM_ioreq_server )
> +return -EINVAL;
> +
> +/* For now, only write emulation is supported. */
> +if ( flags & ~(XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )

Stray parentheses.

> --- a/xen/arch/x86/mm/hap/nested_hap.c
> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> @@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
> L1_gpa, paddr_t *L0_gpa,
>  if ( *p2mt == p2m_mmio_direct )
>  goto direct_mmio_out;
>  rc = NESTEDHVM_PAGEFAULT_MMIO;
> -if ( *p2mt == p2m_mmio_dm )
> +if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )

Btw., how does this addition match up with the rc value being
assigned right before the if()?

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
> *p2m, ept_entry_t *entry,
>  entry->r = entry->w = entry->x = 1;
>  entry->a = entry->d = 

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-22 Thread Yu Zhang



On 3/22/2017 3:49 PM, Tian, Kevin wrote:

From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
Sent: Tuesday, March 21, 2017 10:53 AM

A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to let
one ioreq server claim/disclaim its responsibility for the handling of guest
pages with p2m type p2m_ioreq_server. Users of this DMOP can specify
which kind of operation is supposed to be emulated in a parameter named
flags. Currently, this DMOP only support the emulation of write operations.
And it can be further extended to support the emulation of read ones if an
ioreq server has such requirement in the future.

p2m_ioreq_server was already introduced before. Do you want to
give some background how current state is around that type which
will be helpful about purpose of this patch?


Sorry? I thought the background is described in the cover letter.
Previously p2m_ioreq_server is only for write-protection, and is tracked 
in an ioreq server's

rangeset, this patch is to bind the p2m type with an ioreq server directly.


For now, we only support one ioreq server for this p2m type, so once an
ioreq server has claimed its ownership, subsequent calls of the
XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with p2m_ioreq_server, by
triggering this new DMOP, with ioreq server id set to the current owner's and
flags parameter set to 0.

Note both XEN_DMOP_map_mem_type_to_ioreq_server and
p2m_ioreq_server are only supported for HVMs with HAP enabled.

Also note that only after one ioreq server claims its ownership of
p2m_ioreq_server, will the p2m type change to p2m_ioreq_server be
allowed.

Signed-off-by: Paul Durrant 
Signed-off-by: Yu Zhang 
Acked-by: Tim Deegan 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Paul Durrant 
Cc: George Dunlap 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: Tim Deegan 

changes in v8:
   - According to comments from Jan & Paul: comments changes in
hvmemul_do_io().
   - According to comments from Jan: remove the redundant code which
would only
 be useful for read emulations.
   - According to comments from Jan: change interface which maps mem type
to
 ioreq server, removed uint16_t pad and added an uint64_t opaque.
   - Address other comments from Jan, i.e. correct return values; remove stray
 cast.

changes in v7:
   - Use new ioreq server interface -
XEN_DMOP_map_mem_type_to_ioreq_server.
   - According to comments from George: removed domain_pause/unpause()
in
 hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
 and we can avoid the:
 a> deadlock issue existed in v6 patch, between p2m lock and ioreq server
lock by using these locks in the same order - solved in patch 4;
 b> for race condition between vm exit and ioreq server unbinding, we can
just retry this instruction.
   - According to comments from Jan and George: continue to clarify logic in
 hvmemul_do_io().
   - According to comments from Jan: clarify comment in
p2m_set_ioreq_server().

changes in v6:
   - Clarify logic in hvmemul_do_io().
   - Use recursive lock for ioreq server lock.
   - Remove debug print when mapping ioreq server.
   - Clarify code in ept_p2m_type_to_flags() for consistency.
   - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
   - Add comments for HVMMEM_ioreq_server to note only changes
 to/from HVMMEM_ram_rw are permitted.
   - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
 to avoid the race condition when a vm exit happens on a write-
 protected page, just to find the ioreq server has been unmapped
 already.
   - Introduce a seperate patch to delay the release of p2m
 lock to avoid the race condition.
   - Introduce a seperate patch to handle the read-modify-write
 operations on a write protected page.

changes in v5:
   - Simplify logic in hvmemul_do_io().
   - Use natual width types instead of fixed width types when possible.
   - Do not grant executable permission for p2m_ioreq_server entries.
   - Clarify comments and commit message.
   - Introduce a seperate patch to recalculate the p2m types after
 the ioreq server unmaps the p2m_ioreq_server.

changes in v4:
   - According to Paul's advice, add comments around the definition
 of HVMMEM_iore_server in hvm_op.h.
   - According to Wei Liu's comments, change the format of the commit
 message.

changes in v3:
   - Only support write emulation in this patch;
   - Remove the code to handle race condition in hvmemul_do_io(),
   - No need to reset the p2m type after an ioreq server has disclaimed
 its ownership of p2m_ioreq_server;
   - Only allow p2m type change to p2m_ioreq_server after an ioreq
 server has claimed its ownership of p2m_ioreq_server;
  

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-22 Thread Tian, Kevin
> From: Yu Zhang [mailto:yu.c.zh...@linux.intel.com]
> Sent: Tuesday, March 21, 2017 10:53 AM
> 
> A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to let
> one ioreq server claim/disclaim its responsibility for the handling of guest
> pages with p2m type p2m_ioreq_server. Users of this DMOP can specify
> which kind of operation is supposed to be emulated in a parameter named
> flags. Currently, this DMOP only support the emulation of write operations.
> And it can be further extended to support the emulation of read ones if an
> ioreq server has such requirement in the future.

p2m_ioreq_server was already introduced before. Do you want to
give some background how current state is around that type which
will be helpful about purpose of this patch?

> 
> For now, we only support one ioreq server for this p2m type, so once an
> ioreq server has claimed its ownership, subsequent calls of the
> XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can also
> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
> triggering this new DMOP, with ioreq server id set to the current owner's and
> flags parameter set to 0.
> 
> Note both XEN_DMOP_map_mem_type_to_ioreq_server and
> p2m_ioreq_server are only supported for HVMs with HAP enabled.
> 
> Also note that only after one ioreq server claims its ownership of
> p2m_ioreq_server, will the p2m type change to p2m_ioreq_server be
> allowed.
> 
> Signed-off-by: Paul Durrant 
> Signed-off-by: Yu Zhang 
> Acked-by: Tim Deegan 
> ---
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> Cc: Paul Durrant 
> Cc: George Dunlap 
> Cc: Jun Nakajima 
> Cc: Kevin Tian 
> Cc: Tim Deegan 
> 
> changes in v8:
>   - According to comments from Jan & Paul: comments changes in
> hvmemul_do_io().
>   - According to comments from Jan: remove the redundant code which
> would only
> be useful for read emulations.
>   - According to comments from Jan: change interface which maps mem type
> to
> ioreq server, removed uint16_t pad and added an uint64_t opaque.
>   - Address other comments from Jan, i.e. correct return values; remove stray
> cast.
> 
> changes in v7:
>   - Use new ioreq server interface -
> XEN_DMOP_map_mem_type_to_ioreq_server.
>   - According to comments from George: removed domain_pause/unpause()
> in
> hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
> and we can avoid the:
> a> deadlock issue existed in v6 patch, between p2m lock and ioreq server
>lock by using these locks in the same order - solved in patch 4;
> b> for race condition between vm exit and ioreq server unbinding, we can
>just retry this instruction.
>   - According to comments from Jan and George: continue to clarify logic in
> hvmemul_do_io().
>   - According to comments from Jan: clarify comment in
> p2m_set_ioreq_server().
> 
> changes in v6:
>   - Clarify logic in hvmemul_do_io().
>   - Use recursive lock for ioreq server lock.
>   - Remove debug print when mapping ioreq server.
>   - Clarify code in ept_p2m_type_to_flags() for consistency.
>   - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>   - Add comments for HVMMEM_ioreq_server to note only changes
> to/from HVMMEM_ram_rw are permitted.
>   - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
> to avoid the race condition when a vm exit happens on a write-
> protected page, just to find the ioreq server has been unmapped
> already.
>   - Introduce a seperate patch to delay the release of p2m
> lock to avoid the race condition.
>   - Introduce a seperate patch to handle the read-modify-write
> operations on a write protected page.
> 
> changes in v5:
>   - Simplify logic in hvmemul_do_io().
>   - Use natual width types instead of fixed width types when possible.
>   - Do not grant executable permission for p2m_ioreq_server entries.
>   - Clarify comments and commit message.
>   - Introduce a seperate patch to recalculate the p2m types after
> the ioreq server unmaps the p2m_ioreq_server.
> 
> changes in v4:
>   - According to Paul's advice, add comments around the definition
> of HVMMEM_iore_server in hvm_op.h.
>   - According to Wei Liu's comments, change the format of the commit
> message.
> 
> changes in v3:
>   - Only support write emulation in this patch;
>   - Remove the code to handle race condition in hvmemul_do_io(),
>   - No need to reset the p2m type after an ioreq server has disclaimed
> its ownership of p2m_ioreq_server;
>   - Only allow p2m type change to p2m_ioreq_server after an ioreq
> server has claimed its ownership of p2m_ioreq_server;
>   - Only allow p2m type change to p2m_ioreq_server from pages with type
> p2m_ram_rw, and vice versa;
>   - 

[Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-20 Thread Yu Zhang
A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this DMOP can specify which kind of operation is supposed
to be emulated in a parameter named flags. Currently, this DMOP
only support the emulation of write operations. And it can be
further extended to support the emulation of read ones if an
ioreq server has such requirement in the future.

For now, we only support one ioreq server for this p2m type, so
once an ioreq server has claimed its ownership, subsequent calls
of the XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can
also disclaim the ownership of guest ram pages with p2m_ioreq_server,
by triggering this new DMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

Note both XEN_DMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
are only supported for HVMs with HAP enabled.

Also note that only after one ioreq server claims its ownership
of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
be allowed.

Signed-off-by: Paul Durrant 
Signed-off-by: Yu Zhang 
Acked-by: Tim Deegan 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Paul Durrant 
Cc: George Dunlap 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: Tim Deegan 

changes in v8:
  - According to comments from Jan & Paul: comments changes in hvmemul_do_io().
  - According to comments from Jan: remove the redundant code which would only
be useful for read emulations.
  - According to comments from Jan: change interface which maps mem type to
ioreq server, removed uint16_t pad and added an uint64_t opaque.
  - Address other comments from Jan, i.e. correct return values; remove stray
cast.

changes in v7:
  - Use new ioreq server interface - XEN_DMOP_map_mem_type_to_ioreq_server.
  - According to comments from George: removed domain_pause/unpause() in
hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
and we can avoid the:
a> deadlock issue existed in v6 patch, between p2m lock and ioreq server
   lock by using these locks in the same order - solved in patch 4;
b> for race condition between vm exit and ioreq server unbinding, we can
   just retry this instruction.
  - According to comments from Jan and George: continue to clarify logic in
hvmemul_do_io().
  - According to comments from Jan: clarify comment in p2m_set_ioreq_server().

changes in v6:
  - Clarify logic in hvmemul_do_io().
  - Use recursive lock for ioreq server lock.
  - Remove debug print when mapping ioreq server.
  - Clarify code in ept_p2m_type_to_flags() for consistency.
  - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
  - Add comments for HVMMEM_ioreq_server to note only changes
to/from HVMMEM_ram_rw are permitted.
  - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
to avoid the race condition when a vm exit happens on a write-
protected page, just to find the ioreq server has been unmapped
already.
  - Introduce a seperate patch to delay the release of p2m
lock to avoid the race condition.
  - Introduce a seperate patch to handle the read-modify-write
operations on a write protected page.

changes in v5:
  - Simplify logic in hvmemul_do_io().
  - Use natual width types instead of fixed width types when possible.
  - Do not grant executable permission for p2m_ioreq_server entries.
  - Clarify comments and commit message.
  - Introduce a seperate patch to recalculate the p2m types after
the ioreq server unmaps the p2m_ioreq_server.

changes in v4:
  - According to Paul's advice, add comments around the definition
of HVMMEM_iore_server in hvm_op.h.
  - According to Wei Liu's comments, change the format of the commit
message.

changes in v3:
  - Only support write emulation in this patch;
  - Remove the code to handle race condition in hvmemul_do_io(),
  - No need to reset the p2m type after an ioreq server has disclaimed
its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server after an ioreq
server has claimed its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server from pages with type
p2m_ram_rw, and vice versa;
  - HVMOP_map_mem_type_to_ioreq_server interface change - use uint16,
instead of enum to specify the memory type;
  - Function prototype change to p2m_get_ioreq_server();
  - Coding style changes;
  - Commit message changes;
  - Add Tim's Acked-by.

changes in v2:
  - Only support HAP enabled HVMs;
  - Replace p2m_mem_type_changed() with p2m_change_entry_type_global()
to reset the p2m type, when an ioreq server tries to claim/disclaim
its ownership of p2m_ioreq_server;
  -