Re: [Xen-devel] [XenSummit 2017] Notes from the PVH performance session

2017-07-17 Thread Juergen Gross
On 17/07/17 12:15, Andrew Cooper wrote:
> On 17/07/17 11:09, Juergen Gross wrote:
>> Hey,
>>
>> I took a few notes at the PVH performance session at the summit.
>> I hope there isn't any major stuff missing...
>>
>> Participants (at least naming the active ones): Andrew Cooper,
>> Jan Beulich, Paul Durrant, Roger Pau Monné and myself (the list is
>> just from my memory).
>>
>> Following performance problems with PVH, especially when being used
>> as Dom0 or in driver domains, have been named to expected:
>>
>> - Domain creation will be slower compared to PV Dom0, as especially
>>   hypercalls are much more expensive in PVH. Most calls into the
>>   hypervisor will result from hypercall continuations. Measurements
>>   with a PVH Dom0 based on BSD by Roger showed a slowdown of about
>>   factor 3-4 for domain creation.
>>
>> - Live migration will have the same issues as domain creation,
>>   additionally mapping/unmapping the guest's memory will add more
>>   overhead.
>>
>> - Backends for PV devices will suffer from worse hypercall performance
>>   as well, especially event channel operations, maps and unmaps have
>>   been named.
>>
>> The following tuning options have been suggested:
>>
>> - For live migration add a "mem copy" option similar to "grant copy".
>>   This avoids one hypercall compared to "map, copy, unmap" done today.
> 
> I presume you mean "This would be one single hypercall as opposed to the
> three done today" ?

One instead of the two of today (copy isn't a hypercall, of course). So
one hypercall would be avoided.

>> - For domain creation a possible solution could be a service domain
>>   doing the major amount of hypercalls (this service domain would be
>>   PV again, so Wei's idea of PV inside of a PVH container is no
>>   option then). Other ideas are asynchronous hypercalls (via a
>>   hypercall ring), but this would require some kind of service-vcpu
>>   in the hypervisor.
>>   This topic has to be discussed further.
>>
>> - Backend performance could be enhanced by using "grant copy" instead
>>   of "map, use, unmap". OTOH this adds the need for bounce buffers.
>>   Depending on the backend type this might be a good idea, though.
>>
>> - A general way to speed up some hypercalls might be the handling of
>>   hypercall parameters: for some hypercalls parameters could be passed
>>   in registers instead of guest memory. This would remove the need for
>>   walking the guest's page tables when retrieving those parameters.
>>   Hypercalls requiring memory parameters can be sped up by registering
>>   the memory buffers and just referencing those buffers when doing the
>>   hypercalls. The buffers could be kept mapped in the hypervisor so
>>   again there would be no need to walk the guest's pagetables on a hot
>>   path. Another possibility would be to use guest physical addresses
>>   as hypercall parameters.
>>   This should be sorted out and implemented in 4.10 IMO.
> 
> And some initial patches have already been posted :)

Indeed. :-)


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Notes from the PVH performance session

2017-07-17 Thread Andrew Cooper
On 17/07/17 11:09, Juergen Gross wrote:
> Hey,
>
> I took a few notes at the PVH performance session at the summit.
> I hope there isn't any major stuff missing...
>
> Participants (at least naming the active ones): Andrew Cooper,
> Jan Beulich, Paul Durrant, Roger Pau Monné and myself (the list is
> just from my memory).
>
> Following performance problems with PVH, especially when being used
> as Dom0 or in driver domains, have been named to expected:
>
> - Domain creation will be slower compared to PV Dom0, as especially
>   hypercalls are much more expensive in PVH. Most calls into the
>   hypervisor will result from hypercall continuations. Measurements
>   with a PVH Dom0 based on BSD by Roger showed a slowdown of about
>   factor 3-4 for domain creation.
>
> - Live migration will have the same issues as domain creation,
>   additionally mapping/unmapping the guest's memory will add more
>   overhead.
>
> - Backends for PV devices will suffer from worse hypercall performance
>   as well, especially event channel operations, maps and unmaps have
>   been named.
>
> The following tuning options have been suggested:
>
> - For live migration add a "mem copy" option similar to "grant copy".
>   This avoids one hypercall compared to "map, copy, unmap" done today.

I presume you mean "This would be one single hypercall as opposed to the
three done today" ?

>
> - For domain creation a possible solution could be a service domain
>   doing the major amount of hypercalls (this service domain would be
>   PV again, so Wei's idea of PV inside of a PVH container is no
>   option then). Other ideas are asynchronous hypercalls (via a
>   hypercall ring), but this would require some kind of service-vcpu
>   in the hypervisor.
>   This topic has to be discussed further.
>
> - Backend performance could be enhanced by using "grant copy" instead
>   of "map, use, unmap". OTOH this adds the need for bounce buffers.
>   Depending on the backend type this might be a good idea, though.
>
> - A general way to speed up some hypercalls might be the handling of
>   hypercall parameters: for some hypercalls parameters could be passed
>   in registers instead of guest memory. This would remove the need for
>   walking the guest's page tables when retrieving those parameters.
>   Hypercalls requiring memory parameters can be sped up by registering
>   the memory buffers and just referencing those buffers when doing the
>   hypercalls. The buffers could be kept mapped in the hypervisor so
>   again there would be no need to walk the guest's pagetables on a hot
>   path. Another possibility would be to use guest physical addresses
>   as hypercall parameters.
>   This should be sorted out and implemented in 4.10 IMO.

And some initial patches have already been posted :)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XenSummit 2017] Notes from the PVH performance session

2017-07-17 Thread Juergen Gross
Hey,

I took a few notes at the PVH performance session at the summit.
I hope there isn't any major stuff missing...

Participants (at least naming the active ones): Andrew Cooper,
Jan Beulich, Paul Durrant, Roger Pau Monné and myself (the list is
just from my memory).

Following performance problems with PVH, especially when being used
as Dom0 or in driver domains, have been named to expected:

- Domain creation will be slower compared to PV Dom0, as especially
  hypercalls are much more expensive in PVH. Most calls into the
  hypervisor will result from hypercall continuations. Measurements
  with a PVH Dom0 based on BSD by Roger showed a slowdown of about
  factor 3-4 for domain creation.

- Live migration will have the same issues as domain creation,
  additionally mapping/unmapping the guest's memory will add more
  overhead.

- Backends for PV devices will suffer from worse hypercall performance
  as well, especially event channel operations, maps and unmaps have
  been named.

The following tuning options have been suggested:

- For live migration add a "mem copy" option similar to "grant copy".
  This avoids one hypercall compared to "map, copy, unmap" done today.

- For domain creation a possible solution could be a service domain
  doing the major amount of hypercalls (this service domain would be
  PV again, so Wei's idea of PV inside of a PVH container is no
  option then). Other ideas are asynchronous hypercalls (via a
  hypercall ring), but this would require some kind of service-vcpu
  in the hypervisor.
  This topic has to be discussed further.

- Backend performance could be enhanced by using "grant copy" instead
  of "map, use, unmap". OTOH this adds the need for bounce buffers.
  Depending on the backend type this might be a good idea, though.

- A general way to speed up some hypercalls might be the handling of
  hypercall parameters: for some hypercalls parameters could be passed
  in registers instead of guest memory. This would remove the need for
  walking the guest's page tables when retrieving those parameters.
  Hypercalls requiring memory parameters can be sped up by registering
  the memory buffers and just referencing those buffers when doing the
  hypercalls. The buffers could be kept mapped in the hypervisor so
  again there would be no need to walk the guest's pagetables on a hot
  path. Another possibility would be to use guest physical addresses
  as hypercall parameters.
  This should be sorted out and implemented in 4.10 IMO.

I hope I didn't forget anything.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel