Re: [Xen-devel] xen-blkfront hang

2017-08-04 Thread Roger Pau Monné
On Fri, Aug 04, 2017 at 05:36:29PM +0800, Dongli Zhang wrote:
> 
> 
> On 08/04/2017 05:13 PM, Roger Pau Monné wrote:
> > On Fri, Aug 04, 2017 at 10:00:09AM +0200, Valentin Vidic wrote:
> >> On Mon, Jul 31, 2017 at 09:09:19AM +0800, Dongli Zhang wrote:
> >>> To verify whether the above patch would help, please check the 
> >>> nr_grant_frames
> >>> value in guest domU. If this value is exactly the same of maximum grant 
> >>> frames
> >>> (by default, xen mainline uses 32) and the number of free grant 
> >>> references is
> >>> very small, the above patch might help.
> >>
> >> You are right, this is what I get after the machine hangs:
> >>
> >>   crash> print nr_grant_frames 
> >>   $1 = 32
> >>   crash> print gnttab_free_count 
> >>   $2 = 9
> >>
> >>> The best way is to increase the gnttab_max_frames to larger value (e.g.,  
> >>> 256)
> >>> in dom0 xen.gz grub.
> >>
> >> Thank you, this seems to help.  The test machine does not hang now and
> >> the numbers are looking better now:
> >>
> >>   crash> print nr_grant_frames 
> >>   $1 = 59
> >>   crash> print gnttab_free_count 
> >>   $2 = 356
> > 
> > At some point I've already expressed my opinion that having a
> > per-queue list of persistent grants was not a good idea. Now I think
> > the only solution is to remove persistent grants, or to lower the
> > default number of per-queue persistent grants to a ridiculously low
> > value.
> 
> It would be more efficient if (1) persistent grants are shared by all queues 
> and

There was a complain that having a shared pool of persistent grants
introduced too much contention.

> (2) there is a new mechanism to allow frontend to actively ask backend to 
> unmap
> existing persistent grants. So far, it is backend's responsibility to decide
> when to unmap persistent grants.

Hm, I would rather remove them than make the protocol more complex.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen-blkfront hang

2017-08-04 Thread Dongli Zhang


On 08/04/2017 05:13 PM, Roger Pau Monné wrote:
> On Fri, Aug 04, 2017 at 10:00:09AM +0200, Valentin Vidic wrote:
>> On Mon, Jul 31, 2017 at 09:09:19AM +0800, Dongli Zhang wrote:
>>> To verify whether the above patch would help, please check the 
>>> nr_grant_frames
>>> value in guest domU. If this value is exactly the same of maximum grant 
>>> frames
>>> (by default, xen mainline uses 32) and the number of free grant references 
>>> is
>>> very small, the above patch might help.
>>
>> You are right, this is what I get after the machine hangs:
>>
>>   crash> print nr_grant_frames 
>>   $1 = 32
>>   crash> print gnttab_free_count 
>>   $2 = 9
>>
>>> The best way is to increase the gnttab_max_frames to larger value (e.g.,  
>>> 256)
>>> in dom0 xen.gz grub.
>>
>> Thank you, this seems to help.  The test machine does not hang now and
>> the numbers are looking better now:
>>
>>   crash> print nr_grant_frames 
>>   $1 = 59
>>   crash> print gnttab_free_count 
>>   $2 = 356
> 
> At some point I've already expressed my opinion that having a
> per-queue list of persistent grants was not a good idea. Now I think
> the only solution is to remove persistent grants, or to lower the
> default number of per-queue persistent grants to a ridiculously low
> value.

It would be more efficient if (1) persistent grants are shared by all queues and
(2) there is a new mechanism to allow frontend to actively ask backend to unmap
existing persistent grants. So far, it is backend's responsibility to decide
when to unmap persistent grants.

Dongli Zhang

> 
> Roger.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen-blkfront hang

2017-08-04 Thread Roger Pau Monné
On Fri, Aug 04, 2017 at 10:00:09AM +0200, Valentin Vidic wrote:
> On Mon, Jul 31, 2017 at 09:09:19AM +0800, Dongli Zhang wrote:
> > To verify whether the above patch would help, please check the 
> > nr_grant_frames
> > value in guest domU. If this value is exactly the same of maximum grant 
> > frames
> > (by default, xen mainline uses 32) and the number of free grant references 
> > is
> > very small, the above patch might help.
> 
> You are right, this is what I get after the machine hangs:
> 
>   crash> print nr_grant_frames 
>   $1 = 32
>   crash> print gnttab_free_count 
>   $2 = 9
> 
> > The best way is to increase the gnttab_max_frames to larger value (e.g.,  
> > 256)
> > in dom0 xen.gz grub.
> 
> Thank you, this seems to help.  The test machine does not hang now and
> the numbers are looking better now:
> 
>   crash> print nr_grant_frames 
>   $1 = 59
>   crash> print gnttab_free_count 
>   $2 = 356

At some point I've already expressed my opinion that having a
per-queue list of persistent grants was not a good idea. Now I think
the only solution is to remove persistent grants, or to lower the
default number of per-queue persistent grants to a ridiculously low
value.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen-blkfront hang

2017-08-04 Thread Valentin Vidic
On Mon, Jul 31, 2017 at 09:09:19AM +0800, Dongli Zhang wrote:
> To verify whether the above patch would help, please check the nr_grant_frames
> value in guest domU. If this value is exactly the same of maximum grant frames
> (by default, xen mainline uses 32) and the number of free grant references is
> very small, the above patch might help.

You are right, this is what I get after the machine hangs:

  crash> print nr_grant_frames 
  $1 = 32
  crash> print gnttab_free_count 
  $2 = 9

> The best way is to increase the gnttab_max_frames to larger value (e.g.,  256)
> in dom0 xen.gz grub.

Thank you, this seems to help.  The test machine does not hang now and
the numbers are looking better now:

  crash> print nr_grant_frames 
  $1 = 59
  crash> print gnttab_free_count 
  $2 = 356

-- 
Valentin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen-blkfront hang

2017-07-31 Thread Dongli Zhang
Here are the options:

1. Dump a vmcore of guest and print nr_grant_frames with crash utility.

2. Implement a kernel module in guest to dump nr_grant_frames if you still have
access to your hung guest domU.

3. There is a new utility in xen toolstack at tools/misc/xen-diag.c to dump
grant table usage for arbitrary guest domU (including dom0)

./xen-diag gnttab_query_size [domid]

4. If your host's xen toolstack does not have xen-diag, feel free to implement
one your self via GNTTABOP_query_size hypercall and compile with -lxenctrl.

Dongli Zhang


On 07/31/2017 02:30 PM, Valentin Vidic wrote:
> On Mon, Jul 31, 2017 at 09:09:19AM +0800, Dongli Zhang wrote:
>> This patch is not able to fix the lack of grant issue permanently. It is 
>> used to
>> optimize the utilization of grant table entires.
>>
>> To verify whether the above patch would help, please check the 
>> nr_grant_frames
>> value in guest domU. If this value is exactly the same of maximum grant 
>> frames
>> (by default, xen mainline uses 32) and the number of free grant references is
>> very small, the above patch might help.
> 
> I can try that, but how do I get the nr_grant_frames value in domU?
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen-blkfront hang

2017-07-30 Thread Dongli Zhang
CCed xen-devel so that more people would be able to help.

Dongli Zhang

On 07/31/2017 09:09 AM, Dongli Zhang wrote:
> Hi Valentin,
> 
> On 07/30/2017 03:42 PM, Valentin Vidic wrote:
>> I'm having a problem with a domU hang in disk IO, described here:
>>
>> https://lists.xen.org/archives/html/xen-users/2017-07/msg00057.html
>>
>> Do you think this is a multi-queue issue and applying one of these
>> latest changes would help?
>>
>> xen/blkfront: always allocate grants first from per-queue persistent grants
>> https://github.com/torvalds/linux/commit/bd912ef3e46b6edb51bb8af4b73fd2be7817e305
> 
> This patch is not able to fix the lack of grant issue permanently. It is used 
> to
> optimize the utilization of grant table entires.
> 
> To verify whether the above patch would help, please check the nr_grant_frames
> value in guest domU. If this value is exactly the same of maximum grant frames
> (by default, xen mainline uses 32) and the number of free grant references is
> very small, the above patch might help.
> 
> The best way is to increase the gnttab_max_frames to larger value (e.g.,  256)
> in dom0 xen.gz grub.
> 
> Dongli Zhang
> 
>>
>> xen-blkfront: fix mq start/stop race
>> https://github.com/torvalds/linux/commit/4b422cb99836de3d261faec20a0329385bdec43d
>>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel