RE: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-07 Thread Long Li


> -Original Message-
> From: Greg KH [mailto:g...@kroah.com]
> Sent: Friday, January 06, 2017 11:43 PM
> To: Long Li <lon...@microsoft.com>
> Cc: KY Srinivasan <k...@microsoft.com>; Haiyang Zhang
> <haiya...@microsoft.com>; de...@linuxdriverproject.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> 
> On Sat, Jan 07, 2017 at 07:23:14AM +, Long Li wrote:
> > > -Original Message-
> > > From: Greg KH [mailto:g...@kroah.com]
> > > Sent: Wednesday, January 04, 2017 11:48 PM
> > > To: Long Li <lon...@microsoft.com>
> > > Cc: KY Srinivasan <k...@microsoft.com>; Haiyang Zhang
> > > <haiya...@microsoft.com>; de...@linuxdriverproject.org; linux-
> > > ker...@vger.kernel.org
> > > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient
> > > failures
> > >
> > > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > > > From: Long Li <lon...@microsoft.com>
> > > >
> > > > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > > > Retry infinitely on transient failures to avoid returning error to upper
> layer.
> > >
> > > Again, never retry "forever", always have a way out, otherwise you will
> crash.
> > >
> > > And again, why are you making this change?  What problem does it solve?
> >
> > The problem it tries to solve is that in this code we are returning
> > error prematurely on transient failures. The hypercall is used mostly
> > in channel establishment. If we return a transient failure, the VM may
> > not boot or not useful after boot due to some devices missing.
> >
> > Another approach is to increase the number of retries. But we don't
> > know how many retries is safe, and Windows host side expects the guest
> > retry infinitely and not return error on transient failures.
> 
> That implies a lot of trust in the host side, don't you think?
> 
> Worse case, make the delay a minute or so, but give the system a way out
> incase there's a bug in the host.  As there will be bugs in the host, just 
> like
> there are bugs in the client :)

This makes sense. 1 minute is a long time for a hypercall. I will send V3.

> 
> thanks,
> 
> greg k-h


RE: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-07 Thread Long Li


> -Original Message-
> From: Greg KH [mailto:g...@kroah.com]
> Sent: Friday, January 06, 2017 11:43 PM
> To: Long Li 
> Cc: KY Srinivasan ; Haiyang Zhang
> ; de...@linuxdriverproject.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> 
> On Sat, Jan 07, 2017 at 07:23:14AM +, Long Li wrote:
> > > -Original Message-
> > > From: Greg KH [mailto:g...@kroah.com]
> > > Sent: Wednesday, January 04, 2017 11:48 PM
> > > To: Long Li 
> > > Cc: KY Srinivasan ; Haiyang Zhang
> > > ; de...@linuxdriverproject.org; linux-
> > > ker...@vger.kernel.org
> > > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient
> > > failures
> > >
> > > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > > > From: Long Li 
> > > >
> > > > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > > > Retry infinitely on transient failures to avoid returning error to upper
> layer.
> > >
> > > Again, never retry "forever", always have a way out, otherwise you will
> crash.
> > >
> > > And again, why are you making this change?  What problem does it solve?
> >
> > The problem it tries to solve is that in this code we are returning
> > error prematurely on transient failures. The hypercall is used mostly
> > in channel establishment. If we return a transient failure, the VM may
> > not boot or not useful after boot due to some devices missing.
> >
> > Another approach is to increase the number of retries. But we don't
> > know how many retries is safe, and Windows host side expects the guest
> > retry infinitely and not return error on transient failures.
> 
> That implies a lot of trust in the host side, don't you think?
> 
> Worse case, make the delay a minute or so, but give the system a way out
> incase there's a bug in the host.  As there will be bugs in the host, just 
> like
> there are bugs in the client :)

This makes sense. 1 minute is a long time for a hypercall. I will send V3.

> 
> thanks,
> 
> greg k-h


Re: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-06 Thread Greg KH
On Sat, Jan 07, 2017 at 07:23:14AM +, Long Li wrote:
> > -Original Message-
> > From: Greg KH [mailto:g...@kroah.com]
> > Sent: Wednesday, January 04, 2017 11:48 PM
> > To: Long Li <lon...@microsoft.com>
> > Cc: KY Srinivasan <k...@microsoft.com>; Haiyang Zhang
> > <haiya...@microsoft.com>; de...@linuxdriverproject.org; linux-
> > ker...@vger.kernel.org
> > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> > 
> > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > > From: Long Li <lon...@microsoft.com>
> > >
> > > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > > Retry infinitely on transient failures to avoid returning error to upper 
> > > layer.
> > 
> > Again, never retry "forever", always have a way out, otherwise you will 
> > crash.
> > 
> > And again, why are you making this change?  What problem does it solve?
> 
> The problem it tries to solve is that in this code we are returning
> error prematurely on transient failures. The hypercall is used mostly
> in channel establishment. If we return a transient failure, the VM may
> not boot or not useful after boot due to some devices missing.
> 
> Another approach is to increase the number of retries. But we don't
> know how many retries is safe, and Windows host side expects the guest
> retry infinitely and not return error on transient failures.

That implies a lot of trust in the host side, don't you think?

Worse case, make the delay a minute or so, but give the system a way out
incase there's a bug in the host.  As there will be bugs in the host,
just like there are bugs in the client :)

thanks,

greg k-h


Re: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-06 Thread Greg KH
On Sat, Jan 07, 2017 at 07:23:14AM +, Long Li wrote:
> > -Original Message-
> > From: Greg KH [mailto:g...@kroah.com]
> > Sent: Wednesday, January 04, 2017 11:48 PM
> > To: Long Li 
> > Cc: KY Srinivasan ; Haiyang Zhang
> > ; de...@linuxdriverproject.org; linux-
> > ker...@vger.kernel.org
> > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> > 
> > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > > From: Long Li 
> > >
> > > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > > Retry infinitely on transient failures to avoid returning error to upper 
> > > layer.
> > 
> > Again, never retry "forever", always have a way out, otherwise you will 
> > crash.
> > 
> > And again, why are you making this change?  What problem does it solve?
> 
> The problem it tries to solve is that in this code we are returning
> error prematurely on transient failures. The hypercall is used mostly
> in channel establishment. If we return a transient failure, the VM may
> not boot or not useful after boot due to some devices missing.
> 
> Another approach is to increase the number of retries. But we don't
> know how many retries is safe, and Windows host side expects the guest
> retry infinitely and not return error on transient failures.

That implies a lot of trust in the host side, don't you think?

Worse case, make the delay a minute or so, but give the system a way out
incase there's a bug in the host.  As there will be bugs in the host,
just like there are bugs in the client :)

thanks,

greg k-h


RE: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-06 Thread Long Li
> -Original Message-
> From: Greg KH [mailto:g...@kroah.com]
> Sent: Wednesday, January 04, 2017 11:48 PM
> To: Long Li <lon...@microsoft.com>
> Cc: KY Srinivasan <k...@microsoft.com>; Haiyang Zhang
> <haiya...@microsoft.com>; de...@linuxdriverproject.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> 
> On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > From: Long Li <lon...@microsoft.com>
> >
> > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > Retry infinitely on transient failures to avoid returning error to upper 
> > layer.
> 
> Again, never retry "forever", always have a way out, otherwise you will crash.
> 
> And again, why are you making this change?  What problem does it solve?

The problem it tries to solve is that in this code we are returning error 
prematurely on transient failures. The hypercall is used mostly in channel 
establishment. If we return a transient failure, the VM may not boot or not 
useful after boot due to some devices missing.

Another approach is to increase the number of retries. But we don't know how 
many retries is safe, and Windows host side expects the guest retry infinitely 
and not return error on transient failures.

> 
> greg k-h


RE: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-06 Thread Long Li
> -Original Message-
> From: Greg KH [mailto:g...@kroah.com]
> Sent: Wednesday, January 04, 2017 11:48 PM
> To: Long Li 
> Cc: KY Srinivasan ; Haiyang Zhang
> ; de...@linuxdriverproject.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures
> 
> On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> > From: Long Li 
> >
> > Hyper-v host guarantees that a hypercall will finish in reasonable time.
> > Retry infinitely on transient failures to avoid returning error to upper 
> > layer.
> 
> Again, never retry "forever", always have a way out, otherwise you will crash.
> 
> And again, why are you making this change?  What problem does it solve?

The problem it tries to solve is that in this code we are returning error 
prematurely on transient failures. The hypercall is used mostly in channel 
establishment. If we return a transient failure, the VM may not boot or not 
useful after boot due to some devices missing.

Another approach is to increase the number of retries. But we don't know how 
many retries is safe, and Windows host side expects the guest retry infinitely 
and not return error on transient failures.

> 
> greg k-h


Re: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-04 Thread Greg KH
On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> From: Long Li 
> 
> Hyper-v host guarantees that a hypercall will finish in reasonable time.
> Retry infinitely on transient failures to avoid returning error to upper 
> layer.

Again, never retry "forever", always have a way out, otherwise you will
crash.

And again, why are you making this change?  What problem does it solve?

greg k-h


Re: [PATCH v2] hv: retry infinitely on hypercall transient failures

2017-01-04 Thread Greg KH
On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote:
> From: Long Li 
> 
> Hyper-v host guarantees that a hypercall will finish in reasonable time.
> Retry infinitely on transient failures to avoid returning error to upper 
> layer.

Again, never retry "forever", always have a way out, otherwise you will
crash.

And again, why are you making this change?  What problem does it solve?

greg k-h