Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 10 December 2019 09:45
> To: Durrant, Paul 
> Cc: Jürgen Groß ; Tian, Kevin ;
> Andrew Cooper ; xen-devel@lists.xenproject.org;
> Roger Pau Monné ; Wei Liu 
> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
> quarantine domain
> 
> On 10.12.2019 10:16, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Jürgen Groß 
> >> Sent: 10 December 2019 09:07
> >> To: Jan Beulich 
> >> Cc: Tian, Kevin ; Durrant, Paul
> >> ; Andrew Cooper ; xen-
> >> de...@lists.xenproject.org; Roger Pau Monné ; Wei
> >> Liu 
> >> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
> >> quarantine domain
> >>
> >> On 10.12.19 09:57, Jan Beulich wrote:
> >>> On 10.12.2019 09:12, Jürgen Groß wrote:
>  On 10.12.19 09:05, Jan Beulich wrote:
> > On 10.12.2019 08:16, Tian, Kevin wrote:
> >> While the quarantine idea sounds good overall, I'm still not
> >> convinced
> >> to have it the only way in place just for handling some known-buggy
> >> device. It kills the possibility of identifying a new buggy device
> >> and then
> >> deciding not to use it in the first space... I thought about
> whether
> >> it
> >> will get better when future IOMMU implements A/D bit - by checking
> >> access bit being set then we'll know some buggy device exists, but,
> >> the scratch page is shared by all devices then we cannot rely on
> this
> >> feature to find out the actual buggy one.
> >
> > Thinking about it - yes, I think I agree. This (as with so many
> > workarounds) would better be an off-by-default one. The main issue
> > I understand this would have is that buggy systems then might hang
> > without even having managed to get a log message out - Paul?
> >
> > Jürgen - would you be amenable to an almost last minute refinement
> > here (would then also need to still be backported to 4.12.2, or
> > the original backport reverted, to avoid giving the impression of
> > a regression)?
> 
>  So what is your suggestion here? To have a boot option (defaulting to
>  off) for enabling the scratch page?
> >>>
> >>> Yes (and despite having seen Paul's reply).
> >>
> >> I'd release ack such a patch in case you come to an agreement regarding
> >> the default soon.
> >>
> >
> > Ok. The default is not that crucial. Perhaps it's just me who thinks
> > defaults should be chosen on the basis of being most likely to result
> > in a working system.
> 
> If it wasn't for quirky hardware (or firmware to cover the general case,
> in particular to avoid getting quoted on this wrt my position on EFI
> workarounds), I'd agree. But personally I think Kevin's point takes
> priority here: Admins should at least be aware of running quirky
> hardware, and hence I'd prefer the default to be logging of faults
> rather than their silencing. Documentation of the new (sub-)option may
> give suitable hints, and we may even go as far as providing a Kconfig
> option for the default to be chosen at build time.
> 
> Main question now is - who's going to make a patch? Will you? Should I?
> 

I'm happy to do it, but it would probably be more expedient if you did.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Jan Beulich
On 10.12.2019 10:16, Durrant, Paul wrote:
>> -Original Message-
>> From: Jürgen Groß 
>> Sent: 10 December 2019 09:07
>> To: Jan Beulich 
>> Cc: Tian, Kevin ; Durrant, Paul
>> ; Andrew Cooper ; xen-
>> de...@lists.xenproject.org; Roger Pau Monné ; Wei
>> Liu 
>> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
>> quarantine domain
>>
>> On 10.12.19 09:57, Jan Beulich wrote:
>>> On 10.12.2019 09:12, Jürgen Groß wrote:
 On 10.12.19 09:05, Jan Beulich wrote:
> On 10.12.2019 08:16, Tian, Kevin wrote:
>> While the quarantine idea sounds good overall, I'm still not
>> convinced
>> to have it the only way in place just for handling some known-buggy
>> device. It kills the possibility of identifying a new buggy device
>> and then
>> deciding not to use it in the first space... I thought about whether
>> it
>> will get better when future IOMMU implements A/D bit - by checking
>> access bit being set then we'll know some buggy device exists, but,
>> the scratch page is shared by all devices then we cannot rely on this
>> feature to find out the actual buggy one.
>
> Thinking about it - yes, I think I agree. This (as with so many
> workarounds) would better be an off-by-default one. The main issue
> I understand this would have is that buggy systems then might hang
> without even having managed to get a log message out - Paul?
>
> Jürgen - would you be amenable to an almost last minute refinement
> here (would then also need to still be backported to 4.12.2, or
> the original backport reverted, to avoid giving the impression of
> a regression)?

 So what is your suggestion here? To have a boot option (defaulting to
 off) for enabling the scratch page?
>>>
>>> Yes (and despite having seen Paul's reply).
>>
>> I'd release ack such a patch in case you come to an agreement regarding
>> the default soon.
>>
> 
> Ok. The default is not that crucial. Perhaps it's just me who thinks
> defaults should be chosen on the basis of being most likely to result
> in a working system.

If it wasn't for quirky hardware (or firmware to cover the general case,
in particular to avoid getting quoted on this wrt my position on EFI
workarounds), I'd agree. But personally I think Kevin's point takes
priority here: Admins should at least be aware of running quirky
hardware, and hence I'd prefer the default to be logging of faults
rather than their silencing. Documentation of the new (sub-)option may
give suitable hints, and we may even go as far as providing a Kconfig
option for the default to be chosen at build time.

Main question now is - who's going to make a patch? Will you? Should I?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Durrant, Paul
> -Original Message-
> From: Jürgen Groß 
> Sent: 10 December 2019 09:07
> To: Jan Beulich 
> Cc: Tian, Kevin ; Durrant, Paul
> ; Andrew Cooper ; xen-
> de...@lists.xenproject.org; Roger Pau Monné ; Wei
> Liu 
> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
> quarantine domain
> 
> On 10.12.19 09:57, Jan Beulich wrote:
> > On 10.12.2019 09:12, Jürgen Groß wrote:
> >> On 10.12.19 09:05, Jan Beulich wrote:
> >>> On 10.12.2019 08:16, Tian, Kevin wrote:
>  While the quarantine idea sounds good overall, I'm still not
> convinced
>  to have it the only way in place just for handling some known-buggy
>  device. It kills the possibility of identifying a new buggy device
> and then
>  deciding not to use it in the first space... I thought about whether
> it
>  will get better when future IOMMU implements A/D bit - by checking
>  access bit being set then we'll know some buggy device exists, but,
>  the scratch page is shared by all devices then we cannot rely on this
>  feature to find out the actual buggy one.
> >>>
> >>> Thinking about it - yes, I think I agree. This (as with so many
> >>> workarounds) would better be an off-by-default one. The main issue
> >>> I understand this would have is that buggy systems then might hang
> >>> without even having managed to get a log message out - Paul?
> >>>
> >>> Jürgen - would you be amenable to an almost last minute refinement
> >>> here (would then also need to still be backported to 4.12.2, or
> >>> the original backport reverted, to avoid giving the impression of
> >>> a regression)?
> >>
> >> So what is your suggestion here? To have a boot option (defaulting to
> >> off) for enabling the scratch page?
> >
> > Yes (and despite having seen Paul's reply).
> 
> I'd release ack such a patch in case you come to an agreement regarding
> the default soon.
> 

Ok. The default is not that crucial. Perhaps it's just me who thinks defaults 
should be chosen on the basis of being most likely to result in a working 
system.

  Paul

> 
> Juergen
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Jürgen Groß

On 10.12.19 09:57, Jan Beulich wrote:

On 10.12.2019 09:12, Jürgen Groß wrote:

On 10.12.19 09:05, Jan Beulich wrote:

On 10.12.2019 08:16, Tian, Kevin wrote:

While the quarantine idea sounds good overall, I'm still not convinced
to have it the only way in place just for handling some known-buggy
device. It kills the possibility of identifying a new buggy device and then
deciding not to use it in the first space... I thought about whether it
will get better when future IOMMU implements A/D bit - by checking
access bit being set then we'll know some buggy device exists, but,
the scratch page is shared by all devices then we cannot rely on this
feature to find out the actual buggy one.


Thinking about it - yes, I think I agree. This (as with so many
workarounds) would better be an off-by-default one. The main issue
I understand this would have is that buggy systems then might hang
without even having managed to get a log message out - Paul?

Jürgen - would you be amenable to an almost last minute refinement
here (would then also need to still be backported to 4.12.2, or
the original backport reverted, to avoid giving the impression of
a regression)?


So what is your suggestion here? To have a boot option (defaulting to
off) for enabling the scratch page?


Yes (and despite having seen Paul's reply).


I'd release ack such a patch in case you come to an agreement regarding
the default soon.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Jan Beulich
On 10.12.2019 09:12, Jürgen Groß wrote:
> On 10.12.19 09:05, Jan Beulich wrote:
>> On 10.12.2019 08:16, Tian, Kevin wrote:
>>> While the quarantine idea sounds good overall, I'm still not convinced
>>> to have it the only way in place just for handling some known-buggy
>>> device. It kills the possibility of identifying a new buggy device and then
>>> deciding not to use it in the first space... I thought about whether it
>>> will get better when future IOMMU implements A/D bit - by checking
>>> access bit being set then we'll know some buggy device exists, but,
>>> the scratch page is shared by all devices then we cannot rely on this
>>> feature to find out the actual buggy one.
>>
>> Thinking about it - yes, I think I agree. This (as with so many
>> workarounds) would better be an off-by-default one. The main issue
>> I understand this would have is that buggy systems then might hang
>> without even having managed to get a log message out - Paul?
>>
>> Jürgen - would you be amenable to an almost last minute refinement
>> here (would then also need to still be backported to 4.12.2, or
>> the original backport reverted, to avoid giving the impression of
>> a regression)?
> 
> So what is your suggestion here? To have a boot option (defaulting to
> off) for enabling the scratch page?

Yes (and despite having seen Paul's reply).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Durrant, Paul
> -Original Message-
> From: Jürgen Groß 
> Sent: 10 December 2019 08:12
> To: Jan Beulich ; Tian, Kevin ;
> Durrant, Paul 
> Cc: Andrew Cooper ; xen-
> de...@lists.xenproject.org; Roger Pau Monné ; Wei
> Liu 
> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
> quarantine domain
> 
> On 10.12.19 09:05, Jan Beulich wrote:
> > On 10.12.2019 08:16, Tian, Kevin wrote:
> >>> From: Jan Beulich 
> >>> Sent: Tuesday, December 3, 2019 5:36 PM
> >>>
> >>> On 28.11.2019 12:32, Jürgen Groß wrote:
>  On 28.11.19 12:17, Jan Beulich wrote:
> > On 27.11.2019 18:11, Paul Durrant wrote:
> >> This patch introduces a new iommu_op to facilitate a per-
> >>> implementation
> >> quarantine set up, and then further code for x86 implementations
> >> (amd and vtd) to set up a read-only scratch page to serve as the
> source
> >> for DMA reads whilst a device is assigned to dom_io. DMA writes
> will
> >> continue to fault as before.
> >>
> >> The reason for doing this is that some hardware may continue to re-
> try
> >> DMA (despite FLR) in the event of an error, or even BME being
> cleared,
> >>> and
> >> will fail to deal with DMA read faults gracefully. Having a scratch
> page
> >> mapped will allow pending DMA reads to complete and thus such buggy
> >> hardware will eventually be quiesced.
> >>
> >> NOTE: These modifications are restricted to x86 implementations
> only as
> >> the buggy h/w I am aware of is only used with Xen in an x86
> >> environment. ARM may require similar code but, since I am
> not
> >> aware of the need, this patch does not modify any ARM
> >>> implementation.
> >>
> >> Signed-off-by: Paul Durrant 
> >
> > Reviewed-by: Jan Beulich 
> >
> >> There is still the open question of whether use of a scratch page
> ought
> >> to be gated on something, either are run-time or compile-time.
> >
> > I have no clear opinion either way here. The workaround seems low
> > overhead enough that there may not be a need to have an admin (or
> > build time) control for this.
> >
> > As to 4.13: The quarantining as a whole is pretty fresh. While it
> > has been backported to security maintained trees, I'd still consider
> > it a new feature in 4.13, and hence this workaround at least
> eligible
> > for consideration.
> 
>  I agree.
> 
>  Release-acked-by: Juergen Gross 
> >>>
> >>> I notice this has been committed meanwhile. I had specifically not
> >>> done so due to the still missing VT-d ack, seeing that this wasn't
> >>> an entirely "trivial" change.
> >>>
> >>
> >> While the quarantine idea sounds good overall, I'm still not convinced
> >> to have it the only way in place just for handling some known-buggy
> >> device. It kills the possibility of identifying a new buggy device and
> then
> >> deciding not to use it in the first space... I thought about whether it
> >> will get better when future IOMMU implements A/D bit - by checking
> >> access bit being set then we'll know some buggy device exists, but,
> >> the scratch page is shared by all devices then we cannot rely on this
> >> feature to find out the actual buggy one.
> >
> > Thinking about it - yes, I think I agree. This (as with so many
> > workarounds) would better be an off-by-default one. The main issue
> > I understand this would have is that buggy systems then might hang
> > without even having managed to get a log message out - Paul?
> >

Yes, that's the concern. Some systems will wedge hard on the first fault with 
no logging. I have no objection to a command line parameter but IMO it ought to 
default on, at least in non-debug mode.

  Paul


> > Jürgen - would you be amenable to an almost last minute refinement
> > here (would then also need to still be backported to 4.12.2, or
> > the original backport reverted, to avoid giving the impression of
> > a regression)?
> 
> So what is your suggestion here? To have a boot option (defaulting to
> off) for enabling the scratch page?
> 
> 
> Juergen
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Jürgen Groß

On 10.12.19 09:05, Jan Beulich wrote:

On 10.12.2019 08:16, Tian, Kevin wrote:

From: Jan Beulich 
Sent: Tuesday, December 3, 2019 5:36 PM

On 28.11.2019 12:32, Jürgen Groß wrote:

On 28.11.19 12:17, Jan Beulich wrote:

On 27.11.2019 18:11, Paul Durrant wrote:

This patch introduces a new iommu_op to facilitate a per-

implementation

quarantine set up, and then further code for x86 implementations
(amd and vtd) to set up a read-only scratch page to serve as the source
for DMA reads whilst a device is assigned to dom_io. DMA writes will
continue to fault as before.

The reason for doing this is that some hardware may continue to re-try
DMA (despite FLR) in the event of an error, or even BME being cleared,

and

will fail to deal with DMA read faults gracefully. Having a scratch page
mapped will allow pending DMA reads to complete and thus such buggy
hardware will eventually be quiesced.

NOTE: These modifications are restricted to x86 implementations only as
the buggy h/w I am aware of is only used with Xen in an x86
environment. ARM may require similar code but, since I am not
aware of the need, this patch does not modify any ARM

implementation.


Signed-off-by: Paul Durrant 


Reviewed-by: Jan Beulich 


There is still the open question of whether use of a scratch page ought
to be gated on something, either are run-time or compile-time.


I have no clear opinion either way here. The workaround seems low
overhead enough that there may not be a need to have an admin (or
build time) control for this.

As to 4.13: The quarantining as a whole is pretty fresh. While it
has been backported to security maintained trees, I'd still consider
it a new feature in 4.13, and hence this workaround at least eligible
for consideration.


I agree.

Release-acked-by: Juergen Gross 


I notice this has been committed meanwhile. I had specifically not
done so due to the still missing VT-d ack, seeing that this wasn't
an entirely "trivial" change.



While the quarantine idea sounds good overall, I'm still not convinced
to have it the only way in place just for handling some known-buggy
device. It kills the possibility of identifying a new buggy device and then
deciding not to use it in the first space... I thought about whether it
will get better when future IOMMU implements A/D bit - by checking
access bit being set then we'll know some buggy device exists, but,
the scratch page is shared by all devices then we cannot rely on this
feature to find out the actual buggy one.


Thinking about it - yes, I think I agree. This (as with so many
workarounds) would better be an off-by-default one. The main issue
I understand this would have is that buggy systems then might hang
without even having managed to get a log message out - Paul?

Jürgen - would you be amenable to an almost last minute refinement
here (would then also need to still be backported to 4.12.2, or
the original backport reverted, to avoid giving the impression of
a regression)?


So what is your suggestion here? To have a boot option (defaulting to
off) for enabling the scratch page?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-10 Thread Jan Beulich
On 10.12.2019 08:16, Tian, Kevin wrote:
>> From: Jan Beulich 
>> Sent: Tuesday, December 3, 2019 5:36 PM
>>
>> On 28.11.2019 12:32, Jürgen Groß wrote:
>>> On 28.11.19 12:17, Jan Beulich wrote:
 On 27.11.2019 18:11, Paul Durrant wrote:
> This patch introduces a new iommu_op to facilitate a per-
>> implementation
> quarantine set up, and then further code for x86 implementations
> (amd and vtd) to set up a read-only scratch page to serve as the source
> for DMA reads whilst a device is assigned to dom_io. DMA writes will
> continue to fault as before.
>
> The reason for doing this is that some hardware may continue to re-try
> DMA (despite FLR) in the event of an error, or even BME being cleared,
>> and
> will fail to deal with DMA read faults gracefully. Having a scratch page
> mapped will allow pending DMA reads to complete and thus such buggy
> hardware will eventually be quiesced.
>
> NOTE: These modifications are restricted to x86 implementations only as
>the buggy h/w I am aware of is only used with Xen in an x86
>environment. ARM may require similar code but, since I am not
>aware of the need, this patch does not modify any ARM
>> implementation.
>
> Signed-off-by: Paul Durrant 

 Reviewed-by: Jan Beulich 

> There is still the open question of whether use of a scratch page ought
> to be gated on something, either are run-time or compile-time.

 I have no clear opinion either way here. The workaround seems low
 overhead enough that there may not be a need to have an admin (or
 build time) control for this.

 As to 4.13: The quarantining as a whole is pretty fresh. While it
 has been backported to security maintained trees, I'd still consider
 it a new feature in 4.13, and hence this workaround at least eligible
 for consideration.
>>>
>>> I agree.
>>>
>>> Release-acked-by: Juergen Gross 
>>
>> I notice this has been committed meanwhile. I had specifically not
>> done so due to the still missing VT-d ack, seeing that this wasn't
>> an entirely "trivial" change.
>>
> 
> While the quarantine idea sounds good overall, I'm still not convinced
> to have it the only way in place just for handling some known-buggy
> device. It kills the possibility of identifying a new buggy device and then 
> deciding not to use it in the first space... I thought about whether it
> will get better when future IOMMU implements A/D bit - by checking
> access bit being set then we'll know some buggy device exists, but, 
> the scratch page is shared by all devices then we cannot rely on this 
> feature to find out the actual buggy one.

Thinking about it - yes, I think I agree. This (as with so many
workarounds) would better be an off-by-default one. The main issue
I understand this would have is that buggy systems then might hang
without even having managed to get a log message out - Paul?

Jürgen - would you be amenable to an almost last minute refinement
here (would then also need to still be backported to 4.12.2, or
the original backport reverted, to avoid giving the impression of
a regression)?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-09 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Tuesday, December 3, 2019 5:36 PM
> 
> On 28.11.2019 12:32, Jürgen Groß wrote:
> > On 28.11.19 12:17, Jan Beulich wrote:
> >> On 27.11.2019 18:11, Paul Durrant wrote:
> >>> This patch introduces a new iommu_op to facilitate a per-
> implementation
> >>> quarantine set up, and then further code for x86 implementations
> >>> (amd and vtd) to set up a read-only scratch page to serve as the source
> >>> for DMA reads whilst a device is assigned to dom_io. DMA writes will
> >>> continue to fault as before.
> >>>
> >>> The reason for doing this is that some hardware may continue to re-try
> >>> DMA (despite FLR) in the event of an error, or even BME being cleared,
> and
> >>> will fail to deal with DMA read faults gracefully. Having a scratch page
> >>> mapped will allow pending DMA reads to complete and thus such buggy
> >>> hardware will eventually be quiesced.
> >>>
> >>> NOTE: These modifications are restricted to x86 implementations only as
> >>>the buggy h/w I am aware of is only used with Xen in an x86
> >>>environment. ARM may require similar code but, since I am not
> >>>aware of the need, this patch does not modify any ARM
> implementation.
> >>>
> >>> Signed-off-by: Paul Durrant 
> >>
> >> Reviewed-by: Jan Beulich 
> >>
> >>> There is still the open question of whether use of a scratch page ought
> >>> to be gated on something, either are run-time or compile-time.
> >>
> >> I have no clear opinion either way here. The workaround seems low
> >> overhead enough that there may not be a need to have an admin (or
> >> build time) control for this.
> >>
> >> As to 4.13: The quarantining as a whole is pretty fresh. While it
> >> has been backported to security maintained trees, I'd still consider
> >> it a new feature in 4.13, and hence this workaround at least eligible
> >> for consideration.
> >
> > I agree.
> >
> > Release-acked-by: Juergen Gross 
> 
> I notice this has been committed meanwhile. I had specifically not
> done so due to the still missing VT-d ack, seeing that this wasn't
> an entirely "trivial" change.
> 

While the quarantine idea sounds good overall, I'm still not convinced
to have it the only way in place just for handling some known-buggy
device. It kills the possibility of identifying a new buggy device and then 
deciding not to use it in the first space... I thought about whether it
will get better when future IOMMU implements A/D bit - by checking
access bit being set then we'll know some buggy device exists, but, 
the scratch page is shared by all devices then we cannot rely on this 
feature to find out the actual buggy one.

Thanks
Kevin
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-12-03 Thread Jan Beulich
On 28.11.2019 12:32, Jürgen Groß wrote:
> On 28.11.19 12:17, Jan Beulich wrote:
>> On 27.11.2019 18:11, Paul Durrant wrote:
>>> This patch introduces a new iommu_op to facilitate a per-implementation
>>> quarantine set up, and then further code for x86 implementations
>>> (amd and vtd) to set up a read-only scratch page to serve as the source
>>> for DMA reads whilst a device is assigned to dom_io. DMA writes will
>>> continue to fault as before.
>>>
>>> The reason for doing this is that some hardware may continue to re-try
>>> DMA (despite FLR) in the event of an error, or even BME being cleared, and
>>> will fail to deal with DMA read faults gracefully. Having a scratch page
>>> mapped will allow pending DMA reads to complete and thus such buggy
>>> hardware will eventually be quiesced.
>>>
>>> NOTE: These modifications are restricted to x86 implementations only as
>>>the buggy h/w I am aware of is only used with Xen in an x86
>>>environment. ARM may require similar code but, since I am not
>>>aware of the need, this patch does not modify any ARM implementation.
>>>
>>> Signed-off-by: Paul Durrant 
>>
>> Reviewed-by: Jan Beulich 
>>
>>> There is still the open question of whether use of a scratch page ought
>>> to be gated on something, either are run-time or compile-time.
>>
>> I have no clear opinion either way here. The workaround seems low
>> overhead enough that there may not be a need to have an admin (or
>> build time) control for this.
>>
>> As to 4.13: The quarantining as a whole is pretty fresh. While it
>> has been backported to security maintained trees, I'd still consider
>> it a new feature in 4.13, and hence this workaround at least eligible
>> for consideration.
> 
> I agree.
> 
> Release-acked-by: Juergen Gross 

I notice this has been committed meanwhile. I had specifically not
done so due to the still missing VT-d ack, seeing that this wasn't
an entirely "trivial" change.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-11-28 Thread Jürgen Groß

On 28.11.19 12:17, Jan Beulich wrote:

On 27.11.2019 18:11, Paul Durrant wrote:

This patch introduces a new iommu_op to facilitate a per-implementation
quarantine set up, and then further code for x86 implementations
(amd and vtd) to set up a read-only scratch page to serve as the source
for DMA reads whilst a device is assigned to dom_io. DMA writes will
continue to fault as before.

The reason for doing this is that some hardware may continue to re-try
DMA (despite FLR) in the event of an error, or even BME being cleared, and
will fail to deal with DMA read faults gracefully. Having a scratch page
mapped will allow pending DMA reads to complete and thus such buggy
hardware will eventually be quiesced.

NOTE: These modifications are restricted to x86 implementations only as
   the buggy h/w I am aware of is only used with Xen in an x86
   environment. ARM may require similar code but, since I am not
   aware of the need, this patch does not modify any ARM implementation.

Signed-off-by: Paul Durrant 


Reviewed-by: Jan Beulich 


There is still the open question of whether use of a scratch page ought
to be gated on something, either are run-time or compile-time.


I have no clear opinion either way here. The workaround seems low
overhead enough that there may not be a need to have an admin (or
build time) control for this.

As to 4.13: The quarantining as a whole is pretty fresh. While it
has been backported to security maintained trees, I'd still consider
it a new feature in 4.13, and hence this workaround at least eligible
for consideration.


I agree.

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain

2019-11-28 Thread Jan Beulich
On 27.11.2019 18:11, Paul Durrant wrote:
> This patch introduces a new iommu_op to facilitate a per-implementation
> quarantine set up, and then further code for x86 implementations
> (amd and vtd) to set up a read-only scratch page to serve as the source
> for DMA reads whilst a device is assigned to dom_io. DMA writes will
> continue to fault as before.
> 
> The reason for doing this is that some hardware may continue to re-try
> DMA (despite FLR) in the event of an error, or even BME being cleared, and
> will fail to deal with DMA read faults gracefully. Having a scratch page
> mapped will allow pending DMA reads to complete and thus such buggy
> hardware will eventually be quiesced.
> 
> NOTE: These modifications are restricted to x86 implementations only as
>   the buggy h/w I am aware of is only used with Xen in an x86
>   environment. ARM may require similar code but, since I am not
>   aware of the need, this patch does not modify any ARM implementation.
> 
> Signed-off-by: Paul Durrant 

Reviewed-by: Jan Beulich 

> There is still the open question of whether use of a scratch page ought
> to be gated on something, either are run-time or compile-time.

I have no clear opinion either way here. The workaround seems low
overhead enough that there may not be a need to have an admin (or
build time) control for this.

As to 4.13: The quarantining as a whole is pretty fresh. While it
has been backported to security maintained trees, I'd still consider
it a new feature in 4.13, and hence this workaround at least eligible
for consideration.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel