[virtio-dev] Re: [virtio] [PATCH v2] README.md: clean up build instructions

2019-10-09 Thread Stefan Hajnoczi
On Wed, Sep 25, 2019 at 07:53:05AM -0400, Michael S. Tsirkin wrote:
> Switch to  from manual formatting with  and .
> Clarify wording a bit.
> Add hints on what to do in case of missing fonts.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> changes from v1:
> - drop TeX link as it's confusing (you want TeX live)
> - MacTex-> MacTeX
> 
>  README.md | 56 ---
>  1 file changed, 37 insertions(+), 19 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


[virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

2019-10-09 Thread Alexander Duyck
On Wed, 2019-10-09 at 13:08 -0400, Nitesh Narayan Lal wrote:
> On 10/9/19 12:50 PM, Alexander Duyck wrote:
> > On Wed, 2019-10-09 at 12:25 -0400, Nitesh Narayan Lal wrote:
> > > On 10/7/19 1:20 PM, Alexander Duyck wrote:
> > > > On Mon, Oct 7, 2019 at 10:07 AM Nitesh Narayan Lal  
> > > > wrote:
> > > > > On 10/7/19 12:27 PM, Alexander Duyck wrote:
> > > > > > On Mon, 2019-10-07 at 12:19 -0400, Nitesh Narayan Lal wrote:
> > > > > > > On 10/7/19 11:33 AM, Alexander Duyck wrote:
> > > > > > > > On Mon, 2019-10-07 at 08:29 -0400, Nitesh Narayan Lal wrote:
> > > > > > > > > On 10/2/19 10:25 AM, Alexander Duyck wrote:
> > > > 
> > > > 
> > > > > > > > > page_reporting.c change:
> > > > > > > > > @@ -101,8 +101,12 @@ static void scan_zone_bitmap(struct 
> > > > > > > > > page_reporting_config
> > > > > > > > > *phconf,
> > > > > > > > > /* Process only if the page is still online */
> > > > > > > > > page = pfn_to_online_page((setbit << 
> > > > > > > > > PAGE_REPORTING_MIN_ORDER) +
> > > > > > > > >   zone->base_pfn);
> > > > > > > > > -   if (!page)
> > > > > > > > > +   if (!page || !PageBuddy(page)) {
> > > > > > > > > +   clear_bit(setbit, zone->bitmap);
> > > > > > > > > +   atomic_dec(>free_pages);
> > > > > > > > > continue;
> > > > > > > > > +   }
> > > > > > > > > 
> > > > > > > > I suspect the zone->free_pages is going to be expensive for you 
> > > > > > > > to deal
> > > > > > > > with. It is a global atomic value and is going to have the 
> > > > > > > > cacheline
> > > > > > > > bouncing that it is contained in. As a result thinks like 
> > > > > > > > setting the
> > > > > > > > bitmap with be more expensive as every tome a CPU increments 
> > > > > > > > free_pages it
> > > > > > > > will likely have to take the cache line containing the bitmap 
> > > > > > > > pointer as
> > > > > > > > well.
> > > > > > > I see I will have to explore this more. I am wondering if there 
> > > > > > > is a way to
> > > > > > > measure this If its effect is not visible in 
> > > > > > > will-it-scale/page_fault1. If
> > > > > > > there is a noticeable amount of degradation, I will have to 
> > > > > > > address this.
> > > > > > If nothing else you might look at seeing if you can split up the
> > > > > > structures so that the bitmap and nr_bits is in a different region
> > > > > > somewhere since those are read-mostly values.
> > > > > ok, I will try to understand the issue and your suggestion.
> > > > > Thank you for bringing this up.
> > > > > 
> > > > > > Also you are now updating the bitmap and free_pages both inside and
> > > > > > outside of the zone lock so that will likely have some impact.
> > > > > So as per your previous suggestion, I have made the bitmap structure
> > > > > object as a rcu protected pointer. So we are safe from that side.
> > > > > The other downside which I can think of is a race where one page
> > > > > trying to increment free_pages and other trying to decrements it.
> > > > > However, being an atomic variable that should not be a problem.
> > > > > Did I miss anything?
> > > > I'm not so much worried about a race as the cache line bouncing
> > > > effect. Basically your notifier combined within this hinting thread
> > > > will likely result in more time spent by the thread that holds the
> > > > lock since it will be trying to access the bitmap to set the bit and
> > > > the free_pages to report the bit, but at the same time you will have
> > > > this thread clearing bits and decrementing the free_pages values.
> > > > 
> > > > One thing you could consider in your worker thread would be to do
> > > > reallocate and replace the bitmap every time you plan to walk it. By
> > > > doing that you would avoid the cacheline bouncing on the bitmap since
> > > > you would only have to read it, and you would no longer have another
> > > > thread dirtying it. You could essentially reset the free_pages at the
> > > > same time you replace the bitmap. It would need to all happen with the
> > > > zone lock held though when you swap it out.
> > > If I am not mistaken then from what you are suggesting, I will have to 
> > > hold
> > > the zone lock for the entire duration of swap & scan which would be 
> > > costly if
> > > the bitmap is large, isn't? Also, we might end up missing free pages that 
> > > are
> > > getting
> > > freed while we are scanning.
> > You would only need to hold the zone lock when you swap the bitmap. Once
> > it is swapped you wouldn't need to worry about the locking again for
> > bitmap access since your worker thread would be the only one holding the
> > current bitmap. Think of it as a batch clearing of the bits.
> 
> I see.
> 
> > You already end up missing pages freed while scanning since you are doing
> > it linearly.
> 
> I was referring to free pages for whom bits will not be set while we
> 

[virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

2019-10-09 Thread Nitesh Narayan Lal


On 10/9/19 12:50 PM, Alexander Duyck wrote:
> On Wed, 2019-10-09 at 12:25 -0400, Nitesh Narayan Lal wrote:
>> On 10/7/19 1:20 PM, Alexander Duyck wrote:
>>> On Mon, Oct 7, 2019 at 10:07 AM Nitesh Narayan Lal  
>>> wrote:
 On 10/7/19 12:27 PM, Alexander Duyck wrote:
> On Mon, 2019-10-07 at 12:19 -0400, Nitesh Narayan Lal wrote:
>> On 10/7/19 11:33 AM, Alexander Duyck wrote:
>>> On Mon, 2019-10-07 at 08:29 -0400, Nitesh Narayan Lal wrote:
 On 10/2/19 10:25 AM, Alexander Duyck wrote:
>>> 
>>>
 page_reporting.c change:
 @@ -101,8 +101,12 @@ static void scan_zone_bitmap(struct 
 page_reporting_config
 *phconf,
 /* Process only if the page is still online */
 page = pfn_to_online_page((setbit << 
 PAGE_REPORTING_MIN_ORDER) +
   zone->base_pfn);
 -   if (!page)
 +   if (!page || !PageBuddy(page)) {
 +   clear_bit(setbit, zone->bitmap);
 +   atomic_dec(>free_pages);
 continue;
 +   }

>>> I suspect the zone->free_pages is going to be expensive for you to deal
>>> with. It is a global atomic value and is going to have the cacheline
>>> bouncing that it is contained in. As a result thinks like setting the
>>> bitmap with be more expensive as every tome a CPU increments free_pages 
>>> it
>>> will likely have to take the cache line containing the bitmap pointer as
>>> well.
>> I see I will have to explore this more. I am wondering if there is a way 
>> to
>> measure this If its effect is not visible in will-it-scale/page_fault1. 
>> If
>> there is a noticeable amount of degradation, I will have to address this.
> If nothing else you might look at seeing if you can split up the
> structures so that the bitmap and nr_bits is in a different region
> somewhere since those are read-mostly values.
 ok, I will try to understand the issue and your suggestion.
 Thank you for bringing this up.

> Also you are now updating the bitmap and free_pages both inside and
> outside of the zone lock so that will likely have some impact.
 So as per your previous suggestion, I have made the bitmap structure
 object as a rcu protected pointer. So we are safe from that side.
 The other downside which I can think of is a race where one page
 trying to increment free_pages and other trying to decrements it.
 However, being an atomic variable that should not be a problem.
 Did I miss anything?
>>> I'm not so much worried about a race as the cache line bouncing
>>> effect. Basically your notifier combined within this hinting thread
>>> will likely result in more time spent by the thread that holds the
>>> lock since it will be trying to access the bitmap to set the bit and
>>> the free_pages to report the bit, but at the same time you will have
>>> this thread clearing bits and decrementing the free_pages values.
>>>
>>> One thing you could consider in your worker thread would be to do
>>> reallocate and replace the bitmap every time you plan to walk it. By
>>> doing that you would avoid the cacheline bouncing on the bitmap since
>>> you would only have to read it, and you would no longer have another
>>> thread dirtying it. You could essentially reset the free_pages at the
>>> same time you replace the bitmap. It would need to all happen with the
>>> zone lock held though when you swap it out.
>> If I am not mistaken then from what you are suggesting, I will have to hold
>> the zone lock for the entire duration of swap & scan which would be costly if
>> the bitmap is large, isn't? Also, we might end up missing free pages that are
>> getting
>> freed while we are scanning.
> You would only need to hold the zone lock when you swap the bitmap. Once
> it is swapped you wouldn't need to worry about the locking again for
> bitmap access since your worker thread would be the only one holding the
> current bitmap. Think of it as a batch clearing of the bits.

I see.

>
> You already end up missing pages freed while scanning since you are doing
> it linearly.

I was referring to free pages for whom bits will not be set while we
are doing the batch clearing of the bits.

>
>> As far as free_pages count is concerned, I am thinking if I should
>> replace it with zone->free_area[REPORTING_ORDER].nr_free which is already 
>> there
>> (I still need to explore this in a bit more depth).
>>
>>> - Alex
> So there ends up being two ways you could use nr_free. One is to track it
> the way I did with the number of reported pages being tracked, however
> that requires reducing the count when reported pages are pulled from the
> free_area and identifying reported pages vs unreported ones.
>
> The other option would be to look at converting nr_free 

[virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

2019-10-09 Thread Alexander Duyck
On Wed, 2019-10-09 at 12:25 -0400, Nitesh Narayan Lal wrote:
> On 10/7/19 1:20 PM, Alexander Duyck wrote:
> > On Mon, Oct 7, 2019 at 10:07 AM Nitesh Narayan Lal  
> > wrote:
> > > On 10/7/19 12:27 PM, Alexander Duyck wrote:
> > > > On Mon, 2019-10-07 at 12:19 -0400, Nitesh Narayan Lal wrote:
> > > > > On 10/7/19 11:33 AM, Alexander Duyck wrote:
> > > > > > On Mon, 2019-10-07 at 08:29 -0400, Nitesh Narayan Lal wrote:
> > > > > > > On 10/2/19 10:25 AM, Alexander Duyck wrote:
> > 
> > 
> > > > > > > page_reporting.c change:
> > > > > > > @@ -101,8 +101,12 @@ static void scan_zone_bitmap(struct 
> > > > > > > page_reporting_config
> > > > > > > *phconf,
> > > > > > > /* Process only if the page is still online */
> > > > > > > page = pfn_to_online_page((setbit << 
> > > > > > > PAGE_REPORTING_MIN_ORDER) +
> > > > > > >   zone->base_pfn);
> > > > > > > -   if (!page)
> > > > > > > +   if (!page || !PageBuddy(page)) {
> > > > > > > +   clear_bit(setbit, zone->bitmap);
> > > > > > > +   atomic_dec(>free_pages);
> > > > > > > continue;
> > > > > > > +   }
> > > > > > > 
> > > > > > I suspect the zone->free_pages is going to be expensive for you to 
> > > > > > deal
> > > > > > with. It is a global atomic value and is going to have the cacheline
> > > > > > bouncing that it is contained in. As a result thinks like setting 
> > > > > > the
> > > > > > bitmap with be more expensive as every tome a CPU increments 
> > > > > > free_pages it
> > > > > > will likely have to take the cache line containing the bitmap 
> > > > > > pointer as
> > > > > > well.
> > > > > I see I will have to explore this more. I am wondering if there is a 
> > > > > way to
> > > > > measure this If its effect is not visible in 
> > > > > will-it-scale/page_fault1. If
> > > > > there is a noticeable amount of degradation, I will have to address 
> > > > > this.
> > > > If nothing else you might look at seeing if you can split up the
> > > > structures so that the bitmap and nr_bits is in a different region
> > > > somewhere since those are read-mostly values.
> > > ok, I will try to understand the issue and your suggestion.
> > > Thank you for bringing this up.
> > > 
> > > > Also you are now updating the bitmap and free_pages both inside and
> > > > outside of the zone lock so that will likely have some impact.
> > > So as per your previous suggestion, I have made the bitmap structure
> > > object as a rcu protected pointer. So we are safe from that side.
> > > The other downside which I can think of is a race where one page
> > > trying to increment free_pages and other trying to decrements it.
> > > However, being an atomic variable that should not be a problem.
> > > Did I miss anything?
> > I'm not so much worried about a race as the cache line bouncing
> > effect. Basically your notifier combined within this hinting thread
> > will likely result in more time spent by the thread that holds the
> > lock since it will be trying to access the bitmap to set the bit and
> > the free_pages to report the bit, but at the same time you will have
> > this thread clearing bits and decrementing the free_pages values.
> > 
> > One thing you could consider in your worker thread would be to do
> > reallocate and replace the bitmap every time you plan to walk it. By
> > doing that you would avoid the cacheline bouncing on the bitmap since
> > you would only have to read it, and you would no longer have another
> > thread dirtying it. You could essentially reset the free_pages at the
> > same time you replace the bitmap. It would need to all happen with the
> > zone lock held though when you swap it out.
> 
> If I am not mistaken then from what you are suggesting, I will have to hold
> the zone lock for the entire duration of swap & scan which would be costly if
> the bitmap is large, isn't? Also, we might end up missing free pages that are
> getting
> freed while we are scanning.

You would only need to hold the zone lock when you swap the bitmap. Once
it is swapped you wouldn't need to worry about the locking again for
bitmap access since your worker thread would be the only one holding the
current bitmap. Think of it as a batch clearing of the bits.

You already end up missing pages freed while scanning since you are doing
it linearly.

> As far as free_pages count is concerned, I am thinking if I should
> replace it with zone->free_area[REPORTING_ORDER].nr_free which is already 
> there
> (I still need to explore this in a bit more depth).
> 
> > - Alex

So there ends up being two ways you could use nr_free. One is to track it
the way I did with the number of reported pages being tracked, however
that requires reducing the count when reported pages are pulled from the
free_area and identifying reported pages vs unreported ones.

The other option would be to look at converting 

[virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

2019-10-09 Thread Nitesh Narayan Lal


On 10/7/19 1:20 PM, Alexander Duyck wrote:
> On Mon, Oct 7, 2019 at 10:07 AM Nitesh Narayan Lal  wrote:
>>
>> On 10/7/19 12:27 PM, Alexander Duyck wrote:
>>> On Mon, 2019-10-07 at 12:19 -0400, Nitesh Narayan Lal wrote:
 On 10/7/19 11:33 AM, Alexander Duyck wrote:
> On Mon, 2019-10-07 at 08:29 -0400, Nitesh Narayan Lal wrote:
>> On 10/2/19 10:25 AM, Alexander Duyck wrote:
> 
>
>> page_reporting.c change:
>> @@ -101,8 +101,12 @@ static void scan_zone_bitmap(struct 
>> page_reporting_config
>> *phconf,
>> /* Process only if the page is still online */
>> page = pfn_to_online_page((setbit << 
>> PAGE_REPORTING_MIN_ORDER) +
>>   zone->base_pfn);
>> -   if (!page)
>> +   if (!page || !PageBuddy(page)) {
>> +   clear_bit(setbit, zone->bitmap);
>> +   atomic_dec(>free_pages);
>> continue;
>> +   }
>>
> I suspect the zone->free_pages is going to be expensive for you to deal
> with. It is a global atomic value and is going to have the cacheline
> bouncing that it is contained in. As a result thinks like setting the
> bitmap with be more expensive as every tome a CPU increments free_pages it
> will likely have to take the cache line containing the bitmap pointer as
> well.
 I see I will have to explore this more. I am wondering if there is a way to
 measure this If its effect is not visible in will-it-scale/page_fault1. If
 there is a noticeable amount of degradation, I will have to address this.
>>> If nothing else you might look at seeing if you can split up the
>>> structures so that the bitmap and nr_bits is in a different region
>>> somewhere since those are read-mostly values.
>> ok, I will try to understand the issue and your suggestion.
>> Thank you for bringing this up.
>>
>>> Also you are now updating the bitmap and free_pages both inside and
>>> outside of the zone lock so that will likely have some impact.
>> So as per your previous suggestion, I have made the bitmap structure
>> object as a rcu protected pointer. So we are safe from that side.
>> The other downside which I can think of is a race where one page
>> trying to increment free_pages and other trying to decrements it.
>> However, being an atomic variable that should not be a problem.
>> Did I miss anything?
> I'm not so much worried about a race as the cache line bouncing
> effect. Basically your notifier combined within this hinting thread
> will likely result in more time spent by the thread that holds the
> lock since it will be trying to access the bitmap to set the bit and
> the free_pages to report the bit, but at the same time you will have
> this thread clearing bits and decrementing the free_pages values.
>
> One thing you could consider in your worker thread would be to do
> reallocate and replace the bitmap every time you plan to walk it. By
> doing that you would avoid the cacheline bouncing on the bitmap since
> you would only have to read it, and you would no longer have another
> thread dirtying it. You could essentially reset the free_pages at the
> same time you replace the bitmap. It would need to all happen with the
> zone lock held though when you swap it out.

If I am not mistaken then from what you are suggesting, I will have to hold
the zone lock for the entire duration of swap & scan which would be costly if
the bitmap is large, isn't? Also, we might end up missing free pages that are
getting
freed while we are scanning.

As far as free_pages count is concerned, I am thinking if I should
replace it with zone->free_area[REPORTING_ORDER].nr_free which is already there
(I still need to explore this in a bit more depth).

>
> - Alex
-- 
Thanks
Nitesh


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v11 0/6] mm / virtio: Provide support for unused page reporting

2019-10-09 Thread Nitesh Narayan Lal


On 10/7/19 1:06 PM, Nitesh Narayan Lal wrote:
[...]
>> So what was the size of your guest? One thing that just occurred to me is
>> that you might be running a much smaller guest than I was.
> I am running a 30 GB guest.
>
  If so I would have expected a much higher difference versus
 baseline as zeroing/faulting the pages in the host gets expensive fairly
 quick. What is the host kernel you are running your test on? I'm just
 wondering if there is some additional overhead currently limiting your
 setup. My host kernel was just the same kernel I was running in the guest,
 just built without the patches applied.
>>> Right now I have a different host-kernel. I can install the same kernel to 
>>> the
>>> host as well and see if that changes anything.
>> The host kernel will have a fairly significant impact as I recall. For
>> example running a stock CentOS kernel lowered the performance compared to
>> running a linux-next kernel. As a result the numbers looked better since
>> the overall baseline was lower to begin with as the host OS was
>> introducing additional overhead.
> I see in that case I will try by installing the same guest kernel
> to the host as well.

As per your suggestion, I tried replacing the host kernel with an
upstream kernel without my patches i.e., my host has a kernel built on top
of the upstream kernel's master branch which has Sept 23rd commit and the guest
has the same kernel for the no-hinting case and same kernel + my patches
for the page reporting case.

With the changes reported earlier on top of v12, I am not seeing any further
degradation (other than what I have previously reported).

To be sure that THP is actively used, I did an experiment where I changed the
MEMSIZE in the page_fault. On doing so THP usage checked via /proc/meminfo also
increased as I expected.

In any case, if you find something else please let me know and I will look into 
it
again.


I am still looking into your suggestion about cache line bouncing and will reply
to it, if I have more questions.


[...]



-- 
Thanks
Nitesh


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-10-09 Thread Cornelia Huck
[a bit late to the party, sorry]

On Wed, 25 Sep 2019 06:38:53 -0400
"Michael S. Tsirkin"  wrote:

> On Tue, Sep 10, 2019 at 03:31:45PM +0100, Dr. David Alan Gilbert wrote:
> > * Halil Pasic (pa...@linux.ibm.com) wrote:  
> > > On Tue, 10 Sep 2019 14:09:20 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >   
> > > > * Halil Pasic (pa...@linux.ibm.com) wrote:  
> > > > > On Thu, 29 Aug 2019 14:52:06 +0100
> > > > > Stefan Hajnoczi  wrote:
> > > > >   
> > > > > > Describe how shared memory region ID 0 is the DAX window and how
> > > > > > FUSE_SETUPMAPPING maps file ranges into the window.
> > > > > > 
> > > > > > Signed-off-by: Stefan Hajnoczi 
> > > > > > ---
> > > > > > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux 
> > > > > > patches:
> > > > > > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > > > > > 
> > > > > > v8:
> > > > > >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> > > > > >Window clearer [Cornelia]
> > > > > > v7:
> > > > > >  * Clarify that the DAX Window is optional and can be used together 
> > > > > > with
> > > > > >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > > > > > v6:
> > > > > >  * Document timing side-channel attacks [Michael]
> > > > > > ---
> > > > > >  virtio-fs.tex | 66 
> > > > > > +++
> > > > > >  1 file changed, 66 insertions(+)
> > > > > > 
> > > > > > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > > > > > index 1ae17f8..158d066 100644
> > > > > > --- a/virtio-fs.tex
> > > > > > +++ b/virtio-fs.tex
> > > > > > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > > > > > Queue}\label{sec:Device Types / F
> > > > > >  
> > > > > >  The driver MUST anticipate that request queues are processed 
> > > > > > concurrently with the hiprio queue.
> > > > > >  
> > > > > > +\subsubsection{Device Operation: DAX Window}\label{sec:Device 
> > > > > > Types / File System Device / Device Operation / Device Operation: 
> > > > > > DAX Window}
> > > > > > +
> > > > > > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between 
> > > > > > the
> > > > > > +driver-provided buffer and the device.  In cases where data 
> > > > > > transfer is
> > > > > > +undesirable, the device can map file contents into the DAX window 
> > > > > > shared memory
> > > > > > +region.  The driver then accesses file contents directly in 
> > > > > > device-owned memory
> > > > > > +without a data transfer.
> > > > > > +
> > > > > > +The DAX Window is an alternative mechanism for accessing file 
> > > > > > contents.
> > > > > > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are 
> > > > > > possible at the
> > > > > > +same time.  Providing the DAX Window is optional for devices.  
> > > > > > Using the DAX
> > > > > > +Window is optional for drivers.
> > > > > > +
> > > > > > +Shared memory region ID 0 is called the DAX window.  Drivers map 
> > > > > > this shared
> > > > > > +memory region with writeback caching as if it were regular RAM.  
> > > > > > The contents
> > > > > > +of the DAX window are undefined unless a mapping exists for that 
> > > > > > range.  
> > > > > 
> > > > > This last paragraph is a bit concerning form s390x perspective. In 
> > > > > case
> > > > > of a PCI transport the shared memory region is a chunk of PCI memory 
> > > > > (and
> > > > > must be contained within the declared bar, as mandated by commit
> > > > > 855ad7af2bd6).
> > > > > 
> > > > > The PCI architecture on s390x is at the moment such, that PCI memory
> > > > > *can't be accessed like regular RAM* but specialized instructions have
> > > > > to be used. I've tried to rise concern about this multiple times. Thus
> > > > > the virtio spec would contradict itself a little (at least on s390x).

I saw a set of new instructions being introduced in the kernel which
seem to do just that, but I obviously don't know the details.

> > > > > 
> > > > > Of course for virtual zPCI devices we can make this work. But 
> > > > > including
> > > > > this paragraph in the VIRTIO specification would mean if one were to
> > > > > implement this in HW it would not work for s390.
> > > > > 
> > > > > I don't have a anything better to propose, so I intend to vote yes
> > > > > for this. I just wanted to make sure, we all are aware of the
> > > > > consequences.  
> > > > 
> > > > Thanks.
> > > > 
> > > > Note this is just specifying the way virtiofs uses the existing
> > > > (accepted) shared memory region spec.  You can add a CCW transport of
> > > > that spec to make it appropriate for 390 if needed.
> > > >   
> > > 
> > > On s390x we have both CCW and PCI transport. And that makes things even
> > > more complicated.
> > > 
> > > IMHO specifying that virtiofs uses the existing shared memory
> > > specification like regular RAM conflicts with what is architecturally
> > > possible on s390x when the transport is PCI.  
> > 
> > OK.
> > 
> >   
> > > Because the fact that