Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-10-09 Thread Cornelia Huck
[a bit late to the party, sorry]

On Wed, 25 Sep 2019 06:38:53 -0400
"Michael S. Tsirkin"  wrote:

> On Tue, Sep 10, 2019 at 03:31:45PM +0100, Dr. David Alan Gilbert wrote:
> > * Halil Pasic (pa...@linux.ibm.com) wrote:  
> > > On Tue, 10 Sep 2019 14:09:20 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >   
> > > > * Halil Pasic (pa...@linux.ibm.com) wrote:  
> > > > > On Thu, 29 Aug 2019 14:52:06 +0100
> > > > > Stefan Hajnoczi  wrote:
> > > > >   
> > > > > > Describe how shared memory region ID 0 is the DAX window and how
> > > > > > FUSE_SETUPMAPPING maps file ranges into the window.
> > > > > > 
> > > > > > Signed-off-by: Stefan Hajnoczi 
> > > > > > ---
> > > > > > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux 
> > > > > > patches:
> > > > > > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > > > > > 
> > > > > > v8:
> > > > > >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> > > > > >Window clearer [Cornelia]
> > > > > > v7:
> > > > > >  * Clarify that the DAX Window is optional and can be used together 
> > > > > > with
> > > > > >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > > > > > v6:
> > > > > >  * Document timing side-channel attacks [Michael]
> > > > > > ---
> > > > > >  virtio-fs.tex | 66 
> > > > > > +++
> > > > > >  1 file changed, 66 insertions(+)
> > > > > > 
> > > > > > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > > > > > index 1ae17f8..158d066 100644
> > > > > > --- a/virtio-fs.tex
> > > > > > +++ b/virtio-fs.tex
> > > > > > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > > > > > Queue}\label{sec:Device Types / F
> > > > > >  
> > > > > >  The driver MUST anticipate that request queues are processed 
> > > > > > concurrently with the hiprio queue.
> > > > > >  
> > > > > > +\subsubsection{Device Operation: DAX Window}\label{sec:Device 
> > > > > > Types / File System Device / Device Operation / Device Operation: 
> > > > > > DAX Window}
> > > > > > +
> > > > > > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between 
> > > > > > the
> > > > > > +driver-provided buffer and the device.  In cases where data 
> > > > > > transfer is
> > > > > > +undesirable, the device can map file contents into the DAX window 
> > > > > > shared memory
> > > > > > +region.  The driver then accesses file contents directly in 
> > > > > > device-owned memory
> > > > > > +without a data transfer.
> > > > > > +
> > > > > > +The DAX Window is an alternative mechanism for accessing file 
> > > > > > contents.
> > > > > > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are 
> > > > > > possible at the
> > > > > > +same time.  Providing the DAX Window is optional for devices.  
> > > > > > Using the DAX
> > > > > > +Window is optional for drivers.
> > > > > > +
> > > > > > +Shared memory region ID 0 is called the DAX window.  Drivers map 
> > > > > > this shared
> > > > > > +memory region with writeback caching as if it were regular RAM.  
> > > > > > The contents
> > > > > > +of the DAX window are undefined unless a mapping exists for that 
> > > > > > range.  
> > > > > 
> > > > > This last paragraph is a bit concerning form s390x perspective. In 
> > > > > case
> > > > > of a PCI transport the shared memory region is a chunk of PCI memory 
> > > > > (and
> > > > > must be contained within the declared bar, as mandated by commit
> > > > > 855ad7af2bd6).
> > > > > 
> > > > > The PCI architecture on s390x is at the moment such, that PCI memory
> > > > > *can't be accessed like regular RAM* but specialized instructions have
> > > > > to be used. I've tried to rise concern about this multiple times. Thus
> > > > > the virtio spec would contradict itself a little (at least on s390x).

I saw a set of new instructions being introduced in the kernel which
seem to do just that, but I obviously don't know the details.

> > > > > 
> > > > > Of course for virtual zPCI devices we can make this work. But 
> > > > > including
> > > > > this paragraph in the VIRTIO specification would mean if one were to
> > > > > implement this in HW it would not work for s390.
> > > > > 
> > > > > I don't have a anything better to propose, so I intend to vote yes
> > > > > for this. I just wanted to make sure, we all are aware of the
> > > > > consequences.  
> > > > 
> > > > Thanks.
> > > > 
> > > > Note this is just specifying the way virtiofs uses the existing
> > > > (accepted) shared memory region spec.  You can add a CCW transport of
> > > > that spec to make it appropriate for 390 if needed.
> > > >   
> > > 
> > > On s390x we have both CCW and PCI transport. And that makes things even
> > > more complicated.
> > > 
> > > IMHO specifying that virtiofs uses the existing shared memory
> > > specification like regular RAM conflicts with what is architecturally
> > > possible on s390x when the transport is PCI.  
> > 
> > OK.
> > 
> >   
> > > Because the fact that 

Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-09-25 Thread Michael S. Tsirkin
On Tue, Sep 10, 2019 at 03:31:45PM +0100, Dr. David Alan Gilbert wrote:
> * Halil Pasic (pa...@linux.ibm.com) wrote:
> > On Tue, 10 Sep 2019 14:09:20 +0100
> > "Dr. David Alan Gilbert"  wrote:
> > 
> > > * Halil Pasic (pa...@linux.ibm.com) wrote:
> > > > On Thu, 29 Aug 2019 14:52:06 +0100
> > > > Stefan Hajnoczi  wrote:
> > > > 
> > > > > Describe how shared memory region ID 0 is the DAX window and how
> > > > > FUSE_SETUPMAPPING maps file ranges into the window.
> > > > > 
> > > > > Signed-off-by: Stefan Hajnoczi 
> > > > > ---
> > > > > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> > > > > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > > > > 
> > > > > v8:
> > > > >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> > > > >Window clearer [Cornelia]
> > > > > v7:
> > > > >  * Clarify that the DAX Window is optional and can be used together 
> > > > > with
> > > > >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > > > > v6:
> > > > >  * Document timing side-channel attacks [Michael]
> > > > > ---
> > > > >  virtio-fs.tex | 66 
> > > > > +++
> > > > >  1 file changed, 66 insertions(+)
> > > > > 
> > > > > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > > > > index 1ae17f8..158d066 100644
> > > > > --- a/virtio-fs.tex
> > > > > +++ b/virtio-fs.tex
> > > > > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > > > > Queue}\label{sec:Device Types / F
> > > > >  
> > > > >  The driver MUST anticipate that request queues are processed 
> > > > > concurrently with the hiprio queue.
> > > > >  
> > > > > +\subsubsection{Device Operation: DAX Window}\label{sec:Device Types 
> > > > > / File System Device / Device Operation / Device Operation: DAX 
> > > > > Window}
> > > > > +
> > > > > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between 
> > > > > the
> > > > > +driver-provided buffer and the device.  In cases where data transfer 
> > > > > is
> > > > > +undesirable, the device can map file contents into the DAX window 
> > > > > shared memory
> > > > > +region.  The driver then accesses file contents directly in 
> > > > > device-owned memory
> > > > > +without a data transfer.
> > > > > +
> > > > > +The DAX Window is an alternative mechanism for accessing file 
> > > > > contents.
> > > > > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible 
> > > > > at the
> > > > > +same time.  Providing the DAX Window is optional for devices.  Using 
> > > > > the DAX
> > > > > +Window is optional for drivers.
> > > > > +
> > > > > +Shared memory region ID 0 is called the DAX window.  Drivers map 
> > > > > this shared
> > > > > +memory region with writeback caching as if it were regular RAM.  The 
> > > > > contents
> > > > > +of the DAX window are undefined unless a mapping exists for that 
> > > > > range.
> > > > 
> > > > This last paragraph is a bit concerning form s390x perspective. In case
> > > > of a PCI transport the shared memory region is a chunk of PCI memory 
> > > > (and
> > > > must be contained within the declared bar, as mandated by commit
> > > > 855ad7af2bd6).
> > > > 
> > > > The PCI architecture on s390x is at the moment such, that PCI memory
> > > > *can't be accessed like regular RAM* but specialized instructions have
> > > > to be used. I've tried to rise concern about this multiple times. Thus
> > > > the virtio spec would contradict itself a little (at least on s390x).
> > > > 
> > > > Of course for virtual zPCI devices we can make this work. But including
> > > > this paragraph in the VIRTIO specification would mean if one were to
> > > > implement this in HW it would not work for s390.
> > > > 
> > > > I don't have a anything better to propose, so I intend to vote yes
> > > > for this. I just wanted to make sure, we all are aware of the
> > > > consequences.
> > > 
> > > Thanks.
> > > 
> > > Note this is just specifying the way virtiofs uses the existing
> > > (accepted) shared memory region spec.  You can add a CCW transport of
> > > that spec to make it appropriate for 390 if needed.
> > > 
> > 
> > On s390x we have both CCW and PCI transport. And that makes things even
> > more complicated.
> > 
> > IMHO specifying that virtiofs uses the existing shared memory
> > specification like regular RAM conflicts with what is architecturally
> > possible on s390x when the transport is PCI.
> 
> OK.
> 
> 
> > Because the fact that this is memory exposed by a PCI device and
> > contained within a bar with the current s390 architecture implies that
> > this memory can not be used as regular RAM but needs to be accessed via
> > specialized instructions (PCI LOAD, PCI STORE). @Pierre: please confirm
> > or disprove me.
> > 
> > Of course both simply not doing DAX window on s390 if transport PCI,
> > or conceptually extending the architecture (for virtual systems) and
> > making it work in the non s390 way is an option.
> >
> > 

Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-09-10 Thread Dr. David Alan Gilbert
* Halil Pasic (pa...@linux.ibm.com) wrote:
> On Tue, 10 Sep 2019 14:09:20 +0100
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Halil Pasic (pa...@linux.ibm.com) wrote:
> > > On Thu, 29 Aug 2019 14:52:06 +0100
> > > Stefan Hajnoczi  wrote:
> > > 
> > > > Describe how shared memory region ID 0 is the DAX window and how
> > > > FUSE_SETUPMAPPING maps file ranges into the window.
> > > > 
> > > > Signed-off-by: Stefan Hajnoczi 
> > > > ---
> > > > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> > > > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > > > 
> > > > v8:
> > > >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> > > >Window clearer [Cornelia]
> > > > v7:
> > > >  * Clarify that the DAX Window is optional and can be used together with
> > > >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > > > v6:
> > > >  * Document timing side-channel attacks [Michael]
> > > > ---
> > > >  virtio-fs.tex | 66 +++
> > > >  1 file changed, 66 insertions(+)
> > > > 
> > > > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > > > index 1ae17f8..158d066 100644
> > > > --- a/virtio-fs.tex
> > > > +++ b/virtio-fs.tex
> > > > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > > > Queue}\label{sec:Device Types / F
> > > >  
> > > >  The driver MUST anticipate that request queues are processed 
> > > > concurrently with the hiprio queue.
> > > >  
> > > > +\subsubsection{Device Operation: DAX Window}\label{sec:Device Types / 
> > > > File System Device / Device Operation / Device Operation: DAX Window}
> > > > +
> > > > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between the
> > > > +driver-provided buffer and the device.  In cases where data transfer is
> > > > +undesirable, the device can map file contents into the DAX window 
> > > > shared memory
> > > > +region.  The driver then accesses file contents directly in 
> > > > device-owned memory
> > > > +without a data transfer.
> > > > +
> > > > +The DAX Window is an alternative mechanism for accessing file contents.
> > > > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible 
> > > > at the
> > > > +same time.  Providing the DAX Window is optional for devices.  Using 
> > > > the DAX
> > > > +Window is optional for drivers.
> > > > +
> > > > +Shared memory region ID 0 is called the DAX window.  Drivers map this 
> > > > shared
> > > > +memory region with writeback caching as if it were regular RAM.  The 
> > > > contents
> > > > +of the DAX window are undefined unless a mapping exists for that range.
> > > 
> > > This last paragraph is a bit concerning form s390x perspective. In case
> > > of a PCI transport the shared memory region is a chunk of PCI memory (and
> > > must be contained within the declared bar, as mandated by commit
> > > 855ad7af2bd6).
> > > 
> > > The PCI architecture on s390x is at the moment such, that PCI memory
> > > *can't be accessed like regular RAM* but specialized instructions have
> > > to be used. I've tried to rise concern about this multiple times. Thus
> > > the virtio spec would contradict itself a little (at least on s390x).
> > > 
> > > Of course for virtual zPCI devices we can make this work. But including
> > > this paragraph in the VIRTIO specification would mean if one were to
> > > implement this in HW it would not work for s390.
> > > 
> > > I don't have a anything better to propose, so I intend to vote yes
> > > for this. I just wanted to make sure, we all are aware of the
> > > consequences.
> > 
> > Thanks.
> > 
> > Note this is just specifying the way virtiofs uses the existing
> > (accepted) shared memory region spec.  You can add a CCW transport of
> > that spec to make it appropriate for 390 if needed.
> > 
> 
> On s390x we have both CCW and PCI transport. And that makes things even
> more complicated.
> 
> IMHO specifying that virtiofs uses the existing shared memory
> specification like regular RAM conflicts with what is architecturally
> possible on s390x when the transport is PCI.

OK.


> Because the fact that this is memory exposed by a PCI device and
> contained within a bar with the current s390 architecture implies that
> this memory can not be used as regular RAM but needs to be accessed via
> specialized instructions (PCI LOAD, PCI STORE). @Pierre: please confirm
> or disprove me.
> 
> Of course both simply not doing DAX window on s390 if transport PCI,
> or conceptually extending the architecture (for virtual systems) and
> making it work in the non s390 way is an option.
>
> And yes for the CCW transport we can do whatever we want. And I think
> we do want regular RAM for CCW transport, because architecturally there
> is no way a CCW device can expose memory. So we would/will need to build
> something virtual. And if we do, we should do it the way it suits us
> best.

Yes; I'm assuming you'd do whatever is appropriate on CCW and just not
use 

Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-09-10 Thread Halil Pasic
On Tue, 10 Sep 2019 14:09:20 +0100
"Dr. David Alan Gilbert"  wrote:

> * Halil Pasic (pa...@linux.ibm.com) wrote:
> > On Thu, 29 Aug 2019 14:52:06 +0100
> > Stefan Hajnoczi  wrote:
> > 
> > > Describe how shared memory region ID 0 is the DAX window and how
> > > FUSE_SETUPMAPPING maps file ranges into the window.
> > > 
> > > Signed-off-by: Stefan Hajnoczi 
> > > ---
> > > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> > > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > > 
> > > v8:
> > >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> > >Window clearer [Cornelia]
> > > v7:
> > >  * Clarify that the DAX Window is optional and can be used together with
> > >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > > v6:
> > >  * Document timing side-channel attacks [Michael]
> > > ---
> > >  virtio-fs.tex | 66 +++
> > >  1 file changed, 66 insertions(+)
> > > 
> > > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > > index 1ae17f8..158d066 100644
> > > --- a/virtio-fs.tex
> > > +++ b/virtio-fs.tex
> > > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > > Queue}\label{sec:Device Types / F
> > >  
> > >  The driver MUST anticipate that request queues are processed 
> > > concurrently with the hiprio queue.
> > >  
> > > +\subsubsection{Device Operation: DAX Window}\label{sec:Device Types / 
> > > File System Device / Device Operation / Device Operation: DAX Window}
> > > +
> > > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between the
> > > +driver-provided buffer and the device.  In cases where data transfer is
> > > +undesirable, the device can map file contents into the DAX window shared 
> > > memory
> > > +region.  The driver then accesses file contents directly in device-owned 
> > > memory
> > > +without a data transfer.
> > > +
> > > +The DAX Window is an alternative mechanism for accessing file contents.
> > > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible at 
> > > the
> > > +same time.  Providing the DAX Window is optional for devices.  Using the 
> > > DAX
> > > +Window is optional for drivers.
> > > +
> > > +Shared memory region ID 0 is called the DAX window.  Drivers map this 
> > > shared
> > > +memory region with writeback caching as if it were regular RAM.  The 
> > > contents
> > > +of the DAX window are undefined unless a mapping exists for that range.
> > 
> > This last paragraph is a bit concerning form s390x perspective. In case
> > of a PCI transport the shared memory region is a chunk of PCI memory (and
> > must be contained within the declared bar, as mandated by commit
> > 855ad7af2bd6).
> > 
> > The PCI architecture on s390x is at the moment such, that PCI memory
> > *can't be accessed like regular RAM* but specialized instructions have
> > to be used. I've tried to rise concern about this multiple times. Thus
> > the virtio spec would contradict itself a little (at least on s390x).
> > 
> > Of course for virtual zPCI devices we can make this work. But including
> > this paragraph in the VIRTIO specification would mean if one were to
> > implement this in HW it would not work for s390.
> > 
> > I don't have a anything better to propose, so I intend to vote yes
> > for this. I just wanted to make sure, we all are aware of the
> > consequences.
> 
> Thanks.
> 
> Note this is just specifying the way virtiofs uses the existing
> (accepted) shared memory region spec.  You can add a CCW transport of
> that spec to make it appropriate for 390 if needed.
> 

On s390x we have both CCW and PCI transport. And that makes things even
more complicated.

IMHO specifying that virtiofs uses the existing shared memory
specification like regular RAM conflicts with what is architecturally
possible on s390x when the transport is PCI.

Because the fact that this is memory exposed by a PCI device and
contained within a bar with the current s390 architecture implies that
this memory can not be used as regular RAM but needs to be accessed via
specialized instructions (PCI LOAD, PCI STORE). @Pierre: please confirm
or disprove me.

Of course both simply not doing DAX window on s390 if transport PCI,
or conceptually extending the architecture (for virtual systems) and
making it work in the non s390 way is an option.

And yes for the CCW transport we can do whatever we want. And I think
we do want regular RAM for CCW transport, because architecturally there
is no way a CCW device can expose memory. So we would/will need to build
something virtual. And if we do, we should do it the way it suits us
best.

Regards,
Halil



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-09-10 Thread Dr. David Alan Gilbert
* Halil Pasic (pa...@linux.ibm.com) wrote:
> On Thu, 29 Aug 2019 14:52:06 +0100
> Stefan Hajnoczi  wrote:
> 
> > Describe how shared memory region ID 0 is the DAX window and how
> > FUSE_SETUPMAPPING maps file ranges into the window.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> > The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> > https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> > 
> > v8:
> >  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
> >Window clearer [Cornelia]
> > v7:
> >  * Clarify that the DAX Window is optional and can be used together with
> >FUSE_READ/FUSE_WRITE requests [Cornelia]
> > v6:
> >  * Document timing side-channel attacks [Michael]
> > ---
> >  virtio-fs.tex | 66 +++
> >  1 file changed, 66 insertions(+)
> > 
> > diff --git a/virtio-fs.tex b/virtio-fs.tex
> > index 1ae17f8..158d066 100644
> > --- a/virtio-fs.tex
> > +++ b/virtio-fs.tex
> > @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> > Queue}\label{sec:Device Types / F
> >  
> >  The driver MUST anticipate that request queues are processed concurrently 
> > with the hiprio queue.
> >  
> > +\subsubsection{Device Operation: DAX Window}\label{sec:Device Types / File 
> > System Device / Device Operation / Device Operation: DAX Window}
> > +
> > +FUSE\_READ and FUSE\_WRITE requests transfer file contents between the
> > +driver-provided buffer and the device.  In cases where data transfer is
> > +undesirable, the device can map file contents into the DAX window shared 
> > memory
> > +region.  The driver then accesses file contents directly in device-owned 
> > memory
> > +without a data transfer.
> > +
> > +The DAX Window is an alternative mechanism for accessing file contents.
> > +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible at the
> > +same time.  Providing the DAX Window is optional for devices.  Using the 
> > DAX
> > +Window is optional for drivers.
> > +
> > +Shared memory region ID 0 is called the DAX window.  Drivers map this 
> > shared
> > +memory region with writeback caching as if it were regular RAM.  The 
> > contents
> > +of the DAX window are undefined unless a mapping exists for that range.
> 
> This last paragraph is a bit concerning form s390x perspective. In case
> of a PCI transport the shared memory region is a chunk of PCI memory (and
> must be contained within the declared bar, as mandated by commit
> 855ad7af2bd6).
> 
> The PCI architecture on s390x is at the moment such, that PCI memory
> *can't be accessed like regular RAM* but specialized instructions have
> to be used. I've tried to rise concern about this multiple times. Thus
> the virtio spec would contradict itself a little (at least on s390x).
> 
> Of course for virtual zPCI devices we can make this work. But including
> this paragraph in the VIRTIO specification would mean if one were to
> implement this in HW it would not work for s390.
> 
> I don't have a anything better to propose, so I intend to vote yes
> for this. I just wanted to make sure, we all are aware of the
> consequences.

Thanks.

Note this is just specifying the way virtiofs uses the existing
(accepted) shared memory region spec.  You can add a CCW transport of
that spec to make it appropriate for 390 if needed.

Dave

> Sorry for bringing this up again so late.
> 
> Adding Christian and Pierre as cc.

> Regards,
> Halil
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-09-10 Thread Halil Pasic
On Thu, 29 Aug 2019 14:52:06 +0100
Stefan Hajnoczi  wrote:

> Describe how shared memory region ID 0 is the DAX window and how
> FUSE_SETUPMAPPING maps file ranges into the window.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
> The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> 
> v8:
>  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
>Window clearer [Cornelia]
> v7:
>  * Clarify that the DAX Window is optional and can be used together with
>FUSE_READ/FUSE_WRITE requests [Cornelia]
> v6:
>  * Document timing side-channel attacks [Michael]
> ---
>  virtio-fs.tex | 66 +++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/virtio-fs.tex b/virtio-fs.tex
> index 1ae17f8..158d066 100644
> --- a/virtio-fs.tex
> +++ b/virtio-fs.tex
> @@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
> Queue}\label{sec:Device Types / F
>  
>  The driver MUST anticipate that request queues are processed concurrently 
> with the hiprio queue.
>  
> +\subsubsection{Device Operation: DAX Window}\label{sec:Device Types / File 
> System Device / Device Operation / Device Operation: DAX Window}
> +
> +FUSE\_READ and FUSE\_WRITE requests transfer file contents between the
> +driver-provided buffer and the device.  In cases where data transfer is
> +undesirable, the device can map file contents into the DAX window shared 
> memory
> +region.  The driver then accesses file contents directly in device-owned 
> memory
> +without a data transfer.
> +
> +The DAX Window is an alternative mechanism for accessing file contents.
> +FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible at the
> +same time.  Providing the DAX Window is optional for devices.  Using the DAX
> +Window is optional for drivers.
> +
> +Shared memory region ID 0 is called the DAX window.  Drivers map this shared
> +memory region with writeback caching as if it were regular RAM.  The contents
> +of the DAX window are undefined unless a mapping exists for that range.

This last paragraph is a bit concerning form s390x perspective. In case
of a PCI transport the shared memory region is a chunk of PCI memory (and
must be contained within the declared bar, as mandated by commit
855ad7af2bd6).

The PCI architecture on s390x is at the moment such, that PCI memory
*can't be accessed like regular RAM* but specialized instructions have
to be used. I've tried to rise concern about this multiple times. Thus
the virtio spec would contradict itself a little (at least on s390x).

Of course for virtual zPCI devices we can make this work. But including
this paragraph in the VIRTIO specification would mean if one were to
implement this in HW it would not work for s390.

I don't have a anything better to propose, so I intend to vote yes
for this. I just wanted to make sure, we all are aware of the
consequences.

Sorry for bringing this up again so late.

Adding Christian and Pierre as cc.

Regards,
Halil



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-08-29 Thread Cornelia Huck
On Thu, 29 Aug 2019 14:52:06 +0100
Stefan Hajnoczi  wrote:

> Describe how shared memory region ID 0 is the DAX window and how
> FUSE_SETUPMAPPING maps file ranges into the window.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
> The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
> https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h
> 
> v8:
>  * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
>Window clearer [Cornelia]
> v7:
>  * Clarify that the DAX Window is optional and can be used together with
>FUSE_READ/FUSE_WRITE requests [Cornelia]
> v6:
>  * Document timing side-channel attacks [Michael]
> ---
>  virtio-fs.tex | 66 +++
>  1 file changed, 66 insertions(+)

Reviewed-by: Cornelia Huck 

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH v8 2/2] virtio-fs: add DAX window

2019-08-29 Thread Stefan Hajnoczi
Describe how shared memory region ID 0 is the DAX window and how
FUSE_SETUPMAPPING maps file ranges into the window.

Signed-off-by: Stefan Hajnoczi 
---
The FUSE_SETUPMAPPING message is part of the virtio-fs Linux patches:
https://gitlab.com/virtio-fs/linux/blob/virtio-fs/include/uapi/linux/fuse.h

v8:
 * Make language about using both FUSE_READ/FUSE_WRITE and the DAX
   Window clearer [Cornelia]
v7:
 * Clarify that the DAX Window is optional and can be used together with
   FUSE_READ/FUSE_WRITE requests [Cornelia]
v6:
 * Document timing side-channel attacks [Michael]
---
 virtio-fs.tex | 66 +++
 1 file changed, 66 insertions(+)

diff --git a/virtio-fs.tex b/virtio-fs.tex
index 1ae17f8..158d066 100644
--- a/virtio-fs.tex
+++ b/virtio-fs.tex
@@ -179,6 +179,62 @@ \subsubsection{Device Operation: High Priority 
Queue}\label{sec:Device Types / F
 
 The driver MUST anticipate that request queues are processed concurrently with 
the hiprio queue.
 
+\subsubsection{Device Operation: DAX Window}\label{sec:Device Types / File 
System Device / Device Operation / Device Operation: DAX Window}
+
+FUSE\_READ and FUSE\_WRITE requests transfer file contents between the
+driver-provided buffer and the device.  In cases where data transfer is
+undesirable, the device can map file contents into the DAX window shared memory
+region.  The driver then accesses file contents directly in device-owned memory
+without a data transfer.
+
+The DAX Window is an alternative mechanism for accessing file contents.
+FUSE\_READ/FUSE\_WRITE requests and DAX Window accesses are possible at the
+same time.  Providing the DAX Window is optional for devices.  Using the DAX
+Window is optional for drivers.
+
+Shared memory region ID 0 is called the DAX window.  Drivers map this shared
+memory region with writeback caching as if it were regular RAM.  The contents
+of the DAX window are undefined unless a mapping exists for that range.
+
+The driver maps a file range into the DAX window using the FUSE\_SETUPMAPPING
+request.  Alignment constraints for FUSE\_SETUPMAPPING and FUSE\_REMOVEMAPPING
+requests are communicated during FUSE\_INIT negotiation.
+
+When a FUSE\_SETUPMAPPING request perfectly overlaps a previous mapping, the
+previous mapping is replaced.  When a mapping partially overlaps a previous
+mapping, the previous mapping is split into one or two smaller mappings.  When
+a mapping is partially unmapped it is also split into one or two smaller
+mappings.
+
+Establishing new mappings or splitting existing mappings consumes resources.
+If the device runs out of resources the FUSE\_SETUPMAPPING request fails until
+resources are available again following FUSE\_REMOVEMAPPING.
+
+After FUSE\_SETUPMAPPING has completed successfully the file range is
+accessible from the DAX window at the offset provided by the driver in the
+request.  A mapping is removed using the FUSE\_REMOVEMAPPING request.
+
+Data is only guaranteed to be persistent when a FUSE\_FSYNC request is used by
+the device after having been made available by the driver following the write.
+
+\devicenormative{\paragraph}{Device Operation: DAX Window}{Device Types / File 
System Device / Device Operation / Device Operation: DAX Window}
+
+The device MAY provide the DAX Window to memory-mapped access to file 
contents.  If present, the DAX Window MUST be shared memory region ID 0.
+
+The device MUST support FUSE\_READ and FUSE\_WRITE requests regardless of 
whether the DAX Window is being used or not.
+
+The device MUST allow mappings that completely or partially overlap existing 
mappings within the DAX window.
+
+The device MUST reject mappings that would go beyond the end of the DAX window.
+
+\drivernormative{\paragraph}{Device Operation: DAX Window}{Device Types / File 
System Device / Device Operation / Device Operation: DAX Window}
+
+The driver SHOULD be prepared to find shared memory region ID 0 absent and 
fall back to FUSE\_READ and FUSE\_WRITE requests.
+
+The driver MAY use both FUSE\_READ/FUSE\_WRITE requests and the DAX Window to 
access file contents.
+
+The driver MUST NOT access DAX window areas that have not been mapped.
+
 \subsubsection{Security Considerations}\label{sec:Device Types / File System 
Device / Security Considerations}
 
 The device provides access to a file system containing files owned by one or
@@ -207,6 +263,16 @@ \subsubsection{Security Considerations}\label{sec:Device 
Types / File System Dev
 virtio-fs.  They are typically managed at the file system administration level
 by providing shared access only to mutually trusted users.
 
+Multiple machines sharing access to a file system are susceptible to timing
+side-channel attacks.  By measuring the latency of accesses to file contents or
+file system metadata it is possible to infer whether other machines also
+accessed the same information.  Short latencies indicate that the information
+was cached due to a previous access.  This can reveal