Τη Παρασκευή, 1 Μαΐου 2020 - 1:31:00 π.μ. UTC+3, ο χρήστης Waldek Kozaczuk 
έγραψε:
>
>
>
> On Thu, Apr 30, 2020 at 6:19 PM Fotis Xenakis <fo...@windowslive.com 
> <javascript:>> wrote:
>
>> Indeed, QEMU 5.0 does not support DAX and the virtiofsd in QEMU 5.0 won't 
>> accept any version other than 7.31 as I see here 
>> <https://github.com/qemu/qemu/blob/27c94566379069fb8930bb1433dcffbf7df3203d/tools/virtiofsd/fuse_lowlevel.c#L1920>,
>>  
>> thus the mount fails.
>> Both on the QEMU and the Linux side, DAX is not close to upstreaming yet. 
>> Although it seems no longer marked as "experimental" here 
>> <https://virtio-fs.gitlab.io/howto-qemu.html>, I think it's still under 
>> development (*not* verified with the devs) and that's the source for 
>> some instability.
>>
>> To summarize:
>>
>>    - Upstream QEMU 5.0 includes stable virtio-fs support, with the basic 
>>    feature set. It negotiates *FUSE 7.31* (latest in upstream Linux).
>>    - Downstream virtio-fs QEMU <https://gitlab.com/virtio-fs/qemu> 
>>    currently contains:
>>       - The default (thus recommended in the docs 
>>       <https://virtio-fs.gitlab.io/howto-qemu.html>) virtio-fs 
>>       <https://gitlab.com/virtio-fs/qemu/-/tree/virtio-fs> branch. This 
>>       negotiates *FUSE 7.27* and supports DAX. This is the one I have 
>>       based my patches upon, because it is the most stable *with DAX 
>>       support*.
>>       - The development branches, virtio-dev 
>>       <https://gitlab.com/virtio-fs/qemu/-/tree/virtio-dev> and 
>>       virtio-fs-dev 
>>       <https://gitlab.com/virtio-fs/qemu/-/tree/virtio-fs-dev> (don't 
>>       know what distinguishes them TBH). They both negotiate *FUSE 7.31* 
>>       and support DAX (with changed protocol details). These iterate 
>> quickly, so 
>>       I haven't used them.
>>    
>> I hadn't anticipated this hard constraint upstream, which poses a 
>> problem, since I guess we want to be compatible with it.
>> My plan is to reach out to the virtio-fs devs, asking for the status of 
>> DAX in the dev branches. If they deem it stabilized, I will probably try to 
>> go with those, offering upstream compatibility *and* DAX.
>> Otherwise, we could have a hybrid approach, compatible with upstream for 
>> the stable features, but following the more stale "virtio-fs" downstream 
>> branch as far as DAX is concerned.
>> What do you think?
>>
> I am not sure I 100% understand what you are proposing. Adding some kind 
> of negotiating logic on OSv side that will be able to deal with both 27 and 
> 31 and "advertise" accordingly? Can we simply send 31 if there is no DAX 
> window detected in the driver layer and 27 otherwise?
>

> I guess for we could just keep this header per 31 and 
> add FUSE_SETUPMAPPING AND FUSE_REMOVEMAPPING to our header, no?
>
This is the "hybrid" approach I was thinking of above and the one I will go 
with for now.
Also, I will contact the virtio-fs devs for insight on how the project will 
evolve in the near future.

>
> Meanwhile I will rollback this particular patch to make OSv work with with 
> stock qemu and virtiofs. 
>
Absolutely, this makes sense. 

>
>> Τη Τετάρτη, 29 Απριλίου 2020 - 7:48:02 μ.μ. UTC+3, ο χρήστης Waldek 
>> Kozaczuk έγραψε:
>>>
>>> On Monday, April 20, 2020 at 5:04:27 PM UTC-4, Fotis Xenakis wrote:
>>>>
>>>> Copy from virtiofsd @ 32006c66f2578af4121d7effaccae4aa4fa12e46. This 
>>>> includes the definitions for FUSE_SETUPMAPPING AND FUSE_REMOVEMAPPING. 
>>>>
>>>> Signed-off-by: Fotis Xenakis <fo...@windowslive.com> 
>>>> --- 
>>>>  fs/virtiofs/fuse_kernel.h | 82 ++++++++++++++++++--------------------- 
>>>>  1 file changed, 38 insertions(+), 44 deletions(-) 
>>>>
>>>> diff --git a/fs/virtiofs/fuse_kernel.h b/fs/virtiofs/fuse_kernel.h 
>>>> index 018a00a2..ce46046a 100644 
>>>> --- a/fs/virtiofs/fuse_kernel.h 
>>>> +++ b/fs/virtiofs/fuse_kernel.h 
>>>> @@ -44,7 +44,6 @@ 
>>>>   *  - add lock_owner field to fuse_setattr_in, fuse_read_in and 
>>>> fuse_write_in 
>>>>   *  - add blksize field to fuse_attr 
>>>>   *  - add file flags field to fuse_read_in and fuse_write_in 
>>>> - *  - Add ATIME_NOW and MTIME_NOW flags to fuse_setattr_in 
>>>>   * 
>>>>   * 7.10 
>>>>   *  - add nonseekable open flag 
>>>> @@ -55,7 +54,7 @@ 
>>>>   *  - add POLL message and NOTIFY_POLL notification 
>>>>   * 
>>>>   * 7.12 
>>>> - *  - add umask flag to input argument of create, mknod and mkdir 
>>>> + *  - add umask flag to input argument of open, mknod and mkdir 
>>>>   *  - add notification messages for invalidation of inodes and 
>>>>   *    directory entries 
>>>>   * 
>>>> @@ -120,19 +119,6 @@ 
>>>>   * 
>>>>   *  7.28 
>>>>   *  - add FUSE_COPY_FILE_RANGE 
>>>> - *  - add FOPEN_CACHE_DIR 
>>>> - *  - add FUSE_MAX_PAGES, add max_pages to init_out 
>>>> - *  - add FUSE_CACHE_SYMLINKS 
>>>> - * 
>>>> - *  7.29 
>>>> - *  - add FUSE_NO_OPENDIR_SUPPORT flag 
>>>> - * 
>>>> - *  7.30 
>>>> - *  - add FUSE_EXPLICIT_INVAL_DATA 
>>>> - *  - add FUSE_IOCTL_COMPAT_X32 
>>>> - * 
>>>> - *  7.31 
>>>> - *  - add FUSE_WRITE_KILL_PRIV flag 
>>>>   */ 
>>>>   
>>>>  #ifndef _LINUX_FUSE_H 
>>>> @@ -168,7 +154,7 @@ 
>>>>  #define FUSE_KERNEL_VERSION 7 
>>>>   
>>>>  /** Minor version number of this interface */ 
>>>> -#define FUSE_KERNEL_MINOR_VERSION 31 
>>>> +#define FUSE_KERNEL_MINOR_VERSION 27 
>>>>
>>> I have applied this patch but when I started testing your later patches 
>>> that enable DAX logic I would get error messages about the wrong protocol 
>>> version:
>>>
>>> OSv v0.54.0-179-g2f92fc91
>>> 4 CPUs detected
>>> Firmware vendor: SeaBIOS
>>> bsd: initializing - done
>>> VFS: mounting ramfs at /
>>> VFS: mounting devfs at /dev
>>> net: initializing - done
>>> vga: Add VGA device instance
>>> eth0: ethernet address: 52:54:00:12:34:56
>>> virtio-blk: Add blk device instances 0 as vblk0, devsize=6470656
>>> random: virtio-rng registered as a source.
>>> virtio-fs: Detected device with tag: [myfs] and num_queues: 1
>>> virtio-fs: Detected DAX window with length 67108864
>>> virtio-fs: Add device instance 0 as [virtiofs1]
>>> random: intel drng, rdrand registered as a source.
>>> random: <Software, Yarrow> initialized
>>> VFS: unmounting /dev
>>> VFS: mounting rofs at /rofs
>>> VFS: mounting devfs at /dev
>>> VFS: mounting procfs at /proc
>>> VFS: mounting sysfs at /sys
>>> VFS: mounting ramfs at /tmp
>>> VFS: mounting virtiofs at /virtiofs
>>> [virtiofs] Failed to initialize fuse filesystem!
>>> failed to mount virtiofs, error = Protocol error
>>> [I/43 dhcp]: Broadcasting DHCPDISCOVER message with xid: [1603537588]
>>> [I/43 dhcp]: Waiting for IP...
>>> [I/55 dhcp]: Received DHCPOFFER message from DHCP server: 192.168.122.1 
>>> regarding offerred IP address: 192.168.122.15
>>> [I/55 dhcp]: Broadcasting DHCPREQUEST message with xid: [1603537588] to 
>>> SELECT offered IP: 192.168.122.15
>>> [I/55 dhcp]: Received DHCPACK message from DHCP server: 192.168.122.1 
>>> regarding offerred IP address: 192.168.122.15
>>> [I/55 dhcp]: Server acknowledged IP 192.168.122.15 for interface eth0 
>>> with time to lease in seconds: 86400
>>> eth0: 192.168.122.15
>>> [I/55 dhcp]: Configuring eth0: ip 192.168.122.15 subnet mask 
>>> 255.255.255.0 gateway 192.168.122.1 MTU 1500
>>> Booted up in 140.48 ms
>>> Cmdline: /virtiofs/hello
>>> Failed to load object: /virtiofs/hello. Powering off.
>>>
>>> # and from virtiofsd [7426562093843] [ID: 00000008] INIT: 7.27
>>> [7426562097664] [ID: 00000008] flags=0x00000000
>>> [7426562100498] [ID: 00000008] max_readahead=0x00001000
>>> [7426562104503] [ID: 00000008] fuse: unsupported protocol version: 7.27
>>> [7426562119457] [ID: 00000008]    unique: 1, error: -71 (Protocol 
>>> error), outsize: 16
>>> [7426562125006] [ID: 00000008] virtio_send_msg: elem 0: with 2 in desc 
>>> of length 80
>>> [7426577096593] [ID: 00000001] virtio_loop: Got VU event
>>>
>>> This happens when I use stock QEMU 5.0 (just released a couple of days 
>>> ago, which seems to have not DAX support yet) and qemu version from 
>>> https://gitlab.com/virtio-fs/qemu/-/commits/virtio-dev (see virtio-dev 
>>> branch).
>>>
>>> I had to bump the version to 31 and then it works. Could you please 
>>> investigate?
>>>
>>> Waldek
>>>  
>>>
>>>>   
>>>>  /** The node ID of the root inode */ 
>>>>  #define FUSE_ROOT_ID 1 
>>>> @@ -236,14 +222,10 @@ struct fuse_file_lock { 
>>>>   * FOPEN_DIRECT_IO: bypass page cache for this open file 
>>>>   * FOPEN_KEEP_CACHE: don't invalidate the data cache on open 
>>>>   * FOPEN_NONSEEKABLE: the file is not seekable 
>>>> - * FOPEN_CACHE_DIR: allow caching this directory 
>>>> - * FOPEN_STREAM: the file is stream-like (no file position at all) 
>>>>   */ 
>>>>  #define FOPEN_DIRECT_IO                (1 << 0) 
>>>>  #define FOPEN_KEEP_CACHE        (1 << 1) 
>>>>  #define FOPEN_NONSEEKABLE        (1 << 2) 
>>>> -#define FOPEN_CACHE_DIR                (1 << 3) 
>>>> -#define FOPEN_STREAM                (1 << 4) 
>>>>   
>>>>  /** 
>>>>   * INIT request/reply flags 
>>>> @@ -270,10 +252,6 @@ struct fuse_file_lock { 
>>>>   * FUSE_HANDLE_KILLPRIV: fs handles killing suid/sgid/cap on 
>>>> write/chown/trunc 
>>>>   * FUSE_POSIX_ACL: filesystem supports posix acls 
>>>>   * FUSE_ABORT_ERROR: reading the device after abort returns 
>>>> ECONNABORTED 
>>>> - * FUSE_MAX_PAGES: init_out.max_pages contains the max number of req 
>>>> pages 
>>>> - * FUSE_CACHE_SYMLINKS: cache READLINK responses 
>>>> - * FUSE_NO_OPENDIR_SUPPORT: kernel supports zero-message opendir 
>>>> - * FUSE_EXPLICIT_INVAL_DATA: only invalidate cached pages on explicit 
>>>> request 
>>>>   */ 
>>>>  #define FUSE_ASYNC_READ                (1 << 0) 
>>>>  #define FUSE_POSIX_LOCKS        (1 << 1) 
>>>> @@ -297,10 +275,6 @@ struct fuse_file_lock { 
>>>>  #define FUSE_HANDLE_KILLPRIV        (1 << 19) 
>>>>  #define FUSE_POSIX_ACL                (1 << 20) 
>>>>  #define FUSE_ABORT_ERROR        (1 << 21) 
>>>> -#define FUSE_MAX_PAGES                (1 << 22) 
>>>> -#define FUSE_CACHE_SYMLINKS        (1 << 23) 
>>>> -#define FUSE_NO_OPENDIR_SUPPORT (1 << 24) 
>>>> -#define FUSE_EXPLICIT_INVAL_DATA (1 << 25) 
>>>>   
>>>>  /** 
>>>>   * CUSE INIT request/reply flags 
>>>> @@ -330,11 +304,9 @@ struct fuse_file_lock { 
>>>>   * 
>>>>   * FUSE_WRITE_CACHE: delayed write from page cache, file handle is 
>>>> guessed 
>>>>   * FUSE_WRITE_LOCKOWNER: lock_owner field is valid 
>>>> - * FUSE_WRITE_KILL_PRIV: kill suid and sgid bits 
>>>>   */ 
>>>>  #define FUSE_WRITE_CACHE        (1 << 0) 
>>>>  #define FUSE_WRITE_LOCKOWNER        (1 << 1) 
>>>> -#define FUSE_WRITE_KILL_PRIV        (1 << 2) 
>>>>   
>>>>  /** 
>>>>   * Read flags 
>>>> @@ -349,7 +321,6 @@ struct fuse_file_lock { 
>>>>   * FUSE_IOCTL_RETRY: retry with new iovecs 
>>>>   * FUSE_IOCTL_32BIT: 32bit ioctl 
>>>>   * FUSE_IOCTL_DIR: is a directory 
>>>> - * FUSE_IOCTL_COMPAT_X32: x32 compat ioctl on 64bit machine (64bit 
>>>> time_t) 
>>>>   * 
>>>>   * FUSE_IOCTL_MAX_IOV: maximum of in_iovecs + out_iovecs 
>>>>   */ 
>>>> @@ -358,7 +329,6 @@ struct fuse_file_lock { 
>>>>  #define FUSE_IOCTL_RETRY        (1 << 2) 
>>>>  #define FUSE_IOCTL_32BIT        (1 << 3) 
>>>>  #define FUSE_IOCTL_DIR                (1 << 4) 
>>>> -#define FUSE_IOCTL_COMPAT_X32        (1 << 5) 
>>>>   
>>>>  #define FUSE_IOCTL_MAX_IOV        256 
>>>>   
>>>> @@ -369,13 +339,6 @@ struct fuse_file_lock { 
>>>>   */ 
>>>>  #define FUSE_POLL_SCHEDULE_NOTIFY (1 << 0) 
>>>>   
>>>> -/** 
>>>> - * Fsync flags 
>>>> - * 
>>>> - * FUSE_FSYNC_FDATASYNC: Sync data only, not metadata 
>>>> - */ 
>>>> -#define FUSE_FSYNC_FDATASYNC        (1 << 0) 
>>>> - 
>>>>  enum fuse_opcode { 
>>>>          FUSE_LOOKUP                = 1, 
>>>>          FUSE_FORGET                = 2,  /* no reply */ 
>>>> @@ -422,9 +385,11 @@ enum fuse_opcode { 
>>>>          FUSE_RENAME2                = 45, 
>>>>          FUSE_LSEEK                = 46, 
>>>>          FUSE_COPY_FILE_RANGE        = 47, 
>>>> +        FUSE_SETUPMAPPING       = 48, 
>>>> +        FUSE_REMOVEMAPPING      = 49, 
>>>>   
>>>>          /* CUSE specific operations */ 
>>>> -        CUSE_INIT                = 4096 
>>>> +        CUSE_INIT                = 4096, 
>>>>  }; 
>>>>   
>>>>  enum fuse_notify_code { 
>>>> @@ -434,7 +399,7 @@ enum fuse_notify_code { 
>>>>          FUSE_NOTIFY_STORE = 4, 
>>>>          FUSE_NOTIFY_RETRIEVE = 5, 
>>>>          FUSE_NOTIFY_DELETE = 6, 
>>>> -        FUSE_NOTIFY_CODE_MAX 
>>>> +        FUSE_NOTIFY_CODE_MAX, 
>>>>  }; 
>>>>   
>>>>  /* The read buffer is required to be at least 8k, but may be much 
>>>> larger */ 
>>>> @@ -651,9 +616,7 @@ struct fuse_init_out { 
>>>>          uint16_t        congestion_threshold; 
>>>>          uint32_t        max_write; 
>>>>          uint32_t        time_gran; 
>>>> -        uint16_t        max_pages; 
>>>> -        uint16_t        padding; 
>>>> -        uint32_t        unused[8]; 
>>>> +        uint32_t        unused[9]; 
>>>>  }; 
>>>>   
>>>>  #define CUSE_INIT_INFO_MAX 4096 
>>>> @@ -845,4 +808,35 @@ struct fuse_copy_file_range_in { 
>>>>          uint64_t        flags; 
>>>>  }; 
>>>>   
>>>> +#define FUSE_SETUPMAPPING_ENTRIES 8 
>>>> +#define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0) 
>>>> +struct fuse_setupmapping_in { 
>>>> +        /* An already open handle */ 
>>>> +        uint64_t        fh; 
>>>> +        /* Offset into the file to start the mapping */ 
>>>> +        uint64_t        foffset; 
>>>> +        /* Length of mapping required */ 
>>>> +        uint64_t        len; 
>>>> +        /* Flags, FUSE_SETUPMAPPING_FLAG_* */ 
>>>> +        uint64_t        flags; 
>>>> +        /* memory offset in to dax window */ 
>>>> +        uint64_t        moffset; 
>>>> +}; 
>>>> + 
>>>> +struct fuse_setupmapping_out { 
>>>> +        /* Offsets into the cache of mappings */ 
>>>> +        uint64_t        coffset[FUSE_SETUPMAPPING_ENTRIES]; 
>>>> +        /* Lengths of each mapping */ 
>>>> +        uint64_t        len[FUSE_SETUPMAPPING_ENTRIES]; 
>>>> +}; 
>>>> + 
>>>> +struct fuse_removemapping_in { 
>>>> +        /* An already open handle */ 
>>>> +        uint64_t        fh; 
>>>> +        /* Offset into the dax to start the unmapping */ 
>>>> +        uint64_t        moffset; 
>>>> +        /* Length of mapping required */ 
>>>> +        uint64_t        len; 
>>>> +}; 
>>>> + 
>>>>  #endif /* _LINUX_FUSE_H */ 
>>>> -- 
>>>> 2.26.1 
>>>>
>>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to osv...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/99fdd9fc-e1e6-48c9-9b3f-a50965d22654%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/osv-dev/99fdd9fc-e1e6-48c9-9b3f-a50965d22654%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/8c480456-c8f0-4763-b210-5c3ecf13ac44%40googlegroups.com.

Reply via email to