Re: [Devel] [PATCH vz7] fuse: relax i_mutex coverage in fuse_fsync

Maxim Patlasov Wed, 30 Nov 2016 10:05:17 -0800

Alexey,

You're right. And while composing the patch I well understood that it'spossible to rework fuse_sync_writes() using a counter instead ofnegative bias. But the problem with flush_mtime still exists anyway.Think about it: we firstly acquire local mtime from local inode, thenfill and submit mtime-update-request. Since then, we don't know whenexactly fuse daemon will apply that new mtime to its metadatastructures. If another mtime-update is generated in-between (e.g. "touch-d <date> file", or even simplier -- just a single direct writeimplicitly updating mtime), we wouldn't know which of those twomtime-update-requests are processed by fused first. That comes from ageneral FUSE protocol limitation: when kernel fuse queues request A,then request B, it cannot be sure if they will be processed by userspaceas <A, then B> or <B, then A>.

The big advantage of the patch I sent is that it's very simple,straightforward and presumably will remove 99% of contention betweenfsync and io_submit (assuming we spend most of time waiting foruserspace ACK for FUSE_FSYNC request. There are actually three questionsto answer:

1) Do we really must honor a crazy app who mixes a lot of fsyncs with alot of io_submits? The goal of fsync is to ensure that some state isactually went to platters. An app who races io_submit-s with fsync-sactually doesn't care which state will come to platters. I'm not surethat it's reasonable to work very hard to achieve the best possibleperformance for such a marginal app.

2) Will the patch (in the form I sent it) break something? I think no.If you know some usecase that can be broken, let's discuss it in moredetails.

3) Should we expect some noticeable (or significant) improvement inperformance comparing fuse_fsync with no locking at all vs. the lockingwe have with that patch applied? I tend to think that the answer is "no"because handling FUSE_FSYNC is notoriously heavy-weight operation. Ifyou disagree, let's firstly measure that difference in performance(simply commenting out lock/unlock(i_mutex) in fuse_fsync) and thenstart to think if it's really worthy to fully re-work locking scheme topreserve flush_mtime correctness w/o i_mutex.



Thanks,

Maxim


On 11/30/2016 05:09 AM, Alexey Kuznetsov wrote:

Sorry, missed that pair fuse_set_nowrite/fuse_release_writes
can be done only under i_mutex.

IMHO it is only due to bad implementation.
If fuse_set_nowrite would be done with separate
count instead of adding negative bias, it would
be possible.


On Wed, Nov 30, 2016 at 3:47 PM, Alexey Kuznetsov <kuz...@virtuozzo.com> wrote:

Hello!

I do not think you got it right.

i_mutex in fsync is not about some atomicity,
it is about stopping data feed while fsync is executed
to prevent livelock.

I cannot tell anything about mtime update, it is just some voodoo
magic for me.

What's about fsync semantics, I see two different ways:

A.

1. Remove useless write_inode_now. Its work is done
     by filemap_write_and_wait_range(), there is no need to repeat it
under mutex.
2. move mutex_lock _after_  fuse_sync_writes(), which is essentially
     fuse continuation forfilemap_write_and_wait_range().
3. i_mutex is preserved only around fsync call.

B.
1. Remove  write_inode_now as well.
2. Remove i_mutex _completely_. (No idea about mtime voodo though)
2. Replace fuse_sync_writes() with fuse_set_nowrite()
     and add release after call to FSYNC.

Both prevent livelock. B is obviosly optimal.

But A preserves historic fuse protocol semantics.
F.e. I have no idea would user space survive truncate
racing with fsync. pstorage should survice, though this
path was never tested.





On Wed, Nov 30, 2016 at 4:02 AM, Maxim Patlasov <mpatla...@virtuozzo.com> wrote:

fuse_fsync_common() does need i_mutex for fuse_sync_writes() and
fuse_flush_mtime(). But when those operations are done, it's actually
doesn't matter whether to hold the lock over fuse_request_send(FUSE_FSYNC)
or not: we ensured that all relevant data were already seen by
userspace fuse daemon, and so it will sync them (by handling FUSE_FSYNC)
anyway; if the user screws up by leaking new data updates in-between, it
is up to the user and doesn't violate fsync(2) semantics.

https://jira.sw.ru/browse/PSBM-55919

Signed-off-by: Maxim Patlasov <mpatla...@virtuozzo.com>
---
  fs/fuse/file.c |    3 +++
  1 file changed, 3 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 464b2f5..559dfd9 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -697,6 +697,8 @@ int fuse_fsync_common(struct file *file, loff_t start, 
loff_t end,
                 goto out;
         }

+       mutex_unlock(&inode->i_mutex);
+
         memset(&inarg, 0, sizeof(inarg));
         inarg.fh = ff->fh;
         inarg.fsync_flags = datasync ? 1 : 0;
@@ -715,6 +717,7 @@ int fuse_fsync_common(struct file *file, loff_t start, 
loff_t end,
                         fc->no_fsync = 1;
                 err = 0;
         }
+       return err;
  out:
         mutex_unlock(&inode->i_mutex);
         return err;


_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Re: [Devel] [PATCH vz7] fuse: relax i_mutex coverage in fuse_fsync

Reply via email to