Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Tue, Mar 15, 2011 at 12:30 PM, Aneesh Kumar K. V wrote: > On Tue, 15 Mar 2011 11:11:46 +, Stefan Hajnoczi > wrote: >> On Tue, Mar 15, 2011 at 9:19 AM, Aneesh Kumar K. V >> wrote: >> > On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi >> > wrote: >> >> On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V >> >> wrote: >> >> > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi >> >> > wrote: >> >> >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V >> >> >> wrote: >> >> >> > cache=none implies the file are opened in the host with O_SYNC open >> >> >> > flag >> >> >> >> >> >> O_SYNC does not bypass the host page cache. It ensures that writes >> >> >> only complete once data has been written to the disk. >> >> >> >> >> >> O_DIRECT is a hint to bypass the host page cache when possible. >> >> >> >> >> >> A boolean on|off option would be nicer than an option that takes the >> >> >> special string "none". For example, direct=on|off. It also makes the >> >> >> code nicer by using bools instead of strdup strings that get leaked. >> >> >> >> >> > >> >> > What i wanted is the O_SYNC behavior. Well the comment should be >> >> > updated. I >> >> > want to make sure that we don't have dirty data in host page cache after >> >> > a write. It is always good to make read hit the page cache >> >> >> >> Why silently enforce O_SYNC on the server side? The client does not >> >> know whether or not O_SYNC is in effect, cannot take advantage of that >> >> knowledge, and cannot control it. >> >> >> >> I think a more useful solution is a 9p client mount option called >> >> "sync" that caused the client to always add O_SYNC and skip syncfs. >> >> The whole stack becomes aware of O_SYNC and clients are in control >> >> over whether or not they need O_SYNC semantics. >> > >> > The cache=none specifically enables us to ignore the tsyncfs request on >> > host. tsyncfs on host can be really slow in certain setup. >> >> If I'm a client with the "sync" mount option all my fids are O_SYNC >> and I do not need to send TSYNCFS requests to the server because my >> fids are already stable. > > Having sync mount option is useful, Infact for dotu we already default > O_SYNC on the client side because we don't have tsyncfs. But being able > to avoid the tfsyncfs flush from the server point of view also is > nice. Consider a setup where one doesn't have control on the guest > mount option but can control the qemu export options. If the guest mount is sync it should not send syncfs because all its files are opened O_SYNC and do not need syncing. Toggling this option on the host side has no effect either way. If the guest mount is not sync then setting the option on the host side overrides the guest and forces O_SYNC. The result will likely be slower writes but nop syncfs. In both cases, the guest (and the application) is where O_SYNC matters from a data integrity perspective. The guest always has control over whether it opens files with O_SYNC. So this host side control only enables you to force O_SYNC when the guest doesn't expect it, which does not seem like a useful feature to me. What does this buy us? Stefan
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Tue, 15 Mar 2011 11:11:46 +, Stefan Hajnoczi wrote: > On Tue, Mar 15, 2011 at 9:19 AM, Aneesh Kumar K. V > wrote: > > On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi > > wrote: > >> On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V > >> wrote: > >> > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi > >> > wrote: > >> >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V > >> >> wrote: > >> >> > cache=none implies the file are opened in the host with O_SYNC open > >> >> > flag > >> >> > >> >> O_SYNC does not bypass the host page cache. It ensures that writes > >> >> only complete once data has been written to the disk. > >> >> > >> >> O_DIRECT is a hint to bypass the host page cache when possible. > >> >> > >> >> A boolean on|off option would be nicer than an option that takes the > >> >> special string "none". For example, direct=on|off. It also makes the > >> >> code nicer by using bools instead of strdup strings that get leaked. > >> >> > >> > > >> > What i wanted is the O_SYNC behavior. Well the comment should be > >> > updated. I > >> > want to make sure that we don't have dirty data in host page cache after > >> > a write. It is always good to make read hit the page cache > >> > >> Why silently enforce O_SYNC on the server side? The client does not > >> know whether or not O_SYNC is in effect, cannot take advantage of that > >> knowledge, and cannot control it. > >> > >> I think a more useful solution is a 9p client mount option called > >> "sync" that caused the client to always add O_SYNC and skip syncfs. > >> The whole stack becomes aware of O_SYNC and clients are in control > >> over whether or not they need O_SYNC semantics. > > > > The cache=none specifically enables us to ignore the tsyncfs request on > > host. tsyncfs on host can be really slow in certain setup. > > If I'm a client with the "sync" mount option all my fids are O_SYNC > and I do not need to send TSYNCFS requests to the server because my > fids are already stable. Having sync mount option is useful, Infact for dotu we already default O_SYNC on the client side because we don't have tsyncfs. But being able to avoid the tfsyncfs flush from the server point of view also is nice. Consider a setup where one doesn't have control on the guest mount option but can control the qemu export options. -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Tue, Mar 15, 2011 at 9:19 AM, Aneesh Kumar K. V wrote: > On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi > wrote: >> On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V >> wrote: >> > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi >> > wrote: >> >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V >> >> wrote: >> >> > cache=none implies the file are opened in the host with O_SYNC open flag >> >> >> >> O_SYNC does not bypass the host page cache. It ensures that writes >> >> only complete once data has been written to the disk. >> >> >> >> O_DIRECT is a hint to bypass the host page cache when possible. >> >> >> >> A boolean on|off option would be nicer than an option that takes the >> >> special string "none". For example, direct=on|off. It also makes the >> >> code nicer by using bools instead of strdup strings that get leaked. >> >> >> > >> > What i wanted is the O_SYNC behavior. Well the comment should be updated. I >> > want to make sure that we don't have dirty data in host page cache after >> > a write. It is always good to make read hit the page cache >> >> Why silently enforce O_SYNC on the server side? The client does not >> know whether or not O_SYNC is in effect, cannot take advantage of that >> knowledge, and cannot control it. >> >> I think a more useful solution is a 9p client mount option called >> "sync" that caused the client to always add O_SYNC and skip syncfs. >> The whole stack becomes aware of O_SYNC and clients are in control >> over whether or not they need O_SYNC semantics. > > The cache=none specifically enables us to ignore the tsyncfs request on > host. tsyncfs on host can be really slow in certain setup. If I'm a client with the "sync" mount option all my fids are O_SYNC and I do not need to send TSYNCFS requests to the server because my fids are already stable. Stefan
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi wrote: > On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V > wrote: > > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi > > wrote: > >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V > >> wrote: > >> > cache=none implies the file are opened in the host with O_SYNC open flag > >> > >> O_SYNC does not bypass the host page cache. It ensures that writes > >> only complete once data has been written to the disk. > >> > >> O_DIRECT is a hint to bypass the host page cache when possible. > >> > >> A boolean on|off option would be nicer than an option that takes the > >> special string "none". For example, direct=on|off. It also makes the > >> code nicer by using bools instead of strdup strings that get leaked. > >> > > > > What i wanted is the O_SYNC behavior. Well the comment should be updated. I > > want to make sure that we don't have dirty data in host page cache after > > a write. It is always good to make read hit the page cache > > Why silently enforce O_SYNC on the server side? The client does not > know whether or not O_SYNC is in effect, cannot take advantage of that > knowledge, and cannot control it. > > I think a more useful solution is a 9p client mount option called > "sync" that caused the client to always add O_SYNC and skip syncfs. > The whole stack becomes aware of O_SYNC and clients are in control > over whether or not they need O_SYNC semantics. The cache=none specifically enables us to ignore the tsyncfs request on host. tsyncfs on host can be really slow in certain setup. -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sun, 13 Mar 2011 20:57:12 +, Stefan Hajnoczi wrote: > On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V > wrote: > > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi > > wrote: > >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V > >> wrote: > >> > cache=none implies the file are opened in the host with O_SYNC open flag > >> > >> O_SYNC does not bypass the host page cache. It ensures that writes > >> only complete once data has been written to the disk. > >> > >> O_DIRECT is a hint to bypass the host page cache when possible. > >> > >> A boolean on|off option would be nicer than an option that takes the > >> special string "none". For example, direct=on|off. It also makes the > >> code nicer by using bools instead of strdup strings that get leaked. > >> > > > > What i wanted is the O_SYNC behavior. Well the comment should be updated. I > > want to make sure that we don't have dirty data in host page cache after > > a write. It is always good to make read hit the page cache > > I have sent a patch to clean up the -virtfs option parsing, you are > CCed. I think it will make it easier to add a new sync=on|off option. Absolutely. So what i will do is i will carry the patch in the series and later will drop the same once your changes get pushed upstream. -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V wrote: > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi > wrote: >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V >> wrote: >> > cache=none implies the file are opened in the host with O_SYNC open flag >> >> O_SYNC does not bypass the host page cache. It ensures that writes >> only complete once data has been written to the disk. >> >> O_DIRECT is a hint to bypass the host page cache when possible. >> >> A boolean on|off option would be nicer than an option that takes the >> special string "none". For example, direct=on|off. It also makes the >> code nicer by using bools instead of strdup strings that get leaked. >> > > What i wanted is the O_SYNC behavior. Well the comment should be updated. I > want to make sure that we don't have dirty data in host page cache after > a write. It is always good to make read hit the page cache Why silently enforce O_SYNC on the server side? The client does not know whether or not O_SYNC is in effect, cannot take advantage of that knowledge, and cannot control it. I think a more useful solution is a 9p client mount option called "sync" that caused the client to always add O_SYNC and skip syncfs. The whole stack becomes aware of O_SYNC and clients are in control over whether or not they need O_SYNC semantics. Stefan
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V wrote: > On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi > wrote: >> On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V >> wrote: >> > cache=none implies the file are opened in the host with O_SYNC open flag >> >> O_SYNC does not bypass the host page cache. It ensures that writes >> only complete once data has been written to the disk. >> >> O_DIRECT is a hint to bypass the host page cache when possible. >> >> A boolean on|off option would be nicer than an option that takes the >> special string "none". For example, direct=on|off. It also makes the >> code nicer by using bools instead of strdup strings that get leaked. >> > > What i wanted is the O_SYNC behavior. Well the comment should be updated. I > want to make sure that we don't have dirty data in host page cache after > a write. It is always good to make read hit the page cache I have sent a patch to clean up the -virtfs option parsing, you are CCed. I think it will make it easier to add a new sync=on|off option. Stefan
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi wrote: > On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V > wrote: > > cache=none implies the file are opened in the host with O_SYNC open flag > > O_SYNC does not bypass the host page cache. It ensures that writes > only complete once data has been written to the disk. > > O_DIRECT is a hint to bypass the host page cache when possible. > > A boolean on|off option would be nicer than an option that takes the > special string "none". For example, direct=on|off. It also makes the > code nicer by using bools instead of strdup strings that get leaked. > What i wanted is the O_SYNC behavior. Well the comment should be updated. I want to make sure that we don't have dirty data in host page cache after a write. It is always good to make read hit the page cache -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V wrote: > cache=none implies the file are opened in the host with O_SYNC open flag O_SYNC does not bypass the host page cache. It ensures that writes only complete once data has been written to the disk. O_DIRECT is a hint to bypass the host page cache when possible. A boolean on|off option would be nicer than an option that takes the special string "none". For example, direct=on|off. It also makes the code nicer by using bools instead of strdup strings that get leaked. > @@ -2379,6 +2379,8 @@ int main(int argc, char **argv, char **envp) > } > break; > case QEMU_OPTION_virtfs: { > + const char *cache; > + char *arg_cache = NULL; > char *arg_fsdev = NULL; > char *arg_9p = NULL; > int len = 0; > @@ -2404,7 +2406,18 @@ int main(int argc, char **argv, char **envp) > exit(1); > } > > + cache = qemu_opt_get(opts, "cache"); > + if (cache) { > + len = strlen(",cache="); > + len += strlen(cache); > + arg_cache = qemu_malloc(len+1); > + snprintf(arg_cache, (len+1), ",cache=%s", cache); > + } else { > + arg_cache = (char *)""; > + } > + > len = strlen(",id=,path=,security_model="); > + len += strlen(arg_cache); > len += strlen(qemu_opt_get(opts, "fstype")); > len += strlen(qemu_opt_get(opts, "mount_tag")); > len += strlen(qemu_opt_get(opts, "path")); > @@ -2412,20 +2425,22 @@ int main(int argc, char **argv, char **envp) > arg_fsdev = qemu_malloc((len + 1) * sizeof(*arg_fsdev)); > > snprintf(arg_fsdev, (len + 1) * sizeof(*arg_fsdev), > - "%s,id=%s,path=%s,security_model=%s", > + "%s,id=%s,path=%s,security_model=%s%s", > qemu_opt_get(opts, "fstype"), > qemu_opt_get(opts, "mount_tag"), > qemu_opt_get(opts, "path"), > - qemu_opt_get(opts, "security_model")); > + qemu_opt_get(opts, "security_model"), > + arg_cache); > > len = strlen("virtio-9p-pci,fsdev=,mount_tag="); > len += 2*strlen(qemu_opt_get(opts, "mount_tag")); > arg_9p = qemu_malloc((len + 1) * sizeof(*arg_9p)); > > snprintf(arg_9p, (len + 1) * sizeof(*arg_9p), > - "virtio-9p-pci,fsdev=%s,mount_tag=%s", > + "virtio-9p-pci,fsdev=%s,mount_tag=%s%s", > + qemu_opt_get(opts, "mount_tag"), > qemu_opt_get(opts, "mount_tag"), > - qemu_opt_get(opts, "mount_tag")); > + arg_cache); > > if (!qemu_opts_parse(qemu_find_opts("fsdev"), arg_fsdev, 1)) { > fprintf(stderr, "parse error [fsdev]: %s\n", optarg); arg_cache is leaked. Stefan
[Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
cache=none implies the file are opened in the host with O_SYNC open flag Signed-off-by: Aneesh Kumar K.V --- fsdev/qemu-fsdev.c |8 +++- fsdev/qemu-fsdev.h |1 + hw/9pfs/virtio-9p.c | 17 - hw/file-op-9p.h |1 + qemu-config.c |6 ++ qemu-options.hx | 17 - vl.c| 23 +++ 7 files changed, 58 insertions(+), 15 deletions(-) diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c index 0b33290..d803d47 100644 --- a/fsdev/qemu-fsdev.c +++ b/fsdev/qemu-fsdev.c @@ -33,6 +33,8 @@ int qemu_fsdev_add(QemuOpts *opts) const char *fstype = qemu_opt_get(opts, "fstype"); const char *path = qemu_opt_get(opts, "path"); const char *sec_model = qemu_opt_get(opts, "security_model"); +const char *cache = qemu_opt_get(opts, "cache"); + if (!fsdev_id) { fprintf(stderr, "fsdev: No id specified\n"); @@ -71,10 +73,14 @@ int qemu_fsdev_add(QemuOpts *opts) fsle->fse.path = qemu_strdup(path); fsle->fse.security_model = qemu_strdup(sec_model); fsle->fse.ops = FsTypes[i].ops; +if (cache) { +fsle->fse.cache = qemu_strdup(cache); +} else { +fsle->fse.cache = NULL; +} QTAILQ_INSERT_TAIL(&fstype_entries, fsle, next); return 0; - } FsTypeEntry *get_fsdev_fsentry(char *id) diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h index a704043..d3a0fac 100644 --- a/fsdev/qemu-fsdev.h +++ b/fsdev/qemu-fsdev.h @@ -41,6 +41,7 @@ typedef struct FsTypeEntry { char *fsdev_id; char *path; char *security_model; +char *cache; FileOperations *ops; } FsTypeEntry; diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index 293a562..75df1a1 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -70,17 +70,18 @@ static int omode_to_uflags(int8_t mode) return ret; } -static int get_dotl_openflags(int oflags) +static int get_dotl_openflags(V9fsState *s, int oflags) { int flags; /* * Since we can share the fd between multiple fids, * open the file in read write mode */ +flags = s->ctx.open_flags; if ((oflags & O_ACCMODE) != O_RDONLY) { -flags = O_RDWR; +flags |= O_RDWR; } else { -flags = O_RDONLY; +flags |= O_RDONLY; } /* * If the client asked for any of the below flags we @@ -1856,7 +1857,7 @@ static void v9fs_open_post_lstat(V9fsState *s, V9fsOpenState *vs, int err) v9fs_open_post_opendir(s, vs, err); } else { if (s->proto_version == V9FS_PROTO_2000L) { -flags = get_dotl_openflags(vs->mode); +flags = get_dotl_openflags(s, vs->mode); } else { flags = omode_to_uflags(vs->mode); } @@ -1987,7 +1988,7 @@ static void v9fs_lcreate(V9fsState *s, V9fsPDU *pdu) v9fs_string_sprintf(&vs->fullname, "%s/%s", vs->fidp->fsmap.path.data, vs->name.data); -flags = get_dotl_openflags(flags); +flags = get_dotl_openflags(s, flags); vs->fidp->fsmap.fs.fd = v9fs_do_open2(s, vs->fullname.data, vs->fidp->uid, gid, flags | O_CREAT, mode); vs->fidp->fsmap.open_flags = flags; @@ -3912,6 +3913,12 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) exit(1); } +if (fse->cache && !strcmp(fse->cache, "none")) { +s->ctx.open_flags = O_SYNC; +} else { +s->ctx.open_flags = 0; +} + s->ctx.fs_root = qemu_strdup(fse->path); len = strlen(conf->tag); if (len > MAX_TAG_LEN) { diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h index e306305..8577020 100644 --- a/hw/file-op-9p.h +++ b/hw/file-op-9p.h @@ -54,6 +54,7 @@ typedef struct FsContext char *fs_root; SecModel fs_sm; uid_t uid; +int open_flags; struct xattr_operations **xops; } FsContext; diff --git a/qemu-config.c b/qemu-config.c index 965fa46..4d87a39 100644 --- a/qemu-config.c +++ b/qemu-config.c @@ -165,6 +165,9 @@ QemuOptsList qemu_fsdev_opts = { }, { .name = "security_model", .type = QEMU_OPT_STRING, +}, { +.name = "cache", +.type = QEMU_OPT_STRING, }, { /*End of list */ } }, @@ -187,6 +190,9 @@ QemuOptsList qemu_virtfs_opts = { }, { .name = "security_model", .type = QEMU_OPT_STRING, +}, { +.name = "cache", +.type = QEMU_OPT_STRING, }, { /*End of list */ } diff --git a/qemu-options.hx b/qemu-options.hx index 898561d..ad27bf0 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -486,7 +486,8 @@ ETEXI DEFHEADING(File system options:) DEF("fsdev", HAS_ARG, QEMU_OPTION_fsdev, -"-fsdev local,id=id,path=path,security_model=[mapped|passthrough|none]\n", +"-fsdev local,id=id,path=path,security_model=[mapped|passthrough|none]\n" +" [,cache=none]\n", QEMU_ARCH_ALL)