This is great info - with a lot of options to take in :) To summarise, to enable direct-io and bypass the kernel filesystem cache for a volume 1. Mount the brick with direct-io-mode=enable option 2. run vol set <vol> performance.strict-o-direct on 3. update the vol files with 'o-direct' option in storage/posix (at least for now)
Is that right? On Thu, Feb 25, 2016 at 5:56 PM, Raghavendra Gowdappa <[email protected]> wrote: > > > ----- Original Message ----- > > From: "Krutika Dhananjay" <[email protected]> > > To: "Gluster Devel" <[email protected]>, "Raghavendra Gowdappa" > <[email protected]> > > Cc: "Paul Cuzner" <[email protected]> > > Sent: Thursday, February 25, 2016 7:28:30 AM > > Subject: What's the correct way to enable direct-IO? > > > > Hi, > > > > git-grep tells me there are multiple options in our code base for > enabling > > direct-IO on a gluster volume, at several layers in the translator stack: > > i) use the mount option 'direct-io-mode=enable' > > This option is between kernel and glusterfs. Specifically it asks fuse > kernel module to bypass page-cache. Note that when this option is set, > direct-io is enabled for _all_ fds irrespective of whether applications > have used O_DIRECT in their open/create calls or not. > > > ii) enable 'network.remote-dio' which is a protocol/client option using > > volume set command > > This is an option introduced by [1] to _filter_ O_DIRECT flags in > open/create calls before sending those requests to server. The option name > is misleading here. However please note that this is the key (alias?) used > by glusterd. The exact option name used by protocol/client is > "filter_O_DIRECT" and its fine. Probably we should file a bug on glusterd > to change the name? > > Coming to your use case, we don't want to filter O_DIRECT from reaching > brick. Hence, we need to set this option to _off_ (by default its disabled). > > I am still not sure what is the relevance of this option against the bug > it was introduced. If we need direct-io, we've to pass it to brick too, so > that backend fs on brick is configured appropriately. > > [1] http://review.gluster.org/4206 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213 > > > iii) enable performance.strict-o-direct which is a > performance/write-behind > > option using volume-set command > > Yes, write-behind honours O_DIRECT only if this option is set. So, we need > to enable this for your use-case. Also, note that applications still need > to use O_DIRECT in open/create calls. > > To summarize, following are the ways to bypass write-behind cache: > 1. disable write-behind :). > 2. applications use O_SYNC/O_DSYNC in open calls > 3. enable performance.strict-o-direct _and_ applications should use > O_DIRECT in open/create calls. > > > iv) use 'o-direct' option in storage/posix, volume-set on which reports > that > > the option doesn't exist. > > The option exists in storage/posix. But, there is no way to set it through > cli (probably you can send a patch to do that if necessary). With this > option, O_DIRECT is passed with _every_ open/create call on the brick. > > > > > So then the question is - what is a surefire way to get direct-io-like > > behavior on gluster volume(s)? > > There is no one global option. You need to configure various translators > in the stack. Probably [2] was asking for such a feature. Also, as you > might've noticed above the behavior/interpretation of these options is not > same across all translators (like some are global and some are local only > to an fd etc). > > Also note that apart from the options you listed above, > 1. Quick-read is not aware of O_DIRECT. We need to make it to disable > caching if open happens with O_DIRECT. > 2. Handling of Quota Marker xattrs is not synchronous (though not exactly > an O_DIRECT requirement) as marking is done after sending reply to calls > like writev. > > On a related note, found article [3] to be informative. > > [1] http://review.gluster.org/4206 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213 > [3] https://lwn.net/Articles/457667/ > > regards, > Raghavendra. >
_______________________________________________ Gluster-devel mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-devel
