Re: [Gluster-devel] Changing the relative order of read-ahead and open-behind

Raghavendra Gowdappa Tue, 25 Jul 2017 03:49:10 -0700


----- Original Message -----
> From: "Amar Tumballi" <[email protected]>
> To: "Raghavendra Gowdappa" <[email protected]>
> Cc: "Pranith Kumar Karampuri" <[email protected]>, "Gluster Devel" 
> <[email protected]>
> Sent: Tuesday, July 25, 2017 4:06:27 PM
> Subject: Re: [Gluster-devel] Changing the relative order of read-ahead and 
> open-behind
> 
> On Tue, Jul 25, 2017 at 2:38 PM, Raghavendra G <[email protected]>
> wrote:
> 
> >
> >
> > On Tue, Jul 25, 2017 at 10:39 AM, Amar Tumballi <[email protected]>
> > wrote:
> >
> >>
> >>
> >> On Tue, Jul 25, 2017 at 9:33 AM, Raghavendra Gowdappa <
> >> [email protected]> wrote:
> >>
> >>>
> >>>
> >>> ----- Original Message -----
> >>> > From: "Pranith Kumar Karampuri" <[email protected]>
> >>> > To: "Raghavendra G" <[email protected]>
> >>> > Cc: "Gluster Devel" <[email protected]>
> >>> > Sent: Tuesday, July 25, 2017 7:51:07 AM
> >>> > Subject: Re: [Gluster-devel] Changing the relative order of read-ahead
> >>> and    open-behind
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Jul 24, 2017 at 5:11 PM, Raghavendra G <
> >>> [email protected] >
> >>> > wrote:
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Jul 21, 2017 at 6:39 PM, Vijay Bellur < [email protected] >
> >>> wrote:
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Jul 21, 2017 at 3:26 AM, Raghavendra Gowdappa <
> >>> [email protected] >
> >>> > wrote:
> >>> >
> >>> >
> >>> > Hi all,
> >>> >
> >>> > We've a bug [1], due to which read-ahead is completely disabled when
> >>> the
> >>> > workload is read-only. One of the easy fix was to make read-ahead as an
> >>> > ancestor of open-behind in xlator graph (Currently its a descendant). A
> >>> > patch has been sent out by Rafi to do the same. As noted in one of the
> >>> > comments, one flip side of this solution is that small files (which are
> >>> > eligible to be cached by quick read) are cached twice - once each in
> >>> > read-ahead and quick-read - wasting up precious memory. However, there
> >>> are
> >>> > no other simpler solutions for this issue. If you've concerns on the
> >>> > approach followed by [2] or have other suggestions please voice them
> >>> out.
> >>> > Otherwise, I am planning to merge [2] for lack of better alternatives.
> >>> >
> >>> >
> >>> > Since the maximum size of files cached by quick-read is 64KB, can we
> >>> have
> >>> > read-ahead kick in for offsets greater than 64KB?
> >>> >
> >>> > I got your point. We can enable read-ahead only for files whose size is
> >>> > greater than the size eligible for caching quick-read. IOW, read-ahead
> >>> gets
> >>> > disabled if file size is less than 64KB. Thanks for the suggestion.
> >>> >
> >>> > I added a comment on the patch to move the xlators in reverse to the
> >>> way the
> >>> > patch is currently doing. Milind I think implemented it. Will that
> >>> lead to
> >>> > any problem?
> >>>
> >>> From gerrit:
> >>>
> >>> <comment>
> >>>
> >>> It fixes the issue too and it is a better solution than the current one
> >>> as it doesn't run into duplicate cache problem. The reason open-behind
> >>> was
> >>> loaded as an ancestor of quick-read was that it seemed unnecessary that
> >>> quick-read should even witness an open. However,
> >>>
> >>>    * looking into code qr_open is indeed setting some priority for the
> >>> inode which will be used during purging of cache due to exceeding cache
> >>> limit. So, it helps quick read to witness an open.
> >>>    * the real benefit of open-behind is avoiding fops over network. So,
> >>> as long as open-behind is loaded in client stack, we reap its benefits.
> >>>    * Also note that if option "read-after-open" is set in open-behind,
> >>> an open is anyways done over network irrespective of whether quick-read
> >>> has
> >>> cached the file, which to me looks unnecessary. By moving open-behind as
> >>> a
> >>> descendant of quick-read, open-behind won't even witness a read when the
> >>> file is cached by quick-read. But, if read-after-open option is
> >>> implemented
> >>> in open-behind with the goal of fixing non-posix compliance for the case
> >>> of
> >>> open fd on a file is unlinked, we might regress. But again, even this
> >>> approach doesn't fix the compliance problem completely. One has to turn
> >>> open-behind off to be completely posix complaint in this scenario.
> >>>
> >>> Given the reasons above, it helps just moving open-behind as a
> >>> descendant of read-ahead.
> >>>
> >>> </comment>
> >>>
> >>>
> >> Analysis looks good. But I would like us (all developers) to backup the
> >> theories like this with some data.
> >>
> >
> >> How about you plan a test case which can demonstrate the difference ?
> >>
> >
> > What is the scenario you want to measure here?
> >
> >
> 
> Scenario where by changing the order, the number of fops on wire would be
> different. Also if you have any particular internal metrics of these
> translators, on experimental branch, you can implement 'dump_metrics()'
> method and that can be measured in graphite/grafana.


Ok. Basically you are trying to validate the fix that it solves the issue. I 
think measuring latency and throughput for reads at application is sufficient 
enough for that I think. Bug has a test case and we can use it to validate the 
fix. Of course, we can watch for other metrics too and see anything abnormal 
happens (as in for eg., we shouldn't start seeing setattr calls out of the blue 
in a read workload).

Also we shouldn't have regressed in other related areas.

So, metrics I think useful are:

* measure read latency and throughput at application layer
* Since this patch touches open-behind, check whether there is any change in 
number of opens sent over network
* Since we were concerned with quick-read too, check whether reads on files 
smaller than 64K are done over network (they shouldn't be as they are expected 
to be served by quick-read)

regards,
Raghavendra
> 
> -Amar
> 
> 
> > I will help you set up metrics measuring with graphs [1] on experimental
> >> branch [2] to actually measure and graphically represent the hypothesis.
> >>
> >> We can set this as an example for future for anyone to try the
> >> permutation & combination of different xlator order. Who knows we may
> >> realize, for different work load, different order may be suitable.
> >>
> >> Regards,
> >> Amar
> >>
> >> [1] - https://github.com/amarts/glustermetrics
> >> [2] - https://github.com/gluster/glusterfs/tree/experimental
> >>
> >> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > Thanks,
> >>> > Vijay
> >>> >
> >>> > _______________________________________________
> >>> > Gluster-devel mailing list
> >>> > [email protected]
> >>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Raghavendra G
> >>> >
> >>> > _______________________________________________
> >>> > Gluster-devel mailing list
> >>> > [email protected]
> >>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Pranith
> >>> >
> >>> > _______________________________________________
> >>> > Gluster-devel mailing list
> >>> > [email protected]
> >>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> [email protected]
> >>> http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>>
> >>
> >>
> >>
> >> --
> >> Amar Tumballi (amarts)
> >>
> >> _______________________________________________
> >> Gluster-devel mailing list
> >> [email protected]
> >> http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>
> >
> >
> >
> > --
> > Raghavendra G
> >
> 
> 
> 
> --
> Amar Tumballi (amarts)
> 
_______________________________________________
Gluster-devel mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Changing the relative order of read-ahead and open-behind

Reply via email to