On Tue, May 30, 2017 at 6:42 PM, Shyam <[email protected]> wrote: > On 05/30/2017 05:28 AM, Krutika Dhananjay wrote: > >> You're right. With brick graphs, this will be a problem. >> >> Couple of options: >> >> 1. To begin with we identify points where we think it would be useful to >> load io-stats in the brick graph and unconditionally have >> glusterd-volgen load them in the volfile only at these places (not very >> useful if we want to load trace xl though. Plus, this again makes >> io-stats placement static). >> > > I think this is needed (easier to get in), so +1 for this. > > Additionally, if this is chosen, we may need specific triggers for each > instance, to target measuring the io-stats. IOW, generic io-stats can > measure below
I tried this recently with existing code by configuring the stats-dump-interval and fop-sample-interval options, and each instance dumps its stats into a file under /var/lib/glusterd/stats. There's one file per io-stats xl. The downside is that with each interval, the stats from the prev interval gets overwritten. I'm planning to change this by adding a timestamp + pid suffix to the dump file name. > FUSE (as an example) and below server-protocol. Then, we may want to > enable io-threads (assuming this is one instance on the brick that is a > static placement), or POSIX (or both/all) specifically, than have them > enabled by default when io-stats is turned on (which is the current > behaviour). > > I didn't follow the io-threads and posix part. Could you rephrase? I'm thinking of changing code that would make io-stats be loaded above and below io-threads and also above posix. Not sure if you meant the same thing in your statement above ;) Would that be fine? Or is there value in loading it elsewhere? -Krutika > Does this make sense? > > >> 2. Embed the trace/io-stats functionality within xlator_t object itself, >> and keep the accounting disabled by default. Only when required, the >> user can perhaps enable the accounting options with volume-set or >> through volume-profile start command for the brief period where they >> want to capture the stats and disable it as soon as they're done. >> > > This is a better longer term solution IMO. This way there is no further > injection of io-stats xltor, and we get a lot more control on this better. > > Depending on time to completion, I would choose 1/2 as presented above. > This is because, I see a lot of value in this and in answering user queries > on what is slowing down their systems, so sooner we have this the better > (say 3.12), if (2) is possible by then, more power to it. > > >> Let me know what you think. >> >> -Krutika >> >> On Fri, May 26, 2017 at 9:19 PM, Shyam <[email protected] >> <mailto:[email protected]>> wrote: >> >> On 05/26/2017 05:44 AM, Krutika Dhananjay wrote: >> >> Hi, >> >> debug/io-stats and debug/trace are immensely useful for isolating >> translators that are performance bottlenecks and those that are >> causing >> iatt inconsistencies, respectively. >> >> There are other translators too under xlators/debug such as >> error-gen, >> which are useful for debugging/testing our code. >> >> The trick is to load these above and below one or more suspect >> translators, run the test and analyse the output they dump and >> debug >> your problem. >> >> Unfortunately, there is no way to load these at specific points >> in the >> graph using the volume-set CLI as of today. Our only option is to >> manually edit the volfile and restart the process and be >> super-careful >> not to perform *any* volume-{reset,set,profile} operation and >> graph >> switch operations in general that could rewrite the volfile, >> wiping out >> all previous edits to it. >> >> I propose the following CLI for achieving the same: >> >> # gluster volume set <VOL> {debug.trace, debug.io-stats, >> debug.error-gen} <xl-name> >> >> where <xl-name> represents the name of the translator above >> which you >> want this translator loaded (as parent). >> >> For example, if i have a 2x2 dis-rep volume named testvol and I >> want to >> load trace above and below first child of DHT, I execute the >> following >> commands: >> >> # gluster volume set <VOL> debug.trace testvol-replicate-0 >> # gluster volume set <VOL> debug.trace testvol-client-0 >> # gluster volume set <VOL> debug.trace testvol-client-1 >> >> The corresponding debug/trace translators will be named >> testvol-replicate-0-trace-parent, testvol-client-0-trace-parent, >> testvol-client-1-trace-parent and so on. >> >> To revert the change, the user simply uses volume-reset CLI: >> >> # gluster volume reset <VOL> testvol-replicate-0-trace-parent >> # gluster volume reset <VOL> testvol-client-0-trace-parent >> # gluster volume reset <VOL> testvol-client-1-trace-parent >> >> What should happen when the translator with a >> trace/io-stat/error-gen >> parent gets disabled? >> Well glusterd should be made to take care to remove the trace xl >> too >> from the graph. >> >> >> >> Comments and suggestions welcome. >> >> >> +1, dynamic placement of io-stats was something that I added to this >> spec [1] as well. So I am all for the change. >> >> I have one problem though that bothered me when I wrote the spec, >> currently brick vol files are static, and do not undergo a graph >> change (or code is not yet ready to do that). So when we want to do >> this on the bricks, what happens? Do you have solutions for the >> same? I am interested, hence asking! >> >> [1] Initial feature description for improved io-stats: >> https://review.gluster.org/#/c/16558/1/under_review/Performa >> nce_monitoring_and_debugging.md >> <https://review.gluster.org/#/c/16558/1/under_review/Perform >> ance_monitoring_and_debugging.md> >> >> >>
_______________________________________________ Gluster-devel mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-devel
