Re: [Gluster-devel] Proposal for change regarding latency calculation

2014-08-01 Thread Krishnan Parthasarathi
Vipul,

- Original Message -
> I would want tests of all the posix operations. Need a difference not just in
> throughput, but in max iops for the various ops.

The test that I suggested is incomplete. Like Joe has suggested,
we should see the effect of profiling on all FOPs. This could be done modifying
tests/basic/fops-sanity.t (present in glusterfs repo) a little bit.

This by itself could take longer than we might like it to from glusterfsiostat
perspective. How inconvenient would it be to make glusterfsiostat return an
error message to the user when profiling is disabled on the volume whose
statistics are queried? This could be the approach for the first cut.
Once we have settled that performance impact of enabling profiling by default 
is bearable, 
we could change glusterfsiostat to work seamlessly.

Does that make sense?

~KP

> 
> On 07/27/2014 08:27 AM, Vipul Nayyar wrote:
> 
> 
> 
> Hi
> 
> As guided by you, I performed the experiment regarding measurement of the
> effect of always enabled profiling. I performed two write tests, one with a
> 20 MB file and the other with a 730 MB file. Each file was written 20 times
> to the mounted volume after clearing the buffers on every iteration and the
> time taken measured with the time command. I ran the following bash script
> for this purpose.
> 
> i=1
> while [[ $i -lt 21 ]]; do
> sync && echo 3 > /proc/sys/vm/drop_caches
> path="/mnt/write_test"$i
> out=$( time cp /home/vipul/test.avi $path)
> i=$((i+1))
> done
> 
> Since the values at different times for writing the same file are quite
> varied, I plotted a graph using the obtained values(Y-axis represents
> seconds) which can be found attached. As you might see in these images,
> there is no clear pattern found in the variation of values obtained while
> writing.
> 
> So according to me, values in both the conditions are quite near to each
> other and equally capable of going quite high or low than the mean value and
> hence, there is no negative effect seen due to the change proposed. I hope
> someone else can shed more light on whether setting the option(always
> enabled profiling) really decreased the performance or not.
> 
> Regards
> Vipul Nayyar
> 
> 
> 
> On Wednesday, 16 July 2014 12:50 PM, Krishnan Parthasarathi
>  wrote:
> 
> 
> Vipul,
> 
> 
> 
> 
> 
> Hello,
> 
> Following is a proposal for modifying the io profiling capability of the
> io-stats xlator. I recently sent in a patch(review.gluster.org/#/c/8244/)
> regarding that, which uses the already written latency related functions in
> io-stats to dump info through meta and added some more data containers which
> would track some more fops related info each time a request goes through
> io-stats. Currently, before the io-stats' custom latency functions can run,
> the measure_latency and count_fop_hits option should be enabled. I propose
> to remove these two options entirely from io-stats.
> 
> In order to track io performance, these options should be enabled all the
> time, or removed entirely, so that a record of io requests can be kept since
> mount time, else enabling these options only when it is required will not
> give you the average statistics over the whole period since the start. This
> is based on the methodology of Linux kernel itself, since it internally
> maintains the io statistics data structures all the time and presents it via
> /proc filesystem whenever required. Enabling of any option is not required,
> and the data available represents statistics since the boot time.
> 
> I would like to know the views over this, if having io-stats profiling info
> available all the time would be a good thing?
> Could you run the following experiment to measure the effect of profiling
> being enabled always?
> - Fix the I/O workload to be run.
> - Setup 1 (control group) : Run the fixed workload on a volume with both the
> profiling options NOT set.
> - Setup 2 : Run the (same) fixed workload on the same volume with the
> profiling options set.
> - In both setup, measure the latencies observed by the said workload. You
> could use time(1) command
> for a crude measurement.
> 
> This should allow us to make an informed decision on whether there is any
> performance effect
> when profiling is enabled on a volume by default.
> 
> 
> 
> 
> Apart from this, I was going over latency.c in libglusterfs, which does a
> fine job of maintaining latency info for every xlator and encountered an
> anomaly which I thought should be dealt with. The function
> gf_proc_dump_latency_info which dumps the latency array for the specified
> xlator consists of a last line which in the end flushes this array through
> memset after every dump. That means, you get different latency info every
> time you read the profile file in meta. I think, flushing the data structure
> after every dump is wrong since, you don't get overall stats since one
> enabled the option at the top of meta, and more importantly, multiple
> applications reading this file can

Re: [Gluster-devel] Proposal for change regarding latency calculation

2014-07-27 Thread Joe Julian
I would want tests of all the posix operations. Need a difference not 
just in throughput, but in max iops for the various ops.


On 07/27/2014 08:27 AM, Vipul Nayyar wrote:

Hi

As guided by you, I performed the experiment regarding measurement of 
the effect of always enabled profiling. I performed two write tests, 
one with a 20 MB file and the other with a 730 MB file. Each file was 
written 20 times to the mounted volume after clearing the buffers on 
every iteration and the time taken measured with the time command. I 
ran the following bash script for this purpose.


i=1
while [[ $i -lt 21 ]]; do
sync && echo 3 > /proc/sys/vm/drop_caches
path="/mnt/write_test"$i
out=$( time cp /home/vipul/test.avi $path)
i=$((i+1))
done

Since the values at different times for writing the same file are 
quite varied, I plotted a graph using the obtained values(Y-axis 
represents seconds) which can be found attached. As you might see in 
these images, there is no clear pattern found in the variation of 
values obtained while writing.


So according to me, values in both the conditions are quite near to 
each other and equally capable of going quite high or low than the 
mean value and hence, there is no negative effect seen due to the 
change proposed. I hope someone else can shed more light on whether 
setting the option(always enabled profiling) really decreased the 
performance or not.


Regards
Vipul Nayyar



On Wednesday, 16 July 2014 12:50 PM, Krishnan Parthasarathi 
 wrote:



Vipul,



Hello,

Following is a proposal for modifying the io profiling capability
of the io-stats xlator. I recently sent in a
patch(review.gluster.org/#/c/8244/) regarding that, which uses the
already written latency related functions in io-stats to dump info
through meta and added some more data containers which would track
some more fops related info each time a request goes through
io-stats. Currently, before the io-stats' custom latency functions
can run, the measure_latency and count_fop_hits option should be
enabled. I propose to remove these two options entirely from io-stats.

In order to track io performance, these options should be enabled
all the time, or removed entirely, so that a record of io requests
can be kept since mount time, else enabling these options only
when it is required will not give you the average statistics over
the whole period since the start. This is based on the methodology
of Linux kernel itself, since it internally maintains the io
statistics data structures all the time and presents it via /proc
filesystem whenever required. Enabling of any option is not
required, and the data available represents statistics since the
boot time.

I would like to know the views over this, if having io-stats
profiling info available all the time would be a good thing?

Could you run the following experiment to measure the effect of 
profiling being enabled always?

- Fix the I/O workload to be run.
- Setup 1 (control group) : Run the fixed workload on a volume with 
both the profiling options NOT set.
- Setup 2 : Run the (same) fixed workload on the same volume with the 
profiling options set.
- In both setup, measure the latencies observed by the said workload. 
You could use time(1) command

  for a crude measurement.

This should allow us to make an informed decision on whether there is 
any performance effect

when profiling is enabled on a volume by default.


Apart from this, I was going over latency.c in libglusterfs, which
does a fine job of maintaining latency info for every xlator and
encountered an anomaly which I thought should be dealt with. The
function gf_proc_dump_latency_info which dumps the latency array
for the specified xlator consists of a last line which in the end
flushes this array through memset after every dump. That means,
you get different latency info every time you read the profile
file in meta. I think, flushing the data structure after every
dump is wrong since, you don't get overall stats since one enabled
the option at the top of meta, and more importantly, multiple
applications reading this file can get wrong info, since it gets
cleared after one read only.

Clearing of the statistics on every request sounds incorrect to me. 
Could you please send a patch to fix this?


thanks,
Krish


If my reasons seem apt for you, I'll send a patch over for evaluation.

Regards
Vipul Nayyar






___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Proposal for change regarding latency calculation

2014-07-16 Thread Krishnan Parthasarathi
Vipul, 

- Original Message -

> Hello,

> Following is a proposal for modifying the io profiling capability of the
> io-stats xlator. I recently sent in a patch(review.gluster.org/#/c/8244/)
> regarding that, which uses the already written latency related functions in
> io-stats to dump info through meta and added some more data containers which
> would track some more fops related info each time a request goes through
> io-stats. Currently, before the io-stats' custom latency functions can run,
> the measure_latency and count_fop_hits option should be enabled. I propose
> to remove these two options entirely from io-stats.

> In order to track io performance, these options should be enabled all the
> time, or removed entirely, so that a record of io requests can be kept since
> mount time, else enabling these options only when it is required will not
> give you the average statistics over the whole period since the start. This
> is based on the methodology of Linux kernel itself, since it internally
> maintains the io statistics data structures all the time and presents it via
> /proc filesystem whenever required. Enabling of any option is not required,
> and the data available represents statistics since the boot time.

> I would like to know the views over this, if having io-stats profiling info
> available all the time would be a good thing?

Could you run the following experiment to measure the effect of profiling being 
enabled always? 
- Fix the I/O workload to be run. 
- Setup 1 (control group) : Run the fixed workload on a volume with both the 
profiling options NOT set. 
- Setup 2 : Run the (same) fixed workload on the same volume with the profiling 
options set. 
- In both setup, measure the latencies observed by the said workload. You could 
use time(1) command 
for a crude measurement. 

This should allow us to make an informed decision on whether there is any 
performance effect 
when profiling is enabled on a volume by default. 

> Apart from this, I was going over latency.c in libglusterfs, which does a
> fine job of maintaining latency info for every xlator and encountered an
> anomaly which I thought should be dealt with. The function
> gf_proc_dump_latency_info which dumps the latency array for the specified
> xlator consists of a last line which in the end flushes this array through
> memset after every dump. That means, you get different latency info every
> time you read the profile file in meta. I think, flushing the data structure
> after every dump is wrong since, you don't get overall stats since one
> enabled the option at the top of meta, and more importantly, multiple
> applications reading this file can get wrong info, since it gets cleared
> after one read only.

Clearing of the statistics on every request sounds incorrect to me. Could you 
please send a patch to fix this? 

thanks, 
Krish 

> If my reasons seem apt for you, I'll send a patch over for evaluation.

> Regards
> Vipul Nayyar
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Proposal for change regarding latency calculation

2014-07-14 Thread Vipul Nayyar
Hello,

Following is a proposal for modifying the io profiling capability of the 
io-stats xlator. I recently sent in a patch(review.gluster.org/#/c/8244/) 
regarding that, which uses the already written latency related functions in 
io-stats to dump info through meta and added some more data containers which 
would track some more fops related info each time a request goes through 
io-stats. Currently, before the io-stats' custom latency functions can run, the 
measure_latency and count_fop_hits option should be enabled. I propose to 
remove these two options entirely from io-stats.

In order to track io performance, these options should be enabled all the time, 
or removed entirely, so that a record of io requests can be kept since mount 
time, else enabling these options only when it is required will not give you 
the average statistics over the whole period since the start. This is based on 
the methodology of Linux kernel itself, since it internally maintains the io 
statistics data structures all the time and presents it via /proc filesystem 
whenever required. Enabling of any option is not required, and the data 
available represents statistics since the boot time.

I would like to know the views over this, if having io-stats profiling info 
available all the time would be a good thing?

Apart from this, I was going over latency.c in libglusterfs, which does a fine 
job of maintaining latency info for every xlator and encountered an anomaly 
which I thought should be dealt with. The function gf_proc_dump_latency_info 
which dumps the latency array for the specified xlator consists of a last line 
which in the end flushes this array through memset after every dump. That 
means, you get different latency info every time you read the profile file in 
meta. I think, flushing the data structure after every dump is wrong since, you 
don't get overall stats since one enabled the option at the top of meta, and 
more importantly, multiple applications reading this file can get wrong info, 
since it gets cleared after one read only.

If my reasons seem apt for you, I'll send a patch over for evaluation.

Regards

Vipul Nayyar 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel