For what it is worth, I had weird performance issues when I moved from
3.2.5 to 3.3.0 - I saw increased CPU utilization, as well as drastically
increased network utilization between the nodes with the same workload.
I could never really quantify the difference, other than I noticed my
systems mounting the volumes via NFS had more problems when a brick went
offline in an unclean way (e.g. network disappeared). I have six nodes
which run 4-way or 2-way replicas and mount the volumes locally, so it's
a pretty self-contained configuration.
Today I rolled back to 3.2.5 and ran for ~8hours. My traffic dropped
back to what it looked like previously, and my load dropped to next to
nothing. I just upgraded to 3.2.7, and at least in the last couple of
hours network utilization is about the same as 3.3.0. CPU usage is
actually worse with 3.2.7 than 3.3.0.
Does anyone have a 'good' test of Gluster performance? Most of my
operations take <10s, so when they take 5.4s avg with 3.3.0 after 100
runs, and avg 4.6s with 3.2.5 it's hard to tell if it's a meaningful 15%
or bad statistics. I'd like to understand what is different between
3.2.5 and 3.3.0 or 3.2.7, but really need a good way to quantify it.
On 6/11/12 10:15 AM, Simon Detheridge wrote:
Hi,
I have a situation where I'm mounting a gluster volume on several web servers
via NFS. The web servers run Rails applications off the gluster NFS mounts. The
whole thing is running on EC2.
On 3.2.5, starting a Rails application on the web server was sluggish but
acceptable. However, after upgrading to 3.2.6 the length of time taken to start
a Rails application has increased by over 10 times, to something that's not
really suitable for a production environment. The situation still occurs with
3.3.0 as well.
If I attach strace to the rails process as it's starting up, I see that it's
looking for a very large number of nonexistent files. I think this is something
that Rails does that can't be helped - it checks to see if a file is there for
many things, and does something accordingly if it does.
Has something changed that could negatively affect the length of time it takes
to stat a nonexistent file over a NFS mount to a gluster volume, between 3.2.5
and 3.2.6? Is there any way I can get the old behaviour without downgrading?
-- I don't currently have proof that it's the nonexistent files that's causing
the problem, but it seems highly likely as performance for the other tasks that
the servers carry out appears unaffected.
Sorry this is slightly vague. I can run some more tests/benchmarks to try and
figure out what's going on in more detail, but thought I would ask here first
in case this is related to a known issue.
Thanks,
Simon
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users