Hi folks,
I'm seeing reasonable performance when I run rados
benchmarks, but really slow I/O when reading or writing
from a mounted ceph filesystem. The rados benchmarks
show about 150 MB/s for both read and write, but when I
go to a client machine with a mounted ceph filesystem
and try to rsync a large (60 GB) directory tree onto
the ceph fs, I'm getting rates of only 2-5 MB/s.
The OSDs and MDSs are all running 64-bit CentOS 6.3
with the stock CentOS 2.6.32 kernel. The client is also
64-bit CentOS 6.3, but it's running the "elrepo" 3.5.4 kernel.
There are four OSDs, each with a hardware RAID 5 array
and an SSD for the OSD journal. The primary network
is a gigabit network, and the OSD, MDS and MON
machines have a dedicated backend gigabit network on a
second network interface.
Locally on the OSD, "hdparm -t -T" reports read rates
of ~350 MB/s, and bonnie++ shows:
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
osd-local 23800M 1037 99 316048 92 131023 19 2272 98 312781 21 521.0
24
Latency 13103us 183ms 123ms 15316us 100ms 75899us
Version 1.96 ------Sequential Create------ --------Random Create--------
osd-local -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 16817 55 +++++ +++ 28786 77 23890 78 +++++ +++ 27128 75
Latency 21549us 105us 134us 902us 12us 104us
While rsyncing the files, the ceph logs show lots
of warnings of the form:
[WRN] : slow request 91.848407 seconds old, received at 2012-09-26
09:30:52.252449: osd_op(client.5310.1:56400 1000026eda0.00001ec8 [write
2093056~4096] 0.aa047db8 snapc 1=[]) currently waiting for sub ops
Snooping on traffic with wireshark shows bursts of
activity separated by long periods (30-60 sec) of idle time.
My first thought was that I was seeing a kind of
"bufferbloat". The SSDs are 120 GB, so they could easily contain
enough data to take a long time to dump. I changed to using a
journal file, limited to 1 GB, but I still see the same slow
behavior.
Any advice about how to go about debugging this would
be appreciated.
Thanks,
Bryan
--
========================================================================
Bryan Wright |"If you take cranberries and stew them like
Physics Department | applesauce, they taste much more like prunes
University of Virginia | than rhubarb does." -- Groucho
Charlottesville, VA 22901|
(434) 924-7218 | [email protected]
========================================================================
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html