Re: slow requests, hunting for new mon

2013-02-14 Thread Chris Dunlop
On 2013-02-12, Chris Dunlop ch...@onthe.net.au wrote:
 Hi,

 What are likely causes for slow requests and monclient: hunting for new
 mon messages? E.g.:

 2013-02-12 16:27:07.318943 7f9c0bc16700  0 monclient: hunting for new mon
 ...
 2013-02-12 16:27:45.892314 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
 included below; oldest blocked for  30.383883 secs
 2013-02-12 16:27:45.892323 7f9c13c26700  0 log [WRN] : slow request 30.383883 
 seconds old, received at 2013-02-12 16:27:15.508374: 
 osd_op(client.9821.0:122242 rb.0.209f.74b0dc51.0120 [write 
 921600~4096] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892328 7f9c13c26700  0 log [WRN] : slow request 30.383782 
 seconds old, received at 2013-02-12 16:27:15.508475: 
 osd_op(client.9821.0:122243 rb.0.209f.74b0dc51.0120 [write 
 987136~4096] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892334 7f9c13c26700  0 log [WRN] : slow request 30.383720 
 seconds old, received at 2013-02-12 16:27:15.508537: 
 osd_op(client.9821.0:122244 rb.0.209f.74b0dc51.0120 [write 
 1036288~8192] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892338 7f9c13c26700  0 log [WRN] : slow request 30.383684 
 seconds old, received at 2013-02-12 16:27:15.508573: 
 osd_op(client.9821.0:122245 rb.0.209f.74b0dc51.0122 [write 
 1454080~4096] 2.fff29a9a) v4 currently no flag points reached
 2013-02-12 16:27:45.892341 7f9c13c26700  0 log [WRN] : slow request 30.328986 
 seconds old, received at 2013-02-12 16:27:15.563271: 
 osd_op(client.9821.0:122246 rb.0.209f.74b0dc51.0122 [write 
 1482752~4096] 2.fff29a9a) v4 currently no flag points reached

OK, for the sake of anyone who might come across this thread when
searching for similar issues...

http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#slow-or-unresponsive-osd

...unfortunately the error message in the link above says old request
rather than slow request (old code?), so that page doesn't come up
when googling for the slow request message. The page needs
updating.

The underlying problem in our case seems to have been spikes in the
number of IOPS going to the disks (e.g. watch 'iostat -x' output). Whilst
the disks were coping with steady state load, occasionally something (in
this case, activity in a vm running on rbd) would cause a spike in
activity and the disks couldn't cope. I'd initially looked at the amount
of data going to the disks and thought it was well with the disks'
capabilities, however I hadn't considered the IOPS.

The (partial?) solution was to move the journals onto a separate device,
halving the IOPS going to the data disk (write journal, write data) as
well as avoiding having the heads slamming back and forth between the
data and journal. We're continuing to watch the IOPS and will add more
OSDs to spread the load further if necessary.

I still don't know what the hunting messages actually indicate, but
they've also disappeared since fixing the slow request messages.

Incidentally, it strikes me that there is a significant amount of write
amplification going on when running vms with a file system such as xfs
or ext4 (with journal) on top of rbd/rados (with journal) on top of xfs
(with journal). I.e. a single write from a vm can turn into up to 8
separate writes by the time it hits the underlying xfs filesystem. I
think this is why our ceph setup is struggling at far less load on the
same hardware compared to the drbd setup we're wanting to replace.

Cheers!

Chris.

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow requests, hunting for new mon

2013-02-14 Thread Sage Weil
On Fri, 15 Feb 2013, Chris Dunlop wrote:
 On 2013-02-12, Chris Dunlop ch...@onthe.net.au wrote:
  Hi,
 
  What are likely causes for slow requests and monclient: hunting for new
  mon messages? E.g.:
 
  2013-02-12 16:27:07.318943 7f9c0bc16700  0 monclient: hunting for new mon
  ...
  2013-02-12 16:27:45.892314 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
  included below; oldest blocked for  30.383883 secs
  2013-02-12 16:27:45.892323 7f9c13c26700  0 log [WRN] : slow request 
  30.383883 seconds old, received at 2013-02-12 16:27:15.508374: 
  osd_op(client.9821.0:122242 rb.0.209f.74b0dc51.0120 [write 
  921600~4096] 2.981cf6bc) v4 currently no flag points reached
  2013-02-12 16:27:45.892328 7f9c13c26700  0 log [WRN] : slow request 
  30.383782 seconds old, received at 2013-02-12 16:27:15.508475: 
  osd_op(client.9821.0:122243 rb.0.209f.74b0dc51.0120 [write 
  987136~4096] 2.981cf6bc) v4 currently no flag points reached
  2013-02-12 16:27:45.892334 7f9c13c26700  0 log [WRN] : slow request 
  30.383720 seconds old, received at 2013-02-12 16:27:15.508537: 
  osd_op(client.9821.0:122244 rb.0.209f.74b0dc51.0120 [write 
  1036288~8192] 2.981cf6bc) v4 currently no flag points reached
  2013-02-12 16:27:45.892338 7f9c13c26700  0 log [WRN] : slow request 
  30.383684 seconds old, received at 2013-02-12 16:27:15.508573: 
  osd_op(client.9821.0:122245 rb.0.209f.74b0dc51.0122 [write 
  1454080~4096] 2.fff29a9a) v4 currently no flag points reached
  2013-02-12 16:27:45.892341 7f9c13c26700  0 log [WRN] : slow request 
  30.328986 seconds old, received at 2013-02-12 16:27:15.563271: 
  osd_op(client.9821.0:122246 rb.0.209f.74b0dc51.0122 [write 
  1482752~4096] 2.fff29a9a) v4 currently no flag points reached
 
 OK, for the sake of anyone who might come across this thread when
 searching for similar issues...
 
 http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#slow-or-unresponsive-osd
 
 ...unfortunately the error message in the link above says old request
 rather than slow request (old code?), so that page doesn't come up
 when googling for the slow request message. The page needs
 updating.

Updated, thanks!

 The underlying problem in our case seems to have been spikes in the
 number of IOPS going to the disks (e.g. watch 'iostat -x' output). Whilst
 the disks were coping with steady state load, occasionally something (in
 this case, activity in a vm running on rbd) would cause a spike in
 activity and the disks couldn't cope. I'd initially looked at the amount
 of data going to the disks and thought it was well with the disks'
 capabilities, however I hadn't considered the IOPS.
 
 The (partial?) solution was to move the journals onto a separate device,
 halving the IOPS going to the data disk (write journal, write data) as
 well as avoiding having the heads slamming back and forth between the
 data and journal. We're continuing to watch the IOPS and will add more
 OSDs to spread the load further if necessary.
 
 I still don't know what the hunting messages actually indicate, but
 they've also disappeared since fixing the slow request messages.

This usually means the monitor was responding and we (the OSD or client) 
is trying to reconnect (to a random monitor).

 Incidentally, it strikes me that there is a significant amount of write
 amplification going on when running vms with a file system such as xfs
 or ext4 (with journal) on top of rbd/rados (with journal) on top of xfs
 (with journal). I.e. a single write from a vm can turn into up to 8
 separate writes by the time it hits the underlying xfs filesystem. I
 think this is why our ceph setup is struggling at far less load on the
 same hardware compared to the drbd setup we're wanting to replace.

Currently, yes.  There is always going to be some additional overhead 
because the object data is stored in a file system.  We were/are doing 
several other non-optimal things too, however, that is being improved in 
the current master branch (moving some metadata into leveldb which does a 
better job of managing the IO pattern).  Stay tuned!

sage

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow requests, hunting for new mon

2013-02-12 Thread Chris Dunlop
On Tue, Feb 12, 2013 at 06:28:15PM +1100, Chris Dunlop wrote:
 Hi,
 
 What are likely causes for slow requests and monclient: hunting for new
 mon messages? E.g.:
 
 2013-02-12 16:27:07.318943 7f9c0bc16700  0 monclient: hunting for new mon
 ...
 2013-02-12 16:27:45.892314 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
 included below; oldest blocked for  30.383883 secs
 2013-02-12 16:27:45.892323 7f9c13c26700  0 log [WRN] : slow request 30.383883 
 seconds old, received at 2013-02-12 16:27:15.508374: 
 osd_op(client.9821.0:122242 rb.0.209f.74b0dc51.0120 [write 
 921600~4096] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892328 7f9c13c26700  0 log [WRN] : slow request 30.383782 
 seconds old, received at 2013-02-12 16:27:15.508475: 
 osd_op(client.9821.0:122243 rb.0.209f.74b0dc51.0120 [write 
 987136~4096] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892334 7f9c13c26700  0 log [WRN] : slow request 30.383720 
 seconds old, received at 2013-02-12 16:27:15.508537: 
 osd_op(client.9821.0:122244 rb.0.209f.74b0dc51.0120 [write 
 1036288~8192] 2.981cf6bc) v4 currently no flag points reached
 2013-02-12 16:27:45.892338 7f9c13c26700  0 log [WRN] : slow request 30.383684 
 seconds old, received at 2013-02-12 16:27:15.508573: 
 osd_op(client.9821.0:122245 rb.0.209f.74b0dc51.0122 [write 
 1454080~4096] 2.fff29a9a) v4 currently no flag points reached
 2013-02-12 16:27:45.892341 7f9c13c26700  0 log [WRN] : slow request 30.328986 
 seconds old, received at 2013-02-12 16:27:15.563271: 
 osd_op(client.9821.0:122246 rb.0.209f.74b0dc51.0122 [write 
 1482752~4096] 2.fff29a9a) v4 currently no flag points reached

Sorry, I forgot to add what lead me to look into this in the first place...

The b2 machine is running a number of libvirt / kvm virtuals on rbd via
librados:

disk type='network' device='disk'
  driver name='qemu' type='raw'/
  auth username='xx'
secret type='ceph' uuid='----'/
  /auth
  source protocol='rbd' name='rbd/foo:rbd_cache=1'/
  target dev='vdb' bus='virtio'/
  alias name='virtio-disk1'/
  address type='pci' domain='0x' bus='0x00' slot='0x06' 
function='0x0'/
/disk

Around the time the slow requests messages pop up, the virtual machines 
experience
a spike in disk latency, e.g. from inside one of the virtuals:

load %user %nice  %sys  %iow  %stl %idle  dev rrqm/s wrqm/s 
   r/sw/srkB/swkB/s arq-sz aqu-sz  await  rwait wwait %util
2013-02-12-18:55:23  1.7   0.3   0.0   0.9  42.2   0.0  18.5  vdb0.0   25.8 
   0.09.6 0.00   136.53  28.44   1.34 139.56   0.00 139.6  70.7
2013-02-12-18:55:38  1.5   0.9   0.0   0.4  35.8   0.0  42.8  vdb0.0   26.6 
   0.0   10.8 0.00   143.20  26.52   0.46  41.98   0.00  42.0  40.0
2013-02-12-18:55:53  1.4   0.7   0.0   0.4  38.0   0.0  51.7  vdb0.07.5 
   0.06.8 0.0053.33  15.69   0.46  68.08   0.00  68.1  38.8
2013-02-12-18:56:10  2.2   0.0   0.0   0.1   8.0   0.0  15.8  vdb0.02.1 
   0.00.2 0.00 9.18  78.00   0.98 4164.00   0.00 4164.0  70.6
2013-02-12-18:56:32  3.7   0.0   0.0   0.1   0.0   0.0   0.0  vdb0.00.1 
   0.00.8 0.00 3.27   8.00  24.12 14519.78   0.00 14519.8 100.0
2013-02-12-18:56:47  5.1   0.5   0.0   0.6   0.0   0.0   0.0  vdb0.0   11.5 
   0.05.4 0.0065.87  24.40   1.60 3620.15   0.00 3620.1  88.4
2013-02-12-18:57:03  5.2   0.6   0.0   1.1   3.5   0.0   0.0  vdb0.0   19.6 
   0.06.9 0.00   101.25  29.19   1.13 162.41   0.00 162.4  87.4
2013-02-12-18:57:20  5.2   0.6   0.0   0.9   9.7   0.0   0.0  vdb0.0   44.6 
   0.09.7 0.00   214.82  44.27   2.41 248.22   0.00 248.2  85.3
2013-02-12-18:57:36  4.4   0.4   0.0   0.5  28.7   0.0  46.5  vdb0.0   17.6 
   0.05.2 0.0087.50  33.33   0.56 107.14   0.00 107.1  48.5

...corresponding to this in the b5 / osd.1 log:

2013-02-12 18:52:08.812880 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 18:55:18.851791 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 18:56:25.414948 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
included below; oldest blocked for  30.372124 secs
2013-02-12 18:56:25.414958 7f9c13c26700  0 log [WRN] : slow request 30.372124 
seconds old, received at 2013-02-12 18:55:55.042767: 
osd_op(client.9821.0:144779 rb.0.209f.74b0dc51.0023 [write 
1593344~4096] 2.1882ddb7) v4 currently no flag points reached
...
2013-02-12 18:57:13.427008 7f9c13c26700  0 log [WRN] : slow request 40.721769 
seconds old, received at 2013-02-12 18:56:32.705190: 
osd_op(client.9821.0:146756 rb.0.209f.74b0dc51.0128 [write 819200~8192] 
2.b4390173) v4 currently commit sent
2013-02-12 18:59:43.886517 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 19:02:53.911641 7f9c0bc16700  0 monclient: hunting for new mon


Cheers,

Chris.
--
To unsubscribe from this