Hello cephers, 

I've been testing flashcache and enhanceio block device caching for the osds 
and i've noticed i have started getting the slow requests. The caching type 
that I use is ready only, so all writes bypass the caching ssds and go directly 
to osds, just like what it used to be before introducing the caching layer. 
Prior to introducing caching, i rarely had the slow requests. Judging by the 
logs, all slow requests are looking like these: 


2014-10-16 01:09:15.600807 osd.7 192.168.168.200:6836/32031 100 : [WRN] slow 
request 30.999641 seconds old, received at 2014-10-16 01:08:44.601040: 
osd_op(client.36035566.0:16626375 rbd_data.51da686763845e 
.0000000000005a15 [set-alloc-hint object_size 4194304 write_size 4194304,write 
2007040~16384] 5.7b16421b snapc c4=[c4] ack+ondisk+write e61892) v4 currently 
waiting for subops from 9 
2014-10-16 01:09:15.600811 osd.7 192.168.168.200:6836/32031 101 : [WRN] slow 
request 30.999581 seconds old, received at 2014-10-16 01:08:44.601100: 
osd_op(client.36035566.0:16626376 rbd_data.51da686763845e 
.0000000000005a15 [set-alloc-hint object_size 4194304 write_size 4194304,write 
2039808~16384] 5.7b16421b snapc c4=[c4] ack+ondisk+write e61892) v4 currently 
waiting for subops from 9 
2014-10-16 01:09:16.185530 osd.2 192.168.168.200:6811/31891 76 : [WRN] 20 slow 
requests, 1 included below; oldest blocked for > 57.003961 secs 
2014-10-16 01:09:16.185564 osd.2 192.168.168.200:6811/31891 77 : [WRN] slow 
request 30.098574 seconds old, received at 2014-10-16 01:08:46.086854: 
osd_op(client.38917806.0:3481697 rbd_data.251d05e3db45a54. 
0000000000000304 [stat,set-alloc-hint object_size 4194304 write_size 
4194304,write 2732032~8192] 5.e4683bbb ack+ondisk+write e61892) v4 currently 
waiting for subops from 11 
2014-10-16 01:09:16.601020 osd.7 192.168.168.200:6836/32031 102 : [WRN] 16 slow 
requests, 2 included below; oldest blocked for > 43.531516 secs 


In general, I see between 0 and about 2,000 slow request log entries per day. 
On one day I saw over 100k entries, but it only happened once. 

I am struggling to understand what is casing the slow requests? If all the 
writes go the same path as before caching was introduced, how come I am getting 
them? How can I investigate this further? 

Thanks 

Andrei 

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to