Seperate metadata disk for OSD

Chen, Xiaoxi Fri, 11 Jan 2013 22:58:55 -0800

Hi list,
        For a rbd write request, Ceph need to do 3 writes:
2013-01-10 13:10:15.539967 7f52f516c700 10 filestore(/data/osd.21) 
_do_transaction on 0x327d790
2013-01-10 13:10:15.539979 7f52f516c700 15 filestore(/data/osd.21) write 
meta/516b801c/pglog_2.1a/0//-1 36015~147
2013-01-10 13:10:15.540016 7f52f516c700 15 filestore(/data/osd.21) path: 
/data/osd.21/current/meta/DIR_C/pglog\u2.1a__0_516B801C__none
2013-01-10 13:10:15.540164 7f52f516c700 15 filestore(/data/osd.21) write 
meta/28d2f4a8/pginfo_2.1a/0//-1 0~496
2013-01-10 13:10:15.540189 7f52f516c700 15 filestore(/data/osd.21) path: 
/data/osd.21/current/meta/DIR_8/pginfo\u2.1a__0_28D2F4A8__none
2013-01-10 13:10:15.540217 7f52f516c700 10 filestore(/data/osd.21) 
_do_transaction on 0x327d708
2013-01-10 13:10:15.540222 7f52f516c700 15 filestore(/data/osd.21) write 
2.1a_head/8abf341a/rb.0.106e.6b8b4567.0000000002d3/head//2 3227648~524288
2013-01-10 13:10:15.540245 7f52f516c700 15 filestore(/data/osd.21) path: 
/data/osd.21/current/2.1a_head/rb.0.106e.6b8b4567.0000000002d3__head_8ABF341A__2


        If using XFS as backend file system and running xfs on top of 
traditional sata disk, it will introduce a lot of seeks and therefore reduce 
bandwidth, a blktrace is available here :( 
http://ww3.sinaimg.cn/mw690/6e1aee47jw1e0qsbxbvddj.jpg) to demonstrate this 
issue.( single client running dd on top of a new RBD volumes).
        Then I tried to move /osd.X/current/meta to a separate disk, the 
bandwidth boosted.(look blktrace at 
http://ww4.sinaimg.cn/mw690/6e1aee47jw1e0qsadz1bij.jpg).
        I haven't test other access pattern or something else, but it looks to 
me that moving such meta to a separate disk (ssd or sata with btrfs) will 
benefit ceph write performance, is it true? Will ceph introduce this feature in 
the future?  Is there any potential problem for such hack?

                                                                                
                                                                                
                                                                                
                                Xiaoxi
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seperate metadata disk for OSD

Reply via email to