Ceph availability test & recovering question

Kelvin_Huang Sat, 16 Mar 2013 21:18:27 -0700

Hi, all

I have some problem after availability test


Setup:
Linux kernel: 3.2.0
OS: Ubuntu 12.04
Storage server : 11 HDD (each storage server has 11 osd, 7200 rpm, 1T) + 10GbE 
NIC 
RAID card: LSI MegaRAID SAS 9260-4i  For every HDD: RAID0, Write Policy: Write 
Back with BBU, Read Policy: ReadAhead, IO Policy: Direct 
Storage server number : 2

Ceph version : 0.48.2
Replicas : 2
Monitor number:3


We have two storage server as a cluter, then use ceph client create 1T RBD 
image for testing, the client also 
has 10GbE NIC , Linux kernel 3.2.0 , Ubuntu 12.04

We also use FIO to produce workload

fio command:
[Sequencial Read]
fio --iodepth = 32 --numjobs=1 --runtime=120  --bs = 65536 --rw = read 
--ioengine=libaio --group_reporting --direct=1 --eta=always  --ramp_time=10 
--thinktime=10

[Sequencial Write]
fio --iodepth = 32 --numjobs=1 --runtime=120  --bs = 65536 --rw = write 
--ioengine=libaio --group_reporting --direct=1 --eta=always  --ramp_time=10 
--thinktime=10


Now I want observe to ceph state when one storage server is crash, so I turn 
off one storage server networking.
We expect that data write and data read operation can be quickly resume or even 
not be suspended in ceph recovering time, but the experimental results show 
the data write and data read operation will pause for about 20~30 seconds in 
ceph recovering time.

My question is:
1.The state of I/O pause is normal when ceph recovering ?
2.The pause time of I/O that can not be avoided when ceph recovering ?
3.How to reduce the I/O pause time ?


Thanks!!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ceph availability test & recovering question

Reply via email to