On Wed, Apr 8, 2015 at 7:36 PM, Lionel Bouton <[email protected]> wrote:
> On 04/08/15 18:24, Jeff Epstein wrote:
>> Hi, I'm having sporadic very poor performance running ceph. Right now
>> mkfs, even with nodiscard, takes 30 mintes or more. These kind of
>> delays happen often but irregularly .There seems to be no common
>> denominator. Clearly, however, they make it impossible to deploy ceph
>> in production.
>>
>> I reported this problem earlier on ceph's IRC, and was told to add
>> nodiscard to mkfs. That didn't help. Here is the command that I'm
>> using to format an rbd:
>>
>> For example: mkfs.ext4 -text4 -m0 -b4096 -E nodiscard /dev/rbd1
>
> I probably won't be able to help much, but people knowing more will need
> at least:
> - your Ceph version,
> - the kernel version of the host on which you are trying to format
> /dev/rbd1,
> - which hardware and network you are using for this cluster (CPU, RAM,
> HDD or SSD models, network cards, jumbo frames, ...).
>
>>
>> Ceph says everything is okay:
>>
>> cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
>> health HEALTH_OK
>> monmap e1: 3 mons at
>> {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0},
>> election
>> epoch 12, quorum 0,1,2 a,b,c
>> osdmap e972: 6 osds: 6 up, 6 in
>> pgmap v4821: 4400 pgs, 44 pools, 5157 MB data, 1654 objects
>> 46138 MB used, 1459 GB / 1504 GB avail
>> 4400 active+clean
Are there any "slow request" warnings in the logs?
Assuming a 30 minute mkfs is somewhat reproducible, can you bump osd
and ms log levels and try to capture it?
Thanks,
Ilya
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com