Hi Sage,
Thanks for your information and let me know at least my hardwares are OK...:)
I will upgrade my ceph version to the latest one and see what I can find.

By the way, could you tell me how to set "osd heartbeat grace" ?
Because I can't find how to do it in the ceph wiki.
Thanks!
--
Best Regards,
Sylar Shen

2011/3/28 Sage Weil <[email protected]>:
> On Mon, 28 Mar 2011, Sylar Shen wrote:
>> Hi,
>> I set an environment of 20 servers which include 2 MDSs, 3 MONs and 18
>> OSDes(3 monitors on 18 OSDes)
>> My version is 0.24.3 and OS is Fedora 14.
>> There's a problem when I was doing the writing tests.
>> Whether I was writing the data or not, some OSDes were randomly marked
>> down and out one by one after a period of time.
>> And when that happened, the whole performance soon got worse and worse.
>> I checked the /var/log/ceph/osd.log but found nothing.
>> So I am curious that is there anyone who has the same problem with me?
>> Or maybe it's just a problem of my hardware......><
>
> Hi Sylar,
>
> This is/was a known problem.  There's a long thread from a couple weeks
> back with Jim Schutt debugging the issue.  We've fixed a few different
> things that have significantly improved the situation, but the heartbeats
> are still failing from time to time.
>
> I suspect using a more recent release will be sufficient at your scale,
> either 0.25.2 or the latest 'next' branch from git (there are autobuilt
> debs for that too).  You can also increase the 'osd heartbeat grace' to
> make the system less sensitive to the transient hangs that are preventing
> the heartbeats from going out.
>
> Please let us know what you find, either here or on #ceph.
>
> Thanks!
> sage
>
>
>
>



-- 
Best Regards,
Sylar Shen
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to