Re: [ceph-users] ceph journal failed??

2015-12-22 Thread yuyang
ok, You give me the answer, thanks a lot.

But, I don't know the answer to your questions.

Maybe someone else can answer.

--Original--
From: "Loris Cuoghi";<l...@stella-telecom.fr>;
Date: Tue, Dec 22, 2015 07:31 PM
To: "ceph-users"<ceph-users@lists.ceph.com>;
Subject: Re: [ceph-users] ceph journal failed??

Le 22/12/2015 09:42, yuyang a ??crit :
 Hello, everyone,
[snip snap]

Hi

 If the SSD failed or down, can the OSD work?
 Is the osd down or only can be read?

If you don't have a journal anymore, the OSD has already quit, as it 
can't continue writing, nor it can assure data consistency, since writes 
have probably been interrupted.

The Ceph's community general assumption for a dead journal, is a dead OSD.

But.

http://www.sebastien-han.fr/blog/2014/11/27/ceph-recover-osds-after-ssd-journal-failure/

How does this apply in reality?
Is the solution that S??bastien is proposing viable?
In most/all cases?
Will the OSD continue chugging along after this kind of surgery?
Is it necessary/suggested to deep scrub ASAP the OSD's placement groups?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph journal failed?

2015-12-22 Thread Christian Balzer

Hello,

On Wed, 23 Dec 2015 11:46:58 +0800 yuyang wrote:

> ok, You give me the answer, thanks a lot.
>

Assume that a journal SSD failure means a loss of all associated OSDs.

So in your case a single SSD failure will cause the data loss of a whole
node.

If you have 15 or more of those nodes, your cluster should be able to
handle the resulting I/O storm from recovering 9 OSDs, but with just a few
nodes you will have a severe performance impact and also risk data loss if
other failures occur during recovery.

Lastly, a 1:9 SSD journal to SATA ratio sounds also wrong when it comes to
performance, your SSD would need be able to handle about 900MB/s sync
writes, that's very expensive territory.

Christan
 
> But, I don't know the answer to your questions.
> 
> Maybe someone else can answer.
> 
> --Original--
> From: "Loris Cuoghi";<l...@stella-telecom.fr>;
> Date: Tue, Dec 22, 2015 07:31 PM
> To: "ceph-users"<ceph-users@lists.ceph.com>;
> Subject: Re: [ceph-users] ceph journal failed?
> 
> Le 22/12/2015 09:42, yuyang a écrit :
>  Hello, everyone,
> [snip snap]
> 
> Hi
> 
>  If the SSD failed or down, can the OSD work?
>  Is the osd down or only can be read?
> 
> If you don't have a journal anymore, the OSD has already quit, as it 
> can't continue writing, nor it can assure data consistency, since writes 
> have probably been interrupted.
> 
> The Ceph's community general assumption for a dead journal, is a dead
> OSD.
> 
> But.
> 
> http://www.sebastien-han.fr/blog/2014/11/27/ceph-recover-osds-after-ssd-journal-failure/
> 
> How does this apply in reality?
> Is the solution that Sébastien is proposing viable?
> In most/all cases?
> Will the OSD continue chugging along after this kind of surgery?
> Is it necessary/suggested to deep scrub ASAP the OSD's placement groups?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph journal failed??

2015-12-22 Thread yuyang
Hello, everyone,
I have a ceph cluster with sereral nodes, every node has 1 SSD and 9 STAT disks.
Every STAT disk is used as an OSD, in order to improve IO performance, the SSD 
is used as journal file disk.
That is, there are 9 nournal files in every SSD.

If the SSD failed or down, can the OSD work? 
Is the osd down or only can be read?

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph journal failed?

2015-12-22 Thread Loris Cuoghi

Le 22/12/2015 09:42, yuyang a écrit :

Hello, everyone,

[snip snap]

Hi

> If the SSD failed or down, can the OSD work?
> Is the osd down or only can be read?

If you don't have a journal anymore, the OSD has already quit, as it 
can't continue writing, nor it can assure data consistency, since writes 
have probably been interrupted.


The Ceph's community general assumption for a dead journal, is a dead OSD.

But.

http://www.sebastien-han.fr/blog/2014/11/27/ceph-recover-osds-after-ssd-journal-failure/

How does this apply in reality?
Is the solution that Sébastien is proposing viable?
In most/all cases?
Will the OSD continue chugging along after this kind of surgery?
Is it necessary/suggested to deep scrub ASAP the OSD's placement groups?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com