hi

could you tell the reason, why 'the journal is lost, the OSD is lost'? if 
journal is lost, actually it only lost part  which ware not replayed.
let take a similar case as example, a osd is down for some time , its journal 
is out of date(lose part of journal), but it can catch up with other osds. why?
that example can tell that  either outdated osd can get all journal from others 
 or 'catch up' has different theory with journal.
could you explain?
 
 
 
thanks








At 2014-08-14 05:21:20, "Craig Lewis" <[email protected]> wrote:

If the journal is lost, the OSD is lost.  This can be a problem if you use 1 
SSD for journals for many OSDs.


There has been some discussion about making the OSDs able to recover from a 
lost journal, but I haven't heard anything else about it.  I haven't been 
paying much attention to the developer mailing list though.




For your second question, I'd start by looking at the source code in 
src/osd/ReplicatedPG.cc (for standard replication), or src/osd/ECBackend.cc 
(for Erasure Coding).  I'm not a Ceph developer though, so that might not be 
the right place to start.





On Tue, Aug 12, 2014 at 7:08 PM, yuelongguang <[email protected]> wrote:

hi,all
 
1.
can osd start up  if journal is lost and it has not been replayed?
 
2.
how it catchs up latest epoch?  take osd as example,  where is the code? it 
better you consider journal is lost or not.
in my mind journal only includes meta/R/W operations, does not include 
data(file data).
 
 
thanks



_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to