Rule of thumb with batteries is: - more “proper temperature” you run them at the more life you get out of them - more battery is overpowered for your application the longer it will survive.
Get your self a LSI 94** controller and use it as HBA and you will be fine. but get MORE DRIVES !!!!! … > On 28 Aug 2017, at 23:10, hjcho616 <hjcho...@yahoo.com> wrote: > > Thank you Tomasz and Ronny. I'll have to order some hdd soon and try these > out. Car battery idea is nice! I may try that.. =) Do they last longer? > Ones that fit the UPS original battery spec didn't last very long... part of > the reason why I gave up on them.. =P My wife probably won't like the idea > of car battery hanging out though ha! > > The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard > doesn't have any additional SATA connectors available. Would it be safe to > add another OSD host? > > Regards, > Hong > > > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz <tom.kusmi...@gmail.com> > wrote: > > > Sorry for being brutal … anyway > 1. get the battery for UPS ( a car battery will do as well, I’ve moded on ups > in the past with truck battery and it was working like a charm :D ) > 2. get spare drives and put those in because your cluster CAN NOT get out of > error due to lack of space > 3. Follow advice of Ronny Aasen on hot to recover data from hard drives > 4 get cooling to drives or you will loose more ! > > >> On 28 Aug 2017, at 22:39, hjcho616 <hjcho...@yahoo.com >> <mailto:hjcho...@yahoo.com>> wrote: >> >> Tomasz, >> >> Those machines are behind a surge protector. Doesn't appear to be a good >> one! I do have a UPS... but it is my fault... no battery. Power was pretty >> reliable for a while... and UPS was just beeping every chance it had, >> disrupting some sleep.. =P So running on surge protector only. I am >> running this in home environment. So far, HDD failures have been very rare >> for this environment. =) It just doesn't get loaded as much! I am not sure >> what to expect, seeing that "unfound" and just a feeling of possibility of >> maybe getting OSD back made me excited about it. =) Thanks for letting me >> know what should be the priority. I just lack experience and knowledge in >> this. =) Please do continue to guide me though this. >> >> Thank you for the decode of that smart messages! I do agree that looks like >> it is on its way out. I would like to know how to get good portion of it >> back if possible. =) >> >> I think I just set the size and min_size to 1. >> # ceph osd lspools >> 0 data,1 metadata,2 rbd, >> # ceph osd pool set rbd size 1 >> set pool 2 size to 1 >> # ceph osd pool set rbd min_size 1 >> set pool 2 min_size to 1 >> >> Seems to be doing some backfilling work. >> >> # ceph health >> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 pgs >> backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs degraded; >> 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 16 pgs >> stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 130 >> pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests >> are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); >> recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 >> unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD >> present but 'sortbitwise' flag is not set >> >> >> >> Regards, >> Hong >> >> >> On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz <tom.kusmi...@gmail.com >> <mailto:tom.kusmi...@gmail.com>> wrote: >> >> >> So to decode few things about your disk: >> >> 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - >> 37 >> 37 read erros and only one sector marked as pending - fun disk :/ >> >> 181 Program_Fail_Cnt_Total 0x0022 099 099 000 Old_age Always - >> 35325174 >> So firmware has quite few bugs, that’s nice >> >> 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - >> 2855 >> disk was thrown around while operational even more nice. >> >> 194 Temperature_Celsius 0x0002 047 041 000 Old_age Always - >> 53 (Min/Max 15/59) >> if your disk passes 50 you should not consider using it, high temperatures >> demagnetise plate layer and you will see more errors in very near future. >> >> 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - >> 1 >> as mentioned before :) >> >> 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - >> 4222 >> your heads keep missing tracks … bent ? I don’t even know how to comment >> here. >> >> >> generally fun drive you’ve got there … rescue as much as you can and throw >> it away !!! >> >> > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com