subject:"Re\: \[ceph\-users\] Power outages\!\!\! help\!"

# rados list-inconsistent-pg 
data["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"]# rados list-inconsistent-pg 
metadata["1.d","1.3d"]# rados list-inconsistent-pg rbd["2.7"]# rados 
list-inconsistent-obj 0.0 --format=json-pretty
{    "epoch": 23112,    "inconsistents": []}# rados list-inconsistent-obj 0.5 
--format=json-pretty{    "epoch": 23078,    "inconsistents": []}# rados 
list-inconsistent-obj 0.a --format=json-pretty{    "epoch": 22954,    
"inconsistents": []}# rados list-inconsistent-obj 0.e --format=json-pretty{    
"epoch": 23068,    "inconsistents": []}# rados list-inconsistent-obj 0.1c 
--format=json-pretty{    "epoch": 22954,    "inconsistents": []}# rados 
list-inconsistent-obj 0.29 --format=json-pretty{    "epoch": 22974,    
"inconsistents": []}# rados list-inconsistent-obj 0.2c --format=json-pretty{    
"epoch": 23194,    "inconsistents": []}# rados list-inconsistent-obj 1.d 
--format=json-pretty{    "epoch": 23072,    "inconsistents": []}# rados 
list-inconsistent-obj 1.3d --format=json-pretty{    "epoch": 23221,    
"inconsistents": []}# rados list-inconsistent-obj 2.7 --format=json-pretty{    
"epoch": 23032,    "inconsistents": []}
Looks like not much information is there.  Could you elaborate on the items you 
mentioned in find the object?  How do I check metadata.  What are we looking 
for in md5sum? 
- find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 

I tried that Ceph: manually repair object - Ceph methods on PG 2.7 
before..Tried 3 replica case, which would result in shard missing, regardless 
of which one I moved,  2 replica case, hmm... I guess I don't know how long is 
"wait a bit" is, I just turned it back on after a minute or so, just returns 
back to same inconsistent message.. =P  Are we looking for entire stopped OSD 
to map to different OSD and get 3 replica when running stopped OSD again?
Regards,Hong

 

On Wednesday, September 20, 2017 4:47 PM, hjcho616  
wrote:
 

 Thanks Ronny.  I'll try that inconsistent issue soon.  
I think the OSD drive that PG 1.28 is sitting on is still ok... just file 
corruption happened when power outage happened.. =P  As you suggested, cd 
/var/lib/ceph/osd/ceph-4/current/
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz 1.28_*
cd /var/lib/ceph/osd/ceph-10/tmposd
mkdir currentchown ceph.ceph current/
cd current/tar --xattrs --preserve-permissions -zxvf 
/var/lib/ceph/osd/ceph-4/current/osd.4.tar.gz
systemctl start ceph-osd@8

I created an temp OSD like I did during import time.  Then set the crush 
reweight to 0.  I noticed current directory was missing. =P So created a 
current directory and copied content there.
Starting OSD doesn't appear to show any activity.  Is there any other file I 
need to copy over other than 1.28_head and 1.28_tail directories?
Regards,Hong 

On Wednesday, September 20, 2017 4:04 PM, Ronny Aasen 
 wrote:
 

  i would only tar the pg you have missing objects from, trying to inject older 
objects when the pg is correct can not be good. 
 
 
 scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
 and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and 
onhttp://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to
 
 - find the pg  ::  rados list-inconsistent-pg [pool]
 - find the problem ::  rados list-inconsistent-obj 0.6 --format=json-pretty ; 
give you the object name  look for hints to what is the bad object 
 - find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 
 - fix the problem  :: assuming you find the bad object, stop the affected osd 
with the bad object, remove the object manually, restart osd. issue repair 
command.
 
 
 if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 good luck 
 Ronny Aasen
 
 On 20.09.2017 22:17, hjcho616 wrote:
  
  Thanks Ronny. 
  I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?  
  tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz . 
  As far as inconsistent PGs... I am running in to these errors.  I tried 
moving one copy of pg to other location, but it just says moved shard is 
missing.  Tried setting 'noout ' and turn one of them down, seems to work on 
something but then back to same error.  Currently trying to move to different 
osd...

Re: [ceph-users] Power outages!!! help!

Thanks Ronny.  I'll try that inconsistent issue soon.  
I think the OSD drive that PG 1.28 is sitting on is still ok... just file 
corruption happened when power outage happened.. =P  As you suggested, cd 
/var/lib/ceph/osd/ceph-4/current/
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz 1.28_*
cd /var/lib/ceph/osd/ceph-10/tmposd
mkdir currentchown ceph.ceph current/
cd current/tar --xattrs --preserve-permissions -zxvf 
/var/lib/ceph/osd/ceph-4/current/osd.4.tar.gz
systemctl start ceph-osd@8

I created an temp OSD like I did during import time.  Then set the crush 
reweight to 0.  I noticed current directory was missing. =P So created a 
current directory and copied content there.
Starting OSD doesn't appear to show any activity.  Is there any other file I 
need to copy over other than 1.28_head and 1.28_tail directories?
Regards,Hong 

On Wednesday, September 20, 2017 4:04 PM, Ronny Aasen 
 wrote:
 

  i would only tar the pg you have missing objects from, trying to inject older 
objects when the pg is correct can not be good. 
 
 
 scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
 and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and 
onhttp://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to
 
 - find the pg  ::  rados list-inconsistent-pg [pool]
 - find the problem ::  rados list-inconsistent-obj 0.6 --format=json-pretty ; 
give you the object name  look for hints to what is the bad object 
 - find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 
 - fix the problem  :: assuming you find the bad object, stop the affected osd 
with the bad object, remove the object manually, restart osd. issue repair 
command.
 
 
 if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 good luck 
 Ronny Aasen
 
 On 20.09.2017 22:17, hjcho616 wrote:
  
  Thanks Ronny. 
  I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?  
  tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz . 
  As far as inconsistent PGs... I am running in to these errors.  I tried 
moving one copy of pg to other location, but it just says moved shard is 
missing.  Tried setting 'noout ' and turn one of them down, seems to work on 
something but then back to same error.  Currently trying to move to different 
osd... making sure the drive is not faulty, got few of them.. but still 
persisting..  I've been kicking off ceph pg repair PG#, hoping it would fix 
them. =P  Any other suggestion? 
  2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log [INF] : 
0.29 repair starts 2017-09-20 09:47:37.384921 7f163c5fa700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 6: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s  4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0]) 2017-09-20 09:47:37.384931 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s  4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0]) 2017-09-20 09:47:37.384936 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 soid 
0:97126ead:::200014ce4c3.028f:head: failed to pick suitable auth object 
2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head data_digest 
0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776  dd f41cfab8 od  
alloc_hint [0 0]) 2017-09-20 09:48:11.138575 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97d5c15a:::10101b4.6892:head data_digest 0xd65b4014 != data_digest 
0xf41cfab8 from auth oi 0:97d5c15a:::10101b4.6892:head(12962'65557 
osd.4.0:42234 dirty|data_digest|omap_digest s 4194304 uv 776  dd f41cfab8 od 
 alloc_hint [0 0]) 2017-09-20 09:48:11.138581 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 soid 
0:97d5c15a:::10101b4.6892:head: failed to pick suitable auth object 
2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 repair 4 errors, 0

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread Ronny Aasen

i would only tar the pg you have missing objects from, trying to inject 
older objects when the pg is correct can not be good.



scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and on 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to


- find the pg  ::  rados list-inconsistent-pg [pool]
- find the problem :: rados list-inconsistent-obj 0.6 
--format=json-pretty ; give you the object name  look for hints to what 
is the bad object
- find the object  :: manually check the objects, check the object 
metadata, run md5sum on them all and compare. check objects on the 
nonrunning osd's and compare there as well. anything to try to determine 
what object is ok and what is bad.
- fix the problem  :: assuming you find the bad object, stop the 
affected osd with the bad object, remove the object manually, restart 
osd. issue repair command.



if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/


good luck
Ronny Aasen

On 20.09.2017 22:17, hjcho616 wrote:

Thanks Ronny.

I decided to try to tar everything under current directory.  Is this 
correct command for it?  Is there any directory we do not want in the 
new drive?  commit_op_seq, meta, nosnap, omap?


tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz .

As far as inconsistent PGs... I am running in to these errors.  I 
tried moving one copy of pg to other location, but it just says moved 
shard is missing.  Tried setting 'noout ' and turn one of them down, 
seems to work on something but then back to same error.  Currently 
trying to move to different osd... making sure the drive is not 
faulty, got few of them.. but still persisting..  I've been kicking 
off ceph pg repair PG#, hoping it would fix them. =P  Any other 
suggestion?


2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log 
[INF] : 0.29 repair starts
2017-09-20 09:47:37.384921 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 6: soid 0:97126ead:::200014ce4c3.028f:head 
data_digest 0x8f679a50 != data_digest 0x979f2ed4 from auth oi 
0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 
539375 dd 979f2ed4 od  alloc_hint [0 0])
2017-09-20 09:47:37.384931 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 7: soid 0:97126ead:::200014ce4c3.028f:head 
data_digest 0x8f679a50 != data_digest 0x979f2ed4 from auth oi 
0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 
539375 dd 979f2ed4 od  alloc_hint [0 0])
2017-09-20 09:47:37.384936 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 soid 0:97126ead:::200014ce4c3.028f:head: failed to 
pick suitable auth object
2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head 
data_digest 0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])
2017-09-20 09:48:11.138575 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 7: soid 0:97d5c15a:::10101b4.6892:head 
data_digest 0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])
2017-09-20 09:48:11.138581 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 soid 0:97d5c15a:::10101b4.6892:head: failed to 
pick suitable auth object
2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 repair 4 errors, 0 fixed


Latest health...
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
down; 1 pgs incomplete; 9 pgs inconsistent; 1 pgs repair; 1 pgs stuck 
inactive; 1 pgs stuck unclean; 68 scrub errors; mds rank 0 has failed; 
mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag 
is not set


Regards,
Hong




On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen 
 wrote:



On 20.09.2017 16:49, hjcho616 wrote:

Anyone?  Can this page be saved?  If not what are my options?

Regards,
Hong


On Saturday, September 16, 2017 1:55 AM, hjcho616 
  wrote:



Looking better... working on scrubbing..
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 
1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 
30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set;

Re: [ceph-users] Power outages!!! help!

Thanks Ronny.
I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz .
As far as inconsistent PGs... I am running in to these errors.  I tried moving 
one copy of pg to other location, but it just says moved shard is missing.  
Tried setting 'noout ' and turn one of them down, seems to work on something 
but then back to same error.  Currently trying to move to different osd... 
making sure the drive is not faulty, got few of them.. but still persisting..  
I've been kicking off ceph pg repair PG#, hoping it would fix them. =P  Any 
other suggestion?
2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log [INF] : 
0.29 repair starts2017-09-20 09:47:37.384921 7f163c5fa700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 6: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0])2017-09-20 09:47:37.384931 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0])2017-09-20 09:47:37.384936 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 soid 
0:97126ead:::200014ce4c3.028f:head: failed to pick suitable auth 
object2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log [ERR] 
: 0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head data_digest 
0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])2017-09-20 09:48:11.138575 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97d5c15a:::10101b4.6892:head data_digest 0xd65b4014 != data_digest 
0xf41cfab8 from auth oi 0:97d5c15a:::10101b4.6892:head(12962'65557 
osd.4.0:42234 dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od 
 alloc_hint [0 0])2017-09-20 09:48:11.138581 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 soid 
0:97d5c15a:::10101b4.6892:head: failed to pick suitable auth 
object2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log [ERR] 
: 0.29 repair 4 errors, 0 fixed
Latest health...HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 
1 pgs down; 1 pgs incomplete; 9 pgs inconsistent; 1 pgs repair; 1 pgs stuck 
inactive; 1 pgs stuck unclean; 68 scrub errors; mds rank 0 has failed; mds 
cluster is degraded; no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong

 

On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen 
 wrote:
 

  On 20.09.2017 16:49, hjcho616 wrote:
  
  Anyone?  Can this page be saved?  If not what are my options? 
  Regards, Hong 
 
  On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:
  
 
 Looking better... working on scrubbing.. HEALTH_ERR 1 pgs are stuck 
inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 
pgs repair; 1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too 
few PGs per OSD (29 < min 30); mds rank 0 has failed; mds cluster is degraded; 
noout flag(s) set; no legacy OSD present but 'sortbitwise' flag is not  set
  
  Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus... # 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.export terminate called after throwing an instance of 
'std::domain_error' 
  
 
 [SNIP]
 
 What can I do to save that PG1.28?  Please let me know if you need 
more information.  So close!... =)  
  Regards, Hong 
   
 12 inconsistent and 109 scrub errors is something you should fix first of all. 
  also you can consider using the paid-services of many ceph support companies. 
that specialize in these kind of situations. 
  -- that beeing said, here are some suggestions...
  when it comes to lost object recovery you have come about as far as i have 
ever experienced. so everything after here is just

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread Ronny Aasen


On 20.09.2017 16:49, hjcho616 wrote:

Anyone?  Can this page be saved?  If not what are my options?

Regards,
Hong


On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:



Looking better... working on scrubbing..
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 1 
pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 
30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not set


Now PG1.28.. looking at all old osds dead or alive.  Only one with 
DIR_* directory is in osd.4. This appears to be metadata pool!  21M of 
metadata can be quite a bit of stuff.. so I would like to rescue this! 
 But I am not able to start this OSD.  exporting through 
ceph-objectstore-tool appears to crash.  Even with 
--skip-journal-replay and --skip-mount-omap (different failure).  As I 
mentioned in earlier email, that exception thrown message is bogus...
# ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export

terminate called after throwing an instance of 'std::domain_error'



[SNIP]
What can I do to save that PG1.28?  Please let me know if you need 
more information.  So close!... =)


Regards,
Hong

12 inconsistent and 109 scrub errors is something you should fix first 
of all.


also you can consider using the paid-services of many ceph support 
companies. that specialize in these kind of situations.


--

that beeing said, here are some suggestions...

when it comes to lost object recovery you have come about as far as i 
have ever experienced. so everything after here is just assumptions and 
wild guesswork to what you can try.  I hope others shouts out if i tell 
you wildly wrong things.


if you have found date pg1.28 from the broken osd and have checked all 
other working and nonworking drives, for that pg. then you need to try 
and extract the pg from the broken drive. As always in recovery cases, 
take a dd clone of the drive and work from the cloned image. to avoid 
more damage to the drive, and to allow you to try multiple times.


you should add a temporary injection drive large enough for that pg, and 
set its crush weight to 0 so it always drains. make sure it is up and 
registered properly in ceph.


the idea is to copy the pg manually from broken-osd to the injection 
drive, since the export/import fails.. making sure you get all xattrs 
included.  one can either copy the whole pg, or just the "missing" 
objects.  if there are few objects i would go for that, if there are 
many i would take the whole pg. you wont get data from leveldb. so i am 
not at all sure this would work. but worth a shot.


- stop your injection osd, verify it is down and the proccess not running.
- from the mountpoint of your broken-osd go into the current directory. 
and tar up the pg1.28 make sure you use -p and --xattrs when you create 
the archive.
- if tar errors out on unreadable files, just rm those (since you are 
working on a copy of your rescue image, you can allways try again)
- copy the tar file to the injection drive and extract while sitting in 
the current directory (remember --xattrs)

- set debug options on the injection drive in ceph.conf
- start the injection drive, and follow along in the log file. hopefully 
it should scan, locate the pg, and replicate the pg1.28 objects off to 
the current primary drive for pg1.28. and since it have crush weight 0 
it should drain out.
- if that works, verify the injection drive is drained, stop it and 
remove it from ceph.  zap the drive.



this is all as i said guesstimates so your mileage may vary
good luck

Ronny Aasen







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Anyone?  Can this page be saved?  If not what are my options?
Regards,Hong 

On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:
 

 Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive 
for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 
1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per 
OSD (29 < min 30); mds rank 0 has failed; mds cluster is degraded; noout 
flag(s) set; no legacy OSD present but 'sortbitwise' flag is not set

Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus...# 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.exportterminate called after throwing an instance of 'std::domain_error' 
 what():  coll_t::decode(): don't know how to decode version 1*** Caught signal 
(Aborted) ** in thread 7f812e7fb940 thread_name:ceph-objectstor ceph version 
10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) 
[0x55dee175fa57] 2: (()+0x110c0) [0x7f812d0050c0] 3: (gsignal()+0xcf) 
[0x7f812b438fcf] 4: (abort()+0x16a) [0x7f812b43a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f812bd1fb3d] 6: 
(()+0x5ebb6) [0x7f812bd1dbb6] 7: (()+0x5ec01) [0x7f812bd1dc01] 8: (()+0x5ee19) 
[0x7f812bd1de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55dee143001e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55dee156d5f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55dee1562bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55dee1561eb8] 13: 
(FileStore::mount()+0x2525) [0x55dee1498eb5] 14: (main()+0x28c0) 
[0x55dee10c9400] 15: (__libc_start_main()+0xf1) [0x7f812b4262b1] 16: 
(()+0x34f747) [0x55dee1118747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-journal-replayterminate called after throwing an instance of 
'std::domain_error'  what():  coll_t::decode(): don't know how to decode 
version 1*** Caught signal (Aborted) ** in thread 7fa6d087b940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55abd356aa57] 2: 
(()+0x110c0) [0x7fa6cf0850c0] 3: (gsignal()+0xcf) [0x7fa6cd4b8fcf] 4: 
(abort()+0x16a) [0x7fa6cd4ba3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fa6cdd9fb3d] 6: 
(()+0x5ebb6) [0x7fa6cdd9dbb6] 7: (()+0x5ec01) [0x7fa6cdd9dc01] 8: (()+0x5ee19) 
[0x7fa6cdd9de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55abd323b01e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55abd33785f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55abd336dbb9] 12: (DBObjectMap::init(bool)+0x288) [0x55abd336ceb8] 13: 
(FileStore::mount()+0x2525) [0x55abd32a3eb5] 14: (main()+0x28c0) 
[0x55abd2ed4400] 15: (__libc_start_main()+0xf1) [0x7fa6cd4a62b1] 16: 
(()+0x34f747) [0x55abd2f23747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-mount-omapceph-objectstore-tool: 
/usr/include/boost/smart_ptr/scoped_ptr.hpp:99: T* 
boost::scoped_ptr::operator->() const [with T = ObjectMap]: Assertion `px != 
0' failed.*** Caught signal (Aborted) ** in thread 7f14345c5940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x5575b50a9a57] 2: 
(()+0x110c0) [0x7f1432dcf0c0] 3: (gsignal()+0xcf) [0x7f1431202fcf] 4: 
(abort()+0x16a) [0x7f14312043fa] 5: (()+0x2be37) [0x7f14311fbe37] 6: 
(()+0x2bee2) [0x7f14311fbee2] 7: (()+0x2fa19c) [0x5575b4a0d19c] 8: 
(FileStore::omap_get_values(coll_t const&, ghobject_t const&, 
std::set 
const&, std::map >*)+0x6c2) 
[0x5575b4dc9322] 9: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x235) [0x5575b4ab3135] 10: (main()+0x5bd6) 
[0x5575b4a16716] 11: (__libc_start_main()+0xf1) [0x7f14311f02b1] 12: 
(()+0x34f747) [0x5575b4a62747]
When trying to bring up osd.4 we get this message.  Feels very similar to that 
crash in first two above. ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x960e57) [0x5565e564ae57] 2: 
(()+0x110c0) [0x7f34aa17e0c0] 3: (gsignal()+0xcf) [0x7f34a81c4fcf] 4: 
(abort()+0x16a) [0x7f34a81c63fa] 5:

Re: [ceph-users] Power outages!!! help!

2017-09-16 Thread hjcho616

Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for 
more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 1 
pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD 
(29 < min 30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not set

Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus...# 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.exportterminate called after throwing an instance of 'std::domain_error' 
 what():  coll_t::decode(): don't know how to decode version 1*** Caught signal 
(Aborted) ** in thread 7f812e7fb940 thread_name:ceph-objectstor ceph version 
10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) 
[0x55dee175fa57] 2: (()+0x110c0) [0x7f812d0050c0] 3: (gsignal()+0xcf) 
[0x7f812b438fcf] 4: (abort()+0x16a) [0x7f812b43a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f812bd1fb3d] 6: 
(()+0x5ebb6) [0x7f812bd1dbb6] 7: (()+0x5ec01) [0x7f812bd1dc01] 8: (()+0x5ee19) 
[0x7f812bd1de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55dee143001e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55dee156d5f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55dee1562bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55dee1561eb8] 13: 
(FileStore::mount()+0x2525) [0x55dee1498eb5] 14: (main()+0x28c0) 
[0x55dee10c9400] 15: (__libc_start_main()+0xf1) [0x7f812b4262b1] 16: 
(()+0x34f747) [0x55dee1118747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-journal-replayterminate called after throwing an instance of 
'std::domain_error'  what():  coll_t::decode(): don't know how to decode 
version 1*** Caught signal (Aborted) ** in thread 7fa6d087b940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55abd356aa57] 2: 
(()+0x110c0) [0x7fa6cf0850c0] 3: (gsignal()+0xcf) [0x7fa6cd4b8fcf] 4: 
(abort()+0x16a) [0x7fa6cd4ba3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fa6cdd9fb3d] 6: 
(()+0x5ebb6) [0x7fa6cdd9dbb6] 7: (()+0x5ec01) [0x7fa6cdd9dc01] 8: (()+0x5ee19) 
[0x7fa6cdd9de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55abd323b01e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55abd33785f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55abd336dbb9] 12: (DBObjectMap::init(bool)+0x288) [0x55abd336ceb8] 13: 
(FileStore::mount()+0x2525) [0x55abd32a3eb5] 14: (main()+0x28c0) 
[0x55abd2ed4400] 15: (__libc_start_main()+0xf1) [0x7fa6cd4a62b1] 16: 
(()+0x34f747) [0x55abd2f23747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-mount-omapceph-objectstore-tool: 
/usr/include/boost/smart_ptr/scoped_ptr.hpp:99: T* 
boost::scoped_ptr::operator->() const [with T = ObjectMap]: Assertion `px != 
0' failed.*** Caught signal (Aborted) ** in thread 7f14345c5940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x5575b50a9a57] 2: 
(()+0x110c0) [0x7f1432dcf0c0] 3: (gsignal()+0xcf) [0x7f1431202fcf] 4: 
(abort()+0x16a) [0x7f14312043fa] 5: (()+0x2be37) [0x7f14311fbe37] 6: 
(()+0x2bee2) [0x7f14311fbee2] 7: (()+0x2fa19c) [0x5575b4a0d19c] 8: 
(FileStore::omap_get_values(coll_t const&, ghobject_t const&, 
std::set 
const&, std::map >*)+0x6c2) 
[0x5575b4dc9322] 9: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x235) [0x5575b4ab3135] 10: (main()+0x5bd6) 
[0x5575b4a16716] 11: (__libc_start_main()+0xf1) [0x7f14311f02b1] 12: 
(()+0x34f747) [0x5575b4a62747]
When trying to bring up osd.4 we get this message.  Feels very similar to that 
crash in first two above. ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x960e57) [0x5565e564ae57] 2: 
(()+0x110c0) [0x7f34aa17e0c0] 3: (gsignal()+0xcf) [0x7f34a81c4fcf] 4: 
(abort()+0x16a) [0x7f34a81c63fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f34a8aabb3d] 6: 
(()+0x5ebb6) [0x7f34a8aa9bb6] 7: (()+0x5ec01) [0x7f34a8aa9c01] 8: (()+0x5ee19) 
[0x7f34a8aa9e19] 9:

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread hjcho616

After running ceph osd lost osd.0, it started backfilling... I figured that was 
supposed to happen earlier when I added those missing PGs.  Running in to "too 
few PGs per OSD" I removed osds after cluster stopped working after adding 
osds.  But I guess I still needed them.  Currently I see several incomplete PGs 
and trying to import those PGs back. =P
As far as 1.28 goes, it didn't look like it was limited by osd.0, logs didn't 
show any signs of osd.0 and data is only available on osd.4, which wouldn't 
export... So I still need to deal with that one.  It is still showing up as 
incomplete.. =P  Any recommendations how to get that back?pg 1.28 is stuck 
inactive since forever, current state down+incomplete, last acting [11,6]pg 
1.28 is stuck unclean since forever, current state down+incomplete, last acting 
[11,6]pg 1.28 is down+incomplete, acting [11,6] (reducing pool metadata 
min_size from 2 may help; search ceph.com/docs for 'incomplete')
Regards,Hong


On Friday, September 15, 2017 4:51 AM, Ronny Aasen 
 wrote:
 

 
you write you had all pg's exported except one. so i assume you have 
injected those pg's into the cluster again using the method linked a few 
times in this thread. How did that go, were you successfull in 
recovering those pg's ?

kind regards.
Ronny Aasen



On 15. sep. 2017 07:52, hjcho616 wrote:
> I just did this and backfilling started.  Let's see where this takes me.
> ceph osd lost 0 --yes-i-really-mean-it
> 
> Regards,
> Hong
> 
> 
> On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:
> 
> 
> Ronny,
> 
> Working with all of the pgs shown in the "ceph health detail", I ran 
> below for each PG to export.
> ceph-objectstore-tool --op export --pgid 0.1c  --data-path 
> /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
> --skip-journal-replay --file 0.1c.export
> 
> I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This 
> error doesn't make much sense to me.  Looking at the source code from 
> https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that 
> message is telling me struct_v is 1... but not sure how it ended up in 
> the default in the case statement when 1 case is defined...  I tried 
> with --skip-journal-replay, fails with same error message.
> ceph-objectstore-tool --op export --pgid 1.28  --data-path 
> /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal 
> --file 1.28.export
> terminate called after throwing an instance of 'std::domain_error'
>    what():  coll_t::decode(): don't know how to decode version 1
> *** Caught signal (Aborted) **
>  in thread 7fabc5ecc940 thread_name:ceph-objectstor
>  ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)
>  1: (()+0x996a57) [0x55b2d3323a57]
>  2: (()+0x110c0) [0x7fabc46d50c0]
>  3: (gsignal()+0xcf) [0x7fabc2b08fcf]
>  4: (abort()+0x16a) [0x7fabc2b0a3fa]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d]
>  6: (()+0x5ebb6) [0x7fabc33edbb6]
>  7: (()+0x5ec01) [0x7fabc33edc01]
>  8: (()+0x5ee19) [0x7fabc33ede19]
>  9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) [0x55b2d2ff401e]
>  10: 
> (DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
> [0x55b2d31315f5]
>  11: (DBObjectMap::check(std::ostream&, bool)+0x279) [0x55b2d3126bb9]
>  12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8]
>  13: (FileStore::mount()+0x2525) [0x55b2d305ceb5]
>  14: (main()+0x28c0) [0x55b2d2c8d400]
>  15: (__libc_start_main()+0xf1) [0x7fabc2af62b1]
>  16: (()+0x34f747) [0x55b2d2cdc747]
> Aborted
> 
> Then wrote a simple script to run import process... just created an OSD 
> per PG.  Basically ran below for each PG.
> mkdir /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph-disk prepare /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph-disk activate /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph osd crush reweight osd.$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0
> systemctl stop ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
> ceph-objectstore-tool --op import --pgid 0.1c  --data-path 
> /var/lib/ceph/osd/ceph-$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
> /var/lib/ceph/osd/ceph-$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
> ./export/0.1c.export
> chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> systemctl start ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
> 
> Sometimes import didn't work.. but stopping OSD and rerunning 
> ceph-objectstore-tool again seems to help or when some PG didn't really 
> want to import .
> 
> Unfound messages are gone!  But I still have down+peering, or 
> down+remapped+peering.
> # ceph health detail
> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
> down; 1 pgs inconsistent; 22 pgs peering; 22 pgs stuck inactive; 22 pgs 
> stuck unclean; 1 requests are blocked > 32 sec; 1

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread Ronny Aasen



you write you had all pg's exported except one. so i assume you have 
injected those pg's into the cluster again using the method linked a few 
times in this thread. How did that go, were you successfull in 
recovering those pg's ?


kind regards.
Ronny Aasen



On 15. sep. 2017 07:52, hjcho616 wrote:

I just did this and backfilling started.  Let's see where this takes me.
ceph osd lost 0 --yes-i-really-mean-it

Regards,
Hong


On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:


Ronny,

Working with all of the pgs shown in the "ceph health detail", I ran 
below for each PG to export.
ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export


I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This 
error doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that 
message is telling me struct_v is 1... but not sure how it ended up in 
the default in the case statement when 1 case is defined...  I tried 
with --skip-journal-replay, fails with same error message.
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal 
--file 1.28.export

terminate called after throwing an instance of 'std::domain_error'
   what():  coll_t::decode(): don't know how to decode version 1
*** Caught signal (Aborted) **
  in thread 7fabc5ecc940 thread_name:ceph-objectstor
  ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)
  1: (()+0x996a57) [0x55b2d3323a57]
  2: (()+0x110c0) [0x7fabc46d50c0]
  3: (gsignal()+0xcf) [0x7fabc2b08fcf]
  4: (abort()+0x16a) [0x7fabc2b0a3fa]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d]
  6: (()+0x5ebb6) [0x7fabc33edbb6]
  7: (()+0x5ec01) [0x7fabc33edc01]
  8: (()+0x5ee19) [0x7fabc33ede19]
  9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) [0x55b2d2ff401e]
  10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5]

  11: (DBObjectMap::check(std::ostream&, bool)+0x279) [0x55b2d3126bb9]
  12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8]
  13: (FileStore::mount()+0x2525) [0x55b2d305ceb5]
  14: (main()+0x28c0) [0x55b2d2c8d400]
  15: (__libc_start_main()+0xf1) [0x7fabc2af62b1]
  16: (()+0x34f747) [0x55b2d2cdc747]
Aborted

Then wrote a simple script to run import process... just created an OSD 
per PG.  Basically ran below for each PG.

mkdir /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph-disk prepare /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph-disk activate /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0

systemctl stop ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
ceph-objectstore-tool --op import --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export

chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
systemctl start ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)

Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really 
want to import .


Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering.

# ceph health detail
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
down; 1 pgs inconsistent; 22 pgs peering; 22 pgs stuck inactive; 22 pgs 
stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow 
requests; 2 scrub errors; mds cluster is degraded; noout flag(s) set; no 
legacy OSD present but 'sortbitwise' flag is not set
pg 1.d is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 0.a is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 2.b is stuck inactive since forever, current state 
down+remapped+peering, last acting [7,11]
pg 1.9 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]
pg 0.2c is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.0 is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,7]
pg 1.2b is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.29 is stuck inactive since forever, current state down+peering, 
last acting [11,6]
pg 1.28 is stuck inactive since

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616

I just did this and backfilling started.  Let's see where this takes me. ceph 
osd lost 0 --yes-i-really-mean-it
Regards,Hong 

On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:
 

 Ronny,
Working with all of the pgs shown in the "ceph health detail", I ran below for 
each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export

I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This error 
doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that message is 
telling me struct_v is 1... but not sure how it ended up in the default in the 
case statement when 1 case is defined...  I tried with --skip-journal-replay, 
fails with same error message.ceph-objectstore-tool --op export --pgid 1.28  
--data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file 1.28.exportterminate called after 
throwing an instance of 'std::domain_error'  what():  coll_t::decode(): don't 
know how to decode version 1*** Caught signal (Aborted) ** in thread 
7fabc5ecc940 thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55b2d3323a57] 2: 
(()+0x110c0) [0x7fabc46d50c0] 3: (gsignal()+0xcf) [0x7fabc2b08fcf] 4: 
(abort()+0x16a) [0x7fabc2b0a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d] 6: 
(()+0x5ebb6) [0x7fabc33edbb6] 7: (()+0x5ec01) [0x7fabc33edc01] 8: (()+0x5ee19) 
[0x7fabc33ede19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55b2d2ff401e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55b2d3126bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8] 13: 
(FileStore::mount()+0x2525) [0x55b2d305ceb5] 14: (main()+0x28c0) 
[0x55b2d2c8d400] 15: (__libc_start_main()+0xf1) [0x7fabc2af62b1] 16: 
(()+0x34f747) [0x55b2d2cdc747]Aborted
Then wrote a simple script to run import process... just created an OSD per PG. 
 Basically ran below for each PG.mkdir 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk prepare 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk activate 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0systemctl stop ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)ceph-objectstore-tool --op import 
--pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export   chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/systemctl start ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really want to 
import .
Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering. # ceph health detailHEALTH_ERR 22 pgs are stuck inactive 
for more than 300 seconds; 22 pgs down; 1 pgs inconsistent; 22 pgs peering; 22 
pgs stuck inactive; 22 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 
osds have slow requests; 2 scrub errors; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not setpg 1.d is stuck 
inactive since forever, current state down+peering, last acting [11,2]pg 0.a is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]pg 2.b is stuck inactive since 
forever, current state down+remapped+peering, last acting [7,11]pg 1.9 is stuck 
inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]pg 0.2c is stuck inactive since 
forever, current state down+peering, last acting [1,11]pg 0.0 is stuck inactive 
since forever, current state down+remapped+peering, last acting [10,7]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.29 is stuck inactive since forever, current state down+peering, last 
acting [11,6]pg 1.28 is stuck inactive since forever, current state 
down+peering, last acting [11,6]pg 2.3 is stuck inactive since forever, current 
state down+peering, last acting [11,7]pg 1.1b is stuck inactive since forever, 
current state down+remapped+peering, last acting [11,6]pg 0.d is stuck inactive 
since forever, current state down+remapped+peering, last acting [7,11]pg 1.c is

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616

Ronny,
Working with all of the pgs shown in the "ceph health detail", I ran below for 
each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export

I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This error 
doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that message is 
telling me struct_v is 1... but not sure how it ended up in the default in the 
case statement when 1 case is defined...  I tried with --skip-journal-replay, 
fails with same error message.ceph-objectstore-tool --op export --pgid 1.28  
--data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file 1.28.exportterminate called after 
throwing an instance of 'std::domain_error'  what():  coll_t::decode(): don't 
know how to decode version 1*** Caught signal (Aborted) ** in thread 
7fabc5ecc940 thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55b2d3323a57] 2: 
(()+0x110c0) [0x7fabc46d50c0] 3: (gsignal()+0xcf) [0x7fabc2b08fcf] 4: 
(abort()+0x16a) [0x7fabc2b0a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d] 6: 
(()+0x5ebb6) [0x7fabc33edbb6] 7: (()+0x5ec01) [0x7fabc33edc01] 8: (()+0x5ee19) 
[0x7fabc33ede19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55b2d2ff401e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55b2d3126bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8] 13: 
(FileStore::mount()+0x2525) [0x55b2d305ceb5] 14: (main()+0x28c0) 
[0x55b2d2c8d400] 15: (__libc_start_main()+0xf1) [0x7fabc2af62b1] 16: 
(()+0x34f747) [0x55b2d2cdc747]Aborted
Then wrote a simple script to run import process... just created an OSD per PG. 
 Basically ran below for each PG.mkdir 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk prepare 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk activate 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0systemctl stop ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)ceph-objectstore-tool --op import 
--pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export   chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/systemctl start ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really want to 
import .
Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering. # ceph health detailHEALTH_ERR 22 pgs are stuck inactive 
for more than 300 seconds; 22 pgs down; 1 pgs inconsistent; 22 pgs peering; 22 
pgs stuck inactive; 22 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 
osds have slow requests; 2 scrub errors; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not setpg 1.d is stuck 
inactive since forever, current state down+peering, last acting [11,2]pg 0.a is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]pg 2.b is stuck inactive since 
forever, current state down+remapped+peering, last acting [7,11]pg 1.9 is stuck 
inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]pg 0.2c is stuck inactive since 
forever, current state down+peering, last acting [1,11]pg 0.0 is stuck inactive 
since forever, current state down+remapped+peering, last acting [10,7]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.29 is stuck inactive since forever, current state down+peering, last 
acting [11,6]pg 1.28 is stuck inactive since forever, current state 
down+peering, last acting [11,6]pg 2.3 is stuck inactive since forever, current 
state down+peering, last acting [11,7]pg 1.1b is stuck inactive since forever, 
current state down+remapped+peering, last acting [11,6]pg 0.d is stuck inactive 
since forever, current state down+remapped+peering, last acting [7,11]pg 1.c is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[7,11]pg 0.3b is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,7]pg 2.39 is stuck inactive since

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread hjcho616

Rooney,
Just tried hooking up osd.0 back.  osd.0 seems to be better as I was able to 
run ceph-objectstore-tool export so decided to try hooking it up.  Looks like 
journal is not happy.  Is there any way to get this running?  Or do I need to 
start getting data using ceph-objectstore-tool?
2017-09-13 18:51:50.051421 7f44dd847800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-13 18:51:50.051435 7f44dd847800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
38992017-09-13 18:51:50.052323 7f44dd847800  0 pidfile_write: ignore empty 
--pid-file2017-09-13 18:51:50.061586 7f44dd847800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-13 
18:51:50.061823 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-13 18:51:50.061826 
7f44dd847800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-13 18:51:50.061838 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-13 18:51:50.077506 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-13 18:51:50.077549 
7f44dd847800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-13 18:51:50.078066 7f44dd847800  1 leveldb: 
Recovering log #280692017-09-13 18:51:50.177610 7f44dd847800  1 leveldb: Delete 
type=0 #28069
2017-09-13 18:51:50.177708 7f44dd847800  1 leveldb: Delete type=3 #28068
2017-09-13 18:51:57.946233 7f44dd847800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-13 
18:51:57.947293 7f44dd847800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-13 
18:51:57.949835 7f44dd847800 -1 journal Unable to read past sequence 27057121 
but header indicates the journal has committed up through 27057593, journal is 
corrupt2017-09-13 18:51:57.951824 7f44dd847800 -1 os/filestore/FileJournal.cc: 
In function 'bool FileJournal::read_entry(ceph::bufferlist&, uint64_t&, bool*)' 
thread 7f44dd847800 time 2017-09-13 18:51:57.949837os/filestore/FileJournal.cc: 
2036: FAILED assert(0)
 ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: 
(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x82) 
[0x55c809640d02] 2: (FileJournal::read_entry(ceph::buffer::list&, unsigned 
long&, bool*)+0xa84) [0x55c8093c4da4] 3: 
(JournalingObjectStore::journal_replay(unsigned long)+0x205) [0x55c8092feb95] 
4: (FileStore::mount()+0x2e28) [0x55c8092d0a88] 5: (OSD::init()+0x27d) 
[0x55c808f697ed] 6: (main()+0x2a64) [0x55c808ed05d4] 7: 
(__libc_start_main()+0xf5) [0x7f44da6e3b45] 8: (()+0x341117) [0x55c808f1b117] 
NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.
--- begin dump of recent events ---   -59> 2017-09-13 18:51:50.043283 
7f44dd847800  5 asok(0x55c813d76000) register_command perfcounters_dump hook 
0x55c813cbe030   -58> 2017-09-13 18:51:50.043312 7f44dd847800  5 
asok(0x55c813d76000) register_command 1 hook 0x55c813cbe030   -57> 2017-09-13 
18:51:50.043317 7f44dd847800  5 asok(0x55c813d76000) register_command perf dump 
hook 0x55c813cbe030   -56> 2017-09-13 18:51:50.043322 7f44dd847800  5 
asok(0x55c813d76000) register_command perfcounters_schema hook 0x55c813cbe030   
-55> 2017-09-13 18:51:50.043326 7f44dd847800  5 asok(0x55c813d76000) 
register_command 2 hook 0x55c813cbe030   -54> 2017-09-13 18:51:50.043330 
7f44dd847800  5 asok(0x55c813d76000) register_command perf schema hook 
0x55c813cbe030   -53> 2017-09-13 18:51:50.043334 7f44dd847800  5 
asok(0x55c813d76000) register_command perf reset hook 0x55c813cbe030   -52> 
2017-09-13 18:51:50.043339 7f44dd847800  5 asok(0x55c813d76000) 
register_command config show hook 0x55c813cbe030   -51> 2017-09-13 
18:51:50.043344 7f44dd847800  5 asok(0x55c813d76000) register_command config 
set hook 0x55c813cbe030   -50> 2017-09-13 18:51:50.043349 7f44dd847800  5 
asok(0x55c813d76000) register_command config get hook 0x55c813cbe030   -49> 
2017-09-13 18:51:50.043355 7f44dd847800  5 asok(0x55c813d76000) 
register_command config diff hook 0x55c813cbe030   -48> 2017-09-13 
18:51:50.043361 7f44dd847800  5 asok(0x55c813d76000) register_command log flush 
hook 0x55c813cbe030   -47> 2017-09-13 18:51:50.043367 7f44dd847800  5 
asok(0x55c813d76000) register_command log dump hook 0x55c813cbe030   -46> 
2017-09-13 18:51:50.043373 7f44dd847800  5 asok(0x55c813d76000) 
register_command log reopen hook 0x55c813cbe030   -45> 2017-09-13 
18:51:50.051421 7f44dd847800  0 set uid:gid to 1001:1001 (ceph:ceph)   -44> 
2017-09-13 18:51:50.051435 7f44dd847800  0 ceph version 10.2.9

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread Ronny Aasen


On 13. sep. 2017 07:04, hjcho616 wrote:

Ronny,

Did bunch of ceph pg repair pg# and got the scrub errors down to 10... 
well was 9, trying to fix one became 10.. waiting for it to fix (I did 
that noout trick as I only have two copies).  8 of those scrub errors 
looks like it would need data from osd.0.


HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
degraded; 6 pgs down; 3 pgs inconsistent; 6 pgs peering; 6 pgs 
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 
16 pgs stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16 
pgs undersized; 1 requests are blocked > 32 sec; recovery 221990/4503980 
objects degraded (4.929%); recovery 147/2251990 unfound (0.007%); 10 
scrub errors; mds cluster is degraded; no legacy OSD present but 
'sortbitwise' flag is not set


 From what I saw from ceph health detail, running osd.0 would solve 
majority of the problems.  But that was the disk with the smart error 
earlier.  I did move to new drive using ddrescue.  When trying to start 
osd.0, I get this.  Is there anyway I can get around this?




running a rescued disk is not something you should try. this is when you 
should try to export using the objectstoretool


this was the drive that failed to export pg's becouse of missing 
superblock ? you could also try the export directly on the failed drive. 
just to try if that works. you many have to run the tool as ceph user if 
that is the user owning all the files


you could try running the export of one of the pg's on osd.0 again and 
post all commands and output.


good luck

Ronny





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616

Ronny,
Did bunch of ceph pg repair pg# and got the scrub errors down to 10... well was 
9, trying to fix one became 10.. waiting for it to fix (I did that noout trick 
as I only have two copies).  8 of those scrub errors looks like it would need 
data from osd.0.
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
degraded; 6 pgs down; 3 pgs inconsistent; 6 pgs peering; 6 pgs recovering; 16 
pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 28 
pgs stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 1 requests are 
blocked > 32 sec; recovery 221990/4503980 objects degraded (4.929%); recovery 
147/2251990 unfound (0.007%); 10 scrub errors; mds cluster is degraded; no 
legacy OSD present but 'sortbitwise' flag is not set
>From what I saw from ceph health detail, running osd.0 would solve majority of 
>the problems.  But that was the disk with the smart error earlier.  I did move 
>to new drive using ddrescue.  When trying to start osd.0, I get this.  Is 
>there anyway I can get around this?2017-09-12 01:31:55.205898 7fb61521a800  0 
>set uid:gid to 1001:1001 (ceph:ceph)2017-09-12 01:31:55.205915 7fb61521a800  0 
>ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process 
>ceph-osd, pid 48222017-09-12 01:31:55.206955 7fb61521a800  0 pidfile_write: 
>ignore empty --pid-file2017-09-12 01:31:55.217615 7fb61521a800  0 
>filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-12 
>01:31:55.217854 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP 
>ioctl is disabled via 'filestore fiemap' config option2017-09-12 
>01:31:55.217858 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
>SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config 
>option2017-09-12 01:31:55.217871 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
>supported2017-09-12 01:31:55.268117 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
>syscall fully supported (by glibc and kernel)2017-09-12 01:31:55.268190 
>7fb61521a800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
>extsize is disabled by conf
2017-09-12 01:31:55.269266 7fb61521a800  1 leveldb: Recovering log 
#290562017-09-12 01:31:55.502001 7fb61521a800  1 leveldb: Delete type=0 #29056
2017-09-12 01:31:55.502079 7fb61521a800  1 leveldb: Delete type=3 #29055
2017-09-12 01:32:03.165991 7fb61521a800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-12 
01:32:03.167009 7fb61521a800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-12 
01:32:03.170097 7fb61521a800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-12 
01:32:03.170530 7fb61521a800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-12 01:32:03.170643 7fb61521a800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-12 01:32:03.170657 7fb61521a800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-12 01:32:03.171059 7fb61521a800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-12 01:32:03.193741 7fb61521a800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Trying to attack down+peering issue.  Seems like same problem as above.  Anyway 
around this one?  Alot of these say "last acting [0]".  Should it matter if I 
grab from other OSD? # ceph-objectstore-tool --op export --pgid 0.2c 
--data-path /var/lib/ceph/osd/ceph-0/ --journal-path 
/var/lib/ceph/osd/ceph-0/journal --file 0.2c.export.0Failure to read OSD 
superblock: (2) No such file or directory
Regards,Hong

 

On Tuesday, September 12, 2017 10:04 AM, hjcho616  
wrote:
 

 Thank you for those references!  I'll have to go study some more.  Good 
portion of that inconsistent seems to be from missing data from osd.0. =P  
There appears to be some from okay drives. =P  Kicked off "ceph pg repair pg#" 
few times, but doesn't seem to change much yet. =P  As far as smart output 
goes, they are showing status of PASS for all of them.  and all 
current_pending_sector is 0. =)  There are some Raw_Read_Error_Rate with low 
numbers.. like 2 or 6, but some are huge numbers (Seagate drives do this?) and 
they are not being flagged.  =P  Seek Error seems to be the same... Samsung 
drives show 0 while Seagate drives show huge numbers. =P  Even the new ones.  
Is there any particular one I should be concentrated on for the smart?
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616

Thank you for those references!  I'll have to go study some more.  Good portion 
of that inconsistent seems to be from missing data from osd.0. =P  There 
appears to be some from okay drives. =P  Kicked off "ceph pg repair pg#" few 
times, but doesn't seem to change much yet. =P  As far as smart output goes, 
they are showing status of PASS for all of them.  and all 
current_pending_sector is 0. =)  There are some Raw_Read_Error_Rate with low 
numbers.. like 2 or 6, but some are huge numbers (Seagate drives do this?) and 
they are not being flagged.  =P  Seek Error seems to be the same... Samsung 
drives show 0 while Seagate drives show huge numbers. =P  Even the new ones.  
Is there any particular one I should be concentrated on for the smart?
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
# cat /etc/ceph/ceph.conf[global]#fsid = 
383ef3b1-ba70-43e2-8294-fb2fc2fb6f6afsid = 
9b2c9bca-112e-48b0-86fc-587ef9a52948mon_initial_members = MDS1mon_host = 
192.168.1.20#auth_cluster_required = cephx#auth_service_required = 
cephx#auth_client_required  = cephxauth_cluster_required = 
noneauth_service_required = noneauth_client_required  = 
nonefilestore_xattr_use_omap = truepublic network = 
192.168.1.0/24cluster_network = 192.168.2.0/24
osd_client_op_priority = 63osd_recovery_op_priority = 1osd_max_backfills = 
5osd_recovery_max_active = 5
# ceph osd dfID WEIGHT  REWEIGHT SIZE  USE    AVAIL %USE  VAR  PGS 1 1.95250  
1.0 1862G   797G 1064G 42.84 0.97  66 0 1.95250        0     0      0     0 
 -nan -nan  16 7 0.31239  1.0  297G 41438M  257G 13.58 0.31   3 6 1.95250  
1.0 1862G   599G 1262G 32.21 0.73  48 2 1.95250  1.0 1862G   756G 1105G 
40.63 0.92  59 3 1.95250        0     0      0     0  -nan -nan   0 4 1.95250   
     0     0      0     0  -nan -nan   0 5 1.95250        0     0      0     0  
-nan -nan   0 8 1.95250        0     0      0     0  -nan -nan   0 9 0.31239  
1.0  297G   168M  297G  0.06 0.00   210 1.81360  1.0 1857G   792G 1064G 
42.67 0.96  5911 1.81360  1.0 1857G  1398G  458G 75.32 1.70 116             
 TOTAL 9896G  4386G 5510G 44.32MIN/MAX VAR: 0.00/1.70  STDDEV: 24.00
Thank you!
Regards,Hong
 

On Tuesday, September 12, 2017 3:07 AM, Ronny Aasen 
 wrote:
 

 you can start by posting more details. atleast
"ceph osd tree" "cat ceph.conf" and "ceph osd df" so we can see what 
settings you are running, and how your cluster is balanced at the moment.

generally:

inconsistent pg's are pg's that have scrub errors. use rados 
list-inconsistent-pg [pool] and rados-list-inconsistent-obj [pg] to 
locate the objects with problems. compare and fix the objects using info 
from 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent
 
also read http://ceph.com/geen-categorie/ceph-manually-repair-object/


since you have so many scrub errors i would assume there are more bad 
disks, check all disk's smart values and look for read errors in logs.
if you find any you should drain those disks by setting crush weight to 
0. and  when they are empty remove them from the cluster. personally i 
use smartmontools it sends me emails about bad disks, and check disks 
manually with    smartctl -a /dev/sda || echo bad-disk: $?


pg's that are down+peering need to have one of the acting osd's started 
again. or to have the objects recovered using the methods we have 
discussed previously.
ref: 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure

nb: do not mark any osd's as lost since that = dataloss.


I would
- check smart stats of all disks.  drain disks that are going bad. make 
sure you have enough space on good disks to drain them properly.
- check scrub errors and objects. fix those that are fixable. some may 
require an object from a down osd.
- try to get down osd's running again if possible. if you manage to get 
one running, let it recover and stabilize.
- recover and inject objects from osd's that do not run. stasrt by doing 
one and one pg. and once you get the hang of the method you can do 
multiple pg's at the same time.


good luck

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread Ronny Aasen

you can start by posting more details. atleast
"ceph osd tree" "cat ceph.conf" and "ceph osd df" so we can see what
settings you are running, and how your cluster is balanced at the moment.

generally:

inconsistent pg's are pg's that have scrub errors. use rados
list-inconsistent-pg [pool] and rados-list-inconsistent-obj [pg] to
locate the objects with problems. compare and fix the objects using info
from
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent
also read http://ceph.com/geen-categorie/ceph-manually-repair-object/

since you have so many scrub errors i would assume there are more bad
disks, check all disk's smart values and look for read errors in logs.
if you find any you should drain those disks by setting crush weight to
0. and when they are empty remove them from the cluster. personally i
use smartmontools it sends me emails about bad disks, and check disks
manually withsmartctl -a /dev/sda || echo bad-disk: $?

pg's that are down+peering need to have one of the acting osd's started
again. or to have the objects recovered using the methods we have
discussed previously.
ref:
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure

nb: do not mark any osd's as lost since that = dataloss.

I would
- check smart stats of all disks. drain disks that are going bad. make
sure you have enough space on good disks to drain them properly.
- check scrub errors and objects. fix those that are fixable. some may
require an object from a down osd.
- try to get down osd's running again if possible. if you manage to get
one running, let it recover and stabilize.
- recover and inject objects from osd's that do not run. stasrt by doing
one and one pg. and once you get the hang of the method you can do
multiple pg's at the same time.

good luck
Ronny Aasen

On 11. sep. 2017 06:51, hjcho616 wrote:
It took a while. It appears to have cleaned up quite a bit... but still
has issues. I've been seeing below message for more than a day and cpu
utilization and io utilization is low... looks like something is
stuck... I rebooted OSDs several times when it looked like it was stuck
earlier and it would work on something else, but now it is not changing
much. What can I try now?

Regards,
Hong

# ceph health detail
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs
degraded; 6 pgs down; 11 pgs inconsistent; 6 pgs peering; 6 pgs
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive;
16 pgs stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16
pgs undersized; 1 requests are blocked > 32 sec; 1 osds have slow
requests; recovery 221990/4503980 objects degraded (4.929%); recovery
147/2251990 unfound (0.007%); 95 scrub errors; mds cluster is degraded;
no legacy OSD present but 'sortbitwise' flag is not set
pg 0.e is stuck inactive since forever, current state down+peering, last
acting [11,2]
pg 1.d is stuck inactive since forever, current state down+peering, last
acting [11,2]
pg 1.28 is stuck inactive since forever, current state down+peering,
last acting [11,6]
pg 0.29 is stuck inactive since forever, current state down+peering,
last acting [11,6]
pg 1.2b is stuck inactive since forever, current state down+peering,
last acting [1,11]
pg 0.2c is stuck inactive since forever, current state down+peering,
last acting [1,11]
pg 0.e is stuck unclean since forever, current state down+peering, last
acting [11,2]
pg 0.a is stuck unclean for 1233182.248198, current state
stale+active+undersized+degraded+inconsistent, last acting [0]
pg 2.8 is stuck unclean for 1238044.714421, current state
stale+active+undersized+degraded, last acting [0]
pg 2.1a is stuck unclean for 1238933.203920, current state
active+recovering+degraded, last acting [2,11]
pg 2.3 is stuck unclean for 1238882.443876, current state
stale+active+undersized+degraded, last acting [0]
pg 2.27 is stuck unclean for 1295260.765981, current state
active+recovering+degraded, last acting [11,6]
pg 0.d is stuck unclean for 1230831.504001, current state
stale+active+undersized+degraded, last acting [0]
pg 1.c is stuck unclean for 1238044.715698, current state
stale+active+undersized+degraded, last acting [0]
pg 1.3d is stuck unclean for 1232066.572856, current state
stale+active+undersized+degraded, last acting [0]
pg 1.28 is stuck unclean since forever, current state down+peering, last
acting [11,6]
pg 0.29 is stuck unclean since forever, current state down+peering, last
acting [11,6]
pg 1.2b is stuck unclean since forever, current state down+peering, last
acting [1,11]
pg 2.2f is stuck unclean for 1238127.474088, current state
active+recovering+degraded+remapped, last acting [9,10]
pg 0.0 is stuck unclean for 1233182.247776, current state
stale+active+undersized+degraded, last acting [0]
pg 0.2c is stuck unclean since forever, current state down+peering, last
acting

Re: [ceph-users] Power outages!!! help!

2017-09-10 Thread hjcho616

It took a while.  It appears to have cleaned up quite a bit... but still has 
issues.  I've been seeing below message for more than a day and cpu utilization 
and io utilization is low... looks like something is stuck...  I rebooted OSDs 
several times when it looked like it was stuck earlier and it would work on 
something else, but now it is not changing much.  What can I try now?
Regards,Hong
# ceph health detailHEALTH_ERR 22 pgs are stuck inactive for more than 300 
seconds; 22 pgs degraded; 6 pgs down; 11 pgs inconsistent; 6 pgs peering; 6 pgs 
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs 
stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 
1 requests are blocked > 32 sec; 1 osds have slow requests; recovery 
221990/4503980 objects degraded (4.929%); recovery 147/2251990 unfound 
(0.007%); 95 scrub errors; mds cluster is degraded; no legacy OSD present but 
'sortbitwise' flag is not setpg 0.e is stuck inactive since forever, current 
state down+peering, last acting [11,2]pg 1.d is stuck inactive since forever, 
current state down+peering, last acting [11,2]pg 1.28 is stuck inactive since 
forever, current state down+peering, last acting [11,6]pg 0.29 is stuck 
inactive since forever, current state down+peering, last acting [11,6]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.2c is stuck inactive since forever, current state down+peering, last 
acting [1,11]pg 0.e is stuck unclean since forever, current state down+peering, 
last acting [11,2]pg 0.a is stuck unclean for 1233182.248198, current state 
stale+active+undersized+degraded+inconsistent, last acting [0]pg 2.8 is stuck 
unclean for 1238044.714421, current state stale+active+undersized+degraded, 
last acting [0]pg 2.1a is stuck unclean for 1238933.203920, current state 
active+recovering+degraded, last acting [2,11]pg 2.3 is stuck unclean for 
1238882.443876, current state stale+active+undersized+degraded, last acting 
[0]pg 2.27 is stuck unclean for 1295260.765981, current state 
active+recovering+degraded, last acting [11,6]pg 0.d is stuck unclean for 
1230831.504001, current state stale+active+undersized+degraded, last acting 
[0]pg 1.c is stuck unclean for 1238044.715698, current state 
stale+active+undersized+degraded, last acting [0]pg 1.3d is stuck unclean for 
1232066.572856, current state stale+active+undersized+degraded, last acting 
[0]pg 1.28 is stuck unclean since forever, current state down+peering, last 
acting [11,6]pg 0.29 is stuck unclean since forever, current state 
down+peering, last acting [11,6]pg 1.2b is stuck unclean since forever, current 
state down+peering, last acting [1,11]pg 2.2f is stuck unclean for 
1238127.474088, current state active+recovering+degraded+remapped, last acting 
[9,10]pg 0.0 is stuck unclean for 1233182.247776, current state 
stale+active+undersized+degraded, last acting [0]pg 0.2c is stuck unclean since 
forever, current state down+peering, last acting [1,11]pg 2.b is stuck unclean 
for 1238044.640982, current state stale+active+undersized+degraded, last acting 
[0]pg 1.1b is stuck unclean for 1234021.660986, current state 
stale+active+undersized+degraded, last acting [0]pg 0.1c is stuck unclean for 
1232574.189549, current state stale+active+undersized+degraded, last acting 
[0]pg 1.4 is stuck unclean for 1293624.075753, current state 
stale+active+undersized+degraded, last acting [0]pg 0.5 is stuck unclean for 
1237356.776788, current state stale+active+undersized+degraded+inconsistent, 
last acting [0]pg 2.1f is stuck unclean for 8825246.729513, current state 
active+recovering+degraded, last acting [10,2]pg 1.d is stuck unclean since 
forever, current state down+peering, last acting [11,2]pg 2.39 is stuck unclean 
for 1238933.214406, current state stale+active+undersized+degraded, last acting 
[0]pg 1.3a is stuck unclean for 2125299.164204, current state 
stale+active+undersized+degraded, last acting [0]pg 0.3b is stuck unclean for 
1233432.895409, current state stale+active+undersized+degraded, last acting 
[0]pg 2.3c is stuck unclean for 1238933.208648, current state 
active+recovering+degraded, last acting [10,2]pg 2.35 is stuck unclean for 
1295260.753354, current state active+recovering+degraded, last acting [11,6]pg 
1.9 is stuck unclean for 1238044.722811, current state 
stale+active+undersized+degraded, last acting [0]pg 0.a is stuck undersized for 
1229917.081228, current state stale+active+undersized+degraded+inconsistent, 
last acting [0]pg 2.8 is stuck undersized for 1229917.081016, current state 
stale+active+undersized+degraded, last acting [0]pg 2.b is stuck undersized for 
1229917.068181, current state stale+active+undersized+degraded, last acting 
[0]pg 1.9 is stuck undersized for 1229917.075164, current state 
stale+active+undersized+degraded, last acting [0]pg 0.5 is stuck undersized for 
1229917.085330, current state stale+active+undersized+degraded+inconsistent,

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616

Hmm.. I hope I don't really need any thing from osd.0. =P
# ceph-objectstore-tool --op export --pgid 2.35 --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 
2.35.exportFailure to read OSD superblock: (2) No such file or directory# 
ceph-objectstore-tool --op export --pgid 2.2f --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 
2.2f.exportFailure to read OSD superblock: (2) No such file or directory
Regards,Hong 

On Monday, September 4, 2017 2:29 AM, hjcho616  wrote:
 

 Ronny,
While letting cluster replicate, looks like this might be a while, I decided to 
look in to where those pgs are missing.. From the "ceph health detail" I found 
pgs that are unfound.  Then found the directories that had that pgs, pasted on 
the right of that detail message below..pg 2.35 is active+recovering+degraded, 
acting [11,6], 29 unfound, ceph-0/current/2.35_head, ceph-8/current/2.35_headpg 
2.2f is active+recovery_wait+degraded+remapped, acting [9,10], 24 unfound, 
ceph-0/current/2.2f_head, ceph-2/current/2.2f_headpg 2.27 is 
active+recovery_wait+degraded, acting [11,6], 19 unfound, 
ceph-4/current/2.27_headpg 2.1a is active+recovery_wait+degraded, acting 
[2,11], 29 unfound, ceph-0/current/2.1a_head, ceph-3/current/2.1a_head 
ceph-4/current/2.1a_headpg 2.1f is active+recovery_wait+degraded, acting 
[10,2], 20 unfound, ceph-3/current/2.1f_head, ceph-4/current/2.1f_headpg 2.3c 
is active+recovery_wait+degraded, acting [10,2], 26 unfound, 
ceph-0/current/2.3c_head, ceph-4/current/2.3c_head
Basically, I just went to look at the pg directories with rb.* files in it.  I 
noticed that there are more than 1 of those directories throughout osds.  
Should it matter which one of them I export?  Or do I need both? I am seeing 
all of them can be found outside of ceph-0, I'll probably grab from non ceph-0 
OSDs if I can grab from any.  
One thing strange I notice is... pg 2.2f, that one has rb.* files in active 
node ceph-2 but still marked unfound?  Maybe that means I need to export both 
and import both?  If I have to get both, is there a need to merge the two 
before importing?  Or would the tool know how to handle this?
Regards,Hong 

On Monday, September 4, 2017 1:20 AM, hjcho616  wrote:
 

 Thank you Ronny.  I've added two OSDs to OSD2, 2TB each.  I hope that would be 
enough. =)  I've changed min_size and size to 2.  OSDs are busy balancing 
again.  I'll try those you recommended and will get back to you with more 
questions! =) 
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
Regards,Hong 

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen 
 wrote:
 

  I would not even attempt to connect a recovered drive to ceph, especially not 
one that have had xfs errors and corruption.  
 
 your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set 
 osd crush chooseleaf type = 0 
 to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much 
as possible (no misplaced or degraded objects)  this require that ceph have 
space for the recovery. 
 i would run with size=2 min_size=2  
 
 you should also look at the 7 shrub errors. they indicate that there can be 
other drives with issues, you want to locate where those inconsistent objects 
are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 then you would sit with the 103 unfound objects, and those you should try to 
recover from the recovered drive. 
 by using the ceph-objectstore-tool export/import  to try and export pg's 
missing objects  to a dedicated temporary added import drive.
 the import drive does not need to be very large. since you can do one and one 
pg at the time. and you should only recover pg's that contain unfound objects. 
there is realy only 103 unfound objects that you need to recover. 
 once the recovery is compleate you can wipe the functioning recovery

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616

Ronny,
While letting cluster replicate, looks like this might be a while, I decided to
look in to where those pgs are missing.. From the "ceph health detail" I found
pgs that are unfound. Then found the directories that had that pgs, pasted on
the right of that detail message below..pg 2.35 is active+recovering+degraded,
acting [11,6], 29 unfound, ceph-0/current/2.35_head, ceph-8/current/2.35_headpg
2.2f is active+recovery_wait+degraded+remapped, acting [9,10], 24 unfound,
ceph-0/current/2.2f_head, ceph-2/current/2.2f_headpg 2.27 is
active+recovery_wait+degraded, acting [11,6], 19 unfound,
ceph-4/current/2.27_headpg 2.1a is active+recovery_wait+degraded, acting
[2,11], 29 unfound, ceph-0/current/2.1a_head, ceph-3/current/2.1a_head
ceph-4/current/2.1a_headpg 2.1f is active+recovery_wait+degraded, acting
[10,2], 20 unfound, ceph-3/current/2.1f_head, ceph-4/current/2.1f_headpg 2.3c
is active+recovery_wait+degraded, acting [10,2], 26 unfound,
ceph-0/current/2.3c_head, ceph-4/current/2.3c_head
Basically, I just went to look at the pg directories with rb.* files in it. I
noticed that there are more than 1 of those directories throughout osds.
Should it matter which one of them I export? Or do I need both? I am seeing
all of them can be found outside of ceph-0, I'll probably grab from non ceph-0
OSDs if I can grab from any.
One thing strange I notice is... pg 2.2f, that one has rb.* files in active
node ceph-2 but still marked unfound? Maybe that means I need to export both
and import both? If I have to get both, is there a need to merge the two
before importing? Or would the tool know how to handle this?
Regards,Hong

On Monday, September 4, 2017 1:20 AM, hjcho616 wrote:

Thank you Ronny. I've added two OSDs to OSD2, 2TB each. I hope that would be
enough. =) I've changed min_size and size to 2. OSDs are busy balancing
again. I'll try those you recommended and will get back to you with more
questions! =)
# ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1
19.87198 root default-2 8.12239 host OSD1 1 1.95250 osd.1
up 1.0 1.0 0 1.95250 osd.0 down 0
1.0 7 0.31239 osd.7 up 1.0 1.0 6 1.95250
osd.6 up 1.0 1.0 2 1.95250 osd.2
up 1.0 1.0-3 11.74959 host OSD2 3 1.95250 osd.3
down 0 1.0 4 1.95250 osd.4 down 0
1.0 5 1.95250 osd.5 down 0 1.0 8
1.95250 osd.8 down 0 1.0 9 0.31239
osd.9 up 1.0 1.010 1.81360 osd.10 up
1.0 1.011 1.81360 osd.11 up 1.0
1.0
Regards,Hong

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen
wrote:

I would not even attempt to connect a recovered drive to ceph, especially not
one that have had xfs errors and corruption.

your pg's that are undersized lead me to belive you still need to either
expand, with more disks, or nodes. or that you need to set
osd crush chooseleaf type = 0
to let ceph pick 2 disks on the same node as a valid object placement.
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much
as possible (no misplaced or degraded objects) this require that ceph have
space for the recovery.
i would run with size=2 min_size=2

you should also look at the 7 shrub errors. they indicate that there can be
other drives with issues, you want to locate where those inconsistent objects
are, and fix them. read this page about fixing scrub errors.
http://ceph.com/geen-categorie/ceph-manually-repair-object/

then you would sit with the 103 unfound objects, and those you should try to
recover from the recovered drive.
by using the ceph-objectstore-tool export/import to try and export pg's
missing objects to a dedicated temporary added import drive.
the import drive does not need to be very large. since you can do one and one
pg at the time. and you should only recover pg's that contain unfound objects.
there is realy only 103 unfound objects that you need to recover.
once the recovery is compleate you can wipe the functioning recovery drive,
and install it as a new osd to the cluster.

kind regards
Ronny Aasen

On 03.09.2017 06:20, hjcho616 wrote:

I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that
superblock file is the same. I copied it over and started OSD. It still fails
with the same error message. Looks like when I updated to 10.2.9, some osd
needs to be updated and that process is not finding the data it needs? What
can I do about this situation?
2017-09-01 22:27:35.590041 7f68837e5800 1
filestore(/var/lib/ceph/osd/ceph-0) upgrade 2017-09-01 22:27:35.590149

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen
wrote:

I would not even attempt to connect a recovered drive to ceph, especially not
one that have had xfs errors and corruption.

kind regards
Ronny Aasen

On 03.09.2017 06:20, hjcho616 wrote:

On Friday, September 1, 2017 11:10 PM, hjcho616
wrote:

Just realized there is a file called superblock in the ceph directory.
ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 are
identical, but not between the two groups. When I originally created the
OSDs, I created ceph-0 through 5. Can superblock file be copied over from
ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is
down. ceph health output is changing! # ceph health HEALTH_ERR 40 pgs are
stuck inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs
degraded; 10 pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2
pgs recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive;
30 pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%);
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded;
no legacy OSD present but 'sortbitwise' flag is not set
Regards, Hong

On Friday, September 1, 2017 10:37 PM, hjcho616
wrote:

Tried connecting recovered osd. Looks like some of the files in the
lost+found are super blocks. Below is the log. What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800 0 set uid:gid to 1001:1001
(ceph:ceph) 2017-09-01 22:27:27.634245 7f68837e5800 0 ceph version 10.2.9

Re: [ceph-users] Power outages!!! help!

2017-09-03 Thread Ronny Aasen

I would not even attempt to connect a recovered drive to ceph, 
especially not one that have had xfs errors and corruption.


your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set


|osd crush chooseleaf type = 0 |

to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal 
as much as possible (no misplaced or degraded objects)  this require 
that ceph have space for the recovery.

i would run with size=2 min_size=2

you should also look at the 7 shrub errors. they indicate that there can 
be other drives with issues, you want to locate where those inconsistent 
objects are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/


then you would sit with the 103 unfound objects, and those you should 
try to recover from the recovered drive.
by using the /ceph/-/objectstore/-/tool /export/import  to try and 
export pg's missing objects  to a dedicated temporary added import drive.
the import drive does not need to be very large. since you can do one 
and one pg at the time. and you should only recover pg's that contain 
unfound objects. there is realy only 103 unfound objects that you need 
to recover.
once the recovery is compleate you can wipe the functioning recovery 
drive, and install it as a new osd to the cluster.




kind regards
Ronny Aasen


On 03.09.2017 06:20, hjcho616 wrote:
I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It 
still fails with the same error message.  Looks like when I updated to 
10.2.9, some osd needs to be updated and that process is not finding 
the data it needs?  What can I do about this situation?


2017-09-01 22:27:35.590041 7f68837e5800  1 
filestore(/var/lib/ceph/osd/ceph-0) upgrade
2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory


Regards,
Hong


On Friday, September 1, 2017 11:10 PM, hjcho616  
wrote:



Just realized there is a file called superblock in the ceph directory. 
 ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 
are identical, but not between the two groups.  When I originally 
created the OSDs, I created ceph-0 through 5.  Can superblock file be 
copied over from ceph-1 to ceph-0?


Hmm.. it appears to be doing something in the background even though 
osd.0 is down.  ceph health output is changing!

# ceph health
HEALTH_ERR 40 pgs are stuck inactive for more than 300 seconds; 14 pgs 
backfill_wait; 21 pgs degraded; 10 pgs down; 2 pgs inconsistent; 10 
pgs peering; 3 pgs recovering; 2 pgs recovery_wait; 30 pgs stale; 21 
pgs stuck degraded; 10 pgs stuck inactive; 30 pgs stuck stale; 45 pgs 
stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 2 requests 
are blocked > 32 sec; recovery 221826/2473662 objects degraded 
(8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is 
degraded; no legacy OSD present but 'sortbitwise' flag is not set


Regards,
Hong


On Friday, September 1, 2017 10:37 PM, hjcho616  
wrote:



Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?


2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)
2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 5432
2017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file
2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2017-09-01 22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2017-09-01 22:27:27.647080 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config 
option
2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
splice is supported
2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
syncfs(2) syscall fully supported (by glibc and kernel)
2017-09-01 22:27:27.679044 7f68837e5800  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize 
is disabled by conf

2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: Recovering log #28054
2017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete type=0 #28054

2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053

2017-09-01

Re: [ceph-users] Power outages!!! help!

2017-09-02 Thread hjcho616

I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It still fails 
with the same error message.  Looks like when I updated to 10.2.9, some osd 
needs to be updated and that process is not finding the data it needs?  What 
can I do about this situation?
2017-09-01 22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory
Regards,Hong 

On Friday, September 1, 2017 11:10 PM, hjcho616  wrote:
 

 Just realized there is a file called superblock in the ceph directory.  ceph-1 
and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but 
not between the two groups.  When I originally created the OSDs, I created 
ceph-0 through 5.  Can superblock file be copied over from ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is 
down.  ceph health output is changing!# ceph healthHEALTH_ERR 40 pgs are stuck 
inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs degraded; 10 
pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2 pgs 
recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive; 30 
pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs 
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects 
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; 
no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong 

On Friday, September 1, 2017 10:37 PM, hjcho616  wrote:
 

 Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33%

Re: [ceph-users] Power outages!!! help!

Just realized there is a file called superblock in the ceph directory.  ceph-1 
and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but 
not between the two groups.  When I originally created the OSDs, I created 
ceph-0 through 5.  Can superblock file be copied over from ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is 
down.  ceph health output is changing!# ceph healthHEALTH_ERR 40 pgs are stuck 
inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs degraded; 10 
pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2 pgs 
recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive; 30 
pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs 
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects 
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; 
no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong 

On Friday, September 1, 2017 10:37 PM, hjcho616  wrote:
 

 Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33% 
/var/lib/ceph/osd/ceph-0/dev/sde1      1952559676 640365952 1312193724  33% 
/var/lib/ceph/osd/ceph-6/dev/sdd1      1952559676 712018768 1240540908  37% 
/var/lib/ceph/osd/ceph-2/dev/sdc1      1952559676 755827440 1196732236  39% 
/var/lib/ceph/osd/ceph-1/dev/sdf1       312417560  42538060  269879500  14% 
/var/lib/ceph/osd/ceph-7tmpfs              792392         0     792392   0% 
/run/user/0# cd /var/lib/ceph/osd/ceph-0# lsactivate.monmap  current  
journal_uuid  magic          superblock  whoamiactive           fsid     
keyring       ready          sysvinitceph_fsid        journal  lost+found    
store_version  type
Regards,Hong 

On Friday, September 1, 2017 2:59 PM, hjcho616

Re: [ceph-users] Power outages!!! help!

Tried connecting recovered osd.  Looks like some of the files in the lost+found 
are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33% 
/var/lib/ceph/osd/ceph-0/dev/sde1      1952559676 640365952 1312193724  33% 
/var/lib/ceph/osd/ceph-6/dev/sdd1      1952559676 712018768 1240540908  37% 
/var/lib/ceph/osd/ceph-2/dev/sdc1      1952559676 755827440 1196732236  39% 
/var/lib/ceph/osd/ceph-1/dev/sdf1       312417560  42538060  269879500  14% 
/var/lib/ceph/osd/ceph-7tmpfs              792392         0     792392   0% 
/run/user/0# cd /var/lib/ceph/osd/ceph-0# lsactivate.monmap  current  
journal_uuid  magic          superblock  whoamiactive           fsid     
keyring       ready          sysvinitceph_fsid        journal  lost+found    
store_version  type
Regards,Hong 

On Friday, September 1, 2017 2:59 PM, hjcho616  wrote:
 

 Found the partition, wasn't able to mount the partition right away... Did a 
xfs_repair on that drive.  
Got bunch of messages like this.. =(entry 
"10a89fd.__head_AE319A25__0" in shortform directory 845908970 
references non-existent inode 605294241               junking entry 
"10a89fd.__head_AE319A25__0" in directory inode 845908970           
Was able to mount.  lost+found has lots of files there. =P  Running du seems to 
show OK files in current directory.
Will it be safe to attach this one back to the cluster?  Is there a way to 
specify to use this drive if the data is missing? =)  Or am I being paranoid?  
Just plug it? =)
Regards,Hong 

On Friday, September 1, 2017 9:01 AM, hjcho616  wrote:
 

 Looks like it has been rescued... Only 1 error as we saw before in the smart 
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C 
to interrupt     ipos:    1508 GB, non-trimmed:        0 B,  current rate:      
 0 B/s     opos:    1508 GB, non-scraped:        0 B,  average rate:  88985 
kB/snon-tried:        0 B,     errsize:     4096 B,      run time:  6h 14m 40s

Re: [ceph-users] Power outages!!! help!

Found the partition, wasn't able to mount the partition right away... Did a
xfs_repair on that drive.
Got bunch of messages like this.. =(entry
"10a89fd.__head_AE319A25__0" in shortform directory 845908970
references non-existent inode 605294241 junking entry
"10a89fd.__head_AE319A25__0" in directory inode 845908970
Was able to mount. lost+found has lots of files there. =P Running du seems to
show OK files in current directory.
Will it be safe to attach this one back to the cluster? Is there a way to
specify to use this drive if the data is missing? =) Or am I being paranoid?
Just plug it? =)
Regards,Hong

On Friday, September 1, 2017 9:01 AM, hjcho616 wrote:

Looks like it has been rescued... Only 1 error as we saw before in the smart
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C
to interrupt ipos: 1508 GB, non-trimmed: 0 B, current rate:
0 B/s opos: 1508 GB, non-scraped: 0 B, average rate: 88985
kB/snon-tried: 0 B, errsize: 4096 B, run time: 6h 14m 40s
rescued: 2000 GB, errors: 1, remaining time: n/apercent
rescued: 99.99% time since last successful read: 39sFinished
Still missing partition in the new drive. =P I found this util called testdisk
for broken partition tables. Will try that tonight. =P
Regards,Hong

On Wednesday, August 30, 2017 9:18 AM, Ronny Aasen
wrote:

On 30.08.2017 15:32, Steve Taylor wrote:

I'm not familiar with dd_rescue, but I've just been reading about it. I'm not
seeing any features that would be beneficial in this scenario that aren't also
available in dd. What specific features give it "really a far better chance of
restoring a copy of your disk" than dd? I'm always interested in learning about
new recovery tools.
i see i wrote dd_rescue from old habit, but the package one should use on
debian is gddrescue or also called gnu ddrecue.

this page have some details on the differences on dd vs the ddrescue variants.
http://www.toad.com/gnu/sysadmin/index.html#ddrescue

kind regards
Ronny Aasen

|| Steve Taylor | Senior Software Engineer | StorageCraft Technology
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 | |

| If you are not the intended recipient of this message or received it
erroneously, please notify the sender and delete it, together with any
attachments, and be advised that any dissemination or copying of this message
is prohibited. |

On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:
On 29-8-2017 19:12, Steve Taylor wrote:

Hong,Probably your best chance at recovering any data without
special,expensive, forensic procedures is to perform a dd from /dev/sdb
tosomewhere else large enough to hold a full disk image and attempt torepair
that. You'll want to use 'conv=noerror' with your dd commandsince your disk is
failing. Then you could either re-attach the OSDfrom the new source or attempt
to retrieve objects from the filestoreon it.
Like somebody else already pointed outIn problem "cases like disk, use
dd_rescue.It has really a far better chance of restoring a copy of your
disk--WjW
I have actually done this before by creating an RBD that matches thedisk size,
performing the dd, running xfs_repair, and eventuallyadding it back to the
cluster as an OSD. RBDs as OSDs is certainly atemporary arrangement for repair
only, but I'm happy to report that itworked flawlessly in my case. I was able
to weight the OSD to 0,offload all of its data, then remove it for a full
recovery, at whichpoint I just deleted the RBD.The possibilities afforded by
Ceph inception are endless. ☺ Steve Taylor | Senior Software Engineer |
StorageCraft Technology Corporation380 Data Drive Suite 300 | Draper | Utah |
84020Office: 801.871.2799 | If you are not the intended recipient of this
message or received it erroneously, please notify the sender and delete it,
together with any attachments, and be advised that any dissemination or copying
of this message is prohibited. On Mon, 2017-08-28 at 23:17 +0100, Tomasz
Kusmierz wrote:
Rule of thumb with batteries is:- more “proper temperature” you run them at the
more life you get outof them- more battery is overpowered for your application
the longer it willsurvive. Get your self a LSI 94** controller and use it as
HBA and you will befine. but get MORE DRIVES ! …
On 28 Aug 2017, at 23:10, hjcho616 wrote:Thank you Tomasz
and Ronny. I'll have to order some hdd soon andtry these out. Car battery
idea is nice! I may try that.. =) Dothey last longer? Ones that fit the UPS
original battery specdidn't last very long... part of the reason why I gave up
on them..=P My wife probably won't like the idea of car battery hanging

Re: [ceph-users] Power outages!!! help!

On Wednesday, August 30, 2017 9:18 AM, Ronny Aasen
wrote:

On 30.08.2017 15:32, Steve Taylor wrote:

this page have some details on the differences on dd vs the ddrescue variants.
http://www.toad.com/gnu/sysadmin/index.html#ddrescue

kind regards
Ronny Aasen

|| Steve Taylor | Senior Software Engineer | StorageCraft Technology
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 | |

On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:
On 29-8-2017 19:12, Steve Taylor wrote:

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen


On 30.08.2017 15:32, Steve Taylor wrote:
I'm not familiar with dd_rescue, but I've just been reading about it. 
I'm not seeing any features that would be beneficial in this scenario 
that aren't also available in dd. What specific features give it 
"really a far better chance of restoring a copy of your disk" than dd? 
I'm always interested in learning about new recovery tools.


i see i wrote dd_rescue from old habit, but the package one should use 
on debian is gddrescue or also called gnu ddrecue.


this page have some details on the differences on dd vs the ddrescue 
variants.

http://www.toad.com/gnu/sysadmin/index.html#ddrescue

kind regards
Ronny Aasen







*Steve Taylor* | Senior Software Engineer |***StorageCraft Technology 
Corporation* 

380 Data Drive Suite 300 | Draper | Utah | 84020
*Office:* 801.871.2799 |


If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this 
message is prohibited.




On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:

On 29-8-2017 19:12, Steve Taylor wrote:
Hong, Probably your best chance at recovering any data without 
special, expensive, forensic procedures is to perform a dd from 
/dev/sdb to somewhere else large enough to hold a full disk image 
and attempt to repair that. You'll want to use 'conv=noerror' with 
your dd command since your disk is failing. Then you could either 
re-attach the OSD from the new source or attempt to retrieve objects 
from the filestore on it. 



Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW

I have actually done this before by creating an RBD that matches the 
disk size, performing the dd, running xfs_repair, and eventually 
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a 
temporary arrangement for repair only, but I'm happy to report that 
it worked flawlessly in my case. I was able to weight the OSD to 0, 
offload all of its data, then remove it for a full recovery, at 
which point I just deleted the RBD. The possibilities afforded by 
Ceph inception are endless. ☺ Steve Taylor | Senior Software 
Engineer | StorageCraft Technology Corporation 380 Data Drive Suite 
300 | Draper | Utah | 84020 Office: 801.871.2799 | If you are not 
the intended recipient of this message or received it erroneously, 
please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of 
this message is prohibited. On Mon, 2017-08-28 at 23:17 +0100, 
Tomasz Kusmierz wrote:
Rule of thumb with batteries is: - more “proper temperature” you 
run them at the more life you get out of them - more battery is 
overpowered for your application the longer it will survive. Get 
your self a LSI 94** controller and use it as HBA and you will be 
fine. but get MORE DRIVES ! …
On 28 Aug 2017, at 23:10, hjcho616 > wrote: Thank you Tomasz and Ronny. 
 I'll have to order some hdd soon and try these out.  Car battery 
idea is nice!  I may try that.. =)  Do they last longer?  Ones 
that fit the UPS original battery spec didn't last very long... 
part of the reason why I gave up on them.. =P  My wife probably 
won't like the idea of car battery hanging out though ha! The OSD1 
(one with mostly ok OSDs, except that smart failure) motherboard 
doesn't have any additional SATA connectors available.  Would it 
be safe to add another OSD host? Regards, Hong On Monday, August 
28, 2017 4:43 PM, Tomasz Kusmierz  wrote: 
Sorry for being brutal … anyway 1. get the battery for UPS ( a car 
battery will do as well, I’ve moded on ups in the past with truck 
battery and it was working like a charm :D ) 2. get spare drives 
and put those in because your cluster CAN NOT get out of error due 
to lack of space 3. Follow advice of Ronny Aasen on hot to recover 
data from hard drives 4 get cooling to drives or you will loose 
more !
On 28 Aug 2017, at 22:39, hjcho616 > wrote: Tomasz, Those machines are 
behind a surge protector.  Doesn't appear to be a good one!  I do 
have a UPS... but it is my fault... no battery.  Power was pretty 
reliable for a while... and UPS was just beeping every chance it 
had, disrupting some sleep.. =P  So running on surge protector 
only.  I am running this in home environment.   So far, HDD 
failures have been very rare for this environment. =)  It just 
doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor

I'm not familiar with dd_rescue, but I've just been reading about it. I'm not 
seeing any features that would be beneficial in this scenario that aren't also 
available in dd. What specific features give it "really a far better chance of 
restoring a copy of your disk" than dd? I'm always interested in learning about 
new recovery tools.

[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]

Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:

On 29-8-2017 19:12, Steve Taylor wrote:

Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺

Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:

Rule of thumb with batteries is:
- more “proper temperature” you run them at the more life you get out
of them
- more battery is overpowered for your application the longer it will
survive.

Get your self a LSI 94** controller and use it as HBA and you will be
fine. but get MORE DRIVES ! …

On 28 Aug 2017, at 23:10, hjcho616 
> wrote:

Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
try these out.  Car battery idea is nice!  I may try that.. =)  Do
they last longer?  Ones that fit the UPS original battery spec
didn't last very long... part of the reason why I gave up on them..
=P  My wife probably won't like the idea of car battery hanging out
though ha!

The OSD1 (one with mostly ok OSDs, except that smart failure)
motherboard doesn't have any additional SATA connectors available.
 Would it be safe to add another OSD host?

Regards,
Hong

On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  wrote:

Sorry for being brutal … anyway
1. get the battery for UPS ( a car battery will do as well, I’ve
moded on ups in the past with truck battery and it was working like
a charm :D )
2. get spare drives and put those in because your cluster CAN NOT
get out of error due to lack of space
3. Follow advice of Ronny Aasen on hot to recover data from hard
drives
4 get cooling to drives or you will loose more !

On 28 Aug 2017, at 22:39, hjcho616 
> wrote:

Tomasz,

Those machines are behind a surge protector.  Doesn't appear to
be a good one!  I do have a UPS... but it is my fault... no
battery.  Power was pretty reliable for a while... and UPS was
just beeping every chance it had, disrupting some sleep.. =P  So
running on surge protector only.  I am running this in home
environment.   So far, HDD failures have been very rare for this
environment. =)  It just doesn't get loaded as much!  I am not
sure what to expect, seeing that "unfound" and just a feeling of
possibility of maybe getting OSD back made me excited about it.
=) Thanks for letting me know what should be the priority.  I
just lack experience and knowledge in this. =) Please do continue
to guide me though this.

Thank you for the decode of that smart messages!  I do agree that
looks like it is on its way out.  I would like to know how to get
good portion of it back if possible. =)

I think I

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor

Yes, if I had created the RBD in the same cluster I was trying to repair then I 
would have used rbd-fuse to "map" the RBD in order to avoid potential deadlock 
issues with the kernel client. I had another cluster available, so I copied its 
config file to the osd node, created the RBD in the second cluster, and used 
the kernel client for the dd, xfs_repair, and mount. Worked like a charm.

[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]

Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

On Tue, 2017-08-29 at 18:04 +, David Turner wrote:

But it was absolutely awesome to run an osd off of an rbd after the disk failed.

On Tue, Aug 29, 2017, 1:42 PM David Turner 
> wrote:

To addend Steve's success, the rbd was created in a second cluster in the same 
datacenter so it didn't run the risk of deadlocking that mapping rbds on 
machines running osds has.  It is still theoretical to work on the same 
cluster, but more inherently dangerous for a few reasons.

On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
> wrote:
Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺

Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive.
>
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! …
> > On 28 Aug 2017, at 23:10, hjcho616 
> > > wrote:
> >
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> >
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> >
> > Regards,
> > Hong
> >
> >
> >
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> >
> >
> > Sorry for being brutal … anyway
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives
> > 4 get cooling to drives or you will loose more !
> >
> >
> > > On 28 Aug 2017, at 22:39, hjcho616 
> > > > wrote:
> > >
> > > Tomasz,
> > >
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > >

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen


[snip]

I'm not sure if I am liking what I see on fdisk... it doesn't show sdb1. 
  I hope it shows up when I run dd_rescue to other drive... =P


# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.25.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

/dev/sdb: device contains a valid 'xfs' signature, it's strongly 
recommended to wipe the device by command wipefs(8) if this setup is 
unexpected to avoid possible collisions.


Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xe684adb6.

Command (m for help): p
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xe684adb6



Command (m for help):




Do not use fdisk for osd drives. they are using the GPT partition 
structure. and depend on the GPT uuid to be correct.  So use either 
parted or gdisk/cgdisk/sgdisk  if you want to look at it.


writing a mbr partition table to the osd will break it naturally.

kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616

This is what it looks like today. Seems like ceph-osds are sitting at 0% cpu
so... all the migrations appear to be done, Does this look ok to shutdown and
continue when I get the HDD on Thursday?
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 20
pgs backfill_wait; 23 pgs degraded; 6 pgs down; 2 pgs inconsistent; 6 pgs
peering; 4 pgs recovering; 3 pgs recovery_wait; 16 pgs stale; 23 pgs stuck
degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 49 pgs stuck unclean; 16
pgs stuck undersized; 16 pgs undersized; 1 requests are blocked > 32 sec;
recovery 221870/2473686 objects degraded (8.969%); recovery 365398/2473686
objects misplaced (14.771%); recovery 147/2251990 unfound (0.007%); 7 scrub
errors; mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag
is not set
# dfFilesystem 1K-blocks Used Available Use% Mounted onudev
10240 0 10240 0% /devtmpfs 1584780 9212
1575568 1% /run/dev/sda1 15247760 9610208 4839960 67% /tmpfs
3961940 0 3961940 0% /dev/shmtmpfs 5120
0 5120 0% /run/locktmpfs 3961940 0 3961940
0% /sys/fs/cgroup/dev/sdd1 1952559676 712028032 1240531644 37%
/var/lib/ceph/osd/ceph-2/dev/sde1 1952559676 628862040 1323697636 33%
/var/lib/ceph/osd/ceph-6/dev/sdc1 1952559676 755815036 1196744640 39%
/var/lib/ceph/osd/ceph-1/dev/sdf1 312417560 42551928 269865632 14%
/var/lib/ceph/osd/ceph-7tmpfs 792392 0 792392 0%
/run/user/0
I'm not sure if I am liking what I see on fdisk... it doesn't show sdb1. I
hope it shows up when I run dd_rescue to other drive... =P
# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.25.2).Changes will remain in memory only, until
you decide to write them.Be careful before using the write command.
/dev/sdb: device contains a valid 'xfs' signature, it's strongly recommended to
wipe the device by command wipefs(8) if this setup is unexpected to avoid
possible collisions.
Device does not contain a recognized partition table.Created a new DOS
disklabel with disk identifier 0xe684adb6.
Command (m for help): pDisk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168
sectorsUnits: sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512
bytes / 512 bytesI/O size (minimum/optimal): 512 bytes / 512 bytesDisklabel
type: dosDisk identifier: 0xe684adb6

Command (m for help):

On Tuesday, August 29, 2017 3:29 PM, Tomasz Kusmierz
wrote:

Maged, on second host he has 4 out of 5 OSD failed on him … I think he’s past
the trying to increase the backfill threshold :) ofcourse he could try to
degrade cluster by letting mirror within same host :)

On 29 Aug 2017, at 21:26, Maged Mokhtar wrote:

One of the things to watch out in small clusters is OSDs can get full rather
unexpectedly in recovery/backfill cases:In your case you have 2 OSD nodes with
5 disks each. Since you have a replica of 2, each PG will have 1 copy on each
host, so if an OSD fails, all its PGs will have to be re-created on the same
host, meaning they will be distributed only among the 4 OSDs on the same host,
which will quickly bump their usage by nearly 20% each.
the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was near 70%
util before the failure, it will easily reach 85% and cause the cluster to
error with backfill_toofull message you see. This is why i suggest you add an
extra disk or try your luck reasing osd_backfill_full_ratio to 92% it may fix
things./MagedOn 2017-08-29 21:13, hjcho616 wrote:
Nice! Thank you for the explanation! I feel like I can revive that OSD. =)
That does sound great. I don't quite have another cluster so waiting for a
drive to arrive! =) After setting min and max_min to 1, looks like toofull
flag is gone... Maybe when I was making that video copy OSDs were already
down... and those two OSDs were not enough to take too much extra... and on
top of it that last OSD alive was smaller disk (2TB vs 320GB)... so it probably
was filling up faster. I should have captured that message... but turned
machine off and now I am at work. =P When I get back home, I'll try to grab
that and share. Maybe I don't need to try to add another OSD to that cluster
just yet! OSDs are about 50% full on OSD1. So next up, fixing osd0!
Regards,Hong

On Tuesday, August 29, 2017 1:05 PM, David Turner
wrote:

But it was absolutely awesome to run an osd off of an rbd after the disk failed.
On Tue, Aug 29, 2017, 1:42 PM David Turner wrote:
To addend Steve's success, the rbd was created in a second cluster in the same
datacenter so it didn't run the risk of deadlocking that mapping rbds on
machines running osds has. It is still theoretical to work on the same
cluster, but more inherently

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Tomasz Kusmierz

Maged, on second host he has 4 out of 5 OSD failed on him … I think he’s past 
the trying to increase the backfill threshold :) ofcourse he could try to 
degrade cluster by letting mirror within same host :) 
> On 29 Aug 2017, at 21:26, Maged Mokhtar  wrote:
> 
> One of the things to watch out in small clusters is OSDs can get full rather 
> unexpectedly in recovery/backfill cases:
> 
> In your case you have 2 OSD nodes with 5 disks each. Since you have a replica 
> of 2, each PG will have 1 copy on each host, so if an OSD fails, all its PGs 
> will have to be re-created on the same host, meaning they will be distributed 
> only among the 4 OSDs on the same host, which will quickly bump their usage 
> by nearly 20% each.
> the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was near 
> 70% util before the failure, it will easily reach 85% and cause the cluster 
> to error with backfill_toofull message you see.  This is why i suggest you 
> add an extra disk or try your luck reasing osd_backfill_full_ratio to 92% it 
> may fix things.
> 
> /Maged
> 
> On 2017-08-29 21:13, hjcho616 wrote:
> 
>> Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =) 
>>  That does sound great.  I don't quite have another cluster so waiting for a 
>> drive to arrive! =)  
>>  
>> After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
>> when I was making that video copy OSDs were already down... and those two 
>> OSDs were not enough to take too much extra...  and on top of it that last 
>> OSD alive was smaller disk (2TB vs 320GB)... so it probably was filling up 
>> faster.  I should have captured that message... but turned machine off and 
>> now I am at work. =P  When I get back home, I'll try to grab that and share. 
>>  Maybe I don't need to try to add another OSD to that cluster just yet!  
>> OSDs are about 50% full on OSD1.
>>  
>> So next up, fixing osd0!
>>  
>> Regards,
>> Hong  
>> 
>> 
>> On Tuesday, August 29, 2017 1:05 PM, David Turner  
>> wrote:
>> 
>> 
>> But it was absolutely awesome to run an osd off of an rbd after the disk 
>> failed.
>> 
>> On Tue, Aug 29, 2017, 1:42 PM David Turner > > wrote:
>> To addend Steve's success, the rbd was created in a second cluster in the 
>> same datacenter so it didn't run the risk of deadlocking that mapping rbds 
>> on machines running osds has.  It is still theoretical to work on the same 
>> cluster, but more inherently dangerous for a few reasons.
>> 
>> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor > > wrote:
>> Hong,
>> 
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
>> 
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>> 
>> The possibilities afforded by Ceph inception are endless. ☺
>> 
>> 
>> 
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 |
>> 
>> If you are not the intended recipient of this message or received it 
>> erroneously, please notify the sender and delete it, together with any 
>> attachments, and be advised that any dissemination or copying of this 
>> message is prohibited.
>> 
>> 
>> 
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> > Rule of thumb with batteries is:
>> > - more "proper temperature" you run them at the more life you get out
>> > of them
>> > - more battery is overpowered for your application the longer it will
>> > survive. 
>> >
>> > Get your self a LSI 94** controller and use it as HBA and you will be
>> > fine. but get MORE DRIVES ! ... 
>> > > On 28 Aug 2017, at 23:10, hjcho616 > > > > wrote:
>> > >
>> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
>> > > they last longer?  Ones that fit the UPS original battery spec
>> > > didn't last very long... part of the reason why I gave up on them..
>> > > =P  My wife probably won't like the idea of car

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Maged Mokhtar

One of the things to watch out in small clusters is OSDs can get full
rather unexpectedly in recovery/backfill cases: 

In your case you have 2 OSD nodes with 5 disks each. Since you have a
replica of 2, each PG will have 1 copy on each host, so if an OSD fails,
all its PGs will have to be re-created on the same host, meaning they
will be distributed only among the 4 OSDs on the same host, which will
quickly bump their usage by nearly 20% each.
the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was
near 70% util before the failure, it will easily reach 85% and cause the
cluster to error with backfill_toofull message you see.  This is why i
suggest you add an extra disk or try your luck reasing
osd_backfill_full_ratio to 92% it may fix things. 

/Maged 

On 2017-08-29 21:13, hjcho616 wrote:

> Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)  
> That does sound great.  I don't quite have another cluster so waiting for a 
> drive to arrive! =)   
> 
> After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
> when I was making that video copy OSDs were already down... and those two 
> OSDs were not enough to take too much extra...  and on top of it that last 
> OSD alive was smaller disk (2TB vs 320GB)... so it probably was filling up 
> faster.  I should have captured that message... but turned machine off and 
> now I am at work. =P  When I get back home, I'll try to grab that and share.  
> Maybe I don't need to try to add another OSD to that cluster just yet!  OSDs 
> are about 50% full on OSD1. 
> 
> So next up, fixing osd0! 
> 
> Regards, 
> Hong   
> 
> On Tuesday, August 29, 2017 1:05 PM, David Turner  
> wrote:
> 
> But it was absolutely awesome to run an osd off of an rbd after the disk 
> failed. 
> 
> On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote: 
> To addend Steve's success, the rbd was created in a second cluster in the 
> same datacenter so it didn't run the risk of deadlocking that mapping rbds on 
> machines running osds has.  It is still theoretical to work on the same 
> cluster, but more inherently dangerous for a few reasons. 
> 
> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor  
> wrote: Hong,
> 
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.
> 
> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
> 
> The possibilities afforded by Ceph inception are endless. ☺
> 
> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 |
> 
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> Rule of thumb with batteries is:
>> - more "proper temperature" you run them at the more life you get out
>> of them
>> - more battery is overpowered for your application the longer it will
>> survive. 
>> 
>> Get your self a LSI 94** controller and use it as HBA and you will be
>> fine. but get MORE DRIVES ! ... 
>>> On 28 Aug 2017, at 23:10, hjcho616  wrote:
>>>
>>> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>>> try these out.  Car battery idea is nice!  I may try that.. =)  Do
>>> they last longer?  Ones that fit the UPS original battery spec
>>> didn't last very long... part of the reason why I gave up on them..
>>> =P  My wife probably won't like the idea of car battery hanging out
>>> though ha!
>>>
>>> The OSD1 (one with mostly ok OSDs, except that smart failure)
>>> motherboard doesn't have any additional SATA connectors available.
>>>  Would it be safe to add another OSD host?
>>>
>>> Regards,
>>> Hong
>>>
>>>
>>>
>>> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz >> mail.com [1]> wrote:
>>>
>>>
>>> Sorry for being brutal ... anyway 
>>> 1. get the battery for UPS ( a car battery will do as well, I've
>>> moded on ups in the past with truck battery and it was working like
>>> a

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Tomasz Kusmierz

Just FYI, setting size and min_size to 1 is a last resort in my mind - to get 
you out of dodge !! 

Before setting that you should have made your self 105% certain that all OSD 
you leave ON, have NO bad sectors or no sectors pending or no any errors of any 
kind. 

once you can mount the cephfs, just delete everything you don’t actually need. 
Trust everybody has some data that they don’t trully need … this pron 
collection that you can redownload ;) that set of iso files that you downloaded 
from ubuntu but you can download them later … it might turn out that one of 
those files will contain the missing objects and your recovery will be 
pointless. 

> On 29 Aug 2017, at 20:49, Willem Jan Withagen  wrote:
> 
> On 29-8-2017 19:12, Steve Taylor wrote:
>> Hong,
>> 
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
> 
> Like somebody else already pointed out
> In problem "cases like disk, use dd_rescue.
> It has really a far better chance of restoring a copy of your disk
> 
> --WjW
> 
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>> 
>> The possibilities afforded by Ceph inception are endless. ☺
>> 
>> 
>> 
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 | 
>> 
>> If you are not the intended recipient of this message or received it 
>> erroneously, please notify the sender and delete it, together with any 
>> attachments, and be advised that any dissemination or copying of this 
>> message is prohibited.
>> 
>> 
>> 
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>>> Rule of thumb with batteries is:
>>> - more “proper temperature” you run them at the more life you get out
>>> of them
>>> - more battery is overpowered for your application the longer it will
>>> survive. 
>>> 
>>> Get your self a LSI 94** controller and use it as HBA and you will be
>>> fine. but get MORE DRIVES ! … 
 On 28 Aug 2017, at 23:10, hjcho616  wrote:

 Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
 try these out.  Car battery idea is nice!  I may try that.. =)  Do
 they last longer?  Ones that fit the UPS original battery spec
 didn't last very long... part of the reason why I gave up on them..
 =P  My wife probably won't like the idea of car battery hanging out
 though ha!

 The OSD1 (one with mostly ok OSDs, except that smart failure)
 motherboard doesn't have any additional SATA connectors available.
  Would it be safe to add another OSD host?

 Regards,
 Hong

 On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  wrote:

 Sorry for being brutal … anyway 
 1. get the battery for UPS ( a car battery will do as well, I’ve
 moded on ups in the past with truck battery and it was working like
 a charm :D )
 2. get spare drives and put those in because your cluster CAN NOT
 get out of error due to lack of space
 3. Follow advice of Ronny Aasen on hot to recover data from hard
 drives 
 4 get cooling to drives or you will loose more ! 

> On 28 Aug 2017, at 22:39, hjcho616  wrote:
> 
> Tomasz,
> 
> Those machines are behind a surge protector.  Doesn't appear to
> be a good one!  I do have a UPS... but it is my fault... no
> battery.  Power was pretty reliable for a while... and UPS was
> just beeping every chance it had, disrupting some sleep.. =P  So
> running on surge protector only.  I am running this in home
> environment.   So far, HDD failures have been very rare for this
> environment. =)  It just doesn't get loaded as much!  I am not
> sure what to expect, seeing that "unfound" and just a feeling of
> possibility of maybe getting OSD back made me excited about it.
> =) Thanks for letting me know what should be the priority.  I
> just lack experience and knowledge in this. =) Please do continue
> to guide me though this. 
> 
> Thank you for the decode of that smart messages!  I do agree that
>

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Willem Jan Withagen

On 29-8-2017 19:12, Steve Taylor wrote:
> Hong,
> 
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.

Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW

> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
> 
> The possibilities afforded by Ceph inception are endless. ☺
> 
> 
>  
> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 | 
>  
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
>  
> 
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> Rule of thumb with batteries is:
>> - more “proper temperature” you run them at the more life you get out
>> of them
>> - more battery is overpowered for your application the longer it will
>> survive. 
>>
>> Get your self a LSI 94** controller and use it as HBA and you will be
>> fine. but get MORE DRIVES ! … 
>>> On 28 Aug 2017, at 23:10, hjcho616  wrote:
>>>
>>> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>>> try these out.  Car battery idea is nice!  I may try that.. =)  Do
>>> they last longer?  Ones that fit the UPS original battery spec
>>> didn't last very long... part of the reason why I gave up on them..
>>> =P  My wife probably won't like the idea of car battery hanging out
>>> though ha!
>>>
>>> The OSD1 (one with mostly ok OSDs, except that smart failure)
>>> motherboard doesn't have any additional SATA connectors available.
>>>  Would it be safe to add another OSD host?
>>>
>>> Regards,
>>> Hong
>>>
>>>
>>>
>>> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz >> mail.com> wrote:
>>>
>>>
>>> Sorry for being brutal … anyway 
>>> 1. get the battery for UPS ( a car battery will do as well, I’ve
>>> moded on ups in the past with truck battery and it was working like
>>> a charm :D )
>>> 2. get spare drives and put those in because your cluster CAN NOT
>>> get out of error due to lack of space
>>> 3. Follow advice of Ronny Aasen on hot to recover data from hard
>>> drives 
>>> 4 get cooling to drives or you will loose more ! 
>>>
>>>
 On 28 Aug 2017, at 22:39, hjcho616  wrote:

 Tomasz,

 Those machines are behind a surge protector.  Doesn't appear to
 be a good one!  I do have a UPS... but it is my fault... no
 battery.  Power was pretty reliable for a while... and UPS was
 just beeping every chance it had, disrupting some sleep.. =P  So
 running on surge protector only.  I am running this in home
 environment.   So far, HDD failures have been very rare for this
 environment. =)  It just doesn't get loaded as much!  I am not
 sure what to expect, seeing that "unfound" and just a feeling of
 possibility of maybe getting OSD back made me excited about it.
 =) Thanks for letting me know what should be the priority.  I
 just lack experience and knowledge in this. =) Please do continue
 to guide me though this. 

 Thank you for the decode of that smart messages!  I do agree that
 looks like it is on its way out.  I would like to know how to get
 good portion of it back if possible. =)

 I think I just set the size and min_size to 1.
 # ceph osd lspools
 0 data,1 metadata,2 rbd,
 # ceph osd pool set rbd size 1
 set pool 2 size to 1
 # ceph osd pool set rbd min_size 1
 set pool 2 min_size to 1

 Seems to be doing some backfilling work.

 # ceph health
 HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
 pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling;
 108 pgs degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering;
 7 pgs recovery_wait; 16 pgs stale; 108 pgs stuck degraded; 6 pgs
 stuck inactive; 16 pgs stuck stale; 130 pgs stuck unclean; 101
 pgs stuck undersized; 101 pgs undersized; 1 requests are blocked
> 32 sec;

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616

Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)  
That does sound great.  I don't quite have another cluster so waiting for a 
drive to arrive! =)  
After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
when I was making that video copy OSDs were already down... and those two OSDs 
were not enough to take too much extra...  and on top of it that last OSD alive 
was smaller disk (2TB vs 320GB)... so it probably was filling up faster.  I 
should have captured that message... but turned machine off and now I am at 
work. =P  When I get back home, I'll try to grab that and share.  Maybe I don't 
need to try to add another OSD to that cluster just yet!  OSDs are about 50% 
full on OSD1.
So next up, fixing osd0!
Regards,Hong   

On Tuesday, August 29, 2017 1:05 PM, David Turner  
wrote:

 But it was absolutely awesome to run an osd off of an rbd after the disk 
failed.
On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote:

To addend Steve's success, the rbd was created in a second cluster in the same 
datacenter so it didn't run the risk of deadlocking that mapping rbds on 
machines running osds has.  It is still theoretical to work on the same 
cluster, but more inherently dangerous for a few reasons.
On Tue, Aug 29, 2017, 1:15 PM Steve Taylor  
wrote:

Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺

Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive. 
>
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! … 
> > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> >
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> >
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> >
> > Regards,
> > Hong
> >
> >
> >
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> >
> >
> > Sorry for being brutal … anyway 
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives 
> > 4 get cooling to drives or you will loose more ! 
> >
> >
> > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > >
> > > Tomasz,
> > >
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > running on surge protector only.  I am running this in home
> > > environment.   So far, HDD failures have been very rare for this
> > > environment. =)  It just doesn't get loaded as much!  I am not
> > > sure what to expect, seeing that "unfound" and just a feeling of
> > > possibility of maybe getting OSD back

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread David Turner

But it was absolutely awesome to run an osd off of an rbd after the disk
failed.

On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote:

> To addend Steve's success, the rbd was created in a second cluster in the
> same datacenter so it didn't run the risk of deadlocking that mapping rbds
> on machines running osds has.  It is still theoretical to work on the same
> cluster, but more inherently dangerous for a few reasons.
>
> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
> wrote:
>
>> Hong,
>>
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
>>
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>>
>> The possibilities afforded by Ceph inception are endless. ☺
>>
>>
>>
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology
>> Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 |
>>
>> If you are not the intended recipient of this message or received it
>> erroneously, please notify the sender and delete it, together with any
>> attachments, and be advised that any dissemination or copying of this
>> message is prohibited.
>>
>>
>>
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> > Rule of thumb with batteries is:
>> > - more “proper temperature” you run them at the more life you get out
>> > of them
>> > - more battery is overpowered for your application the longer it will
>> > survive.
>> >
>> > Get your self a LSI 94** controller and use it as HBA and you will be
>> > fine. but get MORE DRIVES ! …
>> > > On 28 Aug 2017, at 23:10, hjcho616  wrote:
>> > >
>> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
>> > > they last longer?  Ones that fit the UPS original battery spec
>> > > didn't last very long... part of the reason why I gave up on them..
>> > > =P  My wife probably won't like the idea of car battery hanging out
>> > > though ha!
>> > >
>> > > The OSD1 (one with mostly ok OSDs, except that smart failure)
>> > > motherboard doesn't have any additional SATA connectors available.
>> > >  Would it be safe to add another OSD host?
>> > >
>> > > Regards,
>> > > Hong
>> > >
>> > >
>> > >
>> > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz > > > mail.com> wrote:
>> > >
>> > >
>> > > Sorry for being brutal … anyway
>> > > 1. get the battery for UPS ( a car battery will do as well, I’ve
>> > > moded on ups in the past with truck battery and it was working like
>> > > a charm :D )
>> > > 2. get spare drives and put those in because your cluster CAN NOT
>> > > get out of error due to lack of space
>> > > 3. Follow advice of Ronny Aasen on hot to recover data from hard
>> > > drives
>> > > 4 get cooling to drives or you will loose more !
>> > >
>> > >
>> > > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
>> > > >
>> > > > Tomasz,
>> > > >
>> > > > Those machines are behind a surge protector.  Doesn't appear to
>> > > > be a good one!  I do have a UPS... but it is my fault... no
>> > > > battery.  Power was pretty reliable for a while... and UPS was
>> > > > just beeping every chance it had, disrupting some sleep.. =P  So
>> > > > running on surge protector only.  I am running this in home
>> > > > environment.   So far, HDD failures have been very rare for this
>> > > > environment. =)  It just doesn't get loaded as much!  I am not
>> > > > sure what to expect, seeing that "unfound" and just a feeling of
>> > > > possibility of maybe getting OSD back made me excited about it.
>> > > > =) Thanks for letting me know what should be the priority.  I
>> > > > just lack experience and knowledge in this. =) Please do continue
>> > > > to guide me though this.
>> > > >
>> > > > Thank you for the decode of that smart messages!  I do agree that
>> > > > looks like it is on its way out.  I would like to know how to get
>> > > > good portion of it back if possible. =)
>> > > >
>> > > > I think I just set the size and min_size to 1.
>> > > > # ceph osd lspools
>> > > > 0 data,1 metadata,2 rbd,
>> > > > # ceph osd pool set rbd size 1
>> > > > set pool 2 size to 1
>> > > >

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread David Turner

To addend Steve's success, the rbd was created in a second cluster in the
same datacenter so it didn't run the risk of deadlocking that mapping rbds
on machines running osds has.  It is still theoretical to work on the same
cluster, but more inherently dangerous for a few reasons.

On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
wrote:

> Hong,
>
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.
>
> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
>
> The possibilities afforded by Ceph inception are endless. ☺
>
>
>
> Steve Taylor | Senior Software Engineer | StorageCraft Technology
> Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 |
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
>
>
>
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> > Rule of thumb with batteries is:
> > - more “proper temperature” you run them at the more life you get out
> > of them
> > - more battery is overpowered for your application the longer it will
> > survive.
> >
> > Get your self a LSI 94** controller and use it as HBA and you will be
> > fine. but get MORE DRIVES ! …
> > > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> > >
> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > > they last longer?  Ones that fit the UPS original battery spec
> > > didn't last very long... part of the reason why I gave up on them..
> > > =P  My wife probably won't like the idea of car battery hanging out
> > > though ha!
> > >
> > > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > > motherboard doesn't have any additional SATA connectors available.
> > >  Would it be safe to add another OSD host?
> > >
> > > Regards,
> > > Hong
> > >
> > >
> > >
> > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > > mail.com> wrote:
> > >
> > >
> > > Sorry for being brutal … anyway
> > > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > > moded on ups in the past with truck battery and it was working like
> > > a charm :D )
> > > 2. get spare drives and put those in because your cluster CAN NOT
> > > get out of error due to lack of space
> > > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > > drives
> > > 4 get cooling to drives or you will loose more !
> > >
> > >
> > > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > > >
> > > > Tomasz,
> > > >
> > > > Those machines are behind a surge protector.  Doesn't appear to
> > > > be a good one!  I do have a UPS... but it is my fault... no
> > > > battery.  Power was pretty reliable for a while... and UPS was
> > > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > > running on surge protector only.  I am running this in home
> > > > environment.   So far, HDD failures have been very rare for this
> > > > environment. =)  It just doesn't get loaded as much!  I am not
> > > > sure what to expect, seeing that "unfound" and just a feeling of
> > > > possibility of maybe getting OSD back made me excited about it.
> > > > =) Thanks for letting me know what should be the priority.  I
> > > > just lack experience and knowledge in this. =) Please do continue
> > > > to guide me though this.
> > > >
> > > > Thank you for the decode of that smart messages!  I do agree that
> > > > looks like it is on its way out.  I would like to know how to get
> > > > good portion of it back if possible. =)
> > > >
> > > > I think I just set the size and min_size to 1.
> > > > # ceph osd lspools
> > > > 0 data,1 metadata,2 rbd,
> > > > # ceph osd pool set rbd size 1
> > > > set pool 2 size to 1
> > > > # ceph osd pool set rbd min_size 1
> > > > set pool 2 min_size to 1
> > > >
> > > > Seems to be doing some backfilling work.
> > > >
> > > > # ceph health
> > > > HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
> > > > pgs backfill_toofull; 74 pgs

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Steve Taylor

Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺


 
Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 | 
 
If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

 

On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive. 
> 
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! … 
> > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> > 
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> > 
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> > 
> > Regards,
> > Hong
> > 
> > 
> > 
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> > 
> > 
> > Sorry for being brutal … anyway 
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives 
> > 4 get cooling to drives or you will loose more ! 
> > 
> > 
> > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > > 
> > > Tomasz,
> > > 
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > running on surge protector only.  I am running this in home
> > > environment.   So far, HDD failures have been very rare for this
> > > environment. =)  It just doesn't get loaded as much!  I am not
> > > sure what to expect, seeing that "unfound" and just a feeling of
> > > possibility of maybe getting OSD back made me excited about it.
> > > =) Thanks for letting me know what should be the priority.  I
> > > just lack experience and knowledge in this. =) Please do continue
> > > to guide me though this. 
> > > 
> > > Thank you for the decode of that smart messages!  I do agree that
> > > looks like it is on its way out.  I would like to know how to get
> > > good portion of it back if possible. =)
> > > 
> > > I think I just set the size and min_size to 1.
> > > # ceph osd lspools
> > > 0 data,1 metadata,2 rbd,
> > > # ceph osd pool set rbd size 1
> > > set pool 2 size to 1
> > > # ceph osd pool set rbd min_size 1
> > > set pool 2 min_size to 1
> > > 
> > > Seems to be doing some backfilling work.
> > > 
> > > # ceph health
> > > HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
> > > pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling;
> > > 108 pgs degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering;
> > > 7 pgs recovery_wait; 16 pgs stale; 108 pgs stuck degraded; 6 pgs
> > > stuck inactive; 16 pgs stuck stale; 130 pgs stuck unclean; 101
> > > pgs stuck undersized; 101 pgs undersized; 1 requests are blocked
> > > > 32 sec; recovery 1790657/4502340 objects degraded (39.772%);
> > > recovery 641906/4502340 objects misplaced (14.257%); recovery
> > > 147/2251990 unfound (0.007%); 50 scrub errors; mds cluster is
> > > degraded; no legacy OSD

Re: [ceph-users] Power outages!!! help!

Rule of thumb with batteries is:
- more “proper temperature” you run them at the more life you get out of them
- more battery is overpowered for your application the longer it will survive. 

Get your self a LSI 94** controller and use it as HBA and you will be fine. but 
get MORE DRIVES ! … 
> On 28 Aug 2017, at 23:10, hjcho616  wrote:
> 
> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and try these 
> out.  Car battery idea is nice!  I may try that.. =)  Do they last longer?  
> Ones that fit the UPS original battery spec didn't last very long... part of 
> the reason why I gave up on them.. =P  My wife probably won't like the idea 
> of car battery hanging out though ha!
> 
> The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard 
> doesn't have any additional SATA connectors available.  Would it be safe to 
> add another OSD host?
> 
> Regards,
> Hong
> 
> 
> 
> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> Sorry for being brutal … anyway 
> 1. get the battery for UPS ( a car battery will do as well, I’ve moded on ups 
> in the past with truck battery and it was working like a charm :D )
> 2. get spare drives and put those in because your cluster CAN NOT get out of 
> error due to lack of space
> 3. Follow advice of Ronny Aasen on hot to recover data from hard drives 
> 4 get cooling to drives or you will loose more ! 
> 
> 
>> On 28 Aug 2017, at 22:39, hjcho616 > > wrote:
>> 
>> Tomasz,
>> 
>> Those machines are behind a surge protector.  Doesn't appear to be a good 
>> one!  I do have a UPS... but it is my fault... no battery.  Power was pretty 
>> reliable for a while... and UPS was just beeping every chance it had, 
>> disrupting some sleep.. =P  So running on surge protector only.  I am 
>> running this in home environment.   So far, HDD failures have been very rare 
>> for this environment. =)  It just doesn't get loaded as much!  I am not sure 
>> what to expect, seeing that "unfound" and just a feeling of possibility of 
>> maybe getting OSD back made me excited about it. =) Thanks for letting me 
>> know what should be the priority.  I just lack experience and knowledge in 
>> this. =) Please do continue to guide me though this. 
>> 
>> Thank you for the decode of that smart messages!  I do agree that looks like 
>> it is on its way out.  I would like to know how to get good portion of it 
>> back if possible. =)
>> 
>> I think I just set the size and min_size to 1.
>> # ceph osd lspools
>> 0 data,1 metadata,2 rbd,
>> # ceph osd pool set rbd size 1
>> set pool 2 size to 1
>> # ceph osd pool set rbd min_size 1
>> set pool 2 min_size to 1
>> 
>> Seems to be doing some backfilling work.
>> 
>> # ceph health
>> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 pgs 
>> backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs degraded; 
>> 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 16 pgs 
>> stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 130 
>> pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
>> are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
>> recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
>> unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
>> present but 'sortbitwise' flag is not set
>> 
>> 
>> 
>> Regards,
>> Hong
>> 
>> 
>> On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz > > wrote:
>> 
>> 
>> So to decode few things about your disk:
>> 
>>   1 Raw_Read_Error_Rate0x002f  100  100  051Pre-fail  Always  -  
>> 37
>> 37 read erros and only one sector marked as pending - fun disk :/ 
>> 
>> 181 Program_Fail_Cnt_Total  0x0022  099  099  000Old_age  Always  -  
>> 35325174
>> So firmware has quite few bugs, that’s nice
>> 
>> 191 G-Sense_Error_Rate  0x0022  100  100  000Old_age  Always  -  
>> 2855
>> disk was thrown around while operational even more nice.
>> 
>> 194 Temperature_Celsius0x0002  047  041  000Old_age  Always  -   
>>53 (Min/Max 15/59)
>> if your disk passes 50 you should not consider using it, high temperatures 
>> demagnetise plate layer and you will see more errors in very near future.
>> 
>> 197 Current_Pending_Sector  0x0032  100  100  000Old_age  Always  -  
>> 1
>> as mentioned before :)
>> 
>> 200 Multi_Zone_Error_Rate  0x002a  100  100  000Old_age  Always  -   
>>4222
>> your heads keep missing tracks … bent ? I don’t even know how to comment 
>> here.
>> 
>> 
>> generally fun drive you’ve got there … rescue as much as you can and throw 
>> it away !!!
>> 
>> 
> 
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] Power outages!!! help!

Thank you Tomasz and Ronny.  I'll have to order some hdd soon and try these 
out.  Car battery idea is nice!  I may try that.. =)  Do they last longer?  
Ones that fit the UPS original battery spec didn't last very long... part of 
the reason why I gave up on them.. =P  My wife probably won't like the idea of 
car battery hanging out though ha!
The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard 
doesn't have any additional SATA connectors available.  Would it be safe to add 
another OSD host?
Regards,Hong
 

On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz 
 wrote:
 

 Sorry for being brutal … anyway 1. get the battery for UPS ( a car battery 
will do as well, I’ve moded on ups in the past with truck battery and it was 
working like a charm :D )2. get spare drives and put those in because your 
cluster CAN NOT get out of error due to lack of space3. Follow advice of Ronny 
Aasen on hot to recover data from hard drives 4 get cooling to drives or you 
will loose more ! 


On 28 Aug 2017, at 22:39, hjcho616  wrote:
Tomasz,
Those machines are behind a surge protector.  Doesn't appear to be a good one!  
I do have a UPS... but it is my fault... no battery.  Power was pretty reliable 
for a while... and UPS was just beeping every chance it had, disrupting some 
sleep.. =P  So running on surge protector only.  I am running this in home 
environment.   So far, HDD failures have been very rare for this environment. 
=)  It just doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a feeling of possibility of maybe getting OSD back made 
me excited about it. =) Thanks for letting me know what should be the priority. 
 I just lack experience and knowledge in this. =) Please do continue to guide 
me though this. 
Thank you for the decode of that smart messages!  I do agree that looks like it 
is on its way out.  I would like to know how to get good portion of it back if 
possible. =)
I think I just set the size and min_size to 1.# ceph osd lspools0 data,1 
metadata,2 rbd,# ceph osd pool set rbd size 1set pool 2 size to 1# ceph osd 
pool set rbd min_size 1set pool 2 min_size to 1
Seems to be doing some backfilling work.
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 
pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs 
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 
130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
present but 'sortbitwise' flag is not set


Regards,Hong 

On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz 
 wrote:
 

 So to decode few things about your disk:

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -     
 37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022  099  099  000    Old_age  Always      -     
 35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -     
 2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius    0x0002  047  041  000    Old_age  Always      -      
53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032  100  100  000    Old_age  Always      -     
 1
as mentioned before :)

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      
4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!

   



   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Sorry for being brutal … anyway 
1. get the battery for UPS ( a car battery will do as well, I’ve moded on ups 
in the past with truck battery and it was working like a charm :D )
2. get spare drives and put those in because your cluster CAN NOT get out of 
error due to lack of space
3. Follow advice of Ronny Aasen on hot to recover data from hard drives 
4 get cooling to drives or you will loose more ! 


> On 28 Aug 2017, at 22:39, hjcho616  wrote:
> 
> Tomasz,
> 
> Those machines are behind a surge protector.  Doesn't appear to be a good 
> one!  I do have a UPS... but it is my fault... no battery.  Power was pretty 
> reliable for a while... and UPS was just beeping every chance it had, 
> disrupting some sleep.. =P  So running on surge protector only.  I am running 
> this in home environment.   So far, HDD failures have been very rare for this 
> environment. =)  It just doesn't get loaded as much!  I am not sure what to 
> expect, seeing that "unfound" and just a feeling of possibility of maybe 
> getting OSD back made me excited about it. =) Thanks for letting me know what 
> should be the priority.  I just lack experience and knowledge in this. =) 
> Please do continue to guide me though this. 
> 
> Thank you for the decode of that smart messages!  I do agree that looks like 
> it is on its way out.  I would like to know how to get good portion of it 
> back if possible. =)
> 
> I think I just set the size and min_size to 1.
> # ceph osd lspools
> 0 data,1 metadata,2 rbd,
> # ceph osd pool set rbd size 1
> set pool 2 size to 1
> # ceph osd pool set rbd min_size 1
> set pool 2 min_size to 1
> 
> Seems to be doing some backfilling work.
> 
> # ceph health
> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 pgs 
> backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs degraded; 
> 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 16 pgs 
> stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 130 
> pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
> are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
> recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
> unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
> present but 'sortbitwise' flag is not set
> 
> 
> 
> Regards,
> Hong
> 
> 
> On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> So to decode few things about your disk:
> 
>   1 Raw_Read_Error_Rate0x002f  100  100  051Pre-fail  Always  -   
>37
> 37 read erros and only one sector marked as pending - fun disk :/ 
> 
> 181 Program_Fail_Cnt_Total  0x0022  099  099  000Old_age  Always  -   
>35325174
> So firmware has quite few bugs, that’s nice
> 
> 191 G-Sense_Error_Rate  0x0022  100  100  000Old_age  Always  -   
>2855
> disk was thrown around while operational even more nice.
> 
> 194 Temperature_Celsius0x0002  047  041  000Old_age  Always  -
>   53 (Min/Max 15/59)
> if your disk passes 50 you should not consider using it, high temperatures 
> demagnetise plate layer and you will see more errors in very near future.
> 
> 197 Current_Pending_Sector  0x0032  100  100  000Old_age  Always  -   
>1
> as mentioned before :)
> 
> 200 Multi_Zone_Error_Rate  0x002a  100  100  000Old_age  Always  -
>   4222
> your heads keep missing tracks … bent ? I don’t even know how to comment here.
> 
> 
> generally fun drive you’ve got there … rescue as much as you can and throw it 
> away !!!
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Tomasz,
Those machines are behind a surge protector.  Doesn't appear to be a good one!  
I do have a UPS... but it is my fault... no battery.  Power was pretty reliable 
for a while... and UPS was just beeping every chance it had, disrupting some 
sleep.. =P  So running on surge protector only.  I am running this in home 
environment.   So far, HDD failures have been very rare for this environment. 
=)  It just doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a feeling of possibility of maybe getting OSD back made 
me excited about it. =) Thanks for letting me know what should be the priority. 
 I just lack experience and knowledge in this. =) Please do continue to guide 
me though this. 
Thank you for the decode of that smart messages!  I do agree that looks like it 
is on its way out.  I would like to know how to get good portion of it back if 
possible. =)
I think I just set the size and min_size to 1.# ceph osd lspools0 data,1 
metadata,2 rbd,# ceph osd pool set rbd size 1set pool 2 size to 1# ceph osd 
pool set rbd min_size 1set pool 2 min_size to 1
Seems to be doing some backfilling work.
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 
pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs 
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 
130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
present but 'sortbitwise' flag is not set


Regards,Hong 

On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz 
 wrote:
 

 So to decode few things about your disk:

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -     
 37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022  099  099  000    Old_age  Always      -     
 35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -     
 2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius    0x0002  047  041  000    Old_age  Always      -      
53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032  100  100  000    Old_age  Always      -     
 1
as mentioned before :)

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      
4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!

   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Ronny Aasen


> [SNIP - bad drives]

Generally when a disk is displaying bad blocks to the OS, the drive have 
been remapping blocks for ages in the background. and the disk is really 
on it's last legs.  a bit unlikely that you get so many disks dying at 
the same time tho. but the problem can have been silently worsening and 
was not realy noticed until the osd had to restart due to the powerloss.



if this is _very_ important data i would recomend you start by taking 
the bad drives out of operation, and cloning the bad drive block by 
block onto a good one. by using dd_rescue. also a good idea to store a 
image of the disk so you can try the different rescue methods several 
times.  in the very worst case send the disk to a professional data 
recovery company.


once that is done, you have 2 options:
try to make the osd run again, by. xfs_fsck, + manually finding corrupt 
objects. (find + md5sum (look for read errors)) and deleting them have 
helped me in the past. if you manage to get the osd to run, drain it, by 
setting crush weight to 0. and eventualy remove the disk from the cluster.

alternativly if you can not get the osd running again:
use ceph objectstoretool to extract objects and inject them using a 
clean node and osd like described in 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/   read the man page 
and help for the tool i think the arguments have changed slightly since 
that blogpost.


you may also run into read errors on corrupt objects, stopping your 
export.  in that case rm the offending object and rerun the export.

repeat for all bad drives.

when doing the inject it is important that your cluster is operational 
and able to accept objects from the draining drive, so either set 
minimal replication type to OSD, or even better. add more osd nodes to 
make a operational cluster (with missing objects)



also i see in your log you have os-prober testing all partitions. i tend 
to remove os-prober on machines that does not dualboot with another os.


rules of thumb for future ceph clusters:
min_size =2 for a reason it should never be 1 unless dataloss is wanted.
size=3 f you need the cluster to be operating with a drive or node in a 
error state. size=2 gives you more space but the cluster will block on 
errors until the recovery is done. better to be blocking then loosing data.
if you have size=3 and 3 nodes and you loose a node, then your cluster 
can not self heal. you should have more nodes then you have set size to.
have free space on drives, this is where data is replicated to in case 
of a down node. if you have 4 nodes and you want to be able to loose 
one, and still operate. you need leftover room on your 3 remaining nodes 
to cover for the lost one. the more nodes you have the less the impact 
of a node failure is.  and the less spare room is needed  for a 4 node 
cluster you should not fill more then 66% if you want to be able to 
self-heal + operate.




good luck
Ronny Aasen


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

So to decode few things about your disk:

  1 Raw_Read_Error_Rate 0x002f   100   100   051Pre-fail  Always   
-   37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022   099   099   000Old_age   Always   
-   35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate  0x0022   100   100   000Old_age   Always   
-   2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius 0x0002   047   041   000Old_age   Always   
-   53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always   
-   1
as mentioned before :)

200 Multi_Zone_Error_Rate   0x002a   100   100   000Old_age   Always   
-   4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

I think you are looking at something more like this :

https://www.google.co.uk/imgres?imgurl=https%3A%2F%2Fthumbs.dreamstime.com%2Fz%2Fhard-drive-being-destroyed-hammer-16668693.jpg=https%3A%2F%2Fwww.dreamstime.com%2Fstock-photos-hard-drive-being-destroyed-hammer-image16668693=Ofi7hHnUFmPsyM=Ak6YfqQVvZWCsM%3A=10ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA..i=1300=1130=safari=1116=1920=hdd%20hammer=0ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA=mrc=8

I’ll sound brusk now so buckle up.

You really need to set your priorities straight now, if you want to rescue a
disk that has a pending sector you set your self up for failure. You said that
several consecutive power outage killed your cluster, yet you did show no
concern of investing in at least anti surge protector (outages can create
surges, or a surge can cut the power by burning fuses in sub station) which
actually cause hardware failures. I’ve fired at you a control statement about
how to save your data, but you keep returning trying to save some osd’s - which
to me look like you don’t really care about your data.

If any of those were at any point on your mind, you would shut down those
systems to limit possibility of another outage destroy more data. You would
protect your self via a simple surge protector or at least basic UPS. You would
borrow / beg / steal to get a spare hard drive and backup your stuff. You would
set your pool size and min size to 1, let the cluster get out of warning state
and only then you will be able to mount it to attempt data recovery. You would
check remaining disks for SMART errors.

Then and only then you can start playing with as complex repairs.

Right now you are in the middle of a shit creek without a paddle.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

I think you are looking at something more like this :

https://www.google.co.uk/imgres?imgurl=https%3A%2F%2Fthumbs.dreamstime.com%2Fz%2Fhard-drive-being-destroyed-hammer-16668693.jpg=https%3A%2F%2Fwww.dreamstime.com%2Fstock-photos-hard-drive-being-destroyed-hammer-image16668693=Ofi7hHnUFmPsyM=Ak6YfqQVvZWCsM%3A=10ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA..i=1300=1130=safari=1116=1920=hdd%20hammer=0ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA=mrc=8

:P

I’ll sound brusk now so buckle up.

You really need to set your priorities straight now, if you want to rescue a 
disk that has a pending sector you set your self up for failure. You said that 
several consecutive power outage killed your cluster, yet you did show no 
concern of investing in at least anti surge protector (outages can create 
surges, or a surge can cut the power by burning fuses in sub station) which 
actually cause hardware failures. I’ve fired at you a control statement about 
how to save your data, but you keep returning trying to save some osd’s - which 
to me look like you don’t really care about your data.

If any of those were at any point on your mind, you would shut down those 
systems to limit possibility of another outage destroy more data. You would 
protect your self via a simple surge protector or at least basic UPS. You would 
borrow / beg / steal to get a spare hard drive and backup your stuff. You would 
set your pool size and min size to 1, let the cluster get out of warning state 
and only then you will be able to mount it to attempt data recovery. You would 
check remaining disks for SMART errors.

Then and only then you can start playing with as complex repairs. 

Right now you are in the middle of a shit creek without a paddle. 

> On 28 Aug 2017, at 21:45, hjcho616  wrote:
> 
> So.. would doing something like this could potentially bring it back to life? 
> =)
> 
> Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki 
> 
> 
> 
> Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
>  
> 
> 
> 
> 
> On Monday, August 28, 2017 3:24 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> I think you’ve got your anwser:
> 
> 197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always  
>  -   1
> 
>> On 28 Aug 2017, at 21:22, hjcho616 > > wrote:
>> 
>> Steve,
>> 
>> I thought that was odd too.. 
>> 
>> Below is from the log, This captures transition from good to bad. Looks like 
>> there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors". 
>>  And looks like I did a repair with /dev/sdb1... =P
>> 
>> # grep sdb syslog.1
>> Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 43
>> Aug 27 06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 43 to 45
>> Aug 27 07:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 45 to 44
>> Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 45
>> Aug 27 10:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 45 to 44
>> Aug 27 13:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 45
>> Aug 27 13:53:34 OSD1 kernel: [1.454082] sd 1:0:0:0: [sdb] 3907029168 
>> 512-byte logical blocks: (2.00 TB/1.82 TiB)
>> Aug 27 13:53:34 OSD1 kernel: [1.454447] sd 1:0:0:0: [sdb] Write Protect 
>> is off
>> Aug 27 13:53:34 OSD1 kernel: [1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 
>> 3a 00 00
>> Aug 27 13:53:34 OSD1 kernel: [1.454488] sd 1:0:0:0: [sdb] Write cache: 
>> enabled, read cache: enabled, doesn't support DPO or FUA
>> Aug 27 13:53:34 OSD1 kernel: [1.501349]  sdb: sdb1
>> Aug 27 13:53:34 OSD1 kernel: [1.501796] sd 1:0:0:0: [sdb] Attached SCSI 
>> disk
>> Aug 27 13:53:34 OSD1 kernel: [4.033081] XFS (sdb1): Mounting V4 
>> Filesystem
>> Aug 27 13:53:34 OSD1 kernel: [4.207191] XFS (sdb1): Starting recovery 
>> (logdev: internal)
>> Aug 27 13:53:34 OSD1 kernel: [5.656298] XFS (sdb1): Ending recovery 
>> (logdev: internal)
>> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb, type changed from 
>> 'scsi'

Re: [ceph-users] Power outages!!! help!

So.. would doing something like this could potentially bring it back to life? =)
Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
  
|  
|   
|   
|   ||

   |

  |
|  
|   |  
Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
   |   |

  |

  |

 
 

On Monday, August 28, 2017 3:24 PM, Tomasz Kusmierz 
 wrote:
 

 I think you’ve got your anwser:
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       
-       1

On 28 Aug 2017, at 21:22, hjcho616  wrote:
Steve,
I thought that was odd too.. 
Below is from the log, This captures transition from good to bad. Looks like 
there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
And looks like I did a repair with /dev/sdb1... =P
# grep sdb syslog.1Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], 
SMART Usage Attribute: 194 Temperature_Celsius changed from 44 to 43Aug 27 
06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 43 to 45Aug 27 07:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 45 to 44Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART 
Usage Attribute: 194 Temperature_Celsius changed from 44 to 45Aug 27 10:57:22 
OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 44 to 45Aug 27 13:53:34 OSD1 kernel: [    1.454082] sd 1:0:0:0: [sdb] 
3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)Aug 27 13:53:34 OSD1 
kernel: [    1.454447] sd 1:0:0:0: [sdb] Write Protect is offAug 27 13:53:34 
OSD1 kernel: [    1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00Aug 27 
13:53:34 OSD1 kernel: [    1.454488] sd 1:0:0:0: [sdb] Write cache: enabled, 
read cache: enabled, doesn't support DPO or FUAAug 27 13:53:34 OSD1 kernel: [   
 1.501349]  sdb: sdb1Aug 27 13:53:34 OSD1 kernel: [    1.501796] sd 1:0:0:0: 
[sdb] Attached SCSI diskAug 27 13:53:34 OSD1 kernel: [    4.033081] XFS (sdb1): 
Mounting V4 FilesystemAug 27 13:53:34 OSD1 kernel: [    4.207191] XFS (sdb1): 
Starting recovery (logdev: internal)Aug 27 13:53:34 OSD1 kernel: [    5.656298] 
XFS (sdb1): Ending recovery (logdev: internal)Aug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb, type changed from 'scsi' to 'sat'Aug 27 
13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], openedAug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, S/N:S2H7JD1B306112, 
WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TBAug 27 13:53:34 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], found in smartd database: SAMSUNG SpinPoint F4 EG 
(AF)Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
smartmontools or hdparm with thisAug 27 13:53:36 OSD1 smartd[1028]: Device: 
/dev/sdb [SAT], is SMART capable. Adding to "monitor" list.Aug 27 13:53:36 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], state read from 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:53:49 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], state written to 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/05efi on 
mounted /dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/10qnx on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: 
exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
macosx-prober: debug: /dev/sdb1 is not an HFS+ partition: exitingAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/20microsoft 
on mounted /dev/sdb1Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a 
MS partition: exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
30utility: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/40lsb on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/80minix on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
83haiku: debug: /dev/sdb1 is not a BeFS partition: exitingAug 27

Re: [ceph-users] Power outages!!! help!

I think you’ve got your anwser:

197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always   
-   1

> On 28 Aug 2017, at 21:22, hjcho616  wrote:
> 
> Steve,
> 
> I thought that was odd too.. 
> 
> Below is from the log, This captures transition from good to bad. Looks like 
> there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
> And looks like I did a repair with /dev/sdb1... =P
> 
> # grep sdb syslog.1
> Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 43
> Aug 27 06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 43 to 45
> Aug 27 07:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 45
> Aug 27 10:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 13:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 45
> Aug 27 13:53:34 OSD1 kernel: [1.454082] sd 1:0:0:0: [sdb] 3907029168 
> 512-byte logical blocks: (2.00 TB/1.82 TiB)
> Aug 27 13:53:34 OSD1 kernel: [1.454447] sd 1:0:0:0: [sdb] Write Protect 
> is off
> Aug 27 13:53:34 OSD1 kernel: [1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 
> 3a 00 00
> Aug 27 13:53:34 OSD1 kernel: [1.454488] sd 1:0:0:0: [sdb] Write cache: 
> enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 27 13:53:34 OSD1 kernel: [1.501349]  sdb: sdb1
> Aug 27 13:53:34 OSD1 kernel: [1.501796] sd 1:0:0:0: [sdb] Attached SCSI 
> disk
> Aug 27 13:53:34 OSD1 kernel: [4.033081] XFS (sdb1): Mounting V4 Filesystem
> Aug 27 13:53:34 OSD1 kernel: [4.207191] XFS (sdb1): Starting recovery 
> (logdev: internal)
> Aug 27 13:53:34 OSD1 kernel: [5.656298] XFS (sdb1): Ending recovery 
> (logdev: internal)
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb, type changed from 'scsi' 
> to 'sat'
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], opened
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, 
> S/N:S2H7JD1B306112, WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TB
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], found in smartd 
> database: SAMSUNG SpinPoint F4 EG (AF)
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
> smartmontools or hdparm with this
> Aug 27 13:53:36 OSD1 smartd[1028]: Device: /dev/sdb [SAT], is SMART capable. 
> Adding to "monitor" list.
> Aug 27 13:53:36 OSD1 smartd[1028]: Device: /dev/sdb [SAT], state read from 
> /var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.state
> Aug 27 13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 13:53:49 OSD1 smartd[1028]: Device: /dev/sdb [SAT], state written to 
> /var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.state
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/05efi on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 10freedos: debug: /dev/sdb1 is not a FAT partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/10qnx on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 macosx-prober: debug: /dev/sdb1 is not an HFS+ 
> partition: exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/20microsoft on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a MS partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 30utility: debug: /dev/sdb1 is not a FAT partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/40lsb on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/80minix on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 83haiku: debug: /dev/sdb1 is not a BeFS partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/90linux-distro on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
>

Re: [ceph-users] Power outages!!! help!

Steve,
I thought that was odd too.. 
Below is from the log, This captures transition from good to bad. Looks like 
there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
And looks like I did a repair with /dev/sdb1... =P
# grep sdb syslog.1Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], 
SMART Usage Attribute: 194 Temperature_Celsius changed from 44 to 43Aug 27 
06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 43 to 45Aug 27 07:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 45 to 44Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART 
Usage Attribute: 194 Temperature_Celsius changed from 44 to 45Aug 27 10:57:22 
OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 44 to 45Aug 27 13:53:34 OSD1 kernel: [    1.454082] sd 1:0:0:0: [sdb] 
3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)Aug 27 13:53:34 OSD1 
kernel: [    1.454447] sd 1:0:0:0: [sdb] Write Protect is offAug 27 13:53:34 
OSD1 kernel: [    1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00Aug 27 
13:53:34 OSD1 kernel: [    1.454488] sd 1:0:0:0: [sdb] Write cache: enabled, 
read cache: enabled, doesn't support DPO or FUAAug 27 13:53:34 OSD1 kernel: [   
 1.501349]  sdb: sdb1Aug 27 13:53:34 OSD1 kernel: [    1.501796] sd 1:0:0:0: 
[sdb] Attached SCSI diskAug 27 13:53:34 OSD1 kernel: [    4.033081] XFS (sdb1): 
Mounting V4 FilesystemAug 27 13:53:34 OSD1 kernel: [    4.207191] XFS (sdb1): 
Starting recovery (logdev: internal)Aug 27 13:53:34 OSD1 kernel: [    5.656298] 
XFS (sdb1): Ending recovery (logdev: internal)Aug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb, type changed from 'scsi' to 'sat'Aug 27 
13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], openedAug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, S/N:S2H7JD1B306112, 
WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TBAug 27 13:53:34 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], found in smartd database: SAMSUNG SpinPoint F4 EG 
(AF)Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
smartmontools or hdparm with thisAug 27 13:53:36 OSD1 smartd[1028]: Device: 
/dev/sdb [SAT], is SMART capable. Adding to "monitor" list.Aug 27 13:53:36 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], state read from 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:53:49 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], state written to 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/05efi on 
mounted /dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/10qnx on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: 
exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
macosx-prober: debug: /dev/sdb1 is not an HFS+ partition: exitingAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/20microsoft 
on mounted /dev/sdb1Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a 
MS partition: exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
30utility: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/40lsb on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/80minix on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
83haiku: debug: /dev/sdb1 is not a BeFS partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/90linux-distro on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/90solaris on mounted /dev/sdb1Aug 27 15:53:06 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/05efi on mounted 
/dev/sdb1Aug 27 15:53:06 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:53:06 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:53:06 OSD1 
os-prober: debug: running

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Steve Taylor

I'm jumping in a little late here, but running xfs_repair on your partition
can't frag your partition table. The partition table lives outside the
partition block device and xfs_repair doesn't have access to it when run
against /dev/sdb1. I haven't actually tested it, but it seems unlikely that
running xfs_repair on /dev/sdb would do it either. I would assume it would just
give you an error about /dev/sdb not containing an XFS filesystem. That's a
guess though. I haven't ever tried anything like that.

Are you sure there isn't physical damage to the disk? I wouldn't say it's
common, but power outages can do that. You can run 'dmesg | grep sdb' and
'smartctl -a /dev/sdb' to see if there are kernel errors or SMART errors
indicative of physical problems. If the disk is physically sound and the
partition table really has been fragged, you may be able to restore it from the
backup at the end of the disk, assuming it's GPT. If you can't find a partition
or a filesystem somehow, then you're probably out of luck as far as retrieving
any objects from that OSD. If the disk is physically damaged and your partition
is gone, then it probably isn't worth wasting additional time on it.

[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]

If you are not the intended recipient of this message or received it
erroneously, please notify the sender and delete it, together with any
attachments, and be advised that any dissemination or copying of this message
is prohibited.

On Mon, 2017-08-28 at 19:18 +, hjcho616 wrote:
Tomasz,

Looks like when I did xfs_repair -L /dev/sdb1 it did something to partition
table and I don't see /dev/sdb1 anymore... or maybe I missed 1 in the
/dev/sdb1? =(. Yes.. that extra power outage did a pretty good damage... =P I
am hoping 0.007% is very small...=P Any recommendations on fixing xfs
partition I am missing? =)

Ronny,

Thank you for that link!

No I haven't done anything to osds... not touching them, hoping that I can
revive some of them.. =) Only thing done is trying to start and stop them..

Below are the links to newer files with just one start attempt. =)
ceph-osd.3_single.log

[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.3_single.log

ceph-osd.4_single.log

[https://s.yimg.com/vv//api/res/1.2/6js1HPFw1ePUfgrZdK0glw--/YXBwaWQ9bWFpbDtmaT1maWxsO2g9ODA7dz04MA--/https://lh5.googleusercontent.com/dgHcOP6Na3RcgR0rOHgRjiyos_MOtlk-WjCp__L2nIJX7vwaLQj3QQ=w1200-h630-p.cf.jpg]
[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.4_single.log

ceph-osd.5_single.log

[https://s.yimg.com/vv//api/res/1.2/TNJOwajiVQcd_mAnFDCqpQ--/YXBwaWQ9bWFpbDtmaT1maWxsO2g9ODA7dz04MA--/https://lh5.googleusercontent.com/KnCXt_G7jTuxtknlvz3gU5g_dozYNe_EwEdEwaAXoDAPf9bqZurrvw=w1200-h630-p.cf.jpg]
[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.5_single.log

ceph-osd.8_single.log

[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.8_single.log

Regards,
Hong

On Monday, August 28, 2017 12:53 PM, Ronny Aasen
wrote:

comments inline

On 28.08.2017 18:31, hjcho616 wrote:

I'll see what I can do on that... Looks like I may have to add another OSD host
as I utilized all of the SATA ports on those boards. =P

Ronny,

I am running with size=2 min_size=1. I created everything with ceph-deploy and
didn't touch much of that pool settings... I hope not, but sounds like I may
have lost some files! I do want some of those OSDs to come back online
somehow... to get that confidence level up. =P

This is a bad idea as you have found out. once your cluster is healthy you
should look at improving this.

The dead osd.3 message is probably me trying to stop and start the osd. There
were some cases where stop didn't kill the ceph-osd process. I just started or
restarted osd to try and see if that worked.. After that, there were some
reboots and I am not seeing those messages after it...

when providing logs. try to move away the old one. do a single startup. and
post that. it makes it easier to read when you have a single run in the file.

This is

Re: [ceph-users] Power outages!!! help!

Tomasz,
Looks like when I did xfs_repair -L /dev/sdb1 it did something to partition 
table and I don't see /dev/sdb1 anymore... or maybe I missed 1 in the 
/dev/sdb1? =(. Yes.. that extra power outage did a pretty good damage... =P  I 
am hoping 0.007% is very small...=P  Any recommendations on fixing xfs 
partition I am missing? =)
Ronny,
Thank you for that link!
No I haven't done anything to osds... not touching them, hoping that I can 
revive some of them.. =)  Only thing done is trying to start and stop them..
Below are the links to newer files with just one start attempt. 
=)ceph-osd.3_single.log

  
|   
ceph-osd.3_single.log
  |  |

 
ceph-osd.4_single.log

  
||   
ceph-osd.4_single.log
  |  |

 

ceph-osd.5_single.log

  
||   
ceph-osd.5_single.log
  |  |

 
ceph-osd.8_single.log

  
|   
ceph-osd.8_single.log
  |  |

 
Regards,Hong 

On Monday, August 28, 2017 12:53 PM, Ronny Aasen 
 wrote:
 

  comments inline
 
 On 28.08.2017 18:31, hjcho616 wrote:
  
 
 
  I'll see what I can do on that... Looks like I may have to add another OSD 
host as I utilized all of the SATA ports on those boards. =P 
  Ronny, 
  I am running with size=2 min_size=1.  I created everything with ceph-deploy 
and didn't touch much of that pool settings...  I hope not, but sounds like I 
may have lost some files!  I do want some of those OSDs to come back online 
somehow... to get that confidence level up. =P 
   
 
 This is a bad idea as you have found out. once your cluster is healthy you 
should look at improving this.
 
 
  The dead osd.3 message is probably me trying to stop and start the osd.  
There were some cases where stop didn't kill the ceph-osd process.  I just 
started or restarted osd to try and see if that worked..  After that, there 
were some reboots and I am not seeing those messages after it... 
   
 
 when providing logs. try to move away the old one. do a single startup. and 
post that. it makes it easier to read when you have a single run in the file.
 
 
  
  This is something I am running at home.  I am the only user.  In a way it is 
production environment but just driven by me. =) 
  Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and 
osd.8 come back up without removing them?  I have a feeling I can get some data 
back with some of them intact.  
 
 just incase you are not able to make them run again, does not automatically 
mean the data is lost. i have successfully recovered lost object using these 
instructions  http://ceph.com/geen-categorie/incomplete-pgs-oh-my/  
 
 I would start by  renaming the osd's log file, do a single try at starting the 
osd. and posting that log. have you done anything to the osd's that could make 
them not run ? 
 
 kind regards
 Ronny Aasen
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Ronny Aasen


comments inline

On 28.08.2017 18:31, hjcho616 wrote:



I'll see what I can do on that... Looks like I may have to add another 
OSD host as I utilized all of the SATA ports on those boards. =P


Ronny,

I am running with size=2 min_size=1.  I created everything with 
ceph-deploy and didn't touch much of that pool settings...  I hope 
not, but sounds like I may have lost some files!  I do want some of 
those OSDs to come back online somehow... to get that confidence level 
up. =P




This is a bad idea as you have found out. once your cluster is healthy 
you should look at improving this.


The dead osd.3 message is probably me trying to stop and start the 
osd.  There were some cases where stop didn't kill the ceph-osd 
process.  I just started or restarted osd to try and see if that 
worked..  After that, there were some reboots and I am not seeing 
those messages after it...




when providing logs. try to move away the old one. do a single startup. 
and post that. it makes it easier to read when you have a single run in 
the file.




This is something I am running at home.  I am the only user.  In a way 
it is production environment but just driven by me. =)


Do you have any suggestions to get any of those osd.3, osd.4, osd.5, 
and osd.8 come back up without removing them?  I have a feeling I can 
get some data back with some of them intact.


just incase you are not able to make them run again, does not 
automatically mean the data is lost. i have successfully recovered lost 
object using these instructions 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/


I would start by  renaming the osd's log file, do a single try at 
starting the osd. and posting that log. have you done anything to the 
osd's that could make them not run ?


kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Sorry mate I’ve just noticed the 
"unfound (0.007%)”
I think that your main culprit here is osd.0. You need to have all osd’s on one 
host to get all the data back.

Also for time being I would just change size and min size down to 1 and try to 
figure out which osd you actually need to get all the data. Then try to fix 
your machine problems. From my experience regardless of solution, when you are 
in degraded mode and try to fix stuff things only get worse. 


> On 28 Aug 2017, at 17:31, hjcho616  wrote:
> 
> Thank you all for suggestions!
> 
> Maged,
> 
> I'll see what I can do on that... Looks like I may have to add another OSD 
> host as I utilized all of the SATA ports on those boards. =P
> 
> Ronny,
> 
> I am running with size=2 min_size=1.  I created everything with ceph-deploy 
> and didn't touch much of that pool settings...  I hope not, but sounds like I 
> may have lost some files!  I do want some of those OSDs to come back online 
> somehow... to get that confidence level up. =P
> 
> The dead osd.3 message is probably me trying to stop and start the osd.  
> There were some cases where stop didn't kill the ceph-osd process.  I just 
> started or restarted osd to try and see if that worked..  After that, there 
> were some reboots and I am not seeing those messages after it...
> 
> Tomasz,
> 
> This is something I am running at home.  I am the only user.  In a way it is 
> production environment but just driven by me. =)
> 
> Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and 
> osd.8 come back up without removing them?  I have a feeling I can get some 
> data back with some of them intact.
> 
> Thank you!
> 
> Regards,
> Hong
> 
> 
> On Monday, August 28, 2017 6:09 AM, Tomasz Kusmierz  
> wrote:
> 
> 
> Personally I would suggest to:
> - change minimal replication type to OSD (from default host)
> - remove the OSD from the host with all those "down OSD’s" (note that they 
> are down not out which makes it more weird)
> - let single node cluster stabilise, yes performance will suck but at least 
> you will have data on two copies on singular host … better this than nothing.
> - fix whatever issues you have on host OSD2 
> - add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will 
> make ceph migrate all data away from host OSD1
> - fix all the problem you’ve got on host OSD1 
> 
> reason I suggest that is that is seems that you’ve got issues everywhere and 
> since you are running a production environment (at least it seem like that to 
> me) data and down time is main priority.
> 
> > On 28 Aug 2017, at 11:58, Ronny Aasen  > > wrote:
> > 
> > On 28. aug. 2017 08:01, hjcho616 wrote:
> >> Hello!
> >> I've been using ceph for long time mostly for network CephFS storage, even 
> >> before Argonaut release!  It's been working very well for me.  Yes, I had 
> >> some power outtages before and asked few questions on this list before and 
> >> got resolved happily!  Thank you all!
> >> Not sure why but we've been having quite a bit of power outages lately.  
> >> Ceph appear to be running OK with those going on.. so I was pretty happy 
> >> and didn't thought much of it... till yesterday, When I started to move 
> >> some videos to cephfs, ceph decided that it was full although df showed 
> >> only 54% utilization!  Then I looked up, some of the osds were down! (only 
> >> 3 at that point!)
> >> I am running pretty simple ceph configuration... I have one machine 
> >> running MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 
> >> SSD for journal named OSD1 and OSD2.
> >> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's 
> >> log file and googled some of them... they appeared to be tied to version 
> >> 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't solve my 
> >> problems.. =P  While looking at some of this.. there was another power 
> >> outage!  D'oh!  I may need to invest in a UPS or something... Until this 
> >> happened, all of the osd down were from OSD2.  But OSD1 took a hit!  
> >> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L 
> >> /dev/sdb1 as suggested by command line.. I was able to mount it again, 
> >> phew, reboot... then /dev/sdb1 is no longer accessible!  N!!!
> >> So this is what I have today!  I am a bit concerned as half of the osds 
> >> are down!  and osd.0 doesn't look good at all...
> >> # ceph osd tree
> >> ID WEIGHT  TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
> >> -1 16.24478 root default
> >> -2  8.12239host OSD1
> >>  1  1.95250osd.1  up  1.0  1.0
> >>  0  1.95250osd.0down0  1.0
> >>  7  0.31239osd.7  up  1.0  1.0
> >>  6  1.95250osd.6  up  1.0  1.0
> >>  2  1.95250osd.2  up  1.0  1.0
> >> -3  8.12239host

Re: [ceph-users] Power outages!!! help!