Re: [ceph-users] Power outages!!! help!

2017-09-28 Thread Ronny Aasen

On 28. sep. 2017 18:53, hjcho616 wrote:
Yay! Finally after about exactly one month I finally am able to mount 
the drive!  Now is time to see how my data is doing. =P  Doesn't look 
too bad though.


Got to love the open source. =)  I downloaded ceph source code.  Built them.  Then tried to run ceph-objectstore-export on that osd.4.   Then started debugging it.  Obviously don't have any idea of what everything do... > but was able to trace to the error message.  The corruption appears to be at the mount region.  When it tries to decode a buffer, most buffers had very periodic (looking at the printfs I put in) access to data but then > few of them had huge number.  Oh that "1" that didn't make sense was from the corruption happened, and that struct_v portion of the data changed to ASCII value of 1, which happily printed 1. =P  Since it was a mount 
portion... and hoping it doesn't impact the data much... went ahead and allowed those corrupted values.  I was able to export osd.4 with journal!


congratulations and well done :)

just imagine tring to do this on $vendors's propitary blackbox...

Ronny Aasen

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-09-28 Thread hjcho616
Yay! Finally after about exactly one month I finally am able to mount the 
drive!  Now is time to see how my data is doing. =P  Doesn't look too bad 
though.
Got to love the open source. =)  I downloaded ceph source code.  Built them.  
Then tried to run ceph-objectstore-export on that osd.4.   Then started 
debugging it.  Obviously don't have any idea of what everything do... but was 
able to trace to the error message.  The corruption appears to be at the mount 
region.  When it tries to decode a buffer, most buffers had very periodic 
(looking at the printfs I put in) access to data but then few of them had huge 
number.  Oh that "1" that didn't make sense was from the corruption happened, 
and that struct_v portion of the data changed to ASCII value of 1, which 
happily printed 1. =P  Since it was a mount portion... and hoping it doesn't 
impact the data much... went ahead and allowed those corrupted values.  I was 
able to export osd.4 with journal!
Then imported that page..  But OSDs wouldn't take them.. as it decided to 
create empty page 1.28 and assigned them active.  So.. just as "Incomplete PGs 
Oh My!" page sugeested,pulled those osds down and removed those empty heads and 
started back up.  At that point, no more incomplete data!
Working on that inconsistent data. looks like this is somewhat new in the 
10.2s.  I was able to get it working with rados get and put and 
deep-scrub.https://www.spinics.net/lists/ceph-users/msg39063.html

At this point, everything was active+clean.  But MDS wasn't happy.  Seems to 
suggest journal is broke.HEALTH_ERR mds rank 0 is damaged; mds cluster is 
degraded; no legacy OSD present but 'sortbitwise' flag is not set

Found this. Did everything down to cephfs-table-tool all reset 
sessionhttp://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/

Restarted MDS.  HEALTH_WARN no legacy OSD present but 'sortbitwise' flag is not 
set
Mounted!  Thank you everyone for the help!  Learned alot!
Regards,Hong
 

On Friday, September 22, 2017 1:01 AM, hjcho616  wrote:
 

 Ronny,
Could you help me with this log?  I got this with debug osd=20 filestore=20 
ms=20.  This one is running "ceph pg repair 2.7"  This is one of the smaller 
page, thus log was smaller.  Others have similar errors.  I can see the lines 
with ERR, but other than that is there something I should be paying attention 
to? 
https://drive.google.com/file/d/0By7YztAJNGUWNkpCV090dHBmOWc/view?usp=sharing
Error messages looks like this.2017-09-21 23:53:31.545510 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-21 23:53:31.545520 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-21 23:53:31.545531 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head: failed to pick suitable auth 
object
I did try to move that object to different location as suggested from this 
page.http://ceph.com/geen-categorie/ceph-manually-repair-object/

This is what I ran.systemctl stop ceph-osd@7ceph-osd -i 7 --flush-journalcd 
/var/lib/ceph/osd/ceph-7cd current/2.7_head/mv 
rb.0.145d.2ae8944a.00bb__head_6F5DBE87__2 ~/ceph osd treesystemctl 
start ceph-osd@7ceph pg repair 2.7
Then I just get this..2017-09-22 00:41:06.495399 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-22 00:41:06.495417 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7 missing 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head2017-09-22 00:41:06.495424 
7f22ac3bd700 -1 log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head: failed to pick suitable auth 
object
Moving from osd.2 results in similar error message, just says missing on top 
one instead. =P

I was hoping this time would give me a different result as I let one more osd 
copy one from OSD1 by turning down osd.7 and set noout.  But it doesn't appear 
to care about that extra data. Maybe only true when size is 3?  Basically since 
I had most osds alive on OSD1 I was trying to favor data from OSD1. =P
What can I do in this case? According to 

Re: [ceph-users] Power outages!!! help!

2017-09-22 Thread hjcho616
Ronny,
Could you help me with this log?  I got this with debug osd=20 filestore=20 
ms=20.  This one is running "ceph pg repair 2.7"  This is one of the smaller 
page, thus log was smaller.  Others have similar errors.  I can see the lines 
with ERR, but other than that is there something I should be paying attention 
to? 
https://drive.google.com/file/d/0By7YztAJNGUWNkpCV090dHBmOWc/view?usp=sharing
Error messages looks like this.2017-09-21 23:53:31.545510 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-21 23:53:31.545520 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-21 23:53:31.545531 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head: failed to pick suitable auth 
object
I did try to move that object to different location as suggested from this 
page.http://ceph.com/geen-categorie/ceph-manually-repair-object/

This is what I ran.systemctl stop ceph-osd@7ceph-osd -i 7 --flush-journalcd 
/var/lib/ceph/osd/ceph-7cd current/2.7_head/mv 
rb.0.145d.2ae8944a.00bb__head_6F5DBE87__2 ~/ceph osd treesystemctl 
start ceph-osd@7ceph pg repair 2.7
Then I just get this..2017-09-22 00:41:06.495399 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od  
alloc_hint [0 0])2017-09-22 00:41:06.495417 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7 missing 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head2017-09-22 00:41:06.495424 
7f22ac3bd700 -1 log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.00bb:head: failed to pick suitable auth 
object
Moving from osd.2 results in similar error message, just says missing on top 
one instead. =P

I was hoping this time would give me a different result as I let one more osd 
copy one from OSD1 by turning down osd.7 and set noout.  But it doesn't appear 
to care about that extra data. Maybe only true when size is 3?  Basically since 
I had most osds alive on OSD1 I was trying to favor data from OSD1. =P
What can I do in this case? According to 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/ inconsistent data can be 
expected with skip journal replay, and I had to use it as export crashed 
without it. =P  But doesn't say much about what to do in that case.If all went 
well, then your cluster is now back to 100% active+clean / HEALTH_OK state. 
Note that you may still have inconsistent or stale data stored inside the PG. 
This is because the state of the data on the OSD that failed is a bit unknown, 
especially if you had to use the ‘–skip-journal-replay’ option on the export. 
For RBD data, the client which utilizes the RBD should run a filesystem check 
against the RBD.

Regards,Hong 

On Thursday, September 21, 2017 1:46 AM, Ronny Aasen 
 wrote:
 

 On 21. sep. 2017 00:35, hjcho616 wrote:
> # rados list-inconsistent-pg data
> ["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"]
> # rados list-inconsistent-pg metadata
> ["1.d","1.3d"]
> # rados list-inconsistent-pg rbd
> ["2.7"]
> # rados list-inconsistent-obj 0.0 --format=json-pretty
> {
>      "epoch": 23112,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.5 --format=json-pretty
> {
>      "epoch": 23078,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.a --format=json-pretty
> {
>      "epoch": 22954,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.e --format=json-pretty
> {
>      "epoch": 23068,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.1c --format=json-pretty
> {
>      "epoch": 22954,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.29 --format=json-pretty
> {
>      "epoch": 22974,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.2c --format=json-pretty
> {
>      "epoch": 23194,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 1.d --format=json-pretty
> {
>      "epoch": 23072,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 1.3d --format=json-pretty
> {
>      "epoch": 23221,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 2.7 --format=json-pretty
> {
>      

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
# rados list-inconsistent-pg 
data["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"]# rados list-inconsistent-pg 
metadata["1.d","1.3d"]# rados list-inconsistent-pg rbd["2.7"]# rados 
list-inconsistent-obj 0.0 --format=json-pretty
{    "epoch": 23112,    "inconsistents": []}# rados list-inconsistent-obj 0.5 
--format=json-pretty{    "epoch": 23078,    "inconsistents": []}# rados 
list-inconsistent-obj 0.a --format=json-pretty{    "epoch": 22954,    
"inconsistents": []}# rados list-inconsistent-obj 0.e --format=json-pretty{    
"epoch": 23068,    "inconsistents": []}# rados list-inconsistent-obj 0.1c 
--format=json-pretty{    "epoch": 22954,    "inconsistents": []}# rados 
list-inconsistent-obj 0.29 --format=json-pretty{    "epoch": 22974,    
"inconsistents": []}# rados list-inconsistent-obj 0.2c --format=json-pretty{    
"epoch": 23194,    "inconsistents": []}# rados list-inconsistent-obj 1.d 
--format=json-pretty{    "epoch": 23072,    "inconsistents": []}# rados 
list-inconsistent-obj 1.3d --format=json-pretty{    "epoch": 23221,    
"inconsistents": []}# rados list-inconsistent-obj 2.7 --format=json-pretty{    
"epoch": 23032,    "inconsistents": []}
Looks like not much information is there.  Could you elaborate on the items you 
mentioned in find the object?  How do I check metadata.  What are we looking 
for in md5sum? 
- find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 

I tried that Ceph: manually repair object - Ceph methods on PG 2.7 
before..Tried 3 replica case, which would result in shard missing, regardless 
of which one I moved,  2 replica case, hmm... I guess I don't know how long is 
"wait a bit" is, I just turned it back on after a minute or so, just returns 
back to same inconsistent message.. =P  Are we looking for entire stopped OSD 
to map to different OSD and get 3 replica when running stopped OSD again?
Regards,Hong

 

On Wednesday, September 20, 2017 4:47 PM, hjcho616  
wrote:
 

 Thanks Ronny.  I'll try that inconsistent issue soon.  
I think the OSD drive that PG 1.28 is sitting on is still ok... just file 
corruption happened when power outage happened.. =P  As you suggested, cd 
/var/lib/ceph/osd/ceph-4/current/
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz 1.28_*
cd /var/lib/ceph/osd/ceph-10/tmposd
mkdir currentchown ceph.ceph current/
cd current/tar --xattrs --preserve-permissions -zxvf 
/var/lib/ceph/osd/ceph-4/current/osd.4.tar.gz
systemctl start ceph-osd@8

I created an temp OSD like I did during import time.  Then set the crush 
reweight to 0.  I noticed current directory was missing. =P So created a 
current directory and copied content there.
Starting OSD doesn't appear to show any activity.  Is there any other file I 
need to copy over other than 1.28_head and 1.28_tail directories?
Regards,Hong 

On Wednesday, September 20, 2017 4:04 PM, Ronny Aasen 
 wrote:
 

  i would only tar the pg you have missing objects from, trying to inject older 
objects when the pg is correct can not be good. 
 
 
 scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
 and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and 
onhttp://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to
 
 - find the pg  ::  rados list-inconsistent-pg [pool]
 - find the problem ::  rados list-inconsistent-obj 0.6 --format=json-pretty ; 
give you the object name  look for hints to what is the bad object 
 - find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 
 - fix the problem  :: assuming you find the bad object, stop the affected osd 
with the bad object, remove the object manually, restart osd. issue repair 
command.
 
 
 if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 good luck 
 Ronny Aasen
 
 On 20.09.2017 22:17, hjcho616 wrote:
  
  Thanks Ronny. 
  I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?  
  tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz . 
  As far as inconsistent PGs... I am running in to these errors.  I tried 
moving one copy of pg to other location, but it just says moved shard is 
missing.  Tried setting 'noout ' and turn one of them down, seems to work on 
something but then back to same error.  Currently trying to move to different 
osd... 

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
Thanks Ronny.  I'll try that inconsistent issue soon.  
I think the OSD drive that PG 1.28 is sitting on is still ok... just file 
corruption happened when power outage happened.. =P  As you suggested, cd 
/var/lib/ceph/osd/ceph-4/current/
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz 1.28_*
cd /var/lib/ceph/osd/ceph-10/tmposd
mkdir currentchown ceph.ceph current/
cd current/tar --xattrs --preserve-permissions -zxvf 
/var/lib/ceph/osd/ceph-4/current/osd.4.tar.gz
systemctl start ceph-osd@8

I created an temp OSD like I did during import time.  Then set the crush 
reweight to 0.  I noticed current directory was missing. =P So created a 
current directory and copied content there.
Starting OSD doesn't appear to show any activity.  Is there any other file I 
need to copy over other than 1.28_head and 1.28_tail directories?
Regards,Hong 

On Wednesday, September 20, 2017 4:04 PM, Ronny Aasen 
 wrote:
 

  i would only tar the pg you have missing objects from, trying to inject older 
objects when the pg is correct can not be good. 
 
 
 scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
 and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and 
onhttp://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to
 
 - find the pg  ::  rados list-inconsistent-pg [pool]
 - find the problem ::  rados list-inconsistent-obj 0.6 --format=json-pretty ; 
give you the object name  look for hints to what is the bad object 
 - find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 
 - fix the problem  :: assuming you find the bad object, stop the affected osd 
with the bad object, remove the object manually, restart osd. issue repair 
command.
 
 
 if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 good luck 
 Ronny Aasen
 
 On 20.09.2017 22:17, hjcho616 wrote:
  
  Thanks Ronny. 
  I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?  
  tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz . 
  As far as inconsistent PGs... I am running in to these errors.  I tried 
moving one copy of pg to other location, but it just says moved shard is 
missing.  Tried setting 'noout ' and turn one of them down, seems to work on 
something but then back to same error.  Currently trying to move to different 
osd... making sure the drive is not faulty, got few of them.. but still 
persisting..  I've been kicking off ceph pg repair PG#, hoping it would fix 
them. =P  Any other suggestion? 
  2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log [INF] : 
0.29 repair starts 2017-09-20 09:47:37.384921 7f163c5fa700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 6: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s  4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0]) 2017-09-20 09:47:37.384931 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s  4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0]) 2017-09-20 09:47:37.384936 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 soid 
0:97126ead:::200014ce4c3.028f:head: failed to pick suitable auth object 
2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head data_digest 
0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776  dd f41cfab8 od  
alloc_hint [0 0]) 2017-09-20 09:48:11.138575 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97d5c15a:::10101b4.6892:head data_digest 0xd65b4014 != data_digest 
0xf41cfab8 from auth oi 0:97d5c15a:::10101b4.6892:head(12962'65557 
osd.4.0:42234 dirty|data_digest|omap_digest s 4194304 uv 776  dd f41cfab8 od 
 alloc_hint [0 0]) 2017-09-20 09:48:11.138581 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 soid 
0:97d5c15a:::10101b4.6892:head: failed to pick suitable auth object 
2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 repair 4 errors, 0 

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread Ronny Aasen
i would only tar the pg you have missing objects from, trying to inject 
older objects when the pg is correct can not be good.



scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and on 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to


- find the pg  ::  rados list-inconsistent-pg [pool]
- find the problem :: rados list-inconsistent-obj 0.6 
--format=json-pretty ; give you the object name  look for hints to what 
is the bad object
- find the object  :: manually check the objects, check the object 
metadata, run md5sum on them all and compare. check objects on the 
nonrunning osd's and compare there as well. anything to try to determine 
what object is ok and what is bad.
- fix the problem  :: assuming you find the bad object, stop the 
affected osd with the bad object, remove the object manually, restart 
osd. issue repair command.



if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/


good luck
Ronny Aasen

On 20.09.2017 22:17, hjcho616 wrote:

Thanks Ronny.

I decided to try to tar everything under current directory.  Is this 
correct command for it?  Is there any directory we do not want in the 
new drive?  commit_op_seq, meta, nosnap, omap?


tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz .

As far as inconsistent PGs... I am running in to these errors.  I 
tried moving one copy of pg to other location, but it just says moved 
shard is missing.  Tried setting 'noout ' and turn one of them down, 
seems to work on something but then back to same error.  Currently 
trying to move to different osd... making sure the drive is not 
faulty, got few of them.. but still persisting..  I've been kicking 
off ceph pg repair PG#, hoping it would fix them. =P  Any other 
suggestion?


2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log 
[INF] : 0.29 repair starts
2017-09-20 09:47:37.384921 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 6: soid 0:97126ead:::200014ce4c3.028f:head 
data_digest 0x8f679a50 != data_digest 0x979f2ed4 from auth oi 
0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 
539375 dd 979f2ed4 od  alloc_hint [0 0])
2017-09-20 09:47:37.384931 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 7: soid 0:97126ead:::200014ce4c3.028f:head 
data_digest 0x8f679a50 != data_digest 0x979f2ed4 from auth oi 
0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 
539375 dd 979f2ed4 od  alloc_hint [0 0])
2017-09-20 09:47:37.384936 7f163c5fa700 -1 log_channel(cluster) log 
[ERR] : 0.29 soid 0:97126ead:::200014ce4c3.028f:head: failed to 
pick suitable auth object
2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head 
data_digest 0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])
2017-09-20 09:48:11.138575 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 shard 7: soid 0:97d5c15a:::10101b4.6892:head 
data_digest 0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])
2017-09-20 09:48:11.138581 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 soid 0:97d5c15a:::10101b4.6892:head: failed to 
pick suitable auth object
2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log 
[ERR] : 0.29 repair 4 errors, 0 fixed


Latest health...
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
down; 1 pgs incomplete; 9 pgs inconsistent; 1 pgs repair; 1 pgs stuck 
inactive; 1 pgs stuck unclean; 68 scrub errors; mds rank 0 has failed; 
mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag 
is not set


Regards,
Hong




On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen 
 wrote:



On 20.09.2017 16:49, hjcho616 wrote:

Anyone?  Can this page be saved?  If not what are my options?

Regards,
Hong


On Saturday, September 16, 2017 1:55 AM, hjcho616 
  wrote:



Looking better... working on scrubbing..
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 
1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 
30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set; 

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
Thanks Ronny.
I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz .
As far as inconsistent PGs... I am running in to these errors.  I tried moving 
one copy of pg to other location, but it just says moved shard is missing.  
Tried setting 'noout ' and turn one of them down, seems to work on something 
but then back to same error.  Currently trying to move to different osd... 
making sure the drive is not faulty, got few of them.. but still persisting..  
I've been kicking off ceph pg repair PG#, hoping it would fix them. =P  Any 
other suggestion?
2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log [INF] : 
0.29 repair starts2017-09-20 09:47:37.384921 7f163c5fa700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 6: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0])2017-09-20 09:47:37.384931 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97126ead:::200014ce4c3.028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od  alloc_hint [0 0])2017-09-20 09:47:37.384936 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 soid 
0:97126ead:::200014ce4c3.028f:head: failed to pick suitable auth 
object2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log [ERR] 
: 0.29 shard 6: soid 0:97d5c15a:::10101b4.6892:head data_digest 
0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::10101b4.6892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od  
alloc_hint [0 0])2017-09-20 09:48:11.138575 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97d5c15a:::10101b4.6892:head data_digest 0xd65b4014 != data_digest 
0xf41cfab8 from auth oi 0:97d5c15a:::10101b4.6892:head(12962'65557 
osd.4.0:42234 dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od 
 alloc_hint [0 0])2017-09-20 09:48:11.138581 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 soid 
0:97d5c15a:::10101b4.6892:head: failed to pick suitable auth 
object2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log [ERR] 
: 0.29 repair 4 errors, 0 fixed
Latest health...HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 
1 pgs down; 1 pgs incomplete; 9 pgs inconsistent; 1 pgs repair; 1 pgs stuck 
inactive; 1 pgs stuck unclean; 68 scrub errors; mds rank 0 has failed; mds 
cluster is degraded; no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong

 

On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen 
 wrote:
 

  On 20.09.2017 16:49, hjcho616 wrote:
  
  Anyone?  Can this page be saved?  If not what are my options? 
  Regards, Hong 
 
  On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:
  
 
 Looking better... working on scrubbing.. HEALTH_ERR 1 pgs are stuck 
inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 
pgs repair; 1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too 
few PGs per OSD (29 < min 30); mds rank 0 has failed; mds cluster is degraded; 
noout flag(s) set; no legacy OSD present but 'sortbitwise' flag is not  set
  
  Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus... # 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.export terminate called after throwing an instance of 
'std::domain_error' 
  
 
 [SNIP]
 
 What can I do to save that PG1.28?  Please let me know if you need 
more information.  So close!... =)  
  Regards, Hong 
   
 12 inconsistent and 109 scrub errors is something you should fix first of all. 
  also you can consider using the paid-services of many ceph support companies. 
that specialize in these kind of situations. 
  -- that beeing said, here are some suggestions...
  when it comes to lost object recovery you have come about as far as i have 
ever experienced. so everything after here is just 

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread Ronny Aasen

On 20.09.2017 16:49, hjcho616 wrote:

Anyone?  Can this page be saved?  If not what are my options?

Regards,
Hong


On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:



Looking better... working on scrubbing..
HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs 
incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 1 
pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 
30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not set


Now PG1.28.. looking at all old osds dead or alive.  Only one with 
DIR_* directory is in osd.4. This appears to be metadata pool!  21M of 
metadata can be quite a bit of stuff.. so I would like to rescue this! 
 But I am not able to start this OSD.  exporting through 
ceph-objectstore-tool appears to crash.  Even with 
--skip-journal-replay and --skip-mount-omap (different failure).  As I 
mentioned in earlier email, that exception thrown message is bogus...
# ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export

terminate called after throwing an instance of 'std::domain_error'



[SNIP]
What can I do to save that PG1.28?  Please let me know if you need 
more information.  So close!... =)


Regards,
Hong

12 inconsistent and 109 scrub errors is something you should fix first 
of all.


also you can consider using the paid-services of many ceph support 
companies. that specialize in these kind of situations.


--

that beeing said, here are some suggestions...

when it comes to lost object recovery you have come about as far as i 
have ever experienced. so everything after here is just assumptions and 
wild guesswork to what you can try.  I hope others shouts out if i tell 
you wildly wrong things.


if you have found date pg1.28 from the broken osd and have checked all 
other working and nonworking drives, for that pg. then you need to try 
and extract the pg from the broken drive. As always in recovery cases, 
take a dd clone of the drive and work from the cloned image. to avoid 
more damage to the drive, and to allow you to try multiple times.


you should add a temporary injection drive large enough for that pg, and 
set its crush weight to 0 so it always drains. make sure it is up and 
registered properly in ceph.


the idea is to copy the pg manually from broken-osd to the injection 
drive, since the export/import fails.. making sure you get all xattrs 
included.  one can either copy the whole pg, or just the "missing" 
objects.  if there are few objects i would go for that, if there are 
many i would take the whole pg. you wont get data from leveldb. so i am 
not at all sure this would work. but worth a shot.


- stop your injection osd, verify it is down and the proccess not running.
- from the mountpoint of your broken-osd go into the current directory. 
and tar up the pg1.28 make sure you use -p and --xattrs when you create 
the archive.
- if tar errors out on unreadable files, just rm those (since you are 
working on a copy of your rescue image, you can allways try again)
- copy the tar file to the injection drive and extract while sitting in 
the current directory (remember --xattrs)

- set debug options on the injection drive in ceph.conf
- start the injection drive, and follow along in the log file. hopefully 
it should scan, locate the pg, and replicate the pg1.28 objects off to 
the current primary drive for pg1.28. and since it have crush weight 0 
it should drain out.
- if that works, verify the injection drive is drained, stop it and 
remove it from ceph.  zap the drive.



this is all as i said guesstimates so your mileage may vary
good luck

Ronny Aasen







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
Anyone?  Can this page be saved?  If not what are my options?
Regards,Hong 

On Saturday, September 16, 2017 1:55 AM, hjcho616  
wrote:
 

 Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive 
for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 
1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per 
OSD (29 < min 30); mds rank 0 has failed; mds cluster is degraded; noout 
flag(s) set; no legacy OSD present but 'sortbitwise' flag is not set

Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus...# 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.exportterminate called after throwing an instance of 'std::domain_error' 
 what():  coll_t::decode(): don't know how to decode version 1*** Caught signal 
(Aborted) ** in thread 7f812e7fb940 thread_name:ceph-objectstor ceph version 
10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) 
[0x55dee175fa57] 2: (()+0x110c0) [0x7f812d0050c0] 3: (gsignal()+0xcf) 
[0x7f812b438fcf] 4: (abort()+0x16a) [0x7f812b43a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f812bd1fb3d] 6: 
(()+0x5ebb6) [0x7f812bd1dbb6] 7: (()+0x5ec01) [0x7f812bd1dc01] 8: (()+0x5ee19) 
[0x7f812bd1de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55dee143001e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55dee156d5f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55dee1562bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55dee1561eb8] 13: 
(FileStore::mount()+0x2525) [0x55dee1498eb5] 14: (main()+0x28c0) 
[0x55dee10c9400] 15: (__libc_start_main()+0xf1) [0x7f812b4262b1] 16: 
(()+0x34f747) [0x55dee1118747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-journal-replayterminate called after throwing an instance of 
'std::domain_error'  what():  coll_t::decode(): don't know how to decode 
version 1*** Caught signal (Aborted) ** in thread 7fa6d087b940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55abd356aa57] 2: 
(()+0x110c0) [0x7fa6cf0850c0] 3: (gsignal()+0xcf) [0x7fa6cd4b8fcf] 4: 
(abort()+0x16a) [0x7fa6cd4ba3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fa6cdd9fb3d] 6: 
(()+0x5ebb6) [0x7fa6cdd9dbb6] 7: (()+0x5ec01) [0x7fa6cdd9dc01] 8: (()+0x5ee19) 
[0x7fa6cdd9de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55abd323b01e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55abd33785f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55abd336dbb9] 12: (DBObjectMap::init(bool)+0x288) [0x55abd336ceb8] 13: 
(FileStore::mount()+0x2525) [0x55abd32a3eb5] 14: (main()+0x28c0) 
[0x55abd2ed4400] 15: (__libc_start_main()+0xf1) [0x7fa6cd4a62b1] 16: 
(()+0x34f747) [0x55abd2f23747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-mount-omapceph-objectstore-tool: 
/usr/include/boost/smart_ptr/scoped_ptr.hpp:99: T* 
boost::scoped_ptr::operator->() const [with T = ObjectMap]: Assertion `px != 
0' failed.*** Caught signal (Aborted) ** in thread 7f14345c5940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x5575b50a9a57] 2: 
(()+0x110c0) [0x7f1432dcf0c0] 3: (gsignal()+0xcf) [0x7f1431202fcf] 4: 
(abort()+0x16a) [0x7f14312043fa] 5: (()+0x2be37) [0x7f14311fbe37] 6: 
(()+0x2bee2) [0x7f14311fbee2] 7: (()+0x2fa19c) [0x5575b4a0d19c] 8: 
(FileStore::omap_get_values(coll_t const&, ghobject_t const&, 
std::set 
const&, std::map >*)+0x6c2) 
[0x5575b4dc9322] 9: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x235) [0x5575b4ab3135] 10: (main()+0x5bd6) 
[0x5575b4a16716] 11: (__libc_start_main()+0xf1) [0x7f14311f02b1] 12: 
(()+0x34f747) [0x5575b4a62747]
When trying to bring up osd.4 we get this message.  Feels very similar to that 
crash in first two above. ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x960e57) [0x5565e564ae57] 2: 
(()+0x110c0) [0x7f34aa17e0c0] 3: (gsignal()+0xcf) [0x7f34a81c4fcf] 4: 
(abort()+0x16a) [0x7f34a81c63fa] 5: 

Re: [ceph-users] Power outages!!! help!

2017-09-16 Thread hjcho616
Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for 
more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 1 
pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD 
(29 < min 30); mds rank 0 has failed; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not set

Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus...# 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.exportterminate called after throwing an instance of 'std::domain_error' 
 what():  coll_t::decode(): don't know how to decode version 1*** Caught signal 
(Aborted) ** in thread 7f812e7fb940 thread_name:ceph-objectstor ceph version 
10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) 
[0x55dee175fa57] 2: (()+0x110c0) [0x7f812d0050c0] 3: (gsignal()+0xcf) 
[0x7f812b438fcf] 4: (abort()+0x16a) [0x7f812b43a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f812bd1fb3d] 6: 
(()+0x5ebb6) [0x7f812bd1dbb6] 7: (()+0x5ec01) [0x7f812bd1dc01] 8: (()+0x5ee19) 
[0x7f812bd1de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55dee143001e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55dee156d5f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55dee1562bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55dee1561eb8] 13: 
(FileStore::mount()+0x2525) [0x55dee1498eb5] 14: (main()+0x28c0) 
[0x55dee10c9400] 15: (__libc_start_main()+0xf1) [0x7f812b4262b1] 16: 
(()+0x34f747) [0x55dee1118747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-journal-replayterminate called after throwing an instance of 
'std::domain_error'  what():  coll_t::decode(): don't know how to decode 
version 1*** Caught signal (Aborted) ** in thread 7fa6d087b940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55abd356aa57] 2: 
(()+0x110c0) [0x7fa6cf0850c0] 3: (gsignal()+0xcf) [0x7fa6cd4b8fcf] 4: 
(abort()+0x16a) [0x7fa6cd4ba3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fa6cdd9fb3d] 6: 
(()+0x5ebb6) [0x7fa6cdd9dbb6] 7: (()+0x5ec01) [0x7fa6cdd9dc01] 8: (()+0x5ee19) 
[0x7fa6cdd9de19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55abd323b01e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55abd33785f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55abd336dbb9] 12: (DBObjectMap::init(bool)+0x288) [0x55abd336ceb8] 13: 
(FileStore::mount()+0x2525) [0x55abd32a3eb5] 14: (main()+0x28c0) 
[0x55abd2ed4400] 15: (__libc_start_main()+0xf1) [0x7fa6cd4a62b1] 16: 
(()+0x34f747) [0x55abd2f23747]Aborted# ceph-objectstore-tool --op export --pgid 
1.28  --data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file ~/1.28.export 
--skip-mount-omapceph-objectstore-tool: 
/usr/include/boost/smart_ptr/scoped_ptr.hpp:99: T* 
boost::scoped_ptr::operator->() const [with T = ObjectMap]: Assertion `px != 
0' failed.*** Caught signal (Aborted) ** in thread 7f14345c5940 
thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x5575b50a9a57] 2: 
(()+0x110c0) [0x7f1432dcf0c0] 3: (gsignal()+0xcf) [0x7f1431202fcf] 4: 
(abort()+0x16a) [0x7f14312043fa] 5: (()+0x2be37) [0x7f14311fbe37] 6: 
(()+0x2bee2) [0x7f14311fbee2] 7: (()+0x2fa19c) [0x5575b4a0d19c] 8: 
(FileStore::omap_get_values(coll_t const&, ghobject_t const&, 
std::set 
const&, std::map >*)+0x6c2) 
[0x5575b4dc9322] 9: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x235) [0x5575b4ab3135] 10: (main()+0x5bd6) 
[0x5575b4a16716] 11: (__libc_start_main()+0xf1) [0x7f14311f02b1] 12: 
(()+0x34f747) [0x5575b4a62747]
When trying to bring up osd.4 we get this message.  Feels very similar to that 
crash in first two above. ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x960e57) [0x5565e564ae57] 2: 
(()+0x110c0) [0x7f34aa17e0c0] 3: (gsignal()+0xcf) [0x7f34a81c4fcf] 4: 
(abort()+0x16a) [0x7f34a81c63fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f34a8aabb3d] 6: 
(()+0x5ebb6) [0x7f34a8aa9bb6] 7: (()+0x5ec01) [0x7f34a8aa9c01] 8: (()+0x5ee19) 
[0x7f34a8aa9e19] 9: 

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread hjcho616
After running ceph osd lost osd.0, it started backfilling... I figured that was 
supposed to happen earlier when I added those missing PGs.  Running in to "too 
few PGs per OSD" I removed osds after cluster stopped working after adding 
osds.  But I guess I still needed them.  Currently I see several incomplete PGs 
and trying to import those PGs back. =P
As far as 1.28 goes, it didn't look like it was limited by osd.0, logs didn't 
show any signs of osd.0 and data is only available on osd.4, which wouldn't 
export... So I still need to deal with that one.  It is still showing up as 
incomplete.. =P  Any recommendations how to get that back?pg 1.28 is stuck 
inactive since forever, current state down+incomplete, last acting [11,6]pg 
1.28 is stuck unclean since forever, current state down+incomplete, last acting 
[11,6]pg 1.28 is down+incomplete, acting [11,6] (reducing pool metadata 
min_size from 2 may help; search ceph.com/docs for 'incomplete')
Regards,Hong


On Friday, September 15, 2017 4:51 AM, Ronny Aasen 
 wrote:
 

 
you write you had all pg's exported except one. so i assume you have 
injected those pg's into the cluster again using the method linked a few 
times in this thread. How did that go, were you successfull in 
recovering those pg's ?

kind regards.
Ronny Aasen



On 15. sep. 2017 07:52, hjcho616 wrote:
> I just did this and backfilling started.  Let's see where this takes me.
> ceph osd lost 0 --yes-i-really-mean-it
> 
> Regards,
> Hong
> 
> 
> On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:
> 
> 
> Ronny,
> 
> Working with all of the pgs shown in the "ceph health detail", I ran 
> below for each PG to export.
> ceph-objectstore-tool --op export --pgid 0.1c  --data-path 
> /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
> --skip-journal-replay --file 0.1c.export
> 
> I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This 
> error doesn't make much sense to me.  Looking at the source code from 
> https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that 
> message is telling me struct_v is 1... but not sure how it ended up in 
> the default in the case statement when 1 case is defined...  I tried 
> with --skip-journal-replay, fails with same error message.
> ceph-objectstore-tool --op export --pgid 1.28  --data-path 
> /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal 
> --file 1.28.export
> terminate called after throwing an instance of 'std::domain_error'
>    what():  coll_t::decode(): don't know how to decode version 1
> *** Caught signal (Aborted) **
>  in thread 7fabc5ecc940 thread_name:ceph-objectstor
>  ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)
>  1: (()+0x996a57) [0x55b2d3323a57]
>  2: (()+0x110c0) [0x7fabc46d50c0]
>  3: (gsignal()+0xcf) [0x7fabc2b08fcf]
>  4: (abort()+0x16a) [0x7fabc2b0a3fa]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d]
>  6: (()+0x5ebb6) [0x7fabc33edbb6]
>  7: (()+0x5ec01) [0x7fabc33edc01]
>  8: (()+0x5ee19) [0x7fabc33ede19]
>  9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) [0x55b2d2ff401e]
>  10: 
> (DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
> [0x55b2d31315f5]
>  11: (DBObjectMap::check(std::ostream&, bool)+0x279) [0x55b2d3126bb9]
>  12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8]
>  13: (FileStore::mount()+0x2525) [0x55b2d305ceb5]
>  14: (main()+0x28c0) [0x55b2d2c8d400]
>  15: (__libc_start_main()+0xf1) [0x7fabc2af62b1]
>  16: (()+0x34f747) [0x55b2d2cdc747]
> Aborted
> 
> Then wrote a simple script to run import process... just created an OSD 
> per PG.  Basically ran below for each PG.
> mkdir /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph-disk prepare /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph-disk activate /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> ceph osd crush reweight osd.$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0
> systemctl stop ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
> ceph-objectstore-tool --op import --pgid 0.1c  --data-path 
> /var/lib/ceph/osd/ceph-$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
> /var/lib/ceph/osd/ceph-$(cat 
> /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
> ./export/0.1c.export
> chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
> systemctl start ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
> 
> Sometimes import didn't work.. but stopping OSD and rerunning 
> ceph-objectstore-tool again seems to help or when some PG didn't really 
> want to import .
> 
> Unfound messages are gone!  But I still have down+peering, or 
> down+remapped+peering.
> # ceph health detail
> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
> down; 1 pgs inconsistent; 22 pgs peering; 22 pgs stuck inactive; 22 pgs 
> stuck unclean; 1 requests are blocked > 32 sec; 1 

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread Ronny Aasen


you write you had all pg's exported except one. so i assume you have 
injected those pg's into the cluster again using the method linked a few 
times in this thread. How did that go, were you successfull in 
recovering those pg's ?


kind regards.
Ronny Aasen



On 15. sep. 2017 07:52, hjcho616 wrote:

I just did this and backfilling started.  Let's see where this takes me.
ceph osd lost 0 --yes-i-really-mean-it

Regards,
Hong


On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:


Ronny,

Working with all of the pgs shown in the "ceph health detail", I ran 
below for each PG to export.
ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export


I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This 
error doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that 
message is telling me struct_v is 1... but not sure how it ended up in 
the default in the case statement when 1 case is defined...  I tried 
with --skip-journal-replay, fails with same error message.
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal 
--file 1.28.export

terminate called after throwing an instance of 'std::domain_error'
   what():  coll_t::decode(): don't know how to decode version 1
*** Caught signal (Aborted) **
  in thread 7fabc5ecc940 thread_name:ceph-objectstor
  ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)
  1: (()+0x996a57) [0x55b2d3323a57]
  2: (()+0x110c0) [0x7fabc46d50c0]
  3: (gsignal()+0xcf) [0x7fabc2b08fcf]
  4: (abort()+0x16a) [0x7fabc2b0a3fa]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d]
  6: (()+0x5ebb6) [0x7fabc33edbb6]
  7: (()+0x5ec01) [0x7fabc33edc01]
  8: (()+0x5ee19) [0x7fabc33ede19]
  9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) [0x55b2d2ff401e]
  10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5]

  11: (DBObjectMap::check(std::ostream&, bool)+0x279) [0x55b2d3126bb9]
  12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8]
  13: (FileStore::mount()+0x2525) [0x55b2d305ceb5]
  14: (main()+0x28c0) [0x55b2d2c8d400]
  15: (__libc_start_main()+0xf1) [0x7fabc2af62b1]
  16: (()+0x34f747) [0x55b2d2cdc747]
Aborted

Then wrote a simple script to run import process... just created an OSD 
per PG.  Basically ran below for each PG.

mkdir /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph-disk prepare /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph-disk activate /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0

systemctl stop ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
ceph-objectstore-tool --op import --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export

chown -R ceph.ceph /var/lib/ceph/osd/ceph-5/tmposd_0.1c/
systemctl start ceph-osd@$(cat /var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)

Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really 
want to import .


Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering.

# ceph health detail
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
down; 1 pgs inconsistent; 22 pgs peering; 22 pgs stuck inactive; 22 pgs 
stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow 
requests; 2 scrub errors; mds cluster is degraded; noout flag(s) set; no 
legacy OSD present but 'sortbitwise' flag is not set
pg 1.d is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 0.a is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 2.b is stuck inactive since forever, current state 
down+remapped+peering, last acting [7,11]
pg 1.9 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]
pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]
pg 0.2c is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.0 is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,7]
pg 1.2b is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.29 is stuck inactive since forever, current state down+peering, 
last acting [11,6]
pg 1.28 is stuck inactive since 

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
I just did this and backfilling started.  Let's see where this takes me. ceph 
osd lost 0 --yes-i-really-mean-it
Regards,Hong 

On Friday, September 15, 2017 12:44 AM, hjcho616  wrote:
 

 Ronny,
Working with all of the pgs shown in the "ceph health detail", I ran below for 
each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export

I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This error 
doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that message is 
telling me struct_v is 1... but not sure how it ended up in the default in the 
case statement when 1 case is defined...  I tried with --skip-journal-replay, 
fails with same error message.ceph-objectstore-tool --op export --pgid 1.28  
--data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file 1.28.exportterminate called after 
throwing an instance of 'std::domain_error'  what():  coll_t::decode(): don't 
know how to decode version 1*** Caught signal (Aborted) ** in thread 
7fabc5ecc940 thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55b2d3323a57] 2: 
(()+0x110c0) [0x7fabc46d50c0] 3: (gsignal()+0xcf) [0x7fabc2b08fcf] 4: 
(abort()+0x16a) [0x7fabc2b0a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d] 6: 
(()+0x5ebb6) [0x7fabc33edbb6] 7: (()+0x5ec01) [0x7fabc33edc01] 8: (()+0x5ee19) 
[0x7fabc33ede19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55b2d2ff401e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55b2d3126bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8] 13: 
(FileStore::mount()+0x2525) [0x55b2d305ceb5] 14: (main()+0x28c0) 
[0x55b2d2c8d400] 15: (__libc_start_main()+0xf1) [0x7fabc2af62b1] 16: 
(()+0x34f747) [0x55b2d2cdc747]Aborted
Then wrote a simple script to run import process... just created an OSD per PG. 
 Basically ran below for each PG.mkdir 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk prepare 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk activate 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0systemctl stop ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)ceph-objectstore-tool --op import 
--pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export   chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/systemctl start ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really want to 
import .
Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering. # ceph health detailHEALTH_ERR 22 pgs are stuck inactive 
for more than 300 seconds; 22 pgs down; 1 pgs inconsistent; 22 pgs peering; 22 
pgs stuck inactive; 22 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 
osds have slow requests; 2 scrub errors; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not setpg 1.d is stuck 
inactive since forever, current state down+peering, last acting [11,2]pg 0.a is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]pg 2.b is stuck inactive since 
forever, current state down+remapped+peering, last acting [7,11]pg 1.9 is stuck 
inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]pg 0.2c is stuck inactive since 
forever, current state down+peering, last acting [1,11]pg 0.0 is stuck inactive 
since forever, current state down+remapped+peering, last acting [10,7]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.29 is stuck inactive since forever, current state down+peering, last 
acting [11,6]pg 1.28 is stuck inactive since forever, current state 
down+peering, last acting [11,6]pg 2.3 is stuck inactive since forever, current 
state down+peering, last acting [11,7]pg 1.1b is stuck inactive since forever, 
current state down+remapped+peering, last acting [11,6]pg 0.d is stuck inactive 
since forever, current state down+remapped+peering, last acting [7,11]pg 1.c is 

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
Ronny,
Working with all of the pgs shown in the "ceph health detail", I ran below for 
each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal 
--skip-journal-replay --file 0.1c.export

I have all PGs exported, except 1... PG 1.28.  It is on ceph-4.  This error 
doesn't make much sense to me.  Looking at the source code from 
https://github.com/ceph/ceph/blob/master/src/osd/osd_types.cc, that message is 
telling me struct_v is 1... but not sure how it ended up in the default in the 
case statement when 1 case is defined...  I tried with --skip-journal-replay, 
fails with same error message.ceph-objectstore-tool --op export --pgid 1.28  
--data-path /var/lib/ceph/osd/ceph-4 --journal-path 
/var/lib/ceph/osd/ceph-4/journal --file 1.28.exportterminate called after 
throwing an instance of 'std::domain_error'  what():  coll_t::decode(): don't 
know how to decode version 1*** Caught signal (Aborted) ** in thread 
7fabc5ecc940 thread_name:ceph-objectstor ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: (()+0x996a57) [0x55b2d3323a57] 2: 
(()+0x110c0) [0x7fabc46d50c0] 3: (gsignal()+0xcf) [0x7fabc2b08fcf] 4: 
(abort()+0x16a) [0x7fabc2b0a3fa] 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fabc33efb3d] 6: 
(()+0x5ebb6) [0x7fabc33edbb6] 7: (()+0x5ec01) [0x7fabc33edc01] 8: (()+0x5ee19) 
[0x7fabc33ede19] 9: (coll_t::decode(ceph::buffer::list::iterator&)+0x21e) 
[0x55b2d2ff401e] 10: 
(DBObjectMap::_Header::decode(ceph::buffer::list::iterator&)+0x125) 
[0x55b2d31315f5] 11: (DBObjectMap::check(std::ostream&, bool)+0x279) 
[0x55b2d3126bb9] 12: (DBObjectMap::init(bool)+0x288) [0x55b2d3125eb8] 13: 
(FileStore::mount()+0x2525) [0x55b2d305ceb5] 14: (main()+0x28c0) 
[0x55b2d2c8d400] 15: (__libc_start_main()+0xf1) [0x7fabc2af62b1] 16: 
(()+0x34f747) [0x55b2d2cdc747]Aborted
Then wrote a simple script to run import process... just created an OSD per PG. 
 Basically ran below for each PG.mkdir 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk prepare 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph-disk activate 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/ceph osd crush reweight osd.$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) 0systemctl stop ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)ceph-objectstore-tool --op import 
--pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami) --journal-path 
/var/lib/ceph/osd/ceph-$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)/journal --file 
./export/0.1c.export   chown -R ceph.ceph 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/systemctl start ceph-osd@$(cat 
/var/lib/ceph/osd/ceph-5/tmposd_0.1c/whoami)
Sometimes import didn't work.. but stopping OSD and rerunning 
ceph-objectstore-tool again seems to help or when some PG didn't really want to 
import .
Unfound messages are gone!   But I still have down+peering, or 
down+remapped+peering. # ceph health detailHEALTH_ERR 22 pgs are stuck inactive 
for more than 300 seconds; 22 pgs down; 1 pgs inconsistent; 22 pgs peering; 22 
pgs stuck inactive; 22 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 
osds have slow requests; 2 scrub errors; mds cluster is degraded; noout flag(s) 
set; no legacy OSD present but 'sortbitwise' flag is not setpg 1.d is stuck 
inactive since forever, current state down+peering, last acting [11,2]pg 0.a is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 2.8 is stuck inactive since forever, current state 
down+remapped+peering, last acting [11,7]pg 2.b is stuck inactive since 
forever, current state down+remapped+peering, last acting [7,11]pg 1.9 is stuck 
inactive since forever, current state down+remapped+peering, last acting 
[11,7]pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]pg 1.3d is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,6]pg 0.2c is stuck inactive since 
forever, current state down+peering, last acting [1,11]pg 0.0 is stuck inactive 
since forever, current state down+remapped+peering, last acting [10,7]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.29 is stuck inactive since forever, current state down+peering, last 
acting [11,6]pg 1.28 is stuck inactive since forever, current state 
down+peering, last acting [11,6]pg 2.3 is stuck inactive since forever, current 
state down+peering, last acting [11,7]pg 1.1b is stuck inactive since forever, 
current state down+remapped+peering, last acting [11,6]pg 0.d is stuck inactive 
since forever, current state down+remapped+peering, last acting [7,11]pg 1.c is 
stuck inactive since forever, current state down+remapped+peering, last acting 
[7,11]pg 0.3b is stuck inactive since forever, current state 
down+remapped+peering, last acting [10,7]pg 2.39 is stuck inactive since 

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread hjcho616
Rooney,
Just tried hooking up osd.0 back.  osd.0 seems to be better as I was able to 
run ceph-objectstore-tool export so decided to try hooking it up.  Looks like 
journal is not happy.  Is there any way to get this running?  Or do I need to 
start getting data using ceph-objectstore-tool?
2017-09-13 18:51:50.051421 7f44dd847800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-13 18:51:50.051435 7f44dd847800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
38992017-09-13 18:51:50.052323 7f44dd847800  0 pidfile_write: ignore empty 
--pid-file2017-09-13 18:51:50.061586 7f44dd847800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-13 
18:51:50.061823 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-13 18:51:50.061826 
7f44dd847800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-13 18:51:50.061838 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-13 18:51:50.077506 7f44dd847800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-13 18:51:50.077549 
7f44dd847800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-13 18:51:50.078066 7f44dd847800  1 leveldb: 
Recovering log #280692017-09-13 18:51:50.177610 7f44dd847800  1 leveldb: Delete 
type=0 #28069
2017-09-13 18:51:50.177708 7f44dd847800  1 leveldb: Delete type=3 #28068
2017-09-13 18:51:57.946233 7f44dd847800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-13 
18:51:57.947293 7f44dd847800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-13 
18:51:57.949835 7f44dd847800 -1 journal Unable to read past sequence 27057121 
but header indicates the journal has committed up through 27057593, journal is 
corrupt2017-09-13 18:51:57.951824 7f44dd847800 -1 os/filestore/FileJournal.cc: 
In function 'bool FileJournal::read_entry(ceph::bufferlist&, uint64_t&, bool*)' 
thread 7f44dd847800 time 2017-09-13 18:51:57.949837os/filestore/FileJournal.cc: 
2036: FAILED assert(0)
 ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0) 1: 
(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x82) 
[0x55c809640d02] 2: (FileJournal::read_entry(ceph::buffer::list&, unsigned 
long&, bool*)+0xa84) [0x55c8093c4da4] 3: 
(JournalingObjectStore::journal_replay(unsigned long)+0x205) [0x55c8092feb95] 
4: (FileStore::mount()+0x2e28) [0x55c8092d0a88] 5: (OSD::init()+0x27d) 
[0x55c808f697ed] 6: (main()+0x2a64) [0x55c808ed05d4] 7: 
(__libc_start_main()+0xf5) [0x7f44da6e3b45] 8: (()+0x341117) [0x55c808f1b117] 
NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.
--- begin dump of recent events ---   -59> 2017-09-13 18:51:50.043283 
7f44dd847800  5 asok(0x55c813d76000) register_command perfcounters_dump hook 
0x55c813cbe030   -58> 2017-09-13 18:51:50.043312 7f44dd847800  5 
asok(0x55c813d76000) register_command 1 hook 0x55c813cbe030   -57> 2017-09-13 
18:51:50.043317 7f44dd847800  5 asok(0x55c813d76000) register_command perf dump 
hook 0x55c813cbe030   -56> 2017-09-13 18:51:50.043322 7f44dd847800  5 
asok(0x55c813d76000) register_command perfcounters_schema hook 0x55c813cbe030   
-55> 2017-09-13 18:51:50.043326 7f44dd847800  5 asok(0x55c813d76000) 
register_command 2 hook 0x55c813cbe030   -54> 2017-09-13 18:51:50.043330 
7f44dd847800  5 asok(0x55c813d76000) register_command perf schema hook 
0x55c813cbe030   -53> 2017-09-13 18:51:50.043334 7f44dd847800  5 
asok(0x55c813d76000) register_command perf reset hook 0x55c813cbe030   -52> 
2017-09-13 18:51:50.043339 7f44dd847800  5 asok(0x55c813d76000) 
register_command config show hook 0x55c813cbe030   -51> 2017-09-13 
18:51:50.043344 7f44dd847800  5 asok(0x55c813d76000) register_command config 
set hook 0x55c813cbe030   -50> 2017-09-13 18:51:50.043349 7f44dd847800  5 
asok(0x55c813d76000) register_command config get hook 0x55c813cbe030   -49> 
2017-09-13 18:51:50.043355 7f44dd847800  5 asok(0x55c813d76000) 
register_command config diff hook 0x55c813cbe030   -48> 2017-09-13 
18:51:50.043361 7f44dd847800  5 asok(0x55c813d76000) register_command log flush 
hook 0x55c813cbe030   -47> 2017-09-13 18:51:50.043367 7f44dd847800  5 
asok(0x55c813d76000) register_command log dump hook 0x55c813cbe030   -46> 
2017-09-13 18:51:50.043373 7f44dd847800  5 asok(0x55c813d76000) 
register_command log reopen hook 0x55c813cbe030   -45> 2017-09-13 
18:51:50.051421 7f44dd847800  0 set uid:gid to 1001:1001 (ceph:ceph)   -44> 
2017-09-13 18:51:50.051435 7f44dd847800  0 ceph version 10.2.9 

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread Ronny Aasen

On 13. sep. 2017 07:04, hjcho616 wrote:

Ronny,

Did bunch of ceph pg repair pg# and got the scrub errors down to 10... 
well was 9, trying to fix one became 10.. waiting for it to fix (I did 
that noout trick as I only have two copies).  8 of those scrub errors 
looks like it would need data from osd.0.


HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
degraded; 6 pgs down; 3 pgs inconsistent; 6 pgs peering; 6 pgs 
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 
16 pgs stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16 
pgs undersized; 1 requests are blocked > 32 sec; recovery 221990/4503980 
objects degraded (4.929%); recovery 147/2251990 unfound (0.007%); 10 
scrub errors; mds cluster is degraded; no legacy OSD present but 
'sortbitwise' flag is not set


 From what I saw from ceph health detail, running osd.0 would solve 
majority of the problems.  But that was the disk with the smart error 
earlier.  I did move to new drive using ddrescue.  When trying to start 
osd.0, I get this.  Is there anyway I can get around this?




running a rescued disk is not something you should try. this is when you 
should try to export using the objectstoretool


this was the drive that failed to export pg's becouse of missing 
superblock ? you could also try the export directly on the failed drive. 
just to try if that works. you many have to run the tool as ceph user if 
that is the user owning all the files


you could try running the export of one of the pg's on osd.0 again and 
post all commands and output.


good luck

Ronny





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
Ronny,
Did bunch of ceph pg repair pg# and got the scrub errors down to 10... well was 
9, trying to fix one became 10.. waiting for it to fix (I did that noout trick 
as I only have two copies).  8 of those scrub errors looks like it would need 
data from osd.0.
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
degraded; 6 pgs down; 3 pgs inconsistent; 6 pgs peering; 6 pgs recovering; 16 
pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 28 
pgs stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 1 requests are 
blocked > 32 sec; recovery 221990/4503980 objects degraded (4.929%); recovery 
147/2251990 unfound (0.007%); 10 scrub errors; mds cluster is degraded; no 
legacy OSD present but 'sortbitwise' flag is not set
>From what I saw from ceph health detail, running osd.0 would solve majority of 
>the problems.  But that was the disk with the smart error earlier.  I did move 
>to new drive using ddrescue.  When trying to start osd.0, I get this.  Is 
>there anyway I can get around this?2017-09-12 01:31:55.205898 7fb61521a800  0 
>set uid:gid to 1001:1001 (ceph:ceph)2017-09-12 01:31:55.205915 7fb61521a800  0 
>ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process 
>ceph-osd, pid 48222017-09-12 01:31:55.206955 7fb61521a800  0 pidfile_write: 
>ignore empty --pid-file2017-09-12 01:31:55.217615 7fb61521a800  0 
>filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-12 
>01:31:55.217854 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP 
>ioctl is disabled via 'filestore fiemap' config option2017-09-12 
>01:31:55.217858 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
>SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config 
>option2017-09-12 01:31:55.217871 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
>supported2017-09-12 01:31:55.268117 7fb61521a800  0 
>genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
>syscall fully supported (by glibc and kernel)2017-09-12 01:31:55.268190 
>7fb61521a800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
>extsize is disabled by conf
2017-09-12 01:31:55.269266 7fb61521a800  1 leveldb: Recovering log 
#290562017-09-12 01:31:55.502001 7fb61521a800  1 leveldb: Delete type=0 #29056
2017-09-12 01:31:55.502079 7fb61521a800  1 leveldb: Delete type=3 #29055
2017-09-12 01:32:03.165991 7fb61521a800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-12 
01:32:03.167009 7fb61521a800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-12 
01:32:03.170097 7fb61521a800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-12 
01:32:03.170530 7fb61521a800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-12 01:32:03.170643 7fb61521a800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-12 01:32:03.170657 7fb61521a800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-12 01:32:03.171059 7fb61521a800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-12 01:32:03.193741 7fb61521a800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Trying to attack down+peering issue.  Seems like same problem as above.  Anyway 
around this one?  Alot of these say "last acting [0]".  Should it matter if I 
grab from other OSD? # ceph-objectstore-tool --op export --pgid 0.2c 
--data-path /var/lib/ceph/osd/ceph-0/ --journal-path 
/var/lib/ceph/osd/ceph-0/journal --file 0.2c.export.0Failure to read OSD 
superblock: (2) No such file or directory
Regards,Hong

 

On Tuesday, September 12, 2017 10:04 AM, hjcho616  
wrote:
 

 Thank you for those references!  I'll have to go study some more.  Good 
portion of that inconsistent seems to be from missing data from osd.0. =P  
There appears to be some from okay drives. =P  Kicked off "ceph pg repair pg#" 
few times, but doesn't seem to change much yet. =P  As far as smart output 
goes, they are showing status of PASS for all of them.  and all 
current_pending_sector is 0. =)  There are some Raw_Read_Error_Rate with low 
numbers.. like 2 or 6, but some are huge numbers (Seagate drives do this?) and 
they are not being flagged.  =P  Seek Error seems to be the same... Samsung 
drives show 0 while Seagate drives show huge numbers. =P  Even the new ones.  
Is there any particular one I should be concentrated on for the smart?
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down     

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
Thank you for those references!  I'll have to go study some more.  Good portion 
of that inconsistent seems to be from missing data from osd.0. =P  There 
appears to be some from okay drives. =P  Kicked off "ceph pg repair pg#" few 
times, but doesn't seem to change much yet. =P  As far as smart output goes, 
they are showing status of PASS for all of them.  and all 
current_pending_sector is 0. =)  There are some Raw_Read_Error_Rate with low 
numbers.. like 2 or 6, but some are huge numbers (Seagate drives do this?) and 
they are not being flagged.  =P  Seek Error seems to be the same... Samsung 
drives show 0 while Seagate drives show huge numbers. =P  Even the new ones.  
Is there any particular one I should be concentrated on for the smart?
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
# cat /etc/ceph/ceph.conf[global]#fsid = 
383ef3b1-ba70-43e2-8294-fb2fc2fb6f6afsid = 
9b2c9bca-112e-48b0-86fc-587ef9a52948mon_initial_members = MDS1mon_host = 
192.168.1.20#auth_cluster_required = cephx#auth_service_required = 
cephx#auth_client_required  = cephxauth_cluster_required = 
noneauth_service_required = noneauth_client_required  = 
nonefilestore_xattr_use_omap = truepublic network = 
192.168.1.0/24cluster_network = 192.168.2.0/24
osd_client_op_priority = 63osd_recovery_op_priority = 1osd_max_backfills = 
5osd_recovery_max_active = 5
# ceph osd dfID WEIGHT  REWEIGHT SIZE  USE    AVAIL %USE  VAR  PGS 1 1.95250  
1.0 1862G   797G 1064G 42.84 0.97  66 0 1.95250        0     0      0     0 
 -nan -nan  16 7 0.31239  1.0  297G 41438M  257G 13.58 0.31   3 6 1.95250  
1.0 1862G   599G 1262G 32.21 0.73  48 2 1.95250  1.0 1862G   756G 1105G 
40.63 0.92  59 3 1.95250        0     0      0     0  -nan -nan   0 4 1.95250   
     0     0      0     0  -nan -nan   0 5 1.95250        0     0      0     0  
-nan -nan   0 8 1.95250        0     0      0     0  -nan -nan   0 9 0.31239  
1.0  297G   168M  297G  0.06 0.00   210 1.81360  1.0 1857G   792G 1064G 
42.67 0.96  5911 1.81360  1.0 1857G  1398G  458G 75.32 1.70 116             
 TOTAL 9896G  4386G 5510G 44.32MIN/MAX VAR: 0.00/1.70  STDDEV: 24.00
Thank you!
Regards,Hong
 

On Tuesday, September 12, 2017 3:07 AM, Ronny Aasen 
 wrote:
 

 you can start by posting more details. atleast
"ceph osd tree" "cat ceph.conf" and "ceph osd df" so we can see what 
settings you are running, and how your cluster is balanced at the moment.

generally:

inconsistent pg's are pg's that have scrub errors. use rados 
list-inconsistent-pg [pool] and rados-list-inconsistent-obj [pg] to 
locate the objects with problems. compare and fix the objects using info 
from 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent
 
also read http://ceph.com/geen-categorie/ceph-manually-repair-object/


since you have so many scrub errors i would assume there are more bad 
disks, check all disk's smart values and look for read errors in logs.
if you find any you should drain those disks by setting crush weight to 
0. and  when they are empty remove them from the cluster. personally i 
use smartmontools it sends me emails about bad disks, and check disks 
manually with    smartctl -a /dev/sda || echo bad-disk: $?


pg's that are down+peering need to have one of the acting osd's started 
again. or to have the objects recovered using the methods we have 
discussed previously.
ref: 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure

nb: do not mark any osd's as lost since that = dataloss.


I would
- check smart stats of all disks.  drain disks that are going bad. make 
sure you have enough space on good disks to drain them properly.
- check scrub errors and objects. fix those that are fixable. some may 
require an object from a down osd.
- try to get down osd's running again if possible. if you manage to get 
one running, let it recover and stabilize.
- recover and inject objects from osd's that do not run. stasrt by doing 
one and one pg. and once you get the hang of the method you can do 
multiple pg's at the same time.


good luck

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread Ronny Aasen

you can start by posting more details. atleast
"ceph osd tree" "cat ceph.conf" and "ceph osd df" so we can see what 
settings you are running, and how your cluster is balanced at the moment.


generally:

inconsistent pg's are pg's that have scrub errors. use rados 
list-inconsistent-pg [pool] and rados-list-inconsistent-obj [pg] to 
locate the objects with problems. compare and fix the objects using info 
from 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent 
also read http://ceph.com/geen-categorie/ceph-manually-repair-object/



since you have so many scrub errors i would assume there are more bad 
disks, check all disk's smart values and look for read errors in logs.
if you find any you should drain those disks by setting crush weight to 
0. and  when they are empty remove them from the cluster. personally i 
use smartmontools it sends me emails about bad disks, and check disks 
manually withsmartctl -a /dev/sda || echo bad-disk: $?



pg's that are down+peering need to have one of the acting osd's started 
again. or to have the objects recovered using the methods we have 
discussed previously.
ref: 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure


nb: do not mark any osd's as lost since that = dataloss.


I would
- check smart stats of all disks.  drain disks that are going bad. make 
sure you have enough space on good disks to drain them properly.
- check scrub errors and objects. fix those that are fixable. some may 
require an object from a down osd.
- try to get down osd's running again if possible. if you manage to get 
one running, let it recover and stabilize.
- recover and inject objects from osd's that do not run. stasrt by doing 
one and one pg. and once you get the hang of the method you can do 
multiple pg's at the same time.



good luck
Ronny Aasen



On 11. sep. 2017 06:51, hjcho616 wrote:
It took a while.  It appears to have cleaned up quite a bit... but still 
has issues.  I've been seeing below message for more than a day and cpu 
utilization and io utilization is low... looks like something is 
stuck...  I rebooted OSDs several times when it looked like it was stuck 
earlier and it would work on something else, but now it is not changing 
much.  What can I try now?


Regards,
Hong

# ceph health detail
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 22 pgs 
degraded; 6 pgs down; 11 pgs inconsistent; 6 pgs peering; 6 pgs 
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 
16 pgs stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16 
pgs undersized; 1 requests are blocked > 32 sec; 1 osds have slow 
requests; recovery 221990/4503980 objects degraded (4.929%); recovery 
147/2251990 unfound (0.007%); 95 scrub errors; mds cluster is degraded; 
no legacy OSD present but 'sortbitwise' flag is not set
pg 0.e is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 1.d is stuck inactive since forever, current state down+peering, last 
acting [11,2]
pg 1.28 is stuck inactive since forever, current state down+peering, 
last acting [11,6]
pg 0.29 is stuck inactive since forever, current state down+peering, 
last acting [11,6]
pg 1.2b is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.2c is stuck inactive since forever, current state down+peering, 
last acting [1,11]
pg 0.e is stuck unclean since forever, current state down+peering, last 
acting [11,2]
pg 0.a is stuck unclean for 1233182.248198, current state 
stale+active+undersized+degraded+inconsistent, last acting [0]
pg 2.8 is stuck unclean for 1238044.714421, current state 
stale+active+undersized+degraded, last acting [0]
pg 2.1a is stuck unclean for 1238933.203920, current state 
active+recovering+degraded, last acting [2,11]
pg 2.3 is stuck unclean for 1238882.443876, current state 
stale+active+undersized+degraded, last acting [0]
pg 2.27 is stuck unclean for 1295260.765981, current state 
active+recovering+degraded, last acting [11,6]
pg 0.d is stuck unclean for 1230831.504001, current state 
stale+active+undersized+degraded, last acting [0]
pg 1.c is stuck unclean for 1238044.715698, current state 
stale+active+undersized+degraded, last acting [0]
pg 1.3d is stuck unclean for 1232066.572856, current state 
stale+active+undersized+degraded, last acting [0]
pg 1.28 is stuck unclean since forever, current state down+peering, last 
acting [11,6]
pg 0.29 is stuck unclean since forever, current state down+peering, last 
acting [11,6]
pg 1.2b is stuck unclean since forever, current state down+peering, last 
acting [1,11]
pg 2.2f is stuck unclean for 1238127.474088, current state 
active+recovering+degraded+remapped, last acting [9,10]
pg 0.0 is stuck unclean for 1233182.247776, current state 
stale+active+undersized+degraded, last acting [0]
pg 0.2c is stuck unclean since forever, current state down+peering, last 
acting 

Re: [ceph-users] Power outages!!! help!

2017-09-10 Thread hjcho616
It took a while.  It appears to have cleaned up quite a bit... but still has 
issues.  I've been seeing below message for more than a day and cpu utilization 
and io utilization is low... looks like something is stuck...  I rebooted OSDs 
several times when it looked like it was stuck earlier and it would work on 
something else, but now it is not changing much.  What can I try now?
Regards,Hong
# ceph health detailHEALTH_ERR 22 pgs are stuck inactive for more than 300 
seconds; 22 pgs degraded; 6 pgs down; 11 pgs inconsistent; 6 pgs peering; 6 pgs 
recovering; 16 pgs stale; 22 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs 
stuck stale; 28 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 
1 requests are blocked > 32 sec; 1 osds have slow requests; recovery 
221990/4503980 objects degraded (4.929%); recovery 147/2251990 unfound 
(0.007%); 95 scrub errors; mds cluster is degraded; no legacy OSD present but 
'sortbitwise' flag is not setpg 0.e is stuck inactive since forever, current 
state down+peering, last acting [11,2]pg 1.d is stuck inactive since forever, 
current state down+peering, last acting [11,2]pg 1.28 is stuck inactive since 
forever, current state down+peering, last acting [11,6]pg 0.29 is stuck 
inactive since forever, current state down+peering, last acting [11,6]pg 1.2b 
is stuck inactive since forever, current state down+peering, last acting 
[1,11]pg 0.2c is stuck inactive since forever, current state down+peering, last 
acting [1,11]pg 0.e is stuck unclean since forever, current state down+peering, 
last acting [11,2]pg 0.a is stuck unclean for 1233182.248198, current state 
stale+active+undersized+degraded+inconsistent, last acting [0]pg 2.8 is stuck 
unclean for 1238044.714421, current state stale+active+undersized+degraded, 
last acting [0]pg 2.1a is stuck unclean for 1238933.203920, current state 
active+recovering+degraded, last acting [2,11]pg 2.3 is stuck unclean for 
1238882.443876, current state stale+active+undersized+degraded, last acting 
[0]pg 2.27 is stuck unclean for 1295260.765981, current state 
active+recovering+degraded, last acting [11,6]pg 0.d is stuck unclean for 
1230831.504001, current state stale+active+undersized+degraded, last acting 
[0]pg 1.c is stuck unclean for 1238044.715698, current state 
stale+active+undersized+degraded, last acting [0]pg 1.3d is stuck unclean for 
1232066.572856, current state stale+active+undersized+degraded, last acting 
[0]pg 1.28 is stuck unclean since forever, current state down+peering, last 
acting [11,6]pg 0.29 is stuck unclean since forever, current state 
down+peering, last acting [11,6]pg 1.2b is stuck unclean since forever, current 
state down+peering, last acting [1,11]pg 2.2f is stuck unclean for 
1238127.474088, current state active+recovering+degraded+remapped, last acting 
[9,10]pg 0.0 is stuck unclean for 1233182.247776, current state 
stale+active+undersized+degraded, last acting [0]pg 0.2c is stuck unclean since 
forever, current state down+peering, last acting [1,11]pg 2.b is stuck unclean 
for 1238044.640982, current state stale+active+undersized+degraded, last acting 
[0]pg 1.1b is stuck unclean for 1234021.660986, current state 
stale+active+undersized+degraded, last acting [0]pg 0.1c is stuck unclean for 
1232574.189549, current state stale+active+undersized+degraded, last acting 
[0]pg 1.4 is stuck unclean for 1293624.075753, current state 
stale+active+undersized+degraded, last acting [0]pg 0.5 is stuck unclean for 
1237356.776788, current state stale+active+undersized+degraded+inconsistent, 
last acting [0]pg 2.1f is stuck unclean for 8825246.729513, current state 
active+recovering+degraded, last acting [10,2]pg 1.d is stuck unclean since 
forever, current state down+peering, last acting [11,2]pg 2.39 is stuck unclean 
for 1238933.214406, current state stale+active+undersized+degraded, last acting 
[0]pg 1.3a is stuck unclean for 2125299.164204, current state 
stale+active+undersized+degraded, last acting [0]pg 0.3b is stuck unclean for 
1233432.895409, current state stale+active+undersized+degraded, last acting 
[0]pg 2.3c is stuck unclean for 1238933.208648, current state 
active+recovering+degraded, last acting [10,2]pg 2.35 is stuck unclean for 
1295260.753354, current state active+recovering+degraded, last acting [11,6]pg 
1.9 is stuck unclean for 1238044.722811, current state 
stale+active+undersized+degraded, last acting [0]pg 0.a is stuck undersized for 
1229917.081228, current state stale+active+undersized+degraded+inconsistent, 
last acting [0]pg 2.8 is stuck undersized for 1229917.081016, current state 
stale+active+undersized+degraded, last acting [0]pg 2.b is stuck undersized for 
1229917.068181, current state stale+active+undersized+degraded, last acting 
[0]pg 1.9 is stuck undersized for 1229917.075164, current state 
stale+active+undersized+degraded, last acting [0]pg 0.5 is stuck undersized for 
1229917.085330, current state stale+active+undersized+degraded+inconsistent, 

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
Hmm.. I hope I don't really need any thing from osd.0. =P
# ceph-objectstore-tool --op export --pgid 2.35 --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 
2.35.exportFailure to read OSD superblock: (2) No such file or directory# 
ceph-objectstore-tool --op export --pgid 2.2f --data-path 
/var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 
2.2f.exportFailure to read OSD superblock: (2) No such file or directory
Regards,Hong 

On Monday, September 4, 2017 2:29 AM, hjcho616  wrote:
 

 Ronny,
While letting cluster replicate, looks like this might be a while, I decided to 
look in to where those pgs are missing.. From the "ceph health detail" I found 
pgs that are unfound.  Then found the directories that had that pgs, pasted on 
the right of that detail message below..pg 2.35 is active+recovering+degraded, 
acting [11,6], 29 unfound, ceph-0/current/2.35_head, ceph-8/current/2.35_headpg 
2.2f is active+recovery_wait+degraded+remapped, acting [9,10], 24 unfound, 
ceph-0/current/2.2f_head, ceph-2/current/2.2f_headpg 2.27 is 
active+recovery_wait+degraded, acting [11,6], 19 unfound, 
ceph-4/current/2.27_headpg 2.1a is active+recovery_wait+degraded, acting 
[2,11], 29 unfound, ceph-0/current/2.1a_head, ceph-3/current/2.1a_head 
ceph-4/current/2.1a_headpg 2.1f is active+recovery_wait+degraded, acting 
[10,2], 20 unfound, ceph-3/current/2.1f_head, ceph-4/current/2.1f_headpg 2.3c 
is active+recovery_wait+degraded, acting [10,2], 26 unfound, 
ceph-0/current/2.3c_head, ceph-4/current/2.3c_head
Basically, I just went to look at the pg directories with rb.* files in it.  I 
noticed that there are more than 1 of those directories throughout osds.  
Should it matter which one of them I export?  Or do I need both? I am seeing 
all of them can be found outside of ceph-0, I'll probably grab from non ceph-0 
OSDs if I can grab from any.  
One thing strange I notice is... pg 2.2f, that one has rb.* files in active 
node ceph-2 but still marked unfound?  Maybe that means I need to export both 
and import both?  If I have to get both, is there a need to merge the two 
before importing?  Or would the tool know how to handle this?
Regards,Hong 

On Monday, September 4, 2017 1:20 AM, hjcho616  wrote:
 

 Thank you Ronny.  I've added two OSDs to OSD2, 2TB each.  I hope that would be 
enough. =)  I've changed min_size and size to 2.  OSDs are busy balancing 
again.  I'll try those you recommended and will get back to you with more 
questions! =) 
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
Regards,Hong 

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen 
 wrote:
 

  I would not even attempt to connect a recovered drive to ceph, especially not 
one that have had xfs errors and corruption.  
 
 your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set 
 osd crush chooseleaf type = 0 
 to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much 
as possible (no misplaced or degraded objects)  this require that ceph have 
space for the recovery. 
 i would run with size=2 min_size=2  
 
 you should also look at the 7 shrub errors. they indicate that there can be 
other drives with issues, you want to locate where those inconsistent objects 
are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 then you would sit with the 103 unfound objects, and those you should try to 
recover from the recovered drive. 
 by using the ceph-objectstore-tool export/import  to try and export pg's 
missing objects  to a dedicated temporary added import drive.
 the import drive does not need to be very large. since you can do one and one 
pg at the time. and you should only recover pg's that contain unfound objects. 
there is realy only 103 unfound objects that you need to recover. 
 once the recovery is compleate you can wipe the functioning recovery 

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
Ronny,
While letting cluster replicate, looks like this might be a while, I decided to 
look in to where those pgs are missing.. From the "ceph health detail" I found 
pgs that are unfound.  Then found the directories that had that pgs, pasted on 
the right of that detail message below..pg 2.35 is active+recovering+degraded, 
acting [11,6], 29 unfound, ceph-0/current/2.35_head, ceph-8/current/2.35_headpg 
2.2f is active+recovery_wait+degraded+remapped, acting [9,10], 24 unfound, 
ceph-0/current/2.2f_head, ceph-2/current/2.2f_headpg 2.27 is 
active+recovery_wait+degraded, acting [11,6], 19 unfound, 
ceph-4/current/2.27_headpg 2.1a is active+recovery_wait+degraded, acting 
[2,11], 29 unfound, ceph-0/current/2.1a_head, ceph-3/current/2.1a_head 
ceph-4/current/2.1a_headpg 2.1f is active+recovery_wait+degraded, acting 
[10,2], 20 unfound, ceph-3/current/2.1f_head, ceph-4/current/2.1f_headpg 2.3c 
is active+recovery_wait+degraded, acting [10,2], 26 unfound, 
ceph-0/current/2.3c_head, ceph-4/current/2.3c_head
Basically, I just went to look at the pg directories with rb.* files in it.  I 
noticed that there are more than 1 of those directories throughout osds.  
Should it matter which one of them I export?  Or do I need both? I am seeing 
all of them can be found outside of ceph-0, I'll probably grab from non ceph-0 
OSDs if I can grab from any.  
One thing strange I notice is... pg 2.2f, that one has rb.* files in active 
node ceph-2 but still marked unfound?  Maybe that means I need to export both 
and import both?  If I have to get both, is there a need to merge the two 
before importing?  Or would the tool know how to handle this?
Regards,Hong 

On Monday, September 4, 2017 1:20 AM, hjcho616  wrote:
 

 Thank you Ronny.  I've added two OSDs to OSD2, 2TB each.  I hope that would be 
enough. =)  I've changed min_size and size to 2.  OSDs are busy balancing 
again.  I'll try those you recommended and will get back to you with more 
questions! =) 
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
Regards,Hong 

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen 
 wrote:
 

  I would not even attempt to connect a recovered drive to ceph, especially not 
one that have had xfs errors and corruption.  
 
 your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set 
 osd crush chooseleaf type = 0 
 to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much 
as possible (no misplaced or degraded objects)  this require that ceph have 
space for the recovery. 
 i would run with size=2 min_size=2  
 
 you should also look at the 7 shrub errors. they indicate that there can be 
other drives with issues, you want to locate where those inconsistent objects 
are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 then you would sit with the 103 unfound objects, and those you should try to 
recover from the recovered drive. 
 by using the ceph-objectstore-tool export/import  to try and export pg's 
missing objects  to a dedicated temporary added import drive.
 the import drive does not need to be very large. since you can do one and one 
pg at the time. and you should only recover pg's that contain unfound objects. 
there is realy only 103 unfound objects that you need to recover. 
 once the recovery is compleate you can wipe the functioning recovery drive, 
and install it as a new osd to the cluster.
 
 
 
 kind regards
 Ronny Aasen
 
 
 On 03.09.2017 06:20, hjcho616 wrote:
  
  I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It still fails 
with the same error message.  Looks like when I updated to 10.2.9, some osd 
needs to be updated and that process is not finding the data it needs?  What 
can I do about this situation? 
  2017-09-01 22:27:35.590041 7f68837e5800  1 
filestore(/var/lib/ceph/osd/ceph-0) upgrade 2017-09-01 22:27:35.590149 

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
Thank you Ronny.  I've added two OSDs to OSD2, 2TB each.  I hope that would be 
enough. =)  I've changed min_size and size to 2.  OSDs are busy balancing 
again.  I'll try those you recommended and will get back to you with more 
questions! =) 
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.0          1.0 0  1.95250         osd.0     down        0         
 1.0 7  0.31239         osd.7       up  1.0          1.0 6  1.95250 
        osd.6       up  1.0          1.0 2  1.95250         osd.2       
up  1.0          1.0-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.0 4  1.95250         osd.4     down        0    
      1.0 5  1.95250         osd.5     down        0          1.0 8  
1.95250         osd.8     down        0          1.0 9  0.31239         
osd.9       up  1.0          1.010  1.81360         osd.10      up  
1.0          1.011  1.81360         osd.11      up  1.0          
1.0
Regards,Hong 

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen 
 wrote:
 

  I would not even attempt to connect a recovered drive to ceph, especially not 
one that have had xfs errors and corruption.  
 
 your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set 
 osd crush chooseleaf type = 0 
 to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much 
as possible (no misplaced or degraded objects)  this require that ceph have 
space for the recovery. 
 i would run with size=2 min_size=2  
 
 you should also look at the 7 shrub errors. they indicate that there can be 
other drives with issues, you want to locate where those inconsistent objects 
are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 then you would sit with the 103 unfound objects, and those you should try to 
recover from the recovered drive. 
 by using the ceph-objectstore-tool export/import  to try and export pg's 
missing objects  to a dedicated temporary added import drive.
 the import drive does not need to be very large. since you can do one and one 
pg at the time. and you should only recover pg's that contain unfound objects. 
there is realy only 103 unfound objects that you need to recover. 
 once the recovery is compleate you can wipe the functioning recovery drive, 
and install it as a new osd to the cluster.
 
 
 
 kind regards
 Ronny Aasen
 
 
 On 03.09.2017 06:20, hjcho616 wrote:
  
  I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It still fails 
with the same error message.  Looks like when I updated to 10.2.9, some osd 
needs to be updated and that process is not finding the data it needs?  What 
can I do about this situation? 
  2017-09-01 22:27:35.590041 7f68837e5800  1 
filestore(/var/lib/ceph/osd/ceph-0) upgrade 2017-09-01 22:27:35.590149 
7f68837e5800 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory 
  Regards, Hong 
 
  On Friday, September 1, 2017 11:10 PM, hjcho616  
wrote:
  
 
 Just realized there is a file called superblock in the ceph directory.  
ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 are 
identical, but not between the two groups.   When I originally created the 
OSDs, I created ceph-0 through 5.  Can superblock file be copied over from 
ceph-1 to ceph-0? 
  Hmm.. it appears to be doing something in the background even though osd.0 is 
down.  ceph health output is changing! # ceph health HEALTH_ERR 40 pgs are 
stuck inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs 
degraded; 10 pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2 
pgs recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive; 
30 pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs 
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects 
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; 
no legacy OSD present but  'sortbitwise' flag is not set 
  Regards, Hong 
 
   On Friday, September 1, 2017 10:37 PM, hjcho616  
wrote:
  
 
 Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.   Below is the log.  What can I do about this? 
  2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph) 2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 

Re: [ceph-users] Power outages!!! help!

2017-09-03 Thread Ronny Aasen
I would not even attempt to connect a recovered drive to ceph, 
especially not one that have had xfs errors and corruption.


your pg's that are undersized lead me to belive you still need to either 
expand, with more disks, or nodes. or that you need to set


|osd crush chooseleaf type = 0 |

to let ceph pick 2 disks on the same node as a valid object placement.  
(temporary until you get 2 balanced nodes) generally let ceph selfheal 
as much as possible (no misplaced or degraded objects)  this require 
that ceph have space for the recovery.

i would run with size=2 min_size=2

you should also look at the 7 shrub errors. they indicate that there can 
be other drives with issues, you want to locate where those inconsistent 
objects are, and fix them. read this page about fixing scrub errors. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/


then you would sit with the 103 unfound objects, and those you should 
try to recover from the recovered drive.
by using the /ceph/-/objectstore/-/tool /export/import  to try and 
export pg's missing objects  to a dedicated temporary added import drive.
the import drive does not need to be very large. since you can do one 
and one pg at the time. and you should only recover pg's that contain 
unfound objects. there is realy only 103 unfound objects that you need 
to recover.
once the recovery is compleate you can wipe the functioning recovery 
drive, and install it as a new osd to the cluster.




kind regards
Ronny Aasen


On 03.09.2017 06:20, hjcho616 wrote:
I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It 
still fails with the same error message.  Looks like when I updated to 
10.2.9, some osd needs to be updated and that process is not finding 
the data it needs?  What can I do about this situation?


2017-09-01 22:27:35.590041 7f68837e5800  1 
filestore(/var/lib/ceph/osd/ceph-0) upgrade
2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory


Regards,
Hong


On Friday, September 1, 2017 11:10 PM, hjcho616  
wrote:



Just realized there is a file called superblock in the ceph directory. 
 ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 
are identical, but not between the two groups.  When I originally 
created the OSDs, I created ceph-0 through 5.  Can superblock file be 
copied over from ceph-1 to ceph-0?


Hmm.. it appears to be doing something in the background even though 
osd.0 is down.  ceph health output is changing!

# ceph health
HEALTH_ERR 40 pgs are stuck inactive for more than 300 seconds; 14 pgs 
backfill_wait; 21 pgs degraded; 10 pgs down; 2 pgs inconsistent; 10 
pgs peering; 3 pgs recovering; 2 pgs recovery_wait; 30 pgs stale; 21 
pgs stuck degraded; 10 pgs stuck inactive; 30 pgs stuck stale; 45 pgs 
stuck unclean; 16 pgs stuck undersized; 16 pgs undersized; 2 requests 
are blocked > 32 sec; recovery 221826/2473662 objects degraded 
(8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is 
degraded; no legacy OSD present but 'sortbitwise' flag is not set


Regards,
Hong


On Friday, September 1, 2017 10:37 PM, hjcho616  
wrote:



Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?


2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)
2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 5432
2017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file
2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2017-09-01 22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2017-09-01 22:27:27.647080 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config 
option
2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
splice is supported
2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
syncfs(2) syscall fully supported (by glibc and kernel)
2017-09-01 22:27:27.679044 7f68837e5800  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize 
is disabled by conf

2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: Recovering log #28054
2017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete type=0 #28054

2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053

2017-09-01 

Re: [ceph-users] Power outages!!! help!

2017-09-02 Thread hjcho616
I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that 
superblock file is the same.  I copied it over and started OSD.  It still fails 
with the same error message.  Looks like when I updated to 10.2.9, some osd 
needs to be updated and that process is not finding the data it needs?  What 
can I do about this situation?
2017-09-01 22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory
Regards,Hong 

On Friday, September 1, 2017 11:10 PM, hjcho616  wrote:
 

 Just realized there is a file called superblock in the ceph directory.  ceph-1 
and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but 
not between the two groups.  When I originally created the OSDs, I created 
ceph-0 through 5.  Can superblock file be copied over from ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is 
down.  ceph health output is changing!# ceph healthHEALTH_ERR 40 pgs are stuck 
inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs degraded; 10 
pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2 pgs 
recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive; 30 
pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs 
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects 
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; 
no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong 

On Friday, September 1, 2017 10:37 PM, hjcho616  wrote:
 

 Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33% 

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
Just realized there is a file called superblock in the ceph directory.  ceph-1 
and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but 
not between the two groups.  When I originally created the OSDs, I created 
ceph-0 through 5.  Can superblock file be copied over from ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is 
down.  ceph health output is changing!# ceph healthHEALTH_ERR 40 pgs are stuck 
inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs degraded; 10 
pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2 pgs 
recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive; 30 
pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs 
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects 
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%); 
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; 
no legacy OSD present but 'sortbitwise' flag is not set
Regards,Hong 

On Friday, September 1, 2017 10:37 PM, hjcho616  wrote:
 

 Tried connecting recovered osd.  Looks like some of the files in the 
lost+found are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33% 
/var/lib/ceph/osd/ceph-0/dev/sde1      1952559676 640365952 1312193724  33% 
/var/lib/ceph/osd/ceph-6/dev/sdd1      1952559676 712018768 1240540908  37% 
/var/lib/ceph/osd/ceph-2/dev/sdc1      1952559676 755827440 1196732236  39% 
/var/lib/ceph/osd/ceph-1/dev/sdf1       312417560  42538060  269879500  14% 
/var/lib/ceph/osd/ceph-7tmpfs              792392         0     792392   0% 
/run/user/0# cd /var/lib/ceph/osd/ceph-0# lsactivate.monmap  current  
journal_uuid  magic          superblock  whoamiactive           fsid     
keyring       ready          sysvinitceph_fsid        journal  lost+found    
store_version  type
Regards,Hong 

On Friday, September 1, 2017 2:59 PM, hjcho616  

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
Tried connecting recovered osd.  Looks like some of the files in the lost+found 
are super blocks.  Below is the log.  What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800  0 set uid:gid to 1001:1001 
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800  0 ceph version 10.2.9 
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 
54322017-09-01 22:27:27.635456 7f68837e5800  0 pidfile_write: ignore empty 
--pid-file2017-09-01 22:27:27.646849 7f68837e5800  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)2017-09-01 
22:27:27.647077 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option2017-09-01 22:27:27.647080 
7f68837e5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) 
detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
config option2017-09-01 22:27:27.647091 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is 
supported2017-09-01 22:27:27.678937 7f68837e5800  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)2017-09-01 22:27:27.679044 
7f68837e5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: 
extsize is disabled by conf2017-09-01 22:27:27.680718 7f68837e5800  1 leveldb: 
Recovering log #280542017-09-01 22:27:27.804501 7f68837e5800  1 leveldb: Delete 
type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800  1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800  0 filestore(/var/lib/ceph/osd/ceph-0) 
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2017-09-01 
22:27:35.587689 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.589631 7f68837e5800  1 journal _open /var/lib/ceph/osd/ceph-0/journal 
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 12017-09-01 
22:27:35.590041 7f68837e5800  1 filestore(/var/lib/ceph/osd/ceph-0) 
upgrade2017-09-01 22:27:35.590149 7f68837e5800 -1 
filestore(/var/lib/ceph/osd/ceph-0) could not find 
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or 
directory2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : 
unable to read osd superblock2017-09-01 22:27:35.590547 7f68837e5800  1 journal 
close /var/lib/ceph/osd/ceph-0/journal2017-09-01 22:27:35.611595 7f68837e5800 
-1 ^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0.# dfFilesystem      
1K-blocks      Used  Available Use% Mounted onudev                10240         
0      10240   0% /devtmpfs             1584780      9172    1575608   1% 
/run/dev/sda1        15247760   9319048    5131120  65% /tmpfs             
3961940         0    3961940   0% /dev/shmtmpfs                5120         0   
    5120   0% /run/locktmpfs             3961940         0    3961940   0% 
/sys/fs/cgroup/dev/sdb1      1952559676 634913968 1317645708  33% 
/var/lib/ceph/osd/ceph-0/dev/sde1      1952559676 640365952 1312193724  33% 
/var/lib/ceph/osd/ceph-6/dev/sdd1      1952559676 712018768 1240540908  37% 
/var/lib/ceph/osd/ceph-2/dev/sdc1      1952559676 755827440 1196732236  39% 
/var/lib/ceph/osd/ceph-1/dev/sdf1       312417560  42538060  269879500  14% 
/var/lib/ceph/osd/ceph-7tmpfs              792392         0     792392   0% 
/run/user/0# cd /var/lib/ceph/osd/ceph-0# lsactivate.monmap  current  
journal_uuid  magic          superblock  whoamiactive           fsid     
keyring       ready          sysvinitceph_fsid        journal  lost+found    
store_version  type
Regards,Hong 

On Friday, September 1, 2017 2:59 PM, hjcho616  wrote:
 

 Found the partition, wasn't able to mount the partition right away... Did a 
xfs_repair on that drive.  
Got bunch of messages like this.. =(entry 
"10a89fd.__head_AE319A25__0" in shortform directory 845908970 
references non-existent inode 605294241               junking entry 
"10a89fd.__head_AE319A25__0" in directory inode 845908970           
Was able to mount.  lost+found has lots of files there. =P  Running du seems to 
show OK files in current directory.
Will it be safe to attach this one back to the cluster?  Is there a way to 
specify to use this drive if the data is missing? =)  Or am I being paranoid?  
Just plug it? =)
Regards,Hong 

On Friday, September 1, 2017 9:01 AM, hjcho616  wrote:
 

 Looks like it has been rescued... Only 1 error as we saw before in the smart 
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C 
to interrupt     ipos:    1508 GB, non-trimmed:        0 B,  current rate:      
 0 B/s     opos:    1508 GB, non-scraped:        0 B,  average rate:  88985 
kB/snon-tried:        0 B,     errsize:     4096 B,      run time:  6h 14m 40s  

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
Found the partition, wasn't able to mount the partition right away... Did a 
xfs_repair on that drive.  
Got bunch of messages like this.. =(entry 
"10a89fd.__head_AE319A25__0" in shortform directory 845908970 
references non-existent inode 605294241               junking entry 
"10a89fd.__head_AE319A25__0" in directory inode 845908970           
Was able to mount.  lost+found has lots of files there. =P  Running du seems to 
show OK files in current directory.
Will it be safe to attach this one back to the cluster?  Is there a way to 
specify to use this drive if the data is missing? =)  Or am I being paranoid?  
Just plug it? =)
Regards,Hong 

On Friday, September 1, 2017 9:01 AM, hjcho616  wrote:
 

 Looks like it has been rescued... Only 1 error as we saw before in the smart 
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C 
to interrupt     ipos:    1508 GB, non-trimmed:        0 B,  current rate:      
 0 B/s     opos:    1508 GB, non-scraped:        0 B,  average rate:  88985 
kB/snon-tried:        0 B,     errsize:     4096 B,      run time:  6h 14m 40s  
rescued:    2000 GB,      errors:        1,  remaining time:         n/apercent 
rescued:  99.99%      time since last successful read:         39sFinished      
                 
Still missing partition in the new drive. =P  I found this util called testdisk 
for broken partition tables.  Will try that tonight. =P
Regards,Hong
 

On Wednesday, August 30, 2017 9:18 AM, Ronny Aasen 
 wrote:
 

  On 30.08.2017 15:32, Steve Taylor wrote:
  
 
I'm not familiar with dd_rescue, but I've just been reading about it. I'm not 
seeing any features that would be beneficial in this scenario that aren't also 
available in dd. What specific features give it "really a far better chance of 
restoring a copy of your disk" than dd? I'm always interested in learning about 
new recovery tools. 
 i see i wrote dd_rescue from old habit, but the package one should use on 
debian is gddrescue or also called gnu ddrecue. 
 
 this page have some details on the differences on dd vs the ddrescue variants. 
 http://www.toad.com/gnu/sysadmin/index.html#ddrescue
 
 kind regards
 Ronny Aasen
 
 
 
 
||  Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
 380 Data Drive Suite 300 | Draper | Utah | 84020
 Office: 801.871.2799 |   |

  
| If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any  
attachments, and be advised that any dissemination or copying of this message 
is prohibited. |

  
 On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote: 
 On 29-8-2017 19:12, Steve Taylor wrote:

Hong,Probably your best chance at recovering any data without 
special,expensive, forensic procedures is to perform a dd from /dev/sdb 
tosomewhere else large enough to hold a full disk image and attempt torepair 
that. You'll want to use 'conv=noerror' with your dd commandsince your disk is 
failing. Then you could either re-attach the OSDfrom the new source or attempt 
to retrieve objects from the filestoreon it.
Like somebody else already pointed outIn problem "cases like disk, use 
dd_rescue.It has really a far better chance of restoring a copy of your 
disk--WjW
I have actually done this before by creating an RBD that matches thedisk size, 
performing the dd, running xfs_repair, and eventuallyadding it back to the 
cluster as an OSD. RBDs as OSDs is certainly atemporary arrangement for repair 
only, but I'm happy to report that itworked flawlessly in my case. I was able 
to weight the OSD to 0,offload all of its data, then remove it for a full 
recovery, at whichpoint I just deleted the RBD.The possibilities afforded by 
Ceph inception are endless. ☺ Steve Taylor | Senior Software Engineer | 
StorageCraft Technology Corporation380 Data Drive Suite 300 | Draper | Utah | 
84020Office: 801.871.2799 | If you are not the intended recipient of this 
message or received it erroneously, please notify the sender and delete it, 
together with any attachments, and be advised that any dissemination or copying 
of this message is prohibited. On Mon, 2017-08-28 at 23:17 +0100, Tomasz 
Kusmierz wrote:
Rule of thumb with batteries is:- more “proper temperature” you run them at the 
more life you get outof them- more battery is overpowered for your application 
the longer it willsurvive. Get your self a LSI 94** controller and use it as 
HBA and you will befine. but get MORE DRIVES ! … 
On 28 Aug 2017, at 23:10, hjcho616  wrote:Thank you Tomasz 
and Ronny.  I'll have to order some hdd soon andtry these out.  Car battery 
idea is nice!  I may try that.. =)  Dothey last longer?  Ones that fit the UPS 
original battery specdidn't last very long... part of the reason why I gave up 
on them..=P  My wife probably won't like the idea of car battery hanging 

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
Looks like it has been rescued... Only 1 error as we saw before in the smart 
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C 
to interrupt     ipos:    1508 GB, non-trimmed:        0 B,  current rate:      
 0 B/s     opos:    1508 GB, non-scraped:        0 B,  average rate:  88985 
kB/snon-tried:        0 B,     errsize:     4096 B,      run time:  6h 14m 40s  
rescued:    2000 GB,      errors:        1,  remaining time:         n/apercent 
rescued:  99.99%      time since last successful read:         39sFinished      
                 
Still missing partition in the new drive. =P  I found this util called testdisk 
for broken partition tables.  Will try that tonight. =P
Regards,Hong
 

On Wednesday, August 30, 2017 9:18 AM, Ronny Aasen 
 wrote:
 

  On 30.08.2017 15:32, Steve Taylor wrote:
  
 
I'm not familiar with dd_rescue, but I've just been reading about it. I'm not 
seeing any features that would be beneficial in this scenario that aren't also 
available in dd. What specific features give it "really a far better chance of 
restoring a copy of your disk" than dd? I'm always interested in learning about 
new recovery tools. 
 i see i wrote dd_rescue from old habit, but the package one should use on 
debian is gddrescue or also called gnu ddrecue. 
 
 this page have some details on the differences on dd vs the ddrescue variants. 
 http://www.toad.com/gnu/sysadmin/index.html#ddrescue
 
 kind regards
 Ronny Aasen
 
 
 
 
||  Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
 380 Data Drive Suite 300 | Draper | Utah | 84020
 Office: 801.871.2799 |   |

  
| If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any  
attachments, and be advised that any dissemination or copying of this message 
is prohibited. |

  
 On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote: 
 On 29-8-2017 19:12, Steve Taylor wrote:

Hong,Probably your best chance at recovering any data without 
special,expensive, forensic procedures is to perform a dd from /dev/sdb 
tosomewhere else large enough to hold a full disk image and attempt torepair 
that. You'll want to use 'conv=noerror' with your dd commandsince your disk is 
failing. Then you could either re-attach the OSDfrom the new source or attempt 
to retrieve objects from the filestoreon it.
Like somebody else already pointed outIn problem "cases like disk, use 
dd_rescue.It has really a far better chance of restoring a copy of your 
disk--WjW
I have actually done this before by creating an RBD that matches thedisk size, 
performing the dd, running xfs_repair, and eventuallyadding it back to the 
cluster as an OSD. RBDs as OSDs is certainly atemporary arrangement for repair 
only, but I'm happy to report that itworked flawlessly in my case. I was able 
to weight the OSD to 0,offload all of its data, then remove it for a full 
recovery, at whichpoint I just deleted the RBD.The possibilities afforded by 
Ceph inception are endless. ☺ Steve Taylor | Senior Software Engineer | 
StorageCraft Technology Corporation380 Data Drive Suite 300 | Draper | Utah | 
84020Office: 801.871.2799 | If you are not the intended recipient of this 
message or received it erroneously, please notify the sender and delete it, 
together with any attachments, and be advised that any dissemination or copying 
of this message is prohibited. On Mon, 2017-08-28 at 23:17 +0100, Tomasz 
Kusmierz wrote:
Rule of thumb with batteries is:- more “proper temperature” you run them at the 
more life you get outof them- more battery is overpowered for your application 
the longer it willsurvive. Get your self a LSI 94** controller and use it as 
HBA and you will befine. but get MORE DRIVES ! … 
On 28 Aug 2017, at 23:10, hjcho616  wrote:Thank you Tomasz 
and Ronny.  I'll have to order some hdd soon andtry these out.  Car battery 
idea is nice!  I may try that.. =)  Dothey last longer?  Ones that fit the UPS 
original battery specdidn't last very long... part of the reason why I gave up 
on them..=P  My wife probably won't like the idea of car battery hanging 
outthough ha!The OSD1 (one with mostly ok OSDs, except that smart 
failure)motherboard doesn't have any additional SATA connectors available. 
Would it be safe to add another OSD host?Regards,HongOn Monday, August 28, 2017 
4:43 PM, Tomasz Kusmierz  wrote:Sorry for being brutal 
… anyway 1. get the battery for UPS ( a car battery will do as well, I’vemoded 
on ups in the past with truck battery and it was working likea charm :D )2. get 
spare drives and put those in because your cluster CAN NOTget out of error due 
to lack of space3. Follow advice of Ronny Aasen on hot to recover data from 
harddrives 4 get cooling to drives or you will loose more ! 
On 28 Aug 2017, at 22:39, hjcho616  wrote:Tomasz,Those 

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen

On 30.08.2017 15:32, Steve Taylor wrote:
I'm not familiar with dd_rescue, but I've just been reading about it. 
I'm not seeing any features that would be beneficial in this scenario 
that aren't also available in dd. What specific features give it 
"really a far better chance of restoring a copy of your disk" than dd? 
I'm always interested in learning about new recovery tools.


i see i wrote dd_rescue from old habit, but the package one should use 
on debian is gddrescue or also called gnu ddrecue.


this page have some details on the differences on dd vs the ddrescue 
variants.

http://www.toad.com/gnu/sysadmin/index.html#ddrescue

kind regards
Ronny Aasen







*Steve Taylor* | Senior Software Engineer |***StorageCraft Technology 
Corporation* 

380 Data Drive Suite 300 | Draper | Utah | 84020
*Office:* 801.871.2799 |


If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this 
message is prohibited.




On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:

On 29-8-2017 19:12, Steve Taylor wrote:
Hong, Probably your best chance at recovering any data without 
special, expensive, forensic procedures is to perform a dd from 
/dev/sdb to somewhere else large enough to hold a full disk image 
and attempt to repair that. You'll want to use 'conv=noerror' with 
your dd command since your disk is failing. Then you could either 
re-attach the OSD from the new source or attempt to retrieve objects 
from the filestore on it. 



Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW

I have actually done this before by creating an RBD that matches the 
disk size, performing the dd, running xfs_repair, and eventually 
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a 
temporary arrangement for repair only, but I'm happy to report that 
it worked flawlessly in my case. I was able to weight the OSD to 0, 
offload all of its data, then remove it for a full recovery, at 
which point I just deleted the RBD. The possibilities afforded by 
Ceph inception are endless. ☺ Steve Taylor | Senior Software 
Engineer | StorageCraft Technology Corporation 380 Data Drive Suite 
300 | Draper | Utah | 84020 Office: 801.871.2799 | If you are not 
the intended recipient of this message or received it erroneously, 
please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of 
this message is prohibited. On Mon, 2017-08-28 at 23:17 +0100, 
Tomasz Kusmierz wrote:
Rule of thumb with batteries is: - more “proper temperature” you 
run them at the more life you get out of them - more battery is 
overpowered for your application the longer it will survive. Get 
your self a LSI 94** controller and use it as HBA and you will be 
fine. but get MORE DRIVES ! …
On 28 Aug 2017, at 23:10, hjcho616 > wrote: Thank you Tomasz and Ronny. 
 I'll have to order some hdd soon and try these out.  Car battery 
idea is nice!  I may try that.. =)  Do they last longer?  Ones 
that fit the UPS original battery spec didn't last very long... 
part of the reason why I gave up on them.. =P  My wife probably 
won't like the idea of car battery hanging out though ha! The OSD1 
(one with mostly ok OSDs, except that smart failure) motherboard 
doesn't have any additional SATA connectors available.  Would it 
be safe to add another OSD host? Regards, Hong On Monday, August 
28, 2017 4:43 PM, Tomasz Kusmierz  wrote: 
Sorry for being brutal … anyway 1. get the battery for UPS ( a car 
battery will do as well, I’ve moded on ups in the past with truck 
battery and it was working like a charm :D ) 2. get spare drives 
and put those in because your cluster CAN NOT get out of error due 
to lack of space 3. Follow advice of Ronny Aasen on hot to recover 
data from hard drives 4 get cooling to drives or you will loose 
more !
On 28 Aug 2017, at 22:39, hjcho616 > wrote: Tomasz, Those machines are 
behind a surge protector.  Doesn't appear to be a good one!  I do 
have a UPS... but it is my fault... no battery.  Power was pretty 
reliable for a while... and UPS was just beeping every chance it 
had, disrupting some sleep.. =P  So running on surge protector 
only.  I am running this in home environment.   So far, HDD 
failures have been very rare for this environment. =)  It just 
doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a 

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor
I'm not familiar with dd_rescue, but I've just been reading about it. I'm not 
seeing any features that would be beneficial in this scenario that aren't also 
available in dd. What specific features give it "really a far better chance of 
restoring a copy of your disk" than dd? I'm always interested in learning about 
new recovery tools.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:

On 29-8-2017 19:12, Steve Taylor wrote:


Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.



Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW



I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺



Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:


Rule of thumb with batteries is:
- more “proper temperature” you run them at the more life you get out
of them
- more battery is overpowered for your application the longer it will
survive.

Get your self a LSI 94** controller and use it as HBA and you will be
fine. but get MORE DRIVES ! …


On 28 Aug 2017, at 23:10, hjcho616 
> wrote:

Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
try these out.  Car battery idea is nice!  I may try that.. =)  Do
they last longer?  Ones that fit the UPS original battery spec
didn't last very long... part of the reason why I gave up on them..
=P  My wife probably won't like the idea of car battery hanging out
though ha!

The OSD1 (one with mostly ok OSDs, except that smart failure)
motherboard doesn't have any additional SATA connectors available.
 Would it be safe to add another OSD host?

Regards,
Hong



On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  wrote:


Sorry for being brutal … anyway
1. get the battery for UPS ( a car battery will do as well, I’ve
moded on ups in the past with truck battery and it was working like
a charm :D )
2. get spare drives and put those in because your cluster CAN NOT
get out of error due to lack of space
3. Follow advice of Ronny Aasen on hot to recover data from hard
drives
4 get cooling to drives or you will loose more !




On 28 Aug 2017, at 22:39, hjcho616 
> wrote:

Tomasz,

Those machines are behind a surge protector.  Doesn't appear to
be a good one!  I do have a UPS... but it is my fault... no
battery.  Power was pretty reliable for a while... and UPS was
just beeping every chance it had, disrupting some sleep.. =P  So
running on surge protector only.  I am running this in home
environment.   So far, HDD failures have been very rare for this
environment. =)  It just doesn't get loaded as much!  I am not
sure what to expect, seeing that "unfound" and just a feeling of
possibility of maybe getting OSD back made me excited about it.
=) Thanks for letting me know what should be the priority.  I
just lack experience and knowledge in this. =) Please do continue
to guide me though this.

Thank you for the decode of that smart messages!  I do agree that
looks like it is on its way out.  I would like to know how to get
good portion of it back if possible. =)

I think I 

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor
Yes, if I had created the RBD in the same cluster I was trying to repair then I 
would have used rbd-fuse to "map" the RBD in order to avoid potential deadlock 
issues with the kernel client. I had another cluster available, so I copied its 
config file to the osd node, created the RBD in the second cluster, and used 
the kernel client for the dd, xfs_repair, and mount. Worked like a charm.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Tue, 2017-08-29 at 18:04 +, David Turner wrote:

But it was absolutely awesome to run an osd off of an rbd after the disk failed.

On Tue, Aug 29, 2017, 1:42 PM David Turner 
> wrote:

To addend Steve's success, the rbd was created in a second cluster in the same 
datacenter so it didn't run the risk of deadlocking that mapping rbds on 
machines running osds has.  It is still theoretical to work on the same 
cluster, but more inherently dangerous for a few reasons.

On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
> wrote:
Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺



Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive.
>
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! …
> > On 28 Aug 2017, at 23:10, hjcho616 
> > > wrote:
> >
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> >
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> >
> > Regards,
> > Hong
> >
> >
> >
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> >
> >
> > Sorry for being brutal … anyway
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives
> > 4 get cooling to drives or you will loose more !
> >
> >
> > > On 28 Aug 2017, at 22:39, hjcho616 
> > > > wrote:
> > >
> > > Tomasz,
> > >
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > 

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen

[snip]

I'm not sure if I am liking what I see on fdisk... it doesn't show sdb1. 
  I hope it shows up when I run dd_rescue to other drive... =P


# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.25.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

/dev/sdb: device contains a valid 'xfs' signature, it's strongly 
recommended to wipe the device by command wipefs(8) if this setup is 
unexpected to avoid possible collisions.


Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xe684adb6.

Command (m for help): p
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xe684adb6



Command (m for help):




Do not use fdisk for osd drives. they are using the GPT partition 
structure. and depend on the GPT uuid to be correct.  So use either 
parted or gdisk/cgdisk/sgdisk  if you want to look at it.


writing a mbr partition table to the osd will break it naturally.

kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
This is what it looks like today.  Seems like ceph-osds are sitting at 0% cpu 
so... all the migrations appear to be done,  Does this look ok to shutdown and 
continue when I get the HDD on Thursday?
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 20 
pgs backfill_wait; 23 pgs degraded; 6 pgs down; 2 pgs inconsistent; 6 pgs 
peering; 4 pgs recovering; 3 pgs recovery_wait; 16 pgs stale; 23 pgs stuck 
degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 49 pgs stuck unclean; 16 
pgs stuck undersized; 16 pgs undersized; 1 requests are blocked > 32 sec; 
recovery 221870/2473686 objects degraded (8.969%); recovery 365398/2473686 
objects misplaced (14.771%); recovery 147/2251990 unfound (0.007%); 7 scrub 
errors; mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag 
is not set
# dfFilesystem      1K-blocks      Used  Available Use% Mounted onudev          
      10240         0      10240   0% /devtmpfs             1584780      9212   
 1575568   1% /run/dev/sda1        15247760   9610208    4839960  67% /tmpfs    
         3961940         0    3961940   0% /dev/shmtmpfs                5120    
     0       5120   0% /run/locktmpfs             3961940         0    3961940  
 0% /sys/fs/cgroup/dev/sdd1      1952559676 712028032 1240531644  37% 
/var/lib/ceph/osd/ceph-2/dev/sde1      1952559676 628862040 1323697636  33% 
/var/lib/ceph/osd/ceph-6/dev/sdc1      1952559676 755815036 1196744640  39% 
/var/lib/ceph/osd/ceph-1/dev/sdf1       312417560  42551928  269865632  14% 
/var/lib/ceph/osd/ceph-7tmpfs              792392         0     792392   0% 
/run/user/0
I'm not sure if I am liking what I see on fdisk... it doesn't show sdb1.  I 
hope it shows up when I run dd_rescue to other drive... =P
# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.25.2).Changes will remain in memory only, until 
you decide to write them.Be careful before using the write command.
/dev/sdb: device contains a valid 'xfs' signature, it's strongly recommended to 
wipe the device by command wipefs(8) if this setup is unexpected to avoid 
possible collisions.
Device does not contain a recognized partition table.Created a new DOS 
disklabel with disk identifier 0xe684adb6.
Command (m for help): pDisk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 
sectorsUnits: sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512 
bytes / 512 bytesI/O size (minimum/optimal): 512 bytes / 512 bytesDisklabel 
type: dosDisk identifier: 0xe684adb6


Command (m for help):
 

On Tuesday, August 29, 2017 3:29 PM, Tomasz Kusmierz 
 wrote:
 

 Maged, on second host he has 4 out of 5 OSD failed on him … I think he’s past 
the trying to increase the backfill threshold :) ofcourse he could try to 
degrade cluster by letting mirror within same host :) 

On 29 Aug 2017, at 21:26, Maged Mokhtar  wrote:

One of the things to watch out in small clusters is OSDs can get full rather 
unexpectedly in recovery/backfill cases:In your case you have 2 OSD nodes with 
5 disks each. Since you have a replica of 2, each PG will have 1 copy on each 
host, so if an OSD fails, all its PGs will have to be re-created on the same 
host, meaning they will be distributed only among the 4 OSDs on the same host, 
which will quickly bump their usage by nearly 20% each.
the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was near 70% 
util before the failure, it will easily reach 85% and cause the cluster to 
error with backfill_toofull message you see.  This is why i suggest you add an 
extra disk or try your luck reasing osd_backfill_full_ratio to 92% it may fix 
things./MagedOn 2017-08-29 21:13, hjcho616 wrote:
Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)  
That does sound great.  I don't quite have another cluster so waiting for a 
drive to arrive! =)   After setting min and max_min to 1, looks like toofull 
flag is gone... Maybe when I was making that video copy OSDs were already 
down... and those two OSDs were not enough to take too much extra...  and on 
top of it that last OSD alive was smaller disk (2TB vs 320GB)... so it probably 
was filling up faster.  I should have captured that message... but turned 
machine off and now I am at work. =P  When I get back home, I'll try to grab 
that and share.  Maybe I don't need to try to add another OSD to that cluster 
just yet!  OSDs are about 50% full on OSD1. So next up, fixing osd0! 
Regards,Hong  

 On Tuesday, August 29, 2017 1:05 PM, David Turner  
wrote:


But it was absolutely awesome to run an osd off of an rbd after the disk failed.
On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote:
To addend Steve's success, the rbd was created in a second cluster in the same 
datacenter so it didn't run the risk of deadlocking that mapping rbds on 
machines running osds has.  It is still theoretical to work on the same 
cluster, but more inherently 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Tomasz Kusmierz
Maged, on second host he has 4 out of 5 OSD failed on him … I think he’s past 
the trying to increase the backfill threshold :) ofcourse he could try to 
degrade cluster by letting mirror within same host :) 
> On 29 Aug 2017, at 21:26, Maged Mokhtar  wrote:
> 
> One of the things to watch out in small clusters is OSDs can get full rather 
> unexpectedly in recovery/backfill cases:
> 
> In your case you have 2 OSD nodes with 5 disks each. Since you have a replica 
> of 2, each PG will have 1 copy on each host, so if an OSD fails, all its PGs 
> will have to be re-created on the same host, meaning they will be distributed 
> only among the 4 OSDs on the same host, which will quickly bump their usage 
> by nearly 20% each.
> the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was near 
> 70% util before the failure, it will easily reach 85% and cause the cluster 
> to error with backfill_toofull message you see.  This is why i suggest you 
> add an extra disk or try your luck reasing osd_backfill_full_ratio to 92% it 
> may fix things.
> 
> /Maged
> 
> On 2017-08-29 21:13, hjcho616 wrote:
> 
>> Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =) 
>>  That does sound great.  I don't quite have another cluster so waiting for a 
>> drive to arrive! =)  
>>  
>> After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
>> when I was making that video copy OSDs were already down... and those two 
>> OSDs were not enough to take too much extra...  and on top of it that last 
>> OSD alive was smaller disk (2TB vs 320GB)... so it probably was filling up 
>> faster.  I should have captured that message... but turned machine off and 
>> now I am at work. =P  When I get back home, I'll try to grab that and share. 
>>  Maybe I don't need to try to add another OSD to that cluster just yet!  
>> OSDs are about 50% full on OSD1.
>>  
>> So next up, fixing osd0!
>>  
>> Regards,
>> Hong  
>> 
>> 
>> On Tuesday, August 29, 2017 1:05 PM, David Turner  
>> wrote:
>> 
>> 
>> But it was absolutely awesome to run an osd off of an rbd after the disk 
>> failed.
>> 
>> On Tue, Aug 29, 2017, 1:42 PM David Turner > > wrote:
>> To addend Steve's success, the rbd was created in a second cluster in the 
>> same datacenter so it didn't run the risk of deadlocking that mapping rbds 
>> on machines running osds has.  It is still theoretical to work on the same 
>> cluster, but more inherently dangerous for a few reasons.
>> 
>> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor > > wrote:
>> Hong,
>> 
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
>> 
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>> 
>> The possibilities afforded by Ceph inception are endless. ☺
>> 
>> 
>> 
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 |
>> 
>> If you are not the intended recipient of this message or received it 
>> erroneously, please notify the sender and delete it, together with any 
>> attachments, and be advised that any dissemination or copying of this 
>> message is prohibited.
>> 
>> 
>> 
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> > Rule of thumb with batteries is:
>> > - more "proper temperature" you run them at the more life you get out
>> > of them
>> > - more battery is overpowered for your application the longer it will
>> > survive. 
>> >
>> > Get your self a LSI 94** controller and use it as HBA and you will be
>> > fine. but get MORE DRIVES ! ... 
>> > > On 28 Aug 2017, at 23:10, hjcho616 > > > > wrote:
>> > >
>> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
>> > > they last longer?  Ones that fit the UPS original battery spec
>> > > didn't last very long... part of the reason why I gave up on them..
>> > > =P  My wife probably won't like the idea of car 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Maged Mokhtar
One of the things to watch out in small clusters is OSDs can get full
rather unexpectedly in recovery/backfill cases: 

In your case you have 2 OSD nodes with 5 disks each. Since you have a
replica of 2, each PG will have 1 copy on each host, so if an OSD fails,
all its PGs will have to be re-created on the same host, meaning they
will be distributed only among the 4 OSDs on the same host, which will
quickly bump their usage by nearly 20% each.
the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was
near 70% util before the failure, it will easily reach 85% and cause the
cluster to error with backfill_toofull message you see.  This is why i
suggest you add an extra disk or try your luck reasing
osd_backfill_full_ratio to 92% it may fix things. 

/Maged 

On 2017-08-29 21:13, hjcho616 wrote:

> Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)  
> That does sound great.  I don't quite have another cluster so waiting for a 
> drive to arrive! =)   
> 
> After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
> when I was making that video copy OSDs were already down... and those two 
> OSDs were not enough to take too much extra...  and on top of it that last 
> OSD alive was smaller disk (2TB vs 320GB)... so it probably was filling up 
> faster.  I should have captured that message... but turned machine off and 
> now I am at work. =P  When I get back home, I'll try to grab that and share.  
> Maybe I don't need to try to add another OSD to that cluster just yet!  OSDs 
> are about 50% full on OSD1. 
> 
> So next up, fixing osd0! 
> 
> Regards, 
> Hong   
> 
> On Tuesday, August 29, 2017 1:05 PM, David Turner  
> wrote:
> 
> But it was absolutely awesome to run an osd off of an rbd after the disk 
> failed. 
> 
> On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote: 
> To addend Steve's success, the rbd was created in a second cluster in the 
> same datacenter so it didn't run the risk of deadlocking that mapping rbds on 
> machines running osds has.  It is still theoretical to work on the same 
> cluster, but more inherently dangerous for a few reasons. 
> 
> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor  
> wrote: Hong,
> 
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.
> 
> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
> 
> The possibilities afforded by Ceph inception are endless. ☺
> 
> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 |
> 
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> Rule of thumb with batteries is:
>> - more "proper temperature" you run them at the more life you get out
>> of them
>> - more battery is overpowered for your application the longer it will
>> survive. 
>> 
>> Get your self a LSI 94** controller and use it as HBA and you will be
>> fine. but get MORE DRIVES ! ... 
>>> On 28 Aug 2017, at 23:10, hjcho616  wrote:
>>>
>>> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>>> try these out.  Car battery idea is nice!  I may try that.. =)  Do
>>> they last longer?  Ones that fit the UPS original battery spec
>>> didn't last very long... part of the reason why I gave up on them..
>>> =P  My wife probably won't like the idea of car battery hanging out
>>> though ha!
>>>
>>> The OSD1 (one with mostly ok OSDs, except that smart failure)
>>> motherboard doesn't have any additional SATA connectors available.
>>>  Would it be safe to add another OSD host?
>>>
>>> Regards,
>>> Hong
>>>
>>>
>>>
>>> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz >> mail.com [1]> wrote:
>>>
>>>
>>> Sorry for being brutal ... anyway 
>>> 1. get the battery for UPS ( a car battery will do as well, I've
>>> moded on ups in the past with truck battery and it was working like
>>> a 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Tomasz Kusmierz
Just FYI, setting size and min_size to 1 is a last resort in my mind - to get 
you out of dodge !! 

Before setting that you should have made your self 105% certain that all OSD 
you leave ON, have NO bad sectors or no sectors pending or no any errors of any 
kind. 

once you can mount the cephfs, just delete everything you don’t actually need. 
Trust everybody has some data that they don’t trully need … this pron 
collection that you can redownload ;) that set of iso files that you downloaded 
from ubuntu but you can download them later … it might turn out that one of 
those files will contain the missing objects and your recovery will be 
pointless. 

> On 29 Aug 2017, at 20:49, Willem Jan Withagen  wrote:
> 
> On 29-8-2017 19:12, Steve Taylor wrote:
>> Hong,
>> 
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
> 
> Like somebody else already pointed out
> In problem "cases like disk, use dd_rescue.
> It has really a far better chance of restoring a copy of your disk
> 
> --WjW
> 
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>> 
>> The possibilities afforded by Ceph inception are endless. ☺
>> 
>> 
>> 
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 | 
>> 
>> If you are not the intended recipient of this message or received it 
>> erroneously, please notify the sender and delete it, together with any 
>> attachments, and be advised that any dissemination or copying of this 
>> message is prohibited.
>> 
>> 
>> 
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>>> Rule of thumb with batteries is:
>>> - more “proper temperature” you run them at the more life you get out
>>> of them
>>> - more battery is overpowered for your application the longer it will
>>> survive. 
>>> 
>>> Get your self a LSI 94** controller and use it as HBA and you will be
>>> fine. but get MORE DRIVES ! … 
 On 28 Aug 2017, at 23:10, hjcho616  wrote:
 
 Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
 try these out.  Car battery idea is nice!  I may try that.. =)  Do
 they last longer?  Ones that fit the UPS original battery spec
 didn't last very long... part of the reason why I gave up on them..
 =P  My wife probably won't like the idea of car battery hanging out
 though ha!
 
 The OSD1 (one with mostly ok OSDs, except that smart failure)
 motherboard doesn't have any additional SATA connectors available.
  Would it be safe to add another OSD host?
 
 Regards,
 Hong
 
 
 
 On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  wrote:
 
 
 Sorry for being brutal … anyway 
 1. get the battery for UPS ( a car battery will do as well, I’ve
 moded on ups in the past with truck battery and it was working like
 a charm :D )
 2. get spare drives and put those in because your cluster CAN NOT
 get out of error due to lack of space
 3. Follow advice of Ronny Aasen on hot to recover data from hard
 drives 
 4 get cooling to drives or you will loose more ! 
 
 
> On 28 Aug 2017, at 22:39, hjcho616  wrote:
> 
> Tomasz,
> 
> Those machines are behind a surge protector.  Doesn't appear to
> be a good one!  I do have a UPS... but it is my fault... no
> battery.  Power was pretty reliable for a while... and UPS was
> just beeping every chance it had, disrupting some sleep.. =P  So
> running on surge protector only.  I am running this in home
> environment.   So far, HDD failures have been very rare for this
> environment. =)  It just doesn't get loaded as much!  I am not
> sure what to expect, seeing that "unfound" and just a feeling of
> possibility of maybe getting OSD back made me excited about it.
> =) Thanks for letting me know what should be the priority.  I
> just lack experience and knowledge in this. =) Please do continue
> to guide me though this. 
> 
> Thank you for the decode of that smart messages!  I do agree that
> 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Willem Jan Withagen
On 29-8-2017 19:12, Steve Taylor wrote:
> Hong,
> 
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.

Like somebody else already pointed out
In problem "cases like disk, use dd_rescue.
It has really a far better chance of restoring a copy of your disk

--WjW

> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
> 
> The possibilities afforded by Ceph inception are endless. ☺
> 
> 
>  
> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 | 
>  
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
>  
> 
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> Rule of thumb with batteries is:
>> - more “proper temperature” you run them at the more life you get out
>> of them
>> - more battery is overpowered for your application the longer it will
>> survive. 
>>
>> Get your self a LSI 94** controller and use it as HBA and you will be
>> fine. but get MORE DRIVES ! … 
>>> On 28 Aug 2017, at 23:10, hjcho616  wrote:
>>>
>>> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>>> try these out.  Car battery idea is nice!  I may try that.. =)  Do
>>> they last longer?  Ones that fit the UPS original battery spec
>>> didn't last very long... part of the reason why I gave up on them..
>>> =P  My wife probably won't like the idea of car battery hanging out
>>> though ha!
>>>
>>> The OSD1 (one with mostly ok OSDs, except that smart failure)
>>> motherboard doesn't have any additional SATA connectors available.
>>>  Would it be safe to add another OSD host?
>>>
>>> Regards,
>>> Hong
>>>
>>>
>>>
>>> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz >> mail.com> wrote:
>>>
>>>
>>> Sorry for being brutal … anyway 
>>> 1. get the battery for UPS ( a car battery will do as well, I’ve
>>> moded on ups in the past with truck battery and it was working like
>>> a charm :D )
>>> 2. get spare drives and put those in because your cluster CAN NOT
>>> get out of error due to lack of space
>>> 3. Follow advice of Ronny Aasen on hot to recover data from hard
>>> drives 
>>> 4 get cooling to drives or you will loose more ! 
>>>
>>>
 On 28 Aug 2017, at 22:39, hjcho616  wrote:

 Tomasz,

 Those machines are behind a surge protector.  Doesn't appear to
 be a good one!  I do have a UPS... but it is my fault... no
 battery.  Power was pretty reliable for a while... and UPS was
 just beeping every chance it had, disrupting some sleep.. =P  So
 running on surge protector only.  I am running this in home
 environment.   So far, HDD failures have been very rare for this
 environment. =)  It just doesn't get loaded as much!  I am not
 sure what to expect, seeing that "unfound" and just a feeling of
 possibility of maybe getting OSD back made me excited about it.
 =) Thanks for letting me know what should be the priority.  I
 just lack experience and knowledge in this. =) Please do continue
 to guide me though this. 

 Thank you for the decode of that smart messages!  I do agree that
 looks like it is on its way out.  I would like to know how to get
 good portion of it back if possible. =)

 I think I just set the size and min_size to 1.
 # ceph osd lspools
 0 data,1 metadata,2 rbd,
 # ceph osd pool set rbd size 1
 set pool 2 size to 1
 # ceph osd pool set rbd min_size 1
 set pool 2 min_size to 1

 Seems to be doing some backfilling work.

 # ceph health
 HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
 pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling;
 108 pgs degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering;
 7 pgs recovery_wait; 16 pgs stale; 108 pgs stuck degraded; 6 pgs
 stuck inactive; 16 pgs stuck stale; 130 pgs stuck unclean; 101
 pgs stuck undersized; 101 pgs undersized; 1 requests are blocked
> 32 sec; 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)  
That does sound great.  I don't quite have another cluster so waiting for a 
drive to arrive! =)  
After setting min and max_min to 1, looks like toofull flag is gone... Maybe 
when I was making that video copy OSDs were already down... and those two OSDs 
were not enough to take too much extra...  and on top of it that last OSD alive 
was smaller disk (2TB vs 320GB)... so it probably was filling up faster.  I 
should have captured that message... but turned machine off and now I am at 
work. =P  When I get back home, I'll try to grab that and share.  Maybe I don't 
need to try to add another OSD to that cluster just yet!  OSDs are about 50% 
full on OSD1.
So next up, fixing osd0!
Regards,Hong   

On Tuesday, August 29, 2017 1:05 PM, David Turner  
wrote:
 

 But it was absolutely awesome to run an osd off of an rbd after the disk 
failed.
On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote:

To addend Steve's success, the rbd was created in a second cluster in the same 
datacenter so it didn't run the risk of deadlocking that mapping rbds on 
machines running osds has.  It is still theoretical to work on the same 
cluster, but more inherently dangerous for a few reasons.
On Tue, Aug 29, 2017, 1:15 PM Steve Taylor  
wrote:

Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺



Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive. 
>
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! … 
> > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> >
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> >
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> >
> > Regards,
> > Hong
> >
> >
> >
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> >
> >
> > Sorry for being brutal … anyway 
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives 
> > 4 get cooling to drives or you will loose more ! 
> >
> >
> > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > >
> > > Tomasz,
> > >
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > running on surge protector only.  I am running this in home
> > > environment.   So far, HDD failures have been very rare for this
> > > environment. =)  It just doesn't get loaded as much!  I am not
> > > sure what to expect, seeing that "unfound" and just a feeling of
> > > possibility of maybe getting OSD back 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread David Turner
But it was absolutely awesome to run an osd off of an rbd after the disk
failed.

On Tue, Aug 29, 2017, 1:42 PM David Turner  wrote:

> To addend Steve's success, the rbd was created in a second cluster in the
> same datacenter so it didn't run the risk of deadlocking that mapping rbds
> on machines running osds has.  It is still theoretical to work on the same
> cluster, but more inherently dangerous for a few reasons.
>
> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
> wrote:
>
>> Hong,
>>
>> Probably your best chance at recovering any data without special,
>> expensive, forensic procedures is to perform a dd from /dev/sdb to
>> somewhere else large enough to hold a full disk image and attempt to
>> repair that. You'll want to use 'conv=noerror' with your dd command
>> since your disk is failing. Then you could either re-attach the OSD
>> from the new source or attempt to retrieve objects from the filestore
>> on it.
>>
>> I have actually done this before by creating an RBD that matches the
>> disk size, performing the dd, running xfs_repair, and eventually
>> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
>> temporary arrangement for repair only, but I'm happy to report that it
>> worked flawlessly in my case. I was able to weight the OSD to 0,
>> offload all of its data, then remove it for a full recovery, at which
>> point I just deleted the RBD.
>>
>> The possibilities afforded by Ceph inception are endless. ☺
>>
>>
>>
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology
>> Corporation
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 |
>>
>> If you are not the intended recipient of this message or received it
>> erroneously, please notify the sender and delete it, together with any
>> attachments, and be advised that any dissemination or copying of this
>> message is prohibited.
>>
>>
>>
>> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
>> > Rule of thumb with batteries is:
>> > - more “proper temperature” you run them at the more life you get out
>> > of them
>> > - more battery is overpowered for your application the longer it will
>> > survive.
>> >
>> > Get your self a LSI 94** controller and use it as HBA and you will be
>> > fine. but get MORE DRIVES ! …
>> > > On 28 Aug 2017, at 23:10, hjcho616  wrote:
>> > >
>> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
>> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
>> > > they last longer?  Ones that fit the UPS original battery spec
>> > > didn't last very long... part of the reason why I gave up on them..
>> > > =P  My wife probably won't like the idea of car battery hanging out
>> > > though ha!
>> > >
>> > > The OSD1 (one with mostly ok OSDs, except that smart failure)
>> > > motherboard doesn't have any additional SATA connectors available.
>> > >  Would it be safe to add another OSD host?
>> > >
>> > > Regards,
>> > > Hong
>> > >
>> > >
>> > >
>> > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz > > > mail.com> wrote:
>> > >
>> > >
>> > > Sorry for being brutal … anyway
>> > > 1. get the battery for UPS ( a car battery will do as well, I’ve
>> > > moded on ups in the past with truck battery and it was working like
>> > > a charm :D )
>> > > 2. get spare drives and put those in because your cluster CAN NOT
>> > > get out of error due to lack of space
>> > > 3. Follow advice of Ronny Aasen on hot to recover data from hard
>> > > drives
>> > > 4 get cooling to drives or you will loose more !
>> > >
>> > >
>> > > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
>> > > >
>> > > > Tomasz,
>> > > >
>> > > > Those machines are behind a surge protector.  Doesn't appear to
>> > > > be a good one!  I do have a UPS... but it is my fault... no
>> > > > battery.  Power was pretty reliable for a while... and UPS was
>> > > > just beeping every chance it had, disrupting some sleep.. =P  So
>> > > > running on surge protector only.  I am running this in home
>> > > > environment.   So far, HDD failures have been very rare for this
>> > > > environment. =)  It just doesn't get loaded as much!  I am not
>> > > > sure what to expect, seeing that "unfound" and just a feeling of
>> > > > possibility of maybe getting OSD back made me excited about it.
>> > > > =) Thanks for letting me know what should be the priority.  I
>> > > > just lack experience and knowledge in this. =) Please do continue
>> > > > to guide me though this.
>> > > >
>> > > > Thank you for the decode of that smart messages!  I do agree that
>> > > > looks like it is on its way out.  I would like to know how to get
>> > > > good portion of it back if possible. =)
>> > > >
>> > > > I think I just set the size and min_size to 1.
>> > > > # ceph osd lspools
>> > > > 0 data,1 metadata,2 rbd,
>> > > > # ceph osd pool set rbd size 1
>> > > > set pool 2 size to 1
>> > > > 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread David Turner
To addend Steve's success, the rbd was created in a second cluster in the
same datacenter so it didn't run the risk of deadlocking that mapping rbds
on machines running osds has.  It is still theoretical to work on the same
cluster, but more inherently dangerous for a few reasons.

On Tue, Aug 29, 2017, 1:15 PM Steve Taylor 
wrote:

> Hong,
>
> Probably your best chance at recovering any data without special,
> expensive, forensic procedures is to perform a dd from /dev/sdb to
> somewhere else large enough to hold a full disk image and attempt to
> repair that. You'll want to use 'conv=noerror' with your dd command
> since your disk is failing. Then you could either re-attach the OSD
> from the new source or attempt to retrieve objects from the filestore
> on it.
>
> I have actually done this before by creating an RBD that matches the
> disk size, performing the dd, running xfs_repair, and eventually
> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
> temporary arrangement for repair only, but I'm happy to report that it
> worked flawlessly in my case. I was able to weight the OSD to 0,
> offload all of its data, then remove it for a full recovery, at which
> point I just deleted the RBD.
>
> The possibilities afforded by Ceph inception are endless. ☺
>
>
>
> Steve Taylor | Senior Software Engineer | StorageCraft Technology
> Corporation
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 |
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
>
>
>
> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> > Rule of thumb with batteries is:
> > - more “proper temperature” you run them at the more life you get out
> > of them
> > - more battery is overpowered for your application the longer it will
> > survive.
> >
> > Get your self a LSI 94** controller and use it as HBA and you will be
> > fine. but get MORE DRIVES ! …
> > > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> > >
> > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > > they last longer?  Ones that fit the UPS original battery spec
> > > didn't last very long... part of the reason why I gave up on them..
> > > =P  My wife probably won't like the idea of car battery hanging out
> > > though ha!
> > >
> > > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > > motherboard doesn't have any additional SATA connectors available.
> > >  Would it be safe to add another OSD host?
> > >
> > > Regards,
> > > Hong
> > >
> > >
> > >
> > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > > mail.com> wrote:
> > >
> > >
> > > Sorry for being brutal … anyway
> > > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > > moded on ups in the past with truck battery and it was working like
> > > a charm :D )
> > > 2. get spare drives and put those in because your cluster CAN NOT
> > > get out of error due to lack of space
> > > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > > drives
> > > 4 get cooling to drives or you will loose more !
> > >
> > >
> > > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > > >
> > > > Tomasz,
> > > >
> > > > Those machines are behind a surge protector.  Doesn't appear to
> > > > be a good one!  I do have a UPS... but it is my fault... no
> > > > battery.  Power was pretty reliable for a while... and UPS was
> > > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > > running on surge protector only.  I am running this in home
> > > > environment.   So far, HDD failures have been very rare for this
> > > > environment. =)  It just doesn't get loaded as much!  I am not
> > > > sure what to expect, seeing that "unfound" and just a feeling of
> > > > possibility of maybe getting OSD back made me excited about it.
> > > > =) Thanks for letting me know what should be the priority.  I
> > > > just lack experience and knowledge in this. =) Please do continue
> > > > to guide me though this.
> > > >
> > > > Thank you for the decode of that smart messages!  I do agree that
> > > > looks like it is on its way out.  I would like to know how to get
> > > > good portion of it back if possible. =)
> > > >
> > > > I think I just set the size and min_size to 1.
> > > > # ceph osd lspools
> > > > 0 data,1 metadata,2 rbd,
> > > > # ceph osd pool set rbd size 1
> > > > set pool 2 size to 1
> > > > # ceph osd pool set rbd min_size 1
> > > > set pool 2 min_size to 1
> > > >
> > > > Seems to be doing some backfilling work.
> > > >
> > > > # ceph health
> > > > HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
> > > > pgs backfill_toofull; 74 pgs 

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Steve Taylor
Hong,

Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to
somewhere else large enough to hold a full disk image and attempt to
repair that. You'll want to use 'conv=noerror' with your dd command
since your disk is failing. Then you could either re-attach the OSD
from the new source or attempt to retrieve objects from the filestore
on it.

I have actually done this before by creating an RBD that matches the
disk size, performing the dd, running xfs_repair, and eventually
adding it back to the cluster as an OSD. RBDs as OSDs is certainly a
temporary arrangement for repair only, but I'm happy to report that it
worked flawlessly in my case. I was able to weight the OSD to 0,
offload all of its data, then remove it for a full recovery, at which
point I just deleted the RBD.

The possibilities afforded by Ceph inception are endless. ☺


 
Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 | 
 
If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

 

On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote:
> Rule of thumb with batteries is:
> - more “proper temperature” you run them at the more life you get out
> of them
> - more battery is overpowered for your application the longer it will
> survive. 
> 
> Get your self a LSI 94** controller and use it as HBA and you will be
> fine. but get MORE DRIVES ! … 
> > On 28 Aug 2017, at 23:10, hjcho616  wrote:
> > 
> > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and
> > try these out.  Car battery idea is nice!  I may try that.. =)  Do
> > they last longer?  Ones that fit the UPS original battery spec
> > didn't last very long... part of the reason why I gave up on them..
> > =P  My wife probably won't like the idea of car battery hanging out
> > though ha!
> > 
> > The OSD1 (one with mostly ok OSDs, except that smart failure)
> > motherboard doesn't have any additional SATA connectors available.
> >  Would it be safe to add another OSD host?
> > 
> > Regards,
> > Hong
> > 
> > 
> > 
> > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  > mail.com> wrote:
> > 
> > 
> > Sorry for being brutal … anyway 
> > 1. get the battery for UPS ( a car battery will do as well, I’ve
> > moded on ups in the past with truck battery and it was working like
> > a charm :D )
> > 2. get spare drives and put those in because your cluster CAN NOT
> > get out of error due to lack of space
> > 3. Follow advice of Ronny Aasen on hot to recover data from hard
> > drives 
> > 4 get cooling to drives or you will loose more ! 
> > 
> > 
> > > On 28 Aug 2017, at 22:39, hjcho616  wrote:
> > > 
> > > Tomasz,
> > > 
> > > Those machines are behind a surge protector.  Doesn't appear to
> > > be a good one!  I do have a UPS... but it is my fault... no
> > > battery.  Power was pretty reliable for a while... and UPS was
> > > just beeping every chance it had, disrupting some sleep.. =P  So
> > > running on surge protector only.  I am running this in home
> > > environment.   So far, HDD failures have been very rare for this
> > > environment. =)  It just doesn't get loaded as much!  I am not
> > > sure what to expect, seeing that "unfound" and just a feeling of
> > > possibility of maybe getting OSD back made me excited about it.
> > > =) Thanks for letting me know what should be the priority.  I
> > > just lack experience and knowledge in this. =) Please do continue
> > > to guide me though this. 
> > > 
> > > Thank you for the decode of that smart messages!  I do agree that
> > > looks like it is on its way out.  I would like to know how to get
> > > good portion of it back if possible. =)
> > > 
> > > I think I just set the size and min_size to 1.
> > > # ceph osd lspools
> > > 0 data,1 metadata,2 rbd,
> > > # ceph osd pool set rbd size 1
> > > set pool 2 size to 1
> > > # ceph osd pool set rbd min_size 1
> > > set pool 2 min_size to 1
> > > 
> > > Seems to be doing some backfilling work.
> > > 
> > > # ceph health
> > > HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2
> > > pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling;
> > > 108 pgs degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering;
> > > 7 pgs recovery_wait; 16 pgs stale; 108 pgs stuck degraded; 6 pgs
> > > stuck inactive; 16 pgs stuck stale; 130 pgs stuck unclean; 101
> > > pgs stuck undersized; 101 pgs undersized; 1 requests are blocked
> > > > 32 sec; recovery 1790657/4502340 objects degraded (39.772%);
> > > recovery 641906/4502340 objects misplaced (14.257%); recovery
> > > 147/2251990 unfound (0.007%); 50 scrub errors; mds cluster is
> > > degraded; no legacy OSD 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
Rule of thumb with batteries is:
- more “proper temperature” you run them at the more life you get out of them
- more battery is overpowered for your application the longer it will survive. 

Get your self a LSI 94** controller and use it as HBA and you will be fine. but 
get MORE DRIVES ! … 
> On 28 Aug 2017, at 23:10, hjcho616  wrote:
> 
> Thank you Tomasz and Ronny.  I'll have to order some hdd soon and try these 
> out.  Car battery idea is nice!  I may try that.. =)  Do they last longer?  
> Ones that fit the UPS original battery spec didn't last very long... part of 
> the reason why I gave up on them.. =P  My wife probably won't like the idea 
> of car battery hanging out though ha!
> 
> The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard 
> doesn't have any additional SATA connectors available.  Would it be safe to 
> add another OSD host?
> 
> Regards,
> Hong
> 
> 
> 
> On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> Sorry for being brutal … anyway 
> 1. get the battery for UPS ( a car battery will do as well, I’ve moded on ups 
> in the past with truck battery and it was working like a charm :D )
> 2. get spare drives and put those in because your cluster CAN NOT get out of 
> error due to lack of space
> 3. Follow advice of Ronny Aasen on hot to recover data from hard drives 
> 4 get cooling to drives or you will loose more ! 
> 
> 
>> On 28 Aug 2017, at 22:39, hjcho616 > > wrote:
>> 
>> Tomasz,
>> 
>> Those machines are behind a surge protector.  Doesn't appear to be a good 
>> one!  I do have a UPS... but it is my fault... no battery.  Power was pretty 
>> reliable for a while... and UPS was just beeping every chance it had, 
>> disrupting some sleep.. =P  So running on surge protector only.  I am 
>> running this in home environment.   So far, HDD failures have been very rare 
>> for this environment. =)  It just doesn't get loaded as much!  I am not sure 
>> what to expect, seeing that "unfound" and just a feeling of possibility of 
>> maybe getting OSD back made me excited about it. =) Thanks for letting me 
>> know what should be the priority.  I just lack experience and knowledge in 
>> this. =) Please do continue to guide me though this. 
>> 
>> Thank you for the decode of that smart messages!  I do agree that looks like 
>> it is on its way out.  I would like to know how to get good portion of it 
>> back if possible. =)
>> 
>> I think I just set the size and min_size to 1.
>> # ceph osd lspools
>> 0 data,1 metadata,2 rbd,
>> # ceph osd pool set rbd size 1
>> set pool 2 size to 1
>> # ceph osd pool set rbd min_size 1
>> set pool 2 min_size to 1
>> 
>> Seems to be doing some backfilling work.
>> 
>> # ceph health
>> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 pgs 
>> backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs degraded; 
>> 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 16 pgs 
>> stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 130 
>> pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
>> are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
>> recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
>> unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
>> present but 'sortbitwise' flag is not set
>> 
>> 
>> 
>> Regards,
>> Hong
>> 
>> 
>> On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz > > wrote:
>> 
>> 
>> So to decode few things about your disk:
>> 
>>   1 Raw_Read_Error_Rate0x002f  100  100  051Pre-fail  Always  -  
>> 37
>> 37 read erros and only one sector marked as pending - fun disk :/ 
>> 
>> 181 Program_Fail_Cnt_Total  0x0022  099  099  000Old_age  Always  -  
>> 35325174
>> So firmware has quite few bugs, that’s nice
>> 
>> 191 G-Sense_Error_Rate  0x0022  100  100  000Old_age  Always  -  
>> 2855
>> disk was thrown around while operational even more nice.
>> 
>> 194 Temperature_Celsius0x0002  047  041  000Old_age  Always  -   
>>53 (Min/Max 15/59)
>> if your disk passes 50 you should not consider using it, high temperatures 
>> demagnetise plate layer and you will see more errors in very near future.
>> 
>> 197 Current_Pending_Sector  0x0032  100  100  000Old_age  Always  -  
>> 1
>> as mentioned before :)
>> 
>> 200 Multi_Zone_Error_Rate  0x002a  100  100  000Old_age  Always  -   
>>4222
>> your heads keep missing tracks … bent ? I don’t even know how to comment 
>> here.
>> 
>> 
>> generally fun drive you’ve got there … rescue as much as you can and throw 
>> it away !!!
>> 
>> 
> 
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Thank you Tomasz and Ronny.  I'll have to order some hdd soon and try these 
out.  Car battery idea is nice!  I may try that.. =)  Do they last longer?  
Ones that fit the UPS original battery spec didn't last very long... part of 
the reason why I gave up on them.. =P  My wife probably won't like the idea of 
car battery hanging out though ha!
The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard 
doesn't have any additional SATA connectors available.  Would it be safe to add 
another OSD host?
Regards,Hong
 

On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz 
 wrote:
 

 Sorry for being brutal … anyway 1. get the battery for UPS ( a car battery 
will do as well, I’ve moded on ups in the past with truck battery and it was 
working like a charm :D )2. get spare drives and put those in because your 
cluster CAN NOT get out of error due to lack of space3. Follow advice of Ronny 
Aasen on hot to recover data from hard drives 4 get cooling to drives or you 
will loose more ! 


On 28 Aug 2017, at 22:39, hjcho616  wrote:
Tomasz,
Those machines are behind a surge protector.  Doesn't appear to be a good one!  
I do have a UPS... but it is my fault... no battery.  Power was pretty reliable 
for a while... and UPS was just beeping every chance it had, disrupting some 
sleep.. =P  So running on surge protector only.  I am running this in home 
environment.   So far, HDD failures have been very rare for this environment. 
=)  It just doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a feeling of possibility of maybe getting OSD back made 
me excited about it. =) Thanks for letting me know what should be the priority. 
 I just lack experience and knowledge in this. =) Please do continue to guide 
me though this. 
Thank you for the decode of that smart messages!  I do agree that looks like it 
is on its way out.  I would like to know how to get good portion of it back if 
possible. =)
I think I just set the size and min_size to 1.# ceph osd lspools0 data,1 
metadata,2 rbd,# ceph osd pool set rbd size 1set pool 2 size to 1# ceph osd 
pool set rbd min_size 1set pool 2 min_size to 1
Seems to be doing some backfilling work.
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 
pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs 
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 
130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
present but 'sortbitwise' flag is not set


Regards,Hong 

On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz 
 wrote:
 

 So to decode few things about your disk:

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -     
 37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022  099  099  000    Old_age  Always      -     
 35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -     
 2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius    0x0002  047  041  000    Old_age  Always      -      
53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032  100  100  000    Old_age  Always      -     
 1
as mentioned before :)

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      
4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!

   



   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
Sorry for being brutal … anyway 
1. get the battery for UPS ( a car battery will do as well, I’ve moded on ups 
in the past with truck battery and it was working like a charm :D )
2. get spare drives and put those in because your cluster CAN NOT get out of 
error due to lack of space
3. Follow advice of Ronny Aasen on hot to recover data from hard drives 
4 get cooling to drives or you will loose more ! 


> On 28 Aug 2017, at 22:39, hjcho616  wrote:
> 
> Tomasz,
> 
> Those machines are behind a surge protector.  Doesn't appear to be a good 
> one!  I do have a UPS... but it is my fault... no battery.  Power was pretty 
> reliable for a while... and UPS was just beeping every chance it had, 
> disrupting some sleep.. =P  So running on surge protector only.  I am running 
> this in home environment.   So far, HDD failures have been very rare for this 
> environment. =)  It just doesn't get loaded as much!  I am not sure what to 
> expect, seeing that "unfound" and just a feeling of possibility of maybe 
> getting OSD back made me excited about it. =) Thanks for letting me know what 
> should be the priority.  I just lack experience and knowledge in this. =) 
> Please do continue to guide me though this. 
> 
> Thank you for the decode of that smart messages!  I do agree that looks like 
> it is on its way out.  I would like to know how to get good portion of it 
> back if possible. =)
> 
> I think I just set the size and min_size to 1.
> # ceph osd lspools
> 0 data,1 metadata,2 rbd,
> # ceph osd pool set rbd size 1
> set pool 2 size to 1
> # ceph osd pool set rbd min_size 1
> set pool 2 min_size to 1
> 
> Seems to be doing some backfilling work.
> 
> # ceph health
> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 pgs 
> backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs degraded; 
> 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 16 pgs 
> stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 130 
> pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
> are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
> recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
> unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
> present but 'sortbitwise' flag is not set
> 
> 
> 
> Regards,
> Hong
> 
> 
> On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> So to decode few things about your disk:
> 
>   1 Raw_Read_Error_Rate0x002f  100  100  051Pre-fail  Always  -   
>37
> 37 read erros and only one sector marked as pending - fun disk :/ 
> 
> 181 Program_Fail_Cnt_Total  0x0022  099  099  000Old_age  Always  -   
>35325174
> So firmware has quite few bugs, that’s nice
> 
> 191 G-Sense_Error_Rate  0x0022  100  100  000Old_age  Always  -   
>2855
> disk was thrown around while operational even more nice.
> 
> 194 Temperature_Celsius0x0002  047  041  000Old_age  Always  -
>   53 (Min/Max 15/59)
> if your disk passes 50 you should not consider using it, high temperatures 
> demagnetise plate layer and you will see more errors in very near future.
> 
> 197 Current_Pending_Sector  0x0032  100  100  000Old_age  Always  -   
>1
> as mentioned before :)
> 
> 200 Multi_Zone_Error_Rate  0x002a  100  100  000Old_age  Always  -
>   4222
> your heads keep missing tracks … bent ? I don’t even know how to comment here.
> 
> 
> generally fun drive you’ve got there … rescue as much as you can and throw it 
> away !!!
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Tomasz,
Those machines are behind a surge protector.  Doesn't appear to be a good one!  
I do have a UPS... but it is my fault... no battery.  Power was pretty reliable 
for a while... and UPS was just beeping every chance it had, disrupting some 
sleep.. =P  So running on surge protector only.  I am running this in home 
environment.   So far, HDD failures have been very rare for this environment. 
=)  It just doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a feeling of possibility of maybe getting OSD back made 
me excited about it. =) Thanks for letting me know what should be the priority. 
 I just lack experience and knowledge in this. =) Please do continue to guide 
me though this. 
Thank you for the decode of that smart messages!  I do agree that looks like it 
is on its way out.  I would like to know how to get good portion of it back if 
possible. =)
I think I just set the size and min_size to 1.# ceph osd lspools0 data,1 
metadata,2 rbd,# ceph osd pool set rbd size 1set pool 2 size to 1# ceph osd 
pool set rbd min_size 1set pool 2 min_size to 1
Seems to be doing some backfilling work.
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 
pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs 
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 
130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
present but 'sortbitwise' flag is not set


Regards,Hong 

On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz 
 wrote:
 

 So to decode few things about your disk:

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -     
 37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022  099  099  000    Old_age  Always      -     
 35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -     
 2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius    0x0002  047  041  000    Old_age  Always      -      
53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032  100  100  000    Old_age  Always      -     
 1
as mentioned before :)

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      
4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!

   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Ronny Aasen

> [SNIP - bad drives]

Generally when a disk is displaying bad blocks to the OS, the drive have 
been remapping blocks for ages in the background. and the disk is really 
on it's last legs.  a bit unlikely that you get so many disks dying at 
the same time tho. but the problem can have been silently worsening and 
was not realy noticed until the osd had to restart due to the powerloss.



if this is _very_ important data i would recomend you start by taking 
the bad drives out of operation, and cloning the bad drive block by 
block onto a good one. by using dd_rescue. also a good idea to store a 
image of the disk so you can try the different rescue methods several 
times.  in the very worst case send the disk to a professional data 
recovery company.


once that is done, you have 2 options:
try to make the osd run again, by. xfs_fsck, + manually finding corrupt 
objects. (find + md5sum (look for read errors)) and deleting them have 
helped me in the past. if you manage to get the osd to run, drain it, by 
setting crush weight to 0. and eventualy remove the disk from the cluster.

alternativly if you can not get the osd running again:
use ceph objectstoretool to extract objects and inject them using a 
clean node and osd like described in 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/   read the man page 
and help for the tool i think the arguments have changed slightly since 
that blogpost.


you may also run into read errors on corrupt objects, stopping your 
export.  in that case rm the offending object and rerun the export.

repeat for all bad drives.

when doing the inject it is important that your cluster is operational 
and able to accept objects from the draining drive, so either set 
minimal replication type to OSD, or even better. add more osd nodes to 
make a operational cluster (with missing objects)



also i see in your log you have os-prober testing all partitions. i tend 
to remove os-prober on machines that does not dualboot with another os.


rules of thumb for future ceph clusters:
min_size =2 for a reason it should never be 1 unless dataloss is wanted.
size=3 f you need the cluster to be operating with a drive or node in a 
error state. size=2 gives you more space but the cluster will block on 
errors until the recovery is done. better to be blocking then loosing data.
if you have size=3 and 3 nodes and you loose a node, then your cluster 
can not self heal. you should have more nodes then you have set size to.
have free space on drives, this is where data is replicated to in case 
of a down node. if you have 4 nodes and you want to be able to loose 
one, and still operate. you need leftover room on your 3 remaining nodes 
to cover for the lost one. the more nodes you have the less the impact 
of a node failure is.  and the less spare room is needed  for a 4 node 
cluster you should not fill more then 66% if you want to be able to 
self-heal + operate.




good luck
Ronny Aasen


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
So to decode few things about your disk:

  1 Raw_Read_Error_Rate 0x002f   100   100   051Pre-fail  Always   
-   37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022   099   099   000Old_age   Always   
-   35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate  0x0022   100   100   000Old_age   Always   
-   2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius 0x0002   047   041   000Old_age   Always   
-   53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always   
-   1
as mentioned before :)

200 Multi_Zone_Error_Rate   0x002a   100   100   000Old_age   Always   
-   4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
I think you are looking at something more like this :

https://www.google.co.uk/imgres?imgurl=https%3A%2F%2Fthumbs.dreamstime.com%2Fz%2Fhard-drive-being-destroyed-hammer-16668693.jpg=https%3A%2F%2Fwww.dreamstime.com%2Fstock-photos-hard-drive-being-destroyed-hammer-image16668693=Ofi7hHnUFmPsyM=Ak6YfqQVvZWCsM%3A=10ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA..i=1300=1130=safari=1116=1920=hdd%20hammer=0ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA=mrc=8

:P

I’ll sound brusk now so buckle up.

You really need to set your priorities straight now, if you want to rescue a 
disk that has a pending sector you set your self up for failure. You said that 
several consecutive power outage killed your cluster, yet you did show no 
concern of investing in at least anti surge protector (outages can create 
surges, or a surge can cut the power by burning fuses in sub station) which 
actually cause hardware failures. I’ve fired at you a control statement about 
how to save your data, but you keep returning trying to save some osd’s - which 
to me look like you don’t really care about your data.

If any of those were at any point on your mind, you would shut down those 
systems to limit possibility of another outage destroy more data. You would 
protect your self via a simple surge protector or at least basic UPS. You would 
borrow / beg / steal to get a spare hard drive and backup your stuff. You would 
set your pool size and min size to 1, let the cluster get out of warning state 
and only then you will be able to mount it to attempt data recovery. You would 
check remaining disks for SMART errors.

Then and only then you can start playing with as complex repairs. 

Right now you are in the middle of a shit creek without a paddle. 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
I think you are looking at something more like this :

https://www.google.co.uk/imgres?imgurl=https%3A%2F%2Fthumbs.dreamstime.com%2Fz%2Fhard-drive-being-destroyed-hammer-16668693.jpg=https%3A%2F%2Fwww.dreamstime.com%2Fstock-photos-hard-drive-being-destroyed-hammer-image16668693=Ofi7hHnUFmPsyM=Ak6YfqQVvZWCsM%3A=10ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA..i=1300=1130=safari=1116=1920=hdd%20hammer=0ahUKEwj56JfI5vrVAhXoCcAKHfkZDn4QMwgmKAAwAA=mrc=8
 


:P

I’ll sound brusk now so buckle up.

You really need to set your priorities straight now, if you want to rescue a 
disk that has a pending sector you set your self up for failure. You said that 
several consecutive power outage killed your cluster, yet you did show no 
concern of investing in at least anti surge protector (outages can create 
surges, or a surge can cut the power by burning fuses in sub station) which 
actually cause hardware failures. I’ve fired at you a control statement about 
how to save your data, but you keep returning trying to save some osd’s - which 
to me look like you don’t really care about your data.

If any of those were at any point on your mind, you would shut down those 
systems to limit possibility of another outage destroy more data. You would 
protect your self via a simple surge protector or at least basic UPS. You would 
borrow / beg / steal to get a spare hard drive and backup your stuff. You would 
set your pool size and min size to 1, let the cluster get out of warning state 
and only then you will be able to mount it to attempt data recovery. You would 
check remaining disks for SMART errors.

Then and only then you can start playing with as complex repairs. 

Right now you are in the middle of a shit creek without a paddle. 


> On 28 Aug 2017, at 21:45, hjcho616  wrote:
> 
> So.. would doing something like this could potentially bring it back to life? 
> =)
> 
> Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki 
> 
> 
> 
> Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
>  
> 
> 
> 
> 
> On Monday, August 28, 2017 3:24 PM, Tomasz Kusmierz  
> wrote:
> 
> 
> I think you’ve got your anwser:
> 
> 197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always  
>  -   1
> 
>> On 28 Aug 2017, at 21:22, hjcho616 > > wrote:
>> 
>> Steve,
>> 
>> I thought that was odd too.. 
>> 
>> Below is from the log, This captures transition from good to bad. Looks like 
>> there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors". 
>>  And looks like I did a repair with /dev/sdb1... =P
>> 
>> # grep sdb syslog.1
>> Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 43
>> Aug 27 06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 43 to 45
>> Aug 27 07:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 45 to 44
>> Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 45
>> Aug 27 10:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 45 to 44
>> Aug 27 13:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
>> Attribute: 194 Temperature_Celsius changed from 44 to 45
>> Aug 27 13:53:34 OSD1 kernel: [1.454082] sd 1:0:0:0: [sdb] 3907029168 
>> 512-byte logical blocks: (2.00 TB/1.82 TiB)
>> Aug 27 13:53:34 OSD1 kernel: [1.454447] sd 1:0:0:0: [sdb] Write Protect 
>> is off
>> Aug 27 13:53:34 OSD1 kernel: [1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 
>> 3a 00 00
>> Aug 27 13:53:34 OSD1 kernel: [1.454488] sd 1:0:0:0: [sdb] Write cache: 
>> enabled, read cache: enabled, doesn't support DPO or FUA
>> Aug 27 13:53:34 OSD1 kernel: [1.501349]  sdb: sdb1
>> Aug 27 13:53:34 OSD1 kernel: [1.501796] sd 1:0:0:0: [sdb] Attached SCSI 
>> disk
>> Aug 27 13:53:34 OSD1 kernel: [4.033081] XFS (sdb1): Mounting V4 
>> Filesystem
>> Aug 27 13:53:34 OSD1 kernel: [4.207191] XFS (sdb1): Starting recovery 
>> (logdev: internal)
>> Aug 27 13:53:34 OSD1 kernel: [5.656298] XFS (sdb1): Ending recovery 
>> (logdev: internal)
>> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb, type changed from 
>> 'scsi' 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
So.. would doing something like this could potentially bring it back to life? =)
Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
  
|  
|   
|   
|   ||

   |

  |
|  
|   |  
Analyzing a Faulty Hard Disk using Smartctl - Thomas-Krenn-Wiki
   |   |

  |

  |

 
 

On Monday, August 28, 2017 3:24 PM, Tomasz Kusmierz 
 wrote:
 

 I think you’ve got your anwser:
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       
-       1

On 28 Aug 2017, at 21:22, hjcho616  wrote:
Steve,
I thought that was odd too.. 
Below is from the log, This captures transition from good to bad. Looks like 
there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
And looks like I did a repair with /dev/sdb1... =P
# grep sdb syslog.1Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], 
SMART Usage Attribute: 194 Temperature_Celsius changed from 44 to 43Aug 27 
06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 43 to 45Aug 27 07:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 45 to 44Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART 
Usage Attribute: 194 Temperature_Celsius changed from 44 to 45Aug 27 10:57:22 
OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 44 to 45Aug 27 13:53:34 OSD1 kernel: [    1.454082] sd 1:0:0:0: [sdb] 
3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)Aug 27 13:53:34 OSD1 
kernel: [    1.454447] sd 1:0:0:0: [sdb] Write Protect is offAug 27 13:53:34 
OSD1 kernel: [    1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00Aug 27 
13:53:34 OSD1 kernel: [    1.454488] sd 1:0:0:0: [sdb] Write cache: enabled, 
read cache: enabled, doesn't support DPO or FUAAug 27 13:53:34 OSD1 kernel: [   
 1.501349]  sdb: sdb1Aug 27 13:53:34 OSD1 kernel: [    1.501796] sd 1:0:0:0: 
[sdb] Attached SCSI diskAug 27 13:53:34 OSD1 kernel: [    4.033081] XFS (sdb1): 
Mounting V4 FilesystemAug 27 13:53:34 OSD1 kernel: [    4.207191] XFS (sdb1): 
Starting recovery (logdev: internal)Aug 27 13:53:34 OSD1 kernel: [    5.656298] 
XFS (sdb1): Ending recovery (logdev: internal)Aug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb, type changed from 'scsi' to 'sat'Aug 27 
13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], openedAug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, S/N:S2H7JD1B306112, 
WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TBAug 27 13:53:34 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], found in smartd database: SAMSUNG SpinPoint F4 EG 
(AF)Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
smartmontools or hdparm with thisAug 27 13:53:36 OSD1 smartd[1028]: Device: 
/dev/sdb [SAT], is SMART capable. Adding to "monitor" list.Aug 27 13:53:36 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], state read from 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:53:49 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], state written to 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/05efi on 
mounted /dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/10qnx on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: 
exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
macosx-prober: debug: /dev/sdb1 is not an HFS+ partition: exitingAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/20microsoft 
on mounted /dev/sdb1Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a 
MS partition: exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
30utility: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/40lsb on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/80minix on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
83haiku: debug: /dev/sdb1 is not a BeFS partition: exitingAug 27 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
I think you’ve got your anwser:

197 Current_Pending_Sector  0x0032   100   100   000Old_age   Always   
-   1

> On 28 Aug 2017, at 21:22, hjcho616  wrote:
> 
> Steve,
> 
> I thought that was odd too.. 
> 
> Below is from the log, This captures transition from good to bad. Looks like 
> there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
> And looks like I did a repair with /dev/sdb1... =P
> 
> # grep sdb syslog.1
> Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 43
> Aug 27 06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 43 to 45
> Aug 27 07:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 45
> Aug 27 10:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 13:27:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 44 to 45
> Aug 27 13:53:34 OSD1 kernel: [1.454082] sd 1:0:0:0: [sdb] 3907029168 
> 512-byte logical blocks: (2.00 TB/1.82 TiB)
> Aug 27 13:53:34 OSD1 kernel: [1.454447] sd 1:0:0:0: [sdb] Write Protect 
> is off
> Aug 27 13:53:34 OSD1 kernel: [1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 
> 3a 00 00
> Aug 27 13:53:34 OSD1 kernel: [1.454488] sd 1:0:0:0: [sdb] Write cache: 
> enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 27 13:53:34 OSD1 kernel: [1.501349]  sdb: sdb1
> Aug 27 13:53:34 OSD1 kernel: [1.501796] sd 1:0:0:0: [sdb] Attached SCSI 
> disk
> Aug 27 13:53:34 OSD1 kernel: [4.033081] XFS (sdb1): Mounting V4 Filesystem
> Aug 27 13:53:34 OSD1 kernel: [4.207191] XFS (sdb1): Starting recovery 
> (logdev: internal)
> Aug 27 13:53:34 OSD1 kernel: [5.656298] XFS (sdb1): Ending recovery 
> (logdev: internal)
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb, type changed from 'scsi' 
> to 'sat'
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], opened
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, 
> S/N:S2H7JD1B306112, WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TB
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], found in smartd 
> database: SAMSUNG SpinPoint F4 EG (AF)
> Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
> smartmontools or hdparm with this
> Aug 27 13:53:36 OSD1 smartd[1028]: Device: /dev/sdb [SAT], is SMART capable. 
> Adding to "monitor" list.
> Aug 27 13:53:36 OSD1 smartd[1028]: Device: /dev/sdb [SAT], state read from 
> /var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.state
> Aug 27 13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 45 to 44
> Aug 27 13:53:49 OSD1 smartd[1028]: Device: /dev/sdb [SAT], state written to 
> /var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.state
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/05efi on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 10freedos: debug: /dev/sdb1 is not a FAT partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/10qnx on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 macosx-prober: debug: /dev/sdb1 is not an HFS+ 
> partition: exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/20microsoft on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a MS partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 30utility: debug: /dev/sdb1 is not a FAT partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/40lsb on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/80minix on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 83haiku: debug: /dev/sdb1 is not a BeFS partition: 
> exiting
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> /usr/lib/os-probes/mounted/90linux-distro on mounted /dev/sdb1
> Aug 27 15:52:36 OSD1 os-prober: debug: running 
> 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Steve,
I thought that was odd too.. 
Below is from the log, This captures transition from good to bad. Looks like 
there is "Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors".  
And looks like I did a repair with /dev/sdb1... =P
# grep sdb syslog.1Aug 27 06:27:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], 
SMART Usage Attribute: 194 Temperature_Celsius changed from 44 to 43Aug 27 
06:57:22 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 43 to 45Aug 27 07:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 45 to 44Aug 27 07:57:21 OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART 
Usage Attribute: 194 Temperature_Celsius changed from 44 to 45Aug 27 10:57:22 
OSD1 smartd[1031]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:27:21 OSD1 smartd[1031]: 
Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed 
from 44 to 45Aug 27 13:53:34 OSD1 kernel: [    1.454082] sd 1:0:0:0: [sdb] 
3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)Aug 27 13:53:34 OSD1 
kernel: [    1.454447] sd 1:0:0:0: [sdb] Write Protect is offAug 27 13:53:34 
OSD1 kernel: [    1.454448] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00Aug 27 
13:53:34 OSD1 kernel: [    1.454488] sd 1:0:0:0: [sdb] Write cache: enabled, 
read cache: enabled, doesn't support DPO or FUAAug 27 13:53:34 OSD1 kernel: [   
 1.501349]  sdb: sdb1Aug 27 13:53:34 OSD1 kernel: [    1.501796] sd 1:0:0:0: 
[sdb] Attached SCSI diskAug 27 13:53:34 OSD1 kernel: [    4.033081] XFS (sdb1): 
Mounting V4 FilesystemAug 27 13:53:34 OSD1 kernel: [    4.207191] XFS (sdb1): 
Starting recovery (logdev: internal)Aug 27 13:53:34 OSD1 kernel: [    5.656298] 
XFS (sdb1): Ending recovery (logdev: internal)Aug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb, type changed from 'scsi' to 'sat'Aug 27 
13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], openedAug 27 13:53:34 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], SAMSUNG HD204UI, S/N:S2H7JD1B306112, 
WWN:5-0024e9-004c7c449, FW:1AQ10001, 2.00 TBAug 27 13:53:34 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], found in smartd database: SAMSUNG SpinPoint F4 EG 
(AF)Aug 27 13:53:34 OSD1 smartd[1028]: Device: /dev/sdb [SAT], WARNING: Using 
smartmontools or hdparm with thisAug 27 13:53:36 OSD1 smartd[1028]: Device: 
/dev/sdb [SAT], is SMART capable. Adding to "monitor" list.Aug 27 13:53:36 OSD1 
smartd[1028]: Device: /dev/sdb [SAT], state read from 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
13:53:45 OSD1 smartd[1028]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 
Temperature_Celsius changed from 45 to 44Aug 27 13:53:49 OSD1 smartd[1028]: 
Device: /dev/sdb [SAT], state written to 
/var/lib/smartmontools/smartd.SAMSUNG_HD204UI-S2H7JD1B306112.ata.stateAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/05efi on 
mounted /dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/10qnx on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 10qnx: debug: /dev/sdb1 is not a QNX4 partition: 
exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/20macosx on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
macosx-prober: debug: /dev/sdb1 is not an HFS+ partition: exitingAug 27 
15:52:36 OSD1 os-prober: debug: running /usr/lib/os-probes/mounted/20microsoft 
on mounted /dev/sdb1Aug 27 15:52:36 OSD1 20microsoft: debug: /dev/sdb1 is not a 
MS partition: exitingAug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/30utility on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
30utility: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/40lsb on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/70hurd on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/80minix on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/83haiku on mounted /dev/sdb1Aug 27 15:52:36 OSD1 
83haiku: debug: /dev/sdb1 is not a BeFS partition: exitingAug 27 15:52:36 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/90linux-distro on mounted 
/dev/sdb1Aug 27 15:52:36 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/90solaris on mounted /dev/sdb1Aug 27 15:53:06 OSD1 
os-prober: debug: running /usr/lib/os-probes/mounted/05efi on mounted 
/dev/sdb1Aug 27 15:53:06 OSD1 os-prober: debug: running 
/usr/lib/os-probes/mounted/10freedos on mounted /dev/sdb1Aug 27 15:53:06 OSD1 
10freedos: debug: /dev/sdb1 is not a FAT partition: exitingAug 27 15:53:06 OSD1 
os-prober: debug: running 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Steve Taylor
I'm jumping in a little late here, but running xfs_repair on your partition 
can't frag your partition table. The partition table lives outside the 
partition block device and xfs_repair doesn't have access to it when run 
against /dev/sdb1. I haven't actually tested it, but it seems unlikely that 
running xfs_repair on /dev/sdb would do it either. I would assume it would just 
give you an error about /dev/sdb not containing an XFS filesystem. That's a 
guess though. I haven't ever tried anything like that.

Are you sure there isn't physical damage to the disk? I wouldn't say it's 
common, but power outages can do that. You can run 'dmesg | grep sdb' and 
'smartctl -a /dev/sdb' to see if there are kernel errors or SMART errors 
indicative of physical problems. If the disk is physically sound and the 
partition table really has been fragged, you may be able to restore it from the 
backup at the end of the disk, assuming it's GPT. If you can't find a partition 
or a filesystem somehow, then you're probably out of luck as far as retrieving 
any objects from that OSD. If the disk is physically damaged and your partition 
is gone, then it probably isn't worth wasting additional time on it.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Mon, 2017-08-28 at 19:18 +, hjcho616 wrote:
Tomasz,

Looks like when I did xfs_repair -L /dev/sdb1 it did something to partition 
table and I don't see /dev/sdb1 anymore... or maybe I missed 1 in the 
/dev/sdb1? =(. Yes.. that extra power outage did a pretty good damage... =P  I 
am hoping 0.007% is very small...=P  Any recommendations on fixing xfs 
partition I am missing? =)

Ronny,

Thank you for that link!

No I haven't done anything to osds... not touching them, hoping that I can 
revive some of them.. =)  Only thing done is trying to start and stop them..

Below are the links to newer files with just one start attempt. =)
ceph-osd.3_single.log


[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.3_single.log


ceph-osd.4_single.log


[https://s.yimg.com/vv//api/res/1.2/6js1HPFw1ePUfgrZdK0glw--/YXBwaWQ9bWFpbDtmaT1maWxsO2g9ODA7dz04MA--/https://lh5.googleusercontent.com/dgHcOP6Na3RcgR0rOHgRjiyos_MOtlk-WjCp__L2nIJX7vwaLQj3QQ=w1200-h630-p.cf.jpg]
 [https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.4_single.log



ceph-osd.5_single.log


[https://s.yimg.com/vv//api/res/1.2/TNJOwajiVQcd_mAnFDCqpQ--/YXBwaWQ9bWFpbDtmaT1maWxsO2g9ODA7dz04MA--/https://lh5.googleusercontent.com/KnCXt_G7jTuxtknlvz3gU5g_dozYNe_EwEdEwaAXoDAPf9bqZurrvw=w1200-h630-p.cf.jpg]
 [https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.5_single.log


ceph-osd.8_single.log


[https://s.yimg.com/nq/storm/assets/enhancrV2/23/logos/google.png]
ceph-osd.8_single.log


Regards,
Hong


On Monday, August 28, 2017 12:53 PM, Ronny Aasen  
wrote:


comments inline

On 28.08.2017 18:31, hjcho616 wrote:


I'll see what I can do on that... Looks like I may have to add another OSD host 
as I utilized all of the SATA ports on those boards. =P

Ronny,

I am running with size=2 min_size=1.  I created everything with ceph-deploy and 
didn't touch much of that pool settings...  I hope not, but sounds like I may 
have lost some files!  I do want some of those OSDs to come back online 
somehow... to get that confidence level up. =P


This is a bad idea as you have found out. once your cluster is healthy you 
should look at improving this.

The dead osd.3 message is probably me trying to stop and start the osd.  There 
were some cases where stop didn't kill the ceph-osd process.  I just started or 
restarted osd to try and see if that worked..  After that, there were some 
reboots and I am not seeing those messages after it...


when providing logs. try to move away the old one. do a single startup. and 
post that. it makes it easier to read when you have a single run in the file.


This is 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Tomasz,
Looks like when I did xfs_repair -L /dev/sdb1 it did something to partition 
table and I don't see /dev/sdb1 anymore... or maybe I missed 1 in the 
/dev/sdb1? =(. Yes.. that extra power outage did a pretty good damage... =P  I 
am hoping 0.007% is very small...=P  Any recommendations on fixing xfs 
partition I am missing? =)
Ronny,
Thank you for that link!
No I haven't done anything to osds... not touching them, hoping that I can 
revive some of them.. =)  Only thing done is trying to start and stop them..
Below are the links to newer files with just one start attempt. 
=)ceph-osd.3_single.log

  
|   
ceph-osd.3_single.log
  |  |

 
ceph-osd.4_single.log

  
||   
ceph-osd.4_single.log
  |  |

 

ceph-osd.5_single.log

  
||   
ceph-osd.5_single.log
  |  |

 
ceph-osd.8_single.log

  
|   
ceph-osd.8_single.log
  |  |

 
Regards,Hong 

On Monday, August 28, 2017 12:53 PM, Ronny Aasen 
 wrote:
 

  comments inline
 
 On 28.08.2017 18:31, hjcho616 wrote:
  
 
 
  I'll see what I can do on that... Looks like I may have to add another OSD 
host as I utilized all of the SATA ports on those boards. =P 
  Ronny, 
  I am running with size=2 min_size=1.  I created everything with ceph-deploy 
and didn't touch much of that pool settings...  I hope not, but sounds like I 
may have lost some files!  I do want some of those OSDs to come back online 
somehow... to get that confidence level up. =P 
   
 
 This is a bad idea as you have found out. once your cluster is healthy you 
should look at improving this.
 
 
  The dead osd.3 message is probably me trying to stop and start the osd.  
There were some cases where stop didn't kill the ceph-osd process.  I just 
started or restarted osd to try and see if that worked..  After that, there 
were some reboots and I am not seeing those messages after it... 
   
 
 when providing logs. try to move away the old one. do a single startup. and 
post that. it makes it easier to read when you have a single run in the file.
 
 
  
  This is something I am running at home.  I am the only user.  In a way it is 
production environment but just driven by me. =) 
  Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and 
osd.8 come back up without removing them?  I have a feeling I can get some data 
back with some of them intact.  
 
 just incase you are not able to make them run again, does not automatically 
mean the data is lost. i have successfully recovered lost object using these 
instructions  http://ceph.com/geen-categorie/incomplete-pgs-oh-my/  
 
 I would start by  renaming the osd's log file, do a single try at starting the 
osd. and posting that log. have you done anything to the osd's that could make 
them not run ? 
 
 kind regards
 Ronny Aasen
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


   ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Ronny Aasen

comments inline

On 28.08.2017 18:31, hjcho616 wrote:



I'll see what I can do on that... Looks like I may have to add another 
OSD host as I utilized all of the SATA ports on those boards. =P


Ronny,

I am running with size=2 min_size=1.  I created everything with 
ceph-deploy and didn't touch much of that pool settings...  I hope 
not, but sounds like I may have lost some files!  I do want some of 
those OSDs to come back online somehow... to get that confidence level 
up. =P




This is a bad idea as you have found out. once your cluster is healthy 
you should look at improving this.


The dead osd.3 message is probably me trying to stop and start the 
osd.  There were some cases where stop didn't kill the ceph-osd 
process.  I just started or restarted osd to try and see if that 
worked..  After that, there were some reboots and I am not seeing 
those messages after it...




when providing logs. try to move away the old one. do a single startup. 
and post that. it makes it easier to read when you have a single run in 
the file.




This is something I am running at home.  I am the only user.  In a way 
it is production environment but just driven by me. =)


Do you have any suggestions to get any of those osd.3, osd.4, osd.5, 
and osd.8 come back up without removing them?  I have a feeling I can 
get some data back with some of them intact.


just incase you are not able to make them run again, does not 
automatically mean the data is lost. i have successfully recovered lost 
object using these instructions 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/


I would start by  renaming the osd's log file, do a single try at 
starting the osd. and posting that log. have you done anything to the 
osd's that could make them not run ?


kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Tomasz Kusmierz
Sorry mate I’ve just noticed the 
"unfound (0.007%)”
I think that your main culprit here is osd.0. You need to have all osd’s on one 
host to get all the data back.

Also for time being I would just change size and min size down to 1 and try to 
figure out which osd you actually need to get all the data. Then try to fix 
your machine problems. From my experience regardless of solution, when you are 
in degraded mode and try to fix stuff things only get worse. 


> On 28 Aug 2017, at 17:31, hjcho616  wrote:
> 
> Thank you all for suggestions!
> 
> Maged,
> 
> I'll see what I can do on that... Looks like I may have to add another OSD 
> host as I utilized all of the SATA ports on those boards. =P
> 
> Ronny,
> 
> I am running with size=2 min_size=1.  I created everything with ceph-deploy 
> and didn't touch much of that pool settings...  I hope not, but sounds like I 
> may have lost some files!  I do want some of those OSDs to come back online 
> somehow... to get that confidence level up. =P
> 
> The dead osd.3 message is probably me trying to stop and start the osd.  
> There were some cases where stop didn't kill the ceph-osd process.  I just 
> started or restarted osd to try and see if that worked..  After that, there 
> were some reboots and I am not seeing those messages after it...
> 
> Tomasz,
> 
> This is something I am running at home.  I am the only user.  In a way it is 
> production environment but just driven by me. =)
> 
> Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and 
> osd.8 come back up without removing them?  I have a feeling I can get some 
> data back with some of them intact.
> 
> Thank you!
> 
> Regards,
> Hong
> 
> 
> On Monday, August 28, 2017 6:09 AM, Tomasz Kusmierz  
> wrote:
> 
> 
> Personally I would suggest to:
> - change minimal replication type to OSD (from default host)
> - remove the OSD from the host with all those "down OSD’s" (note that they 
> are down not out which makes it more weird)
> - let single node cluster stabilise, yes performance will suck but at least 
> you will have data on two copies on singular host … better this than nothing.
> - fix whatever issues you have on host OSD2 
> - add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will 
> make ceph migrate all data away from host OSD1
> - fix all the problem you’ve got on host OSD1 
> 
> reason I suggest that is that is seems that you’ve got issues everywhere and 
> since you are running a production environment (at least it seem like that to 
> me) data and down time is main priority.
> 
> > On 28 Aug 2017, at 11:58, Ronny Aasen  > > wrote:
> > 
> > On 28. aug. 2017 08:01, hjcho616 wrote:
> >> Hello!
> >> I've been using ceph for long time mostly for network CephFS storage, even 
> >> before Argonaut release!  It's been working very well for me.  Yes, I had 
> >> some power outtages before and asked few questions on this list before and 
> >> got resolved happily!  Thank you all!
> >> Not sure why but we've been having quite a bit of power outages lately.  
> >> Ceph appear to be running OK with those going on.. so I was pretty happy 
> >> and didn't thought much of it... till yesterday, When I started to move 
> >> some videos to cephfs, ceph decided that it was full although df showed 
> >> only 54% utilization!  Then I looked up, some of the osds were down! (only 
> >> 3 at that point!)
> >> I am running pretty simple ceph configuration... I have one machine 
> >> running MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 
> >> SSD for journal named OSD1 and OSD2.
> >> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's 
> >> log file and googled some of them... they appeared to be tied to version 
> >> 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't solve my 
> >> problems.. =P  While looking at some of this.. there was another power 
> >> outage!  D'oh!  I may need to invest in a UPS or something... Until this 
> >> happened, all of the osd down were from OSD2.  But OSD1 took a hit!  
> >> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L 
> >> /dev/sdb1 as suggested by command line.. I was able to mount it again, 
> >> phew, reboot... then /dev/sdb1 is no longer accessible!  N!!!
> >> So this is what I have today!  I am a bit concerned as half of the osds 
> >> are down!  and osd.0 doesn't look good at all...
> >> # ceph osd tree
> >> ID WEIGHT  TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
> >> -1 16.24478 root default
> >> -2  8.12239host OSD1
> >>  1  1.95250osd.1  up  1.0  1.0
> >>  0  1.95250osd.0down0  1.0
> >>  7  0.31239osd.7  up  1.0  1.0
> >>  6  1.95250osd.6  up  1.0  1.0
> >>  2  1.95250osd.2  up  1.0  1.0
> >> -3  8.12239host 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Thank you all for suggestions!
Maged,
I'll see what I can do on that... Looks like I may have to add another OSD host 
as I utilized all of the SATA ports on those boards. =P
Ronny,
I am running with size=2 min_size=1.  I created everything with ceph-deploy and 
didn't touch much of that pool settings...  I hope not, but sounds like I may 
have lost some files!  I do want some of those OSDs to come back online 
somehow... to get that confidence level up. =P
The dead osd.3 message is probably me trying to stop and start the osd.  There 
were some cases where stop didn't kill the ceph-osd process.  I just started or 
restarted osd to try and see if that worked..  After that, there were some 
reboots and I am not seeing those messages after it...
Tomasz,
This is something I am running at home.  I am the only user.  In a way it is 
production environment but just driven by me. =)
Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and osd.8 
come back up without removing them?  I have a feeling I can get some data back 
with some of them intact.
Thank you!
Regards,Hong 

On Monday, August 28, 2017 6:09 AM, Tomasz Kusmierz 
 wrote:
 

 Personally I would suggest to:
- change minimal replication type to OSD (from default host)
- remove the OSD from the host with all those "down OSD’s" (note that they are 
down not out which makes it more weird)
- let single node cluster stabilise, yes performance will suck but at least you 
will have data on two copies on singular host … better this than nothing.
- fix whatever issues you have on host OSD2 
- add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will make 
ceph migrate all data away from host OSD1
- fix all the problem you’ve got on host OSD1 

reason I suggest that is that is seems that you’ve got issues everywhere and 
since you are running a production environment (at least it seem like that to 
me) data and down time is main priority.

> On 28 Aug 2017, at 11:58, Ronny Aasen  wrote:
> 
> On 28. aug. 2017 08:01, hjcho616 wrote:
>> Hello!
>> I've been using ceph for long time mostly for network CephFS storage, even 
>> before Argonaut release!  It's been working very well for me.  Yes, I had 
>> some power outtages before and asked few questions on this list before and 
>> got resolved happily!  Thank you all!
>> Not sure why but we've been having quite a bit of power outages lately.  
>> Ceph appear to be running OK with those going on.. so I was pretty happy and 
>> didn't thought much of it... till yesterday, When I started to move some 
>> videos to cephfs, ceph decided that it was full although df showed only 54% 
>> utilization!  Then I looked up, some of the osds were down! (only 3 at that 
>> point!)
>> I am running pretty simple ceph configuration... I have one machine running 
>> MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 SSD for 
>> journal named OSD1 and OSD2.
>> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's 
>> log file and googled some of them... they appeared to be tied to version 
>> 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't solve my 
>> problems.. =P  While looking at some of this.. there was another power 
>> outage!  D'oh!  I may need to invest in a UPS or something... Until this 
>> happened, all of the osd down were from OSD2.  But OSD1 took a hit!  
>> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L /dev/sdb1 
>> as suggested by command line.. I was able to mount it again, phew, reboot... 
>> then /dev/sdb1 is no longer accessible!  N!!!
>> So this is what I have today!  I am a bit concerned as half of the osds are 
>> down!  and osd.0 doesn't look good at all...
>> # ceph osd tree
>> ID WEIGHT  TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 16.24478 root default
>> -2  8.12239    host OSD1
>>  1  1.95250        osd.1      up  1.0          1.0
>>  0  1.95250        osd.0    down        0          1.0
>>  7  0.31239        osd.7      up  1.0          1.0
>>  6  1.95250        osd.6      up  1.0          1.0
>>  2  1.95250        osd.2      up  1.0          1.0
>> -3  8.12239    host OSD2
>>  3  1.95250        osd.3    down        0          1.0
>>  4  1.95250        osd.4    down        0          1.0
>>  5  1.95250        osd.5    down        0          1.0
>>  8  1.95250        osd.8    down        0          1.0
>>  9  0.31239        osd.9      up  1.0          1.0
>> This looked alot better before that last extra power outage... =(  Can't 
>> mount it anymore!
>> # ceph health
>> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs 
>> backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 pgs 
>> inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 16 pgs 
>> stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 159 
>> pgs 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Ronny Aasen

On 28. aug. 2017 08:01, hjcho616 wrote:

Hello!

I've been using ceph for long time mostly for network CephFS storage, 
even before Argonaut release!  It's been working very well for me.  Yes, 
I had some power outtages before and asked few questions on this list 
before and got resolved happily!  Thank you all!


Not sure why but we've been having quite a bit of power outages lately. 
  Ceph appear to be running OK with those going on.. so I was pretty 
happy and didn't thought much of it... till yesterday, When I started to 
move some videos to cephfs, ceph decided that it was full although df 
showed only 54% utilization!  Then I looked up, some of the osds were 
down! (only 3 at that point!)


I am running pretty simple ceph configuration... I have one machine 
running MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 
SSD for journal named OSD1 and OSD2.


At the time, I was running jewel 10.2.2. I looked at some of downed 
OSD's log file and googled some of them... they appeared to be tied to 
version 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't 
solve my problems.. =P  While looking at some of this.. there was 
another power outage!  D'oh!  I may need to invest in a UPS or 
something... Until this happened, all of the osd down were from OSD2. 
  But OSD1 took a hit!  Couldn't boot, because osd-0 was damaged... I 
tried xfs_repair -L /dev/sdb1 as suggested by command line.. I was able 
to mount it again, phew, reboot... then /dev/sdb1 is no longer 
accessible!  N!!!


So this is what I have today!  I am a bit concerned as half of the osds 
are down!  and osd.0 doesn't look good at all...

# ceph osd tree
ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 16.24478 root default
-2  8.12239 host OSD1
  1  1.95250 osd.1  up  1.0  1.0
  0  1.95250 osd.0down0  1.0
  7  0.31239 osd.7  up  1.0  1.0
  6  1.95250 osd.6  up  1.0  1.0
  2  1.95250 osd.2  up  1.0  1.0
-3  8.12239 host OSD2
  3  1.95250 osd.3down0  1.0
  4  1.95250 osd.4down0  1.0
  5  1.95250 osd.5down0  1.0
  8  1.95250 osd.8down0  1.0
  9  0.31239 osd.9  up  1.0  1.0

This looked alot better before that last extra power outage... =(  Can't 
mount it anymore!

# ceph health
HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs 
backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 
pgs inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 
16 pgs stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck 
stale; 159 pgs stuck unclean; 102 pgs stuck undersized; 102 pgs 
undersized; 1 requests are blocked > 32 sec; recovery 1803466/4503980 
objects degraded (40.042%); recovery 692976/4503980 objects misplaced 
(15.386%); recovery 147/2251990 unfound (0.007%); 1 near full osd(s); 54 
scrub errors; mds cluster is degraded; no legacy OSD present but 
'sortbitwise' flag is not set


Each of osds are showing different failure signature.

I've uploaded osd log with debug osd = 20, debug filestore = 20, and 
debug ms = 20.  You can find it in below links.  Let me know if there is 
preferred way to share this!

https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc (ceph-osd.3.log)
https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc (ceph-osd.4.log)
https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k (ceph-osd.5.log)
https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE (ceph-osd.8.log)

So how does this look?  Can this be fixed? =)  If so please let me know. 
  I used to take backups but since it grew so big, I wasn't able to do 
so anymore... and would like to get most of these back if I can.  Please 
let me know if you need more info!


Thank you!

Regards,
Hong




with only 2 osd host. how are you doing replication ? i assume you use 
size=2, and that is somewhat ok, if you have min_size=2, but if you have 
min_size=1 it can quickly become a big problem of lost objects.


with size=2, min_size=2 your data should be on 2 drives safely(if you 
can get one of them running again), but your cluster will block when 
there is an issue.


if at all possible i would add a third osd node in your cluster. so your 
OK PG's can replicate to that and you can work on the down osd's without 
fear of loosing additional working osd's


Also some of your logs contains lines like...

failed to bind the UNIX domain socket to 
'/var/run/ceph/ceph-osd.3.asok': (17) File exists


filestore(/var/lib/ceph/osd/ceph-3) lock_fsid failed to lock 
/var/lib/ceph/osd/ceph-3/fsid, is another ceph-osd still running? (11) 
Resource temporarily unavailable


7faf16e23800 -1 osd.3 0 OSD::pre_init: object store 
'/var/lib/ceph/osd/ceph-3' is 

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Maged Mokhtar
I would suggest either adding 1 new disk on each of the 2 machines
increasing the osd_backfill_full_ratio to something like 90 or 92 from
default 85. 

/Maged  

On 2017-08-28 08:01, hjcho616 wrote:

> Hello! 
> 
> I've been using ceph for long time mostly for network CephFS storage, even 
> before Argonaut release!  It's been working very well for me.  Yes, I had 
> some power outtages before and asked few questions on this list before and 
> got resolved happily!  Thank you all! 
> 
> Not sure why but we've been having quite a bit of power outages lately.  Ceph 
> appear to be running OK with those going on.. so I was pretty happy and 
> didn't thought much of it... till yesterday, When I started to move some 
> videos to cephfs, ceph decided that it was full although df showed only 54% 
> utilization!  Then I looked up, some of the osds were down! (only 3 at that 
> point!) 
> 
> I am running pretty simple ceph configuration... I have one machine running 
> MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 SSD for 
> journal named OSD1 and OSD2. 
> 
> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's log 
> file and googled some of them... they appeared to be tied to version 10.2.2.  
> So I just upgraded all to 10.2.9.  Well that didn't solve my problems.. =P  
> While looking at some of this.. there was another power outage!  D'oh!  I may 
> need to invest in a UPS or something... Until this happened, all of the osd 
> down were from OSD2.  But OSD1 took a hit!  Couldn't boot, because osd-0 was 
> damaged... I tried xfs_repair -L /dev/sdb1 as suggested by command line.. I 
> was able to mount it again, phew, reboot... then /dev/sdb1 is no longer 
> accessible!  N!!! 
> 
> So this is what I have today!  I am a bit concerned as half of the osds are 
> down!  and osd.0 doesn't look good at all... 
> # ceph osd tree 
> ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY 
> -1 16.24478 root default 
> -2  8.12239 host OSD1 
> 1  1.95250 osd.1  up  1.0  1.0 
> 0  1.95250 osd.0down0  1.0 
> 7  0.31239 osd.7  up  1.0  1.0 
> 6  1.95250 osd.6  up  1.0  1.0 
> 2  1.95250 osd.2  up  1.0  1.0 
> -3  8.12239 host OSD2 
> 3  1.95250 osd.3down0  1.0 
> 4  1.95250 osd.4down0  1.0 
> 5  1.95250 osd.5down0  1.0 
> 8  1.95250 osd.8down0  1.0 
> 9  0.31239 osd.9  up  1.0  1.0 
> 
> This looked alot better before that last extra power outage... =(  Can't 
> mount it anymore! 
> # ceph health 
> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs 
> backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 pgs 
> inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 16 pgs 
> stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 159 
> pgs stuck unclean; 102 pgs stuck undersized; 102 pgs undersized; 1 requests 
> are blocked > 32 sec; recovery 1803466/4503980 objects degraded (40.042%); 
> recovery 692976/4503980 objects misplaced (15.386%); recovery 147/2251990 
> unfound (0.007%); 1 near full osd(s); 54 scrub errors; mds cluster is 
> degraded; no legacy OSD present but 'sortbitwise' flag is not set 
> 
> Each of osds are showing different failure signature.  
> 
> I've uploaded osd log with debug osd = 20, debug filestore = 20, and debug ms 
> = 20.  You can find it in below links.  Let me know if there is preferred way 
> to share this! 
> https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc 
> (ceph-osd.3.log) 
> https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc 
> (ceph-osd.4.log) 
> https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k 
> (ceph-osd.5.log) 
> https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE 
> (ceph-osd.8.log) 
> 
> So how does this look?  Can this be fixed? =)  If so please let me know.  I 
> used to take backups but since it grew so big, I wasn't able to do so 
> anymore... and would like to get most of these back if I can.  Please let me 
> know if you need more info! 
> 
> Thank you! 
> 
> Regards, 
> Hong 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com