Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Oliver Humpage

> On 1 Jun 2017, at 14:38, Steve Taylor  wrote:
> 
> I saw this on several servers, and it took a while to track down as you can 
> imagine. Same symptoms you're reporting.

Thanks, that’s very useful info. We’re using separate Adaptec controllers, but 
will double check firmware on them. Who knows, it may even be a read cache 
issue.

I think we’re OK with the kernel, running recent CentOS.

Cheers all,

Oliver.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Steve Taylor
I've seen similar issues in the past with 4U Supermicro servers populated with 
spinning disks. In my case it turned out to be a specific firmware+BIOS 
combination on the disk controller card that was buggy. I fixed it by updating 
the firmware and BIOS on the card to the latest versions.

I saw this on several servers, and it took a while to track down as you can 
imagine. Same symptoms you're reporting.

There was a data corruption problem a while back with the Linux kernel and 
Samsung 850 Pro drives, but your problem doesn't sound like data corruption. 
Still, I'd check to make sure the kernel version you're running has the fix.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.




On Thu, 2017-06-01 at 13:40 +0100, Oliver Humpage wrote:


On 1 Jun 2017, at 11:55, Matthew Vernon 
> wrote:

You don't say what's in kern.log - we've had (rotating) disks that were 
throwing read errors but still saying they were OK on SMART.



Fair point. There was nothing correlating to the time that ceph logged an error 
this morning, which is why I didn’t mention it, but looking harder I see 
yesterday there was a

May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Sense Key : Hardware Error 
[current]
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Add. Sense: Internal 
target failure
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 CDB: Read(10) 28 00 77 51 
42 d8 00 02 00 00
May 31 07:20:13 osd1 kernel: blk_update_request: critical target error, dev 
sdi, sector 2001814232

sdi was the disk with the OSD affected today. Guess it’s flakey SSDs then.

Weird that just re-reading the file makes everything OK though - wondering how 
much it’s worth worrying about that, or if there’s a way of making ceph retry 
reads automatically?

Oliver.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Oliver Humpage

> On 1 Jun 2017, at 11:55, Matthew Vernon  wrote:
> 
> You don't say what's in kern.log - we've had (rotating) disks that were 
> throwing read errors but still saying they were OK on SMART.

Fair point. There was nothing correlating to the time that ceph logged an error 
this morning, which is why I didn’t mention it, but looking harder I see 
yesterday there was a

May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Sense Key : Hardware Error 
[current] 
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Add. Sense: Internal 
target failure
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 CDB: Read(10) 28 00 77 51 
42 d8 00 02 00 00
May 31 07:20:13 osd1 kernel: blk_update_request: critical target error, dev 
sdi, sector 2001814232

sdi was the disk with the OSD affected today. Guess it’s flakey SSDs then. 

Weird that just re-reading the file makes everything OK though - wondering how 
much it’s worth worrying about that, or if there’s a way of making ceph retry 
reads automatically?

Oliver.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Matthew Vernon

Hi,

On 01/06/17 10:38, Oliver Humpage wrote:


These read errors are all on Samsung 850 Pro 2TB disks (journals are
on separate enterprise SSDs). The SMART status on all of them are
similar and show nothing out of the ordinary.

Has anyone else experienced anything similar? Is this just a curse of
non-enterprise SSDs, or do you think there might be something else
going on, e.g. could it be an XFS issue? Any suggestions as to what
to look at would be welcome.


You don't say what's in kern.log - we've had (rotating) disks that were 
throwing read errors but still saying they were OK on SMART.


Regards,

Matthew


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Read errors on OSD

2017-06-01 Thread Oliver Humpage

Hello,

We have a small cluster of 44 OSDs across 4 servers.

A few times a week, ceph health reports a pg is inconsistent. Looking at the 
relevant OSD’s logs, it always says "head candidate had a read error”. No other 
info, i.e. it’s not that the digest is wrong, it just has an I/O error. It’s 
usually a different OSD each time, so it’s not a specific 
disk/controller/server.

Manually running a deep scrub on the pg succeeds, and ceph health goes back to 
normal.

As a test today, before scrubbing the pg I found the relevant file in 
/var/lib/ceph/osd/… and cat(1)ed it. The first time I ran cat(1) on it I got an 
Input/output error. The second time I did it, however, it worked fine.

These read errors are all on Samsung 850 Pro 2TB disks (journals are on 
separate enterprise SSDs). The SMART status on all of them are similar and show 
nothing out of the ordinary.

Has anyone else experienced anything similar? Is this just a curse of 
non-enterprise SSDs, or do you think there might be something else going on, 
e.g. could it be an XFS issue? Any suggestions as to what to look at would be 
welcome.

Many thanks,

Oliver.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-06-02 Thread Nick Fisk
 
 On Sun, May 31, 2015 at 2:09 AM, Nick Fisk  wrote:
 
  Thanks for the suggestions. I will introduce the disk 1st and see if the 
  smart
 stats change from pending sectors to reallocated, if they don't then I will do
 the DD and smart test. It will be a good test as to what to do in this 
 situation
 as I have a feeling this will most likely happen again.
 
 Please post back when you have a result, I'd like to know the outcome.

Well the disk has finished rebalancing back into the cluster. The smart stats 
are not showing any pending sectors anymore, but strangely no reallocated ones 
either. I can only guess that when the drive tried to write to them again it 
succeeded without needing a remap???

I will continue to monitor the disk smart stats and see if I hit the same 
problem again.

 
 - 
 Robert LeBlanc
 GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1 -
 BEGIN PGP SIGNATURE-
 Version: Mailvelope v0.13.1
 Comment: https://www.mailvelope.com
 
 wsFcBAEBCAAQBQJVbLDTCRDmVDuy+mK58QAAFiwP/2EubdyL06YSNgSGyOr
 4
 +hWPTq530xvD/M6HNHb9xajQv8UGRF0uOM/FI/n1ln7ajDRbDGn/WazMgZD
 N
 uvCRpEtkw/OSRXiabBmPmKcACtMQbFADPMyDVR2130pmedN/pFHZFASy8X
 Cg
 IpnE5+Oj2+Fe8z1fXnwpHdutVE0I/BK+4vQAMuypVUwpv5jZ+Nd1NSOUbe7T
 q/x3vUQNEVpqSP5YCYYJJZOluAdmuvyAzsP1pMP42G920/F1KVVyyFG/ONnv
 0EtPNG7FrpMauT0OM9zhSkTkfF4rYdK1L9MqzsI0hDqYMijPXe+tcHrndM3s
 l+wU5ZsKpQ+6xy6Rgv6LJdvVrXME5twAgy6y8dBtOSwyJztc/77w+FT4xbDS
 wg2k9AH09uG3CehvTvkuPQQkyXtCT+4LYpeU5l9aMn1hPFh0iOJdBi7rPbOf
 17ERT+c0EPReZ+lSCwYEeVnd9iL8quE9AFEKYzDJnZCL2jDQY4Fr7JC2dyw/
 LF1CKk5WU78eQT4aS3AaV0wYG+UzPFeTj8cPeWtqBrQtgzkPjPzeG/7Kpsf3
 npWc/HQg7LB8rZAZ3ADRVE+KaJhuUsl1gRfk78bdGbTBDTpyeki7kywY6ODi
 +OUpUEPhyxkNr0OeD8eAQz2k+6/RJQfBFTeevuLRbMTlESGQnUpNVMk/1A7
 7
 yCPF
 =c0Vh
 -END PGP SIGNATURE-




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-06-02 Thread Gregory Farnum
On Sat, May 30, 2015 at 2:23 PM, Nick Fisk n...@fisk.me.uk wrote:

 Hi All,



 I was noticing poor performance on my cluster and when I went to investigate 
 I noticed OSD 29 was flapping up and down. On investigation it looks like it 
 has 2 pending sectors, kernel log is filled with the following



 end_request: critical medium error, dev sdk, sector 4483365656

 end_request: critical medium error, dev sdk, sector 4483365872



 I can see in the OSD logs that it looked like when the OSD was crashing it 
 was trying to scrub the PG, probably failing when the kernel passes up the 
 read error.



 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

 1: /usr/bin/ceph-osd() [0xacaf4a]

 2: (()+0x10340) [0x7fdc43032340]

 3: (gsignal()+0x39) [0x7fdc414d1cc9]

 4: (abort()+0x148) [0x7fdc414d50d8]

 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fdc41ddc6b5]

 6: (()+0x5e836) [0x7fdc41dda836]

 7: (()+0x5e863) [0x7fdc41dda863]

 8: (()+0x5eaa2) [0x7fdc41ddaaa2]

 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
 const*)+0x278) [0xbc2908]

 10: (FileStore::read(coll_t, ghobject_t const, unsigned long, unsigned long, 
 ceph::buffer::list, unsigned int, bool)+0xc98) [0x9168e

 8]

 11: (ReplicatedBackend::be_deep_scrub(hobject_t const, unsigned int, 
 ScrubMap::object, ThreadPool::TPHandle)+0x2f9) [0xa05bf9]

 12: (PGBackend::be_scan_list(ScrubMap, std::vectorhobject_t, 
 std::allocatorhobject_t  const, bool, unsigned int, ThreadPool::TPH

 andle)+0x2c8) [0x8dab98]

 13: (PG::build_scrub_map_chunk(ScrubMap, hobject_t, hobject_t, bool, 
 unsigned int, ThreadPool::TPHandle)+0x1fa) [0x7f099a]

 14: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle)+0x4a2) [0x7f1132]

 15: (OSD::RepScrubWQ::_process(MOSDRepScrub*, ThreadPool::TPHandle)+0xbe) 
 [0x6e583e]

 16: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb38ae]

 17: (ThreadPool::WorkThread::entry()+0x10) [0xbb4950]

 18: (()+0x8182) [0x7fdc4302a182]

 19: (clone()+0x6d) [0x7fdc4159547d]



 Few questions:

 1.   Is this the expected behaviour, or should Ceph try and do something 
 to either keep the OSD down or rewrite the sector to cause a sector remap?

So the OSD is committing suicide and we want it to stay dead. But the
init system is restarting it. We are actually discussing how that
should change right now, but aren't quite sure what the right settings
are: http://tracker.ceph.com/issues/11798

Presuming you still have the logs, how long was the cycle time for it
to suicide, restart, and suicide again?


 2.   I am monitoring smart stats, but is there any other way of picking 
 this up or getting Ceph to highlight it? Something like a flapping OSD 
 notification would be nice.

 3.   I’m assuming at this stage this disk will not be replaceable under 
 warranty, am I best to mark it as out, let it drain and then re-introduce it 
 again, which should overwrite the sector and cause a remap? Or is there a 
 better way?

I'm not really sure about these ones. I imagine most users are
covering it via nagios monitoring of the processes themselves?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-06-02 Thread Nick Fisk




 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: 02 June 2015 18:34
 To: Nick Fisk
 Cc: ceph-users
 Subject: Re: [ceph-users] Read Errors and OSD Flapping
 
 On Sat, May 30, 2015 at 2:23 PM, Nick Fisk n...@fisk.me.uk wrote:
 
  Hi All,
 
 
 
  I was noticing poor performance on my cluster and when I went to
  investigate I noticed OSD 29 was flapping up and down. On
  investigation it looks like it has 2 pending sectors, kernel log is
  filled with the following
 
 
 
  end_request: critical medium error, dev sdk, sector 4483365656
 
  end_request: critical medium error, dev sdk, sector 4483365872
 
 
 
  I can see in the OSD logs that it looked like when the OSD was crashing it
 was trying to scrub the PG, probably failing when the kernel passes up the
 read error.
 
 
 
  ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 
  1: /usr/bin/ceph-osd() [0xacaf4a]
 
  2: (()+0x10340) [0x7fdc43032340]
 
  3: (gsignal()+0x39) [0x7fdc414d1cc9]
 
  4: (abort()+0x148) [0x7fdc414d50d8]
 
  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fdc41ddc6b5]
 
  6: (()+0x5e836) [0x7fdc41dda836]
 
  7: (()+0x5e863) [0x7fdc41dda863]
 
  8: (()+0x5eaa2) [0x7fdc41ddaaa2]
 
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
  const*)+0x278) [0xbc2908]
 
  10: (FileStore::read(coll_t, ghobject_t const, unsigned long,
  unsigned long, ceph::buffer::list, unsigned int, bool)+0xc98)
  [0x9168e
 
  8]
 
  11: (ReplicatedBackend::be_deep_scrub(hobject_t const, unsigned int,
  ScrubMap::object, ThreadPool::TPHandle)+0x2f9) [0xa05bf9]
 
  12: (PGBackend::be_scan_list(ScrubMap, std::vectorhobject_t,
  std::allocatorhobject_t  const, bool, unsigned int,
  ThreadPool::TPH
 
  andle)+0x2c8) [0x8dab98]
 
  13: (PG::build_scrub_map_chunk(ScrubMap, hobject_t, hobject_t, bool,
  unsigned int, ThreadPool::TPHandle)+0x1fa) [0x7f099a]
 
  14: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle)+0x4a2)
  [0x7f1132]
 
  15: (OSD::RepScrubWQ::_process(MOSDRepScrub*,
  ThreadPool::TPHandle)+0xbe) [0x6e583e]
 
  16: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb38ae]
 
  17: (ThreadPool::WorkThread::entry()+0x10) [0xbb4950]
 
  18: (()+0x8182) [0x7fdc4302a182]
 
  19: (clone()+0x6d) [0x7fdc4159547d]
 
 
 
  Few questions:
 
  1.   Is this the expected behaviour, or should Ceph try and do something
 to either keep the OSD down or rewrite the sector to cause a sector remap?
 
 So the OSD is committing suicide and we want it to stay dead. But the init
 system is restarting it. We are actually discussing how that should change
 right now, but aren't quite sure what the right settings
 are: http://tracker.ceph.com/issues/11798
 
 Presuming you still have the logs, how long was the cycle time for it to
 suicide, restart, and suicide again?

Just looking through a few examples of it. It looks like it took about 2 
seconds from suicide to restart and then about 5 minutes till it died again.

I have taken a copy of the log, let me know if it's of any use to you.

 
 
  2.   I am monitoring smart stats, but is there any other way of picking 
  this
 up or getting Ceph to highlight it? Something like a flapping OSD notification
 would be nice.
 
  3.   I’m assuming at this stage this disk will not be replaceable under
 warranty, am I best to mark it as out, let it drain and then re-introduce it
 again, which should overwrite the sector and cause a remap? Or is there a
 better way?
 
 I'm not really sure about these ones. I imagine most users are covering it via
 nagios monitoring of the processes themselves?


 -Greg




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-06-01 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On Sun, May 31, 2015 at 2:09 AM, Nick Fisk  wrote:

 Thanks for the suggestions. I will introduce the disk 1st and see if the 
 smart stats change from pending sectors to reallocated, if they don't then I 
 will do the DD and smart test. It will be a good test as to what to do in 
 this situation as I have a feeling this will most likely happen again.

Please post back when you have a result, I'd like to know the outcome.

- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
-BEGIN PGP SIGNATURE-
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVbLDTCRDmVDuy+mK58QAAFiwP/2EubdyL06YSNgSGyOr4
+hWPTq530xvD/M6HNHb9xajQv8UGRF0uOM/FI/n1ln7ajDRbDGn/WazMgZDN
uvCRpEtkw/OSRXiabBmPmKcACtMQbFADPMyDVR2130pmedN/pFHZFASy8XCg
IpnE5+Oj2+Fe8z1fXnwpHdutVE0I/BK+4vQAMuypVUwpv5jZ+Nd1NSOUbe7T
q/x3vUQNEVpqSP5YCYYJJZOluAdmuvyAzsP1pMP42G920/F1KVVyyFG/ONnv
0EtPNG7FrpMauT0OM9zhSkTkfF4rYdK1L9MqzsI0hDqYMijPXe+tcHrndM3s
l+wU5ZsKpQ+6xy6Rgv6LJdvVrXME5twAgy6y8dBtOSwyJztc/77w+FT4xbDS
wg2k9AH09uG3CehvTvkuPQQkyXtCT+4LYpeU5l9aMn1hPFh0iOJdBi7rPbOf
17ERT+c0EPReZ+lSCwYEeVnd9iL8quE9AFEKYzDJnZCL2jDQY4Fr7JC2dyw/
LF1CKk5WU78eQT4aS3AaV0wYG+UzPFeTj8cPeWtqBrQtgzkPjPzeG/7Kpsf3
npWc/HQg7LB8rZAZ3ADRVE+KaJhuUsl1gRfk78bdGbTBDTpyeki7kywY6ODi
+OUpUEPhyxkNr0OeD8eAQz2k+6/RJQfBFTeevuLRbMTlESGQnUpNVMk/1A77
yCPF
=c0Vh
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-05-31 Thread Nick Fisk
  Few questions:
 
  1.   Is this the expected behaviour, or should Ceph try and do
  something to either keep the OSD down or rewrite the sector to cause a
  sector remap?
 
 I guess what you see is what you get, but both things, especially the rewrite
 would be better.
 Alas I suppose it is a bit of work for it to do the right thing there 
 (getting the
 replica to rewrite things with from another node) AND to be certain that this
 wasn't the last good replica, read error or not.

Agreed, it's probably best Ceph doesn't do something unless its 100% sure it 
has the correct data before overwriting. But would be really nice if something 
could be done. 

 
  2.   I am monitoring smart stats, but is there any other way of
  picking this up or getting Ceph to highlight it? Something like a
  flapping OSD notification would be nice.
 
 Lots of improvement opportunities in the Ceph status indeed.
 Starting with what constitutes which level (ERR, WRN, INF).

Or maybe a counter somewhere that monitors read errors, this could help with #1 
where Ceph could say if I've tried 10 times to read with no luck then 
overwrite/delete

 
  3.   I'm assuming at this stage this disk will not be replaceable
  under warranty, am I best to mark it as out, let it drain and then
  re-introduce it again, which should overwrite the sector and cause a
  remap? Or is there a better way?
 
 That's the safe, easy way. Might want to add a dd zeroing the drive and long
 SMART test afterwards for good measure before re-adding it.
 
 A faster way might be to determine which PG, file is affected just rewrite
 this, preferably even with a good copy of the data.
 After that a deep-scrub of that PG, potentially doing a manual repair if this
 was the acting one.

Thanks for the suggestions. I will introduce the disk 1st and see if the smart 
stats change from pending sectors to reallocated, if they don't then I will do 
the DD and smart test. It will be a good test as to what to do in this 
situation as I have a feeling this will most likely happen again.

 
 Christian
 
 
  Many Thanks,
 
  Nick
 
 
 
 
 
 
 --
 Christian BalzerNetwork/Systems Engineer
 ch...@gol.com Global OnLine Japan/Fusion Communications
 http://www.gol.com/




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Read Errors and OSD Flapping

2015-05-30 Thread Christian Balzer

Hello,

On Sat, 30 May 2015 22:23:22 +0100 Nick Fisk wrote:

 Hi All,
 
  
 
 I was noticing poor performance on my cluster and when I went to
 investigate I noticed OSD 29 was flapping up and down. On investigation
 it looks like it has 2 pending sectors, kernel log is filled with the
 following
 
  
 
 end_request: critical medium error, dev sdk, sector 4483365656
 
 end_request: critical medium error, dev sdk, sector 4483365872
 
  
 
 I can see in the OSD logs that it looked like when the OSD was crashing
 it was trying to scrub the PG, probably failing when the kernel passes
 up the read error. 
 
  
 
 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 
 1: /usr/bin/ceph-osd() [0xacaf4a]
 
 2: (()+0x10340) [0x7fdc43032340]
 
 3: (gsignal()+0x39) [0x7fdc414d1cc9]
 
 4: (abort()+0x148) [0x7fdc414d50d8]
 
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fdc41ddc6b5]
 
 6: (()+0x5e836) [0x7fdc41dda836]
 
 7: (()+0x5e863) [0x7fdc41dda863]
 
 8: (()+0x5eaa2) [0x7fdc41ddaaa2]
 
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
 const*)+0x278) [0xbc2908]
 
 10: (FileStore::read(coll_t, ghobject_t const, unsigned long, unsigned
 long, ceph::buffer::list, unsigned int, bool)+0xc98) [0x9168e
 
 8]
 
 11: (ReplicatedBackend::be_deep_scrub(hobject_t const, unsigned int,
 ScrubMap::object, ThreadPool::TPHandle)+0x2f9) [0xa05bf9]
 
 12: (PGBackend::be_scan_list(ScrubMap, std::vectorhobject_t,
 std::allocatorhobject_t  const, bool, unsigned int, ThreadPool::TPH
 
 andle)+0x2c8) [0x8dab98]
 
 13: (PG::build_scrub_map_chunk(ScrubMap, hobject_t, hobject_t, bool,
 unsigned int, ThreadPool::TPHandle)+0x1fa) [0x7f099a]
 
 14: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle)+0x4a2)
 [0x7f1132]
 
 15: (OSD::RepScrubWQ::_process(MOSDRepScrub*,
 ThreadPool::TPHandle)+0xbe) [0x6e583e]
 
 16: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb38ae]
 
 17: (ThreadPool::WorkThread::entry()+0x10) [0xbb4950]
 
 18: (()+0x8182) [0x7fdc4302a182]
 
 19: (clone()+0x6d) [0x7fdc4159547d]
 
  
 
 Few questions: 
 
 1.   Is this the expected behaviour, or should Ceph try and do
 something to either keep the OSD down or rewrite the sector to cause a
 sector remap?
 
I guess what you see is what you get, but both things, especially the
rewrite would be better.
Alas I suppose it is a bit of work for it to do the right thing there
(getting the replica to rewrite things with from another node) AND to be
certain that this wasn't the last good replica, read error or not. 

 2.   I am monitoring smart stats, but is there any other way of
 picking this up or getting Ceph to highlight it? Something like a
 flapping OSD notification would be nice.
 
Lots of improvement opportunities in the Ceph status indeed. 
Starting with what constitutes which level (ERR, WRN, INF).

 3.   I'm assuming at this stage this disk will not be replaceable
 under warranty, am I best to mark it as out, let it drain and then
 re-introduce it again, which should overwrite the sector and cause a
 remap? Or is there a better way?

That's the safe, easy way. Might want to add a dd zeroing the drive and
long SMART test afterwards for good measure before re-adding it.

A faster way might be to determine which PG, file is affected just rewrite
this, preferably even with a good copy of the data. 
After that a deep-scrub of that PG, potentially doing a manual repair if
this was the acting one.
 
Christian
  
 
 Many Thanks,
 
 Nick
 
 
 
 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Read Errors and OSD Flapping

2015-05-30 Thread Nick Fisk
Hi All,

 

I was noticing poor performance on my cluster and when I went to investigate
I noticed OSD 29 was flapping up and down. On investigation it looks like it
has 2 pending sectors, kernel log is filled with the following

 

end_request: critical medium error, dev sdk, sector 4483365656

end_request: critical medium error, dev sdk, sector 4483365872

 

I can see in the OSD logs that it looked like when the OSD was crashing it
was trying to scrub the PG, probably failing when the kernel passes up the
read error. 

 

ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

1: /usr/bin/ceph-osd() [0xacaf4a]

2: (()+0x10340) [0x7fdc43032340]

3: (gsignal()+0x39) [0x7fdc414d1cc9]

4: (abort()+0x148) [0x7fdc414d50d8]

5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fdc41ddc6b5]

6: (()+0x5e836) [0x7fdc41dda836]

7: (()+0x5e863) [0x7fdc41dda863]

8: (()+0x5eaa2) [0x7fdc41ddaaa2]

9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x278) [0xbc2908]

10: (FileStore::read(coll_t, ghobject_t const, unsigned long, unsigned
long, ceph::buffer::list, unsigned int, bool)+0xc98) [0x9168e

8]

11: (ReplicatedBackend::be_deep_scrub(hobject_t const, unsigned int,
ScrubMap::object, ThreadPool::TPHandle)+0x2f9) [0xa05bf9]

12: (PGBackend::be_scan_list(ScrubMap, std::vectorhobject_t,
std::allocatorhobject_t  const, bool, unsigned int, ThreadPool::TPH

andle)+0x2c8) [0x8dab98]

13: (PG::build_scrub_map_chunk(ScrubMap, hobject_t, hobject_t, bool,
unsigned int, ThreadPool::TPHandle)+0x1fa) [0x7f099a]

14: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle)+0x4a2)
[0x7f1132]

15: (OSD::RepScrubWQ::_process(MOSDRepScrub*, ThreadPool::TPHandle)+0xbe)
[0x6e583e]

16: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb38ae]

17: (ThreadPool::WorkThread::entry()+0x10) [0xbb4950]

18: (()+0x8182) [0x7fdc4302a182]

19: (clone()+0x6d) [0x7fdc4159547d]

 

Few questions: 

1.   Is this the expected behaviour, or should Ceph try and do something
to either keep the OSD down or rewrite the sector to cause a sector remap?

2.   I am monitoring smart stats, but is there any other way of picking
this up or getting Ceph to highlight it? Something like a flapping OSD
notification would be nice.

3.   I'm assuming at this stage this disk will not be replaceable under
warranty, am I best to mark it as out, let it drain and then re-introduce it
again, which should overwrite the sector and cause a remap? Or is there a
better way?

 

Many Thanks,

Nick




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com