Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-31 Thread Lionel Bouton
On 31/05/2018 14:41, Simon Ironside wrote:
> On 24/05/18 19:21, Lionel Bouton wrote:
>
>> Unfortunately I just learned that Supermicro found an incompatibility
>> between this motherboard and SM863a SSDs (I don't have more information
>> yet) and they proposed S4600 as an alternative. I immediately remembered
>> that there were problems and asked for a delay/more information and dug
>> out this old thread.
>
> In case it helps you, I'm about to go down the same Supermicro EPYC
> and SM863a path as you. I asked about the incompatibility you
> mentioned and they knew what I was referring to. The incompatibility
> is between the on-board SATA controller and the SM863a and has
> apparently already been fixed.

That's good news.

> Even if not fixed, the incompatibility wouldn't be present if you're
> using a RAID controller instead of the on board SATA (which I intend
> to - don't know if you were?).

I wasn't : we plan to use the 14 on board SATA connectors. As long as we
can we use a standard SATA/AHCI controller as they cause less headaches
than RAID controllers even in HBA mode.

Thanks a lot for this information, I've forwarded it to our Supermicro
reseller.

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-31 Thread Simon Ironside

On 24/05/18 19:21, Lionel Bouton wrote:


Unfortunately I just learned that Supermicro found an incompatibility
between this motherboard and SM863a SSDs (I don't have more information
yet) and they proposed S4600 as an alternative. I immediately remembered
that there were problems and asked for a delay/more information and dug
out this old thread.


In case it helps you, I'm about to go down the same Supermicro EPYC and 
SM863a path as you. I asked about the incompatibility you mentioned and 
they knew what I was referring to. The incompatibility is between the 
on-board SATA controller and the SM863a and has apparently already been 
fixed. Even if not fixed, the incompatibility wouldn't be present if 
you're using a RAID controller instead of the on board SATA (which I 
intend to - don't know if you were?).


HTH,
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-29 Thread Simon Ironside

On 24/05/18 19:21, Lionel Bouton wrote:


Has anyone successfully used Ceph with S4600 ? If so could you share if
you used filestore or bluestore, which firmware was used and
approximately how much data was written on the most used SSDs ?


I have 4 new OSD nodes which have 480GB S4600s (Firmware revision: 
SCV10100) as journals for spinning disk Hammer Filestore OSDs. They're 
relatively new but have been in production for a couple of months 
without issue, touch wood.


My monitors are using relatively new (< 2 months old) 240GB S4500s 
(Firmware revision: SCV10121) again without issue to date.


Was there any conclusion to this? Was the OP just unlucky? I note that 
Red Hat specifically recommend S4600s here so David's story is a heck of 
a shock:


https://www.redhat.com/cms/managed-files/st-ceph-storage-intel-configuration-guide-technology-detail-f11532-201804-en.pdf

Regards,
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-24 Thread David Turner
I have some bluestore DC S4500's in my 3 node home cluster.  I haven't ever
had any problems with it.  I've used them with an EC cache tier, cephfs
metadata, and VM RBDs.

On Thu, May 24, 2018 at 2:21 PM Lionel Bouton 
wrote:

> Hi,
>
> On 22/02/2018 23:32, Mike Lovell wrote:
> > hrm. intel has, until a year ago, been very good with ssds. the
> > description of your experience definitely doesn't inspire confidence.
> > intel also dropping the entire s3xxx and p3xxx series last year before
> > having a viable replacement has been driving me nuts.
> >
> > i don't know that i have the luxury of being able to return all of the
> > ones i have or just buying replacements. i'm going to need to at least
> > try them in production. it'll probably happen with the s4600 limited
> > to a particular fault domain. these are also going to be filestore
> > osds so maybe that will result in a different behavior. i'll try to
> > post updates as i have them.
>
> Sorry for the deep digging into the archives. I might be in a situation
> where I could get S4600 (with filestore initially but I would very much
> like them to support Bluestore without bursting into flames).
>
> To expand a Ceph cluster and test EPYC in our context we have ordered a
> server based on a Supermicro EPYC motherboard and SM863a SSDs. For
> reference :
> https://www.supermicro.nl/Aplus/motherboard/EPYC7000/H11DSU-iN.cfm
>
> Unfortunately I just learned that Supermicro found an incompatibility
> between this motherboard and SM863a SSDs (I don't have more information
> yet) and they proposed S4600 as an alternative. I immediately remembered
> that there were problems and asked for a delay/more information and dug
> out this old thread.
>
> Has anyone successfully used Ceph with S4600 ? If so could you share if
> you used filestore or bluestore, which firmware was used and
> approximately how much data was written on the most used SSDs ?
>
> Best regards,
>
> Lionel
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-24 Thread Lionel Bouton
Hi,

On 22/02/2018 23:32, Mike Lovell wrote:
> hrm. intel has, until a year ago, been very good with ssds. the
> description of your experience definitely doesn't inspire confidence.
> intel also dropping the entire s3xxx and p3xxx series last year before
> having a viable replacement has been driving me nuts.
>
> i don't know that i have the luxury of being able to return all of the
> ones i have or just buying replacements. i'm going to need to at least
> try them in production. it'll probably happen with the s4600 limited
> to a particular fault domain. these are also going to be filestore
> osds so maybe that will result in a different behavior. i'll try to
> post updates as i have them.

Sorry for the deep digging into the archives. I might be in a situation
where I could get S4600 (with filestore initially but I would very much
like them to support Bluestore without bursting into flames).

To expand a Ceph cluster and test EPYC in our context we have ordered a
server based on a Supermicro EPYC motherboard and SM863a SSDs. For
reference :
https://www.supermicro.nl/Aplus/motherboard/EPYC7000/H11DSU-iN.cfm

Unfortunately I just learned that Supermicro found an incompatibility
between this motherboard and SM863a SSDs (I don't have more information
yet) and they proposed S4600 as an alternative. I immediately remembered
that there were problems and asked for a delay/more information and dug
out this old thread.

Has anyone successfully used Ceph with S4600 ? If so could you share if
you used filestore or bluestore, which firmware was used and
approximately how much data was written on the most used SSDs ?

Best regards,

Lionel

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-23 Thread Caspar Smit
Hi all,

Thanks for all your follow ups on this. The Samsung SM863a is indeed a very
good alternative, thanks!
We ordered both (SM863a & DC S4600) so we can compare.

Intel's response (I mean the lack of it) is not very promising. Allthough
we have very good experiences with Intel DC SSD's we still want to give
them a chance.
Hopefully the SCV10111 firmware has fixed the issue! (The changelog for the
firmware doesn't mention any major problem fixed though, only 'bugfixes')

Will let you know the results (probably in a month or two).

Kind regards,
Caspar

2018-02-23 0:18 GMT+01:00 Mike Lovell :

> adding ceph-users back on.
>
> it sounds like the enterprise samsungs and hitachis have been mentioned
> on the list as alternatives. i have 2 micron 5200 (pro i think) that i'm
> beginning testing on and have some micron 9100 nvme drives to use as
> journals. so the enterprise micron might be good. i did try some micron
> m600s a couple years ago and was disappointed by them so i'm avoiding the
> "prosumer" ones from micron if i can. my use case has been the 1TB range
> ssds and am using them mainly as a cache tier and filestore. my needs might
> not line up closely with yours though.
>
> mike
>
> On Thu, Feb 22, 2018 at 3:58 PM, Hans Chris Jones <
> chris.jo...@lambdastack.io> wrote:
>
>> Interesting. This does not inspire confidence. What SSDs (2TB or 4TB) do
>> people have good success with in high use production systems with bluestore?
>>
>> Thanks
>>
>> On Thu, Feb 22, 2018 at 5:32 PM, Mike Lovell 
>> wrote:
>>
>>> hrm. intel has, until a year ago, been very good with ssds. the
>>> description of your experience definitely doesn't inspire confidence. intel
>>> also dropping the entire s3xxx and p3xxx series last year before having a
>>> viable replacement has been driving me nuts.
>>>
>>> i don't know that i have the luxury of being able to return all of the
>>> ones i have or just buying replacements. i'm going to need to at least try
>>> them in production. it'll probably happen with the s4600 limited to a
>>> particular fault domain. these are also going to be filestore osds so maybe
>>> that will result in a different behavior. i'll try to post updates as i
>>> have them.
>>>
>>> mike
>>>
>>> On Thu, Feb 22, 2018 at 2:33 PM, David Herselman  wrote:
>>>
 Hi Mike,



 I eventually got hold of a customer relations manager at Intel but his
 attitude was lack luster and Intel never officially responded to any
 correspondence we sent them. The Intel s4600 drives all passed our standard
 burn-in tests, they exclusively appear to fail once they handle production
 BlueStore usage, generally after a couple days use.



 Intel really didn’t seem interested, even after explaining that the
 drives were in different physical systems in different data centres and
 that I had been in contact with another Intel customer who had experienced
 similar failures in Dell equipment (our servers are pure Intel).





 Perhaps there’s interest in a Lawyer picking up the issue and their
 attitude. Not advising customers of a known issue which leads to data loss
 is simply negligent, especially on a product that they tout as being more
 reliable than spinners and has their Data Centre reliability stamp.



 I returned the lot and am done with Intel SSDs, will advise as many
 customers and peers to do the same…





 Regards

 David Herselman



>>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell
adding ceph-users back on.

it sounds like the enterprise samsungs and hitachis have been mentioned on
the list as alternatives. i have 2 micron 5200 (pro i think) that i'm
beginning testing on and have some micron 9100 nvme drives to use as
journals. so the enterprise micron might be good. i did try some micron
m600s a couple years ago and was disappointed by them so i'm avoiding the
"prosumer" ones from micron if i can. my use case has been the 1TB range
ssds and am using them mainly as a cache tier and filestore. my needs might
not line up closely with yours though.

mike

On Thu, Feb 22, 2018 at 3:58 PM, Hans Chris Jones <
chris.jo...@lambdastack.io> wrote:

> Interesting. This does not inspire confidence. What SSDs (2TB or 4TB) do
> people have good success with in high use production systems with bluestore?
>
> Thanks
>
> On Thu, Feb 22, 2018 at 5:32 PM, Mike Lovell 
> wrote:
>
>> hrm. intel has, until a year ago, been very good with ssds. the
>> description of your experience definitely doesn't inspire confidence. intel
>> also dropping the entire s3xxx and p3xxx series last year before having a
>> viable replacement has been driving me nuts.
>>
>> i don't know that i have the luxury of being able to return all of the
>> ones i have or just buying replacements. i'm going to need to at least try
>> them in production. it'll probably happen with the s4600 limited to a
>> particular fault domain. these are also going to be filestore osds so maybe
>> that will result in a different behavior. i'll try to post updates as i
>> have them.
>>
>> mike
>>
>> On Thu, Feb 22, 2018 at 2:33 PM, David Herselman  wrote:
>>
>>> Hi Mike,
>>>
>>>
>>>
>>> I eventually got hold of a customer relations manager at Intel but his
>>> attitude was lack luster and Intel never officially responded to any
>>> correspondence we sent them. The Intel s4600 drives all passed our standard
>>> burn-in tests, they exclusively appear to fail once they handle production
>>> BlueStore usage, generally after a couple days use.
>>>
>>>
>>>
>>> Intel really didn’t seem interested, even after explaining that the
>>> drives were in different physical systems in different data centres and
>>> that I had been in contact with another Intel customer who had experienced
>>> similar failures in Dell equipment (our servers are pure Intel).
>>>
>>>
>>>
>>>
>>>
>>> Perhaps there’s interest in a Lawyer picking up the issue and their
>>> attitude. Not advising customers of a known issue which leads to data loss
>>> is simply negligent, especially on a product that they tout as being more
>>> reliable than spinners and has their Data Centre reliability stamp.
>>>
>>>
>>>
>>> I returned the lot and am done with Intel SSDs, will advise as many
>>> customers and peers to do the same…
>>>
>>>
>>>
>>>
>>>
>>> Regards
>>>
>>> David Herselman
>>>
>>>
>>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell
hrm. intel has, until a year ago, been very good with ssds. the description
of your experience definitely doesn't inspire confidence. intel also
dropping the entire s3xxx and p3xxx series last year before having a viable
replacement has been driving me nuts.

i don't know that i have the luxury of being able to return all of the ones
i have or just buying replacements. i'm going to need to at least try them
in production. it'll probably happen with the s4600 limited to a particular
fault domain. these are also going to be filestore osds so maybe that will
result in a different behavior. i'll try to post updates as i have them.

mike

On Thu, Feb 22, 2018 at 2:33 PM, David Herselman <d...@syrex.co> wrote:

> Hi Mike,
>
>
>
> I eventually got hold of a customer relations manager at Intel but his
> attitude was lack luster and Intel never officially responded to any
> correspondence we sent them. The Intel s4600 drives all passed our standard
> burn-in tests, they exclusively appear to fail once they handle production
> BlueStore usage, generally after a couple days use.
>
>
>
> Intel really didn’t seem interested, even after explaining that the drives
> were in different physical systems in different data centres and that I had
> been in contact with another Intel customer who had experienced similar
> failures in Dell equipment (our servers are pure Intel).
>
>
>
>
>
> Perhaps there’s interest in a Lawyer picking up the issue and their
> attitude. Not advising customers of a known issue which leads to data loss
> is simply negligent, especially on a product that they tout as being more
> reliable than spinners and has their Data Centre reliability stamp.
>
>
>
> I returned the lot and am done with Intel SSDs, will advise as many
> customers and peers to do the same…
>
>
>
>
>
> Regards
>
> David Herselman
>
>
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Mike Lovell
> *Sent:* Thursday, 22 February 2018 11:19 PM
> *To:* ceph-users@lists.ceph.com
>
> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
> activate pgs?
>
>
>
> has anyone tried with the most recent firmwares from intel? i've had a
> number of s4600 960gb drives that have been waiting for me to get around to
> adding them to a ceph cluster. this as well as having 2 die almost
> simultaneously in a different storage box is giving me pause. i noticed
> that David listed some output showing his ssds were running firmware
> version SCV10100. the drives i have came with the same one. it looks
> like SCV10111 is available through the latest isdct package. i'm working
> through upgrading mine and attempting some burn in testing. just curious if
> anyone has had any luck there.
>
>
>
> mike
>
>
>
> On Thu, Feb 22, 2018 at 9:49 AM, Chris Sarginson <csarg...@gmail.com>
> wrote:
>
> Hi Caspar,
>
>
>
> Sean and I replaced the problematic DC S4600 disks (after all but one had
> failed) in our cluster with Samsung SM863a disks.
>
> There was an NDA for new Intel firmware (as mentioned earlier in the
> thread by David) but given the problems we were experiencing we moved all
> Intel disks to a single failure domain but were unable to get to deploy
> additional firmware to test.
>
>
> The Samsung should fit your requirements.
>
>
>
> http://www.samsung.com/semiconductor/minisite/ssd/
> product/enterprise/sm863a/
>
>
>
> Regards
>
> Chris
>
>
>
> On Thu, 22 Feb 2018 at 12:50 Caspar Smit <caspars...@supernas.eu> wrote:
>
> Hi Sean and David,
>
>
>
> Do you have any follow ups / news on the Intel DC S4600 case? We are
> looking into this drives to use as DB/WAL devices for a new to be build
> cluster.
>
>
>
> Did Intel provide anything (like new firmware) which should fix the issues
> you were having or are these drives still unreliable?
>
>
>
> At the moment we are also looking into the Intel DC S3610 as an
> alternative which are a step back in performance but should be very
> reliable.
>
>
>
> Maybe any other recommendations for a ~200GB 2,5" SATA SSD to use as
> DB/WAL? (Aiming for ~3 DWPD should be sufficient for DB/WAL?)
>
>
>
> Kind regards,
>
> Caspar
>
>
>
> 2018-01-12 15:45 GMT+01:00 Sean Redmond <sean.redmo...@gmail.com>:
>
> Hi David,
>
>
>
> To follow up on this I had a 4th drive fail (out of 12) and have opted to
> order the below disks as a replacement, I have an ongoing case with Intel
> via the supplier - Will report back anything useful - But I am going to
> avoid the Intel s4600 2TB SSD's for the moment.
>
>
>
&g

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread David Herselman
Hi Mike,

I eventually got hold of a customer relations manager at Intel but his attitude 
was lack luster and Intel never officially responded to any correspondence we 
sent them. The Intel s4600 drives all passed our standard burn-in tests, they 
exclusively appear to fail once they handle production BlueStore usage, 
generally after a couple days use.

Intel really didn’t seem interested, even after explaining that the drives were 
in different physical systems in different data centres and that I had been in 
contact with another Intel customer who had experienced similar failures in 
Dell equipment (our servers are pure Intel).


Perhaps there’s interest in a Lawyer picking up the issue and their attitude. 
Not advising customers of a known issue which leads to data loss is simply 
negligent, especially on a product that they tout as being more reliable than 
spinners and has their Data Centre reliability stamp.

I returned the lot and am done with Intel SSDs, will advise as many customers 
and peers to do the same…


Regards
David Herselman


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mike 
Lovell
Sent: Thursday, 22 February 2018 11:19 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Many concurrent drive failures - How do I activate 
pgs?

has anyone tried with the most recent firmwares from intel? i've had a number 
of s4600 960gb drives that have been waiting for me to get around to adding 
them to a ceph cluster. this as well as having 2 die almost simultaneously in a 
different storage box is giving me pause. i noticed that David listed some 
output showing his ssds were running firmware version SCV10100. the drives i 
have came with the same one. it looks like SCV10111 is available through the 
latest isdct package. i'm working through upgrading mine and attempting some 
burn in testing. just curious if anyone has had any luck there.

mike

On Thu, Feb 22, 2018 at 9:49 AM, Chris Sarginson 
<csarg...@gmail.com<mailto:csarg...@gmail.com>> wrote:
Hi Caspar,

Sean and I replaced the problematic DC S4600 disks (after all but one had 
failed) in our cluster with Samsung SM863a disks.
There was an NDA for new Intel firmware (as mentioned earlier in the thread by 
David) but given the problems we were experiencing we moved all Intel disks to 
a single failure domain but were unable to get to deploy additional firmware to 
test.

The Samsung should fit your requirements.

http://www.samsung.com/semiconductor/minisite/ssd/product/enterprise/sm863a/

Regards
Chris

On Thu, 22 Feb 2018 at 12:50 Caspar Smit 
<caspars...@supernas.eu<mailto:caspars...@supernas.eu>> wrote:
Hi Sean and David,

Do you have any follow ups / news on the Intel DC S4600 case? We are looking 
into this drives to use as DB/WAL devices for a new to be build cluster.

Did Intel provide anything (like new firmware) which should fix the issues you 
were having or are these drives still unreliable?

At the moment we are also looking into the Intel DC S3610 as an alternative 
which are a step back in performance but should be very reliable.

Maybe any other recommendations for a ~200GB 2,5" SATA SSD to use as DB/WAL? 
(Aiming for ~3 DWPD should be sufficient for DB/WAL?)

Kind regards,
Caspar

2018-01-12 15:45 GMT+01:00 Sean Redmond 
<sean.redmo...@gmail.com<mailto:sean.redmo...@gmail.com>>:
Hi David,

To follow up on this I had a 4th drive fail (out of 12) and have opted to order 
the below disks as a replacement, I have an ongoing case with Intel via the 
supplier - Will report back anything useful - But I am going to avoid the Intel 
s4600 2TB SSD's for the moment.

1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND

Regards
Sean Redmond

On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond 
<sean.redmo...@gmail.com<mailto:sean.redmo...@gmail.com>> wrote:
Hi David,

Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24 disk 
model) in None RAID mode via a perc RAID card.

The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.

Thanks for your feedback on the HGST disks.

Thanks

On Wed, Jan 10, 2018 at 10:55 PM, David Herselman 
<d...@syrex.co<mailto:d...@syrex.co>> wrote:
Hi Sean,

No, Intel’s feedback has been… Pathetic… I have yet to receive anything more 
than a request to ‘sign’ a non-disclosure agreement, to obtain beta firmware. 
No official answer as to whether or not one can logically unlock the drives, no 
answer to my question whether or not Intel publish serial numbers anywhere 
pertaining to recalled batches and no information pertaining to whether or not 
firmware updates would address any known issues.

This with us being an accredited Intel Gold partner…


We’ve returned the lot and ended up with 9/12 of the drives failing in the same 
manner. The replaced drives, which had different serial number ranges, also 
failed. Very frustrating is that the drives fail in a 

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell
has anyone tried with the most recent firmwares from intel? i've had a
number of s4600 960gb drives that have been waiting for me to get around to
adding them to a ceph cluster. this as well as having 2 die almost
simultaneously in a different storage box is giving me pause. i noticed
that David listed some output showing his ssds were running firmware
version SCV10100. the drives i have came with the same one. it looks
like SCV10111 is available through the latest isdct package. i'm working
through upgrading mine and attempting some burn in testing. just curious if
anyone has had any luck there.

mike

On Thu, Feb 22, 2018 at 9:49 AM, Chris Sarginson <csarg...@gmail.com> wrote:

> Hi Caspar,
>
> Sean and I replaced the problematic DC S4600 disks (after all but one had
> failed) in our cluster with Samsung SM863a disks.
> There was an NDA for new Intel firmware (as mentioned earlier in the
> thread by David) but given the problems we were experiencing we moved all
> Intel disks to a single failure domain but were unable to get to deploy
> additional firmware to test.
>
> The Samsung should fit your requirements.
>
> http://www.samsung.com/semiconductor/minisite/ssd/
> product/enterprise/sm863a/
>
> Regards
> Chris
>
> On Thu, 22 Feb 2018 at 12:50 Caspar Smit <caspars...@supernas.eu> wrote:
>
>> Hi Sean and David,
>>
>> Do you have any follow ups / news on the Intel DC S4600 case? We are
>> looking into this drives to use as DB/WAL devices for a new to be build
>> cluster.
>>
>> Did Intel provide anything (like new firmware) which should fix the
>> issues you were having or are these drives still unreliable?
>>
>> At the moment we are also looking into the Intel DC S3610 as an
>> alternative which are a step back in performance but should be very
>> reliable.
>>
>> Maybe any other recommendations for a ~200GB 2,5" SATA SSD to use as
>> DB/WAL? (Aiming for ~3 DWPD should be sufficient for DB/WAL?)
>>
>> Kind regards,
>> Caspar
>>
>> 2018-01-12 15:45 GMT+01:00 Sean Redmond <sean.redmo...@gmail.com>:
>>
>>> Hi David,
>>>
>>> To follow up on this I had a 4th drive fail (out of 12) and have opted
>>> to order the below disks as a replacement, I have an ongoing case with
>>> Intel via the supplier - Will report back anything useful - But I am going
>>> to avoid the Intel s4600 2TB SSD's for the moment.
>>>
>>> 1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND
>>>
>>> Regards
>>> Sean Redmond
>>>
>>> On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond <sean.redmo...@gmail.com>
>>> wrote:
>>>
>>>> Hi David,
>>>>
>>>> Thanks for your email, they are connected inside Dell R730XD (2.5 inch
>>>> 24 disk model) in None RAID mode via a perc RAID card.
>>>>
>>>> The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.
>>>>
>>>> Thanks for your feedback on the HGST disks.
>>>>
>>>> Thanks
>>>>
>>>> On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:
>>>>
>>>>> Hi Sean,
>>>>>
>>>>>
>>>>>
>>>>> No, Intel’s feedback has been… Pathetic… I have yet to receive
>>>>> anything more than a request to ‘sign’ a non-disclosure agreement, to
>>>>> obtain beta firmware. No official answer as to whether or not one can
>>>>> logically unlock the drives, no answer to my question whether or not Intel
>>>>> publish serial numbers anywhere pertaining to recalled batches and no
>>>>> information pertaining to whether or not firmware updates would address 
>>>>> any
>>>>> known issues.
>>>>>
>>>>>
>>>>>
>>>>> This with us being an accredited Intel Gold partner…
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> We’ve returned the lot and ended up with 9/12 of the drives failing in
>>>>> the same manner. The replaced drives, which had different serial number
>>>>> ranges, also failed. Very frustrating is that the drives fail in a way 
>>>>> that
>>>>> result in unbootable servers, unless one adds ‘rootdelay=240’ to the 
>>>>> kernel.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I would be interested to know what platfo

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Chris Sarginson
Hi Caspar,

Sean and I replaced the problematic DC S4600 disks (after all but one had
failed) in our cluster with Samsung SM863a disks.
There was an NDA for new Intel firmware (as mentioned earlier in the thread
by David) but given the problems we were experiencing we moved all Intel
disks to a single failure domain but were unable to get to deploy
additional firmware to test.

The Samsung should fit your requirements.

http://www.samsung.com/semiconductor/minisite/ssd/product/enterprise/sm863a/

Regards
Chris

On Thu, 22 Feb 2018 at 12:50 Caspar Smit <caspars...@supernas.eu> wrote:

> Hi Sean and David,
>
> Do you have any follow ups / news on the Intel DC S4600 case? We are
> looking into this drives to use as DB/WAL devices for a new to be build
> cluster.
>
> Did Intel provide anything (like new firmware) which should fix the issues
> you were having or are these drives still unreliable?
>
> At the moment we are also looking into the Intel DC S3610 as an
> alternative which are a step back in performance but should be very
> reliable.
>
> Maybe any other recommendations for a ~200GB 2,5" SATA SSD to use as
> DB/WAL? (Aiming for ~3 DWPD should be sufficient for DB/WAL?)
>
> Kind regards,
> Caspar
>
> 2018-01-12 15:45 GMT+01:00 Sean Redmond <sean.redmo...@gmail.com>:
>
>> Hi David,
>>
>> To follow up on this I had a 4th drive fail (out of 12) and have opted to
>> order the below disks as a replacement, I have an ongoing case with Intel
>> via the supplier - Will report back anything useful - But I am going to
>> avoid the Intel s4600 2TB SSD's for the moment.
>>
>> 1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND
>>
>> Regards
>> Sean Redmond
>>
>> On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond <sean.redmo...@gmail.com>
>> wrote:
>>
>>> Hi David,
>>>
>>> Thanks for your email, they are connected inside Dell R730XD (2.5 inch
>>> 24 disk model) in None RAID mode via a perc RAID card.
>>>
>>> The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.
>>>
>>> Thanks for your feedback on the HGST disks.
>>>
>>> Thanks
>>>
>>> On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:
>>>
>>>> Hi Sean,
>>>>
>>>>
>>>>
>>>> No, Intel’s feedback has been… Pathetic… I have yet to receive anything
>>>> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta
>>>> firmware. No official answer as to whether or not one can logically unlock
>>>> the drives, no answer to my question whether or not Intel publish serial
>>>> numbers anywhere pertaining to recalled batches and no information
>>>> pertaining to whether or not firmware updates would address any known
>>>> issues.
>>>>
>>>>
>>>>
>>>> This with us being an accredited Intel Gold partner…
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> We’ve returned the lot and ended up with 9/12 of the drives failing in
>>>> the same manner. The replaced drives, which had different serial number
>>>> ranges, also failed. Very frustrating is that the drives fail in a way that
>>>> result in unbootable servers, unless one adds ‘rootdelay=240’ to the 
>>>> kernel.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I would be interested to know what platform your drives were in and
>>>> whether or not they were connected to a RAID module/card.
>>>>
>>>>
>>>>
>>>> PS: After much searching we’ve decided to order the NVMe conversion kit
>>>> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD
>>>> rating.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> David Herselman
>>>>
>>>>
>>>>
>>>> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com]
>>>> *Sent:* Thursday, 11 January 2018 12:45 AM
>>>> *To:* David Herselman <d...@syrex.co>
>>>> *Cc:* Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
>>>>
>>>> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
>>>> activate pgs?
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I have a case where 3 out to 12 of 

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Caspar Smit
Hi Sean and David,

Do you have any follow ups / news on the Intel DC S4600 case? We are
looking into this drives to use as DB/WAL devices for a new to be build
cluster.

Did Intel provide anything (like new firmware) which should fix the issues
you were having or are these drives still unreliable?

At the moment we are also looking into the Intel DC S3610 as an alternative
which are a step back in performance but should be very reliable.

Maybe any other recommendations for a ~200GB 2,5" SATA SSD to use as
DB/WAL? (Aiming for ~3 DWPD should be sufficient for DB/WAL?)

Kind regards,
Caspar

2018-01-12 15:45 GMT+01:00 Sean Redmond <sean.redmo...@gmail.com>:

> Hi David,
>
> To follow up on this I had a 4th drive fail (out of 12) and have opted to
> order the below disks as a replacement, I have an ongoing case with Intel
> via the supplier - Will report back anything useful - But I am going to
> avoid the Intel s4600 2TB SSD's for the moment.
>
> 1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND
>
> Regards
> Sean Redmond
>
> On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond <sean.redmo...@gmail.com>
> wrote:
>
>> Hi David,
>>
>> Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24
>> disk model) in None RAID mode via a perc RAID card.
>>
>> The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.
>>
>> Thanks for your feedback on the HGST disks.
>>
>> Thanks
>>
>> On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:
>>
>>> Hi Sean,
>>>
>>>
>>>
>>> No, Intel’s feedback has been… Pathetic… I have yet to receive anything
>>> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta
>>> firmware. No official answer as to whether or not one can logically unlock
>>> the drives, no answer to my question whether or not Intel publish serial
>>> numbers anywhere pertaining to recalled batches and no information
>>> pertaining to whether or not firmware updates would address any known
>>> issues.
>>>
>>>
>>>
>>> This with us being an accredited Intel Gold partner…
>>>
>>>
>>>
>>>
>>>
>>> We’ve returned the lot and ended up with 9/12 of the drives failing in
>>> the same manner. The replaced drives, which had different serial number
>>> ranges, also failed. Very frustrating is that the drives fail in a way that
>>> result in unbootable servers, unless one adds ‘rootdelay=240’ to the kernel.
>>>
>>>
>>>
>>>
>>>
>>> I would be interested to know what platform your drives were in and
>>> whether or not they were connected to a RAID module/card.
>>>
>>>
>>>
>>> PS: After much searching we’ve decided to order the NVMe conversion kit
>>> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD
>>> rating.
>>>
>>>
>>>
>>>
>>>
>>> Regards
>>>
>>> David Herselman
>>>
>>>
>>>
>>> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com]
>>> *Sent:* Thursday, 11 January 2018 12:45 AM
>>> *To:* David Herselman <d...@syrex.co>
>>> *Cc:* Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
>>>
>>> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
>>> activate pgs?
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I have a case where 3 out to 12 of these Intel S4600 2TB model failed
>>> within a matter of days after being burn-in tested then placed into
>>> production.
>>>
>>>
>>>
>>> I am interested to know, did you every get any further feedback from the
>>> vendor on your issue?
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Thu, Dec 21, 2017 at 1:38 PM, David Herselman <d...@syrex.co> wrote:
>>>
>>> Hi,
>>>
>>> I assume this can only be a physical manufacturing flaw or a firmware
>>> bug? Do Intel publish advisories on recalled equipment? Should others be
>>> concerned about using Intel DC S4600 SSD drives? Could this be an
>>> electrical issue on the Hot Swap Backplane or BMC firmware issue? Either
>>> way, all pure Intel...
>>>
>>> The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped
>>> through images, file systems are subsequently severely damaged.
>

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-01-12 Thread Sean Redmond
Hi David,

To follow up on this I had a 4th drive fail (out of 12) and have opted to
order the below disks as a replacement, I have an ongoing case with Intel
via the supplier - Will report back anything useful - But I am going to
avoid the Intel s4600 2TB SSD's for the moment.

1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND

Regards
Sean Redmond

On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond <sean.redmo...@gmail.com>
wrote:

> Hi David,
>
> Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24
> disk model) in None RAID mode via a perc RAID card.
>
> The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.
>
> Thanks for your feedback on the HGST disks.
>
> Thanks
>
> On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:
>
>> Hi Sean,
>>
>>
>>
>> No, Intel’s feedback has been… Pathetic… I have yet to receive anything
>> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta
>> firmware. No official answer as to whether or not one can logically unlock
>> the drives, no answer to my question whether or not Intel publish serial
>> numbers anywhere pertaining to recalled batches and no information
>> pertaining to whether or not firmware updates would address any known
>> issues.
>>
>>
>>
>> This with us being an accredited Intel Gold partner…
>>
>>
>>
>>
>>
>> We’ve returned the lot and ended up with 9/12 of the drives failing in
>> the same manner. The replaced drives, which had different serial number
>> ranges, also failed. Very frustrating is that the drives fail in a way that
>> result in unbootable servers, unless one adds ‘rootdelay=240’ to the kernel.
>>
>>
>>
>>
>>
>> I would be interested to know what platform your drives were in and
>> whether or not they were connected to a RAID module/card.
>>
>>
>>
>> PS: After much searching we’ve decided to order the NVMe conversion kit
>> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD
>> rating.
>>
>>
>>
>>
>>
>> Regards
>>
>> David Herselman
>>
>>
>>
>> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com]
>> *Sent:* Thursday, 11 January 2018 12:45 AM
>> *To:* David Herselman <d...@syrex.co>
>> *Cc:* Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
>>
>> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
>> activate pgs?
>>
>>
>>
>> Hi,
>>
>>
>>
>> I have a case where 3 out to 12 of these Intel S4600 2TB model failed
>> within a matter of days after being burn-in tested then placed into
>> production.
>>
>>
>>
>> I am interested to know, did you every get any further feedback from the
>> vendor on your issue?
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Thu, Dec 21, 2017 at 1:38 PM, David Herselman <d...@syrex.co> wrote:
>>
>> Hi,
>>
>> I assume this can only be a physical manufacturing flaw or a firmware
>> bug? Do Intel publish advisories on recalled equipment? Should others be
>> concerned about using Intel DC S4600 SSD drives? Could this be an
>> electrical issue on the Hot Swap Backplane or BMC firmware issue? Either
>> way, all pure Intel...
>>
>> The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped
>> through images, file systems are subsequently severely damaged.
>>
>> Is it possible to get Ceph to read in partial data shards? It would
>> provide between 25-75% more yield...
>>
>>
>> Is there anything wrong with how we've proceeded thus far? Would be nice
>> to reference examples of using ceph-objectstore-tool but documentation is
>> virtually non-existent.
>>
>> We used another SSD drive to simulate bringing all the SSDs back online.
>> We carved up the drive to provide equal partitions to essentially simulate
>> the original SSDs:
>>   # Partition a drive to provide 12 x 150GB partitions, eg:
>> sdd   8:48   0   1.8T  0 disk
>> |-sdd18:49   0   140G  0 part
>> |-sdd28:50   0   140G  0 part
>> |-sdd38:51   0   140G  0 part
>> |-sdd48:52   0   140G  0 part
>> |-sdd58:53   0   140G  0 part
>> |-sdd68:54   0   140G  0 part
>> |-sdd78:55   0   140G  0 part
>> |-sdd88:56   0   140G  0 part
>> |-sdd98:57   0   140G  0 part
>> |-sdd10   8:58   0   140G  0 part
>> |-sdd11   

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-01-10 Thread Sean Redmond
Hi David,

Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24
disk model) in None RAID mode via a perc RAID card.

The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.

Thanks for your feedback on the HGST disks.

Thanks

On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:

> Hi Sean,
>
>
>
> No, Intel’s feedback has been… Pathetic… I have yet to receive anything
> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta
> firmware. No official answer as to whether or not one can logically unlock
> the drives, no answer to my question whether or not Intel publish serial
> numbers anywhere pertaining to recalled batches and no information
> pertaining to whether or not firmware updates would address any known
> issues.
>
>
>
> This with us being an accredited Intel Gold partner…
>
>
>
>
>
> We’ve returned the lot and ended up with 9/12 of the drives failing in the
> same manner. The replaced drives, which had different serial number ranges,
> also failed. Very frustrating is that the drives fail in a way that result
> in unbootable servers, unless one adds ‘rootdelay=240’ to the kernel.
>
>
>
>
>
> I would be interested to know what platform your drives were in and
> whether or not they were connected to a RAID module/card.
>
>
>
> PS: After much searching we’ve decided to order the NVMe conversion kit
> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD
> rating.
>
>
>
>
>
> Regards
>
> David Herselman
>
>
>
> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com]
> *Sent:* Thursday, 11 January 2018 12:45 AM
> *To:* David Herselman <d...@syrex.co>
> *Cc:* Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
>
> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
> activate pgs?
>
>
>
> Hi,
>
>
>
> I have a case where 3 out to 12 of these Intel S4600 2TB model failed
> within a matter of days after being burn-in tested then placed into
> production.
>
>
>
> I am interested to know, did you every get any further feedback from the
> vendor on your issue?
>
>
>
> Thanks
>
>
>
> On Thu, Dec 21, 2017 at 1:38 PM, David Herselman <d...@syrex.co> wrote:
>
> Hi,
>
> I assume this can only be a physical manufacturing flaw or a firmware bug?
> Do Intel publish advisories on recalled equipment? Should others be
> concerned about using Intel DC S4600 SSD drives? Could this be an
> electrical issue on the Hot Swap Backplane or BMC firmware issue? Either
> way, all pure Intel...
>
> The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped through
> images, file systems are subsequently severely damaged.
>
> Is it possible to get Ceph to read in partial data shards? It would
> provide between 25-75% more yield...
>
>
> Is there anything wrong with how we've proceeded thus far? Would be nice
> to reference examples of using ceph-objectstore-tool but documentation is
> virtually non-existent.
>
> We used another SSD drive to simulate bringing all the SSDs back online.
> We carved up the drive to provide equal partitions to essentially simulate
> the original SSDs:
>   # Partition a drive to provide 12 x 150GB partitions, eg:
> sdd   8:48   0   1.8T  0 disk
> |-sdd18:49   0   140G  0 part
> |-sdd28:50   0   140G  0 part
> |-sdd38:51   0   140G  0 part
> |-sdd48:52   0   140G  0 part
> |-sdd58:53   0   140G  0 part
> |-sdd68:54   0   140G  0 part
> |-sdd78:55   0   140G  0 part
> |-sdd88:56   0   140G  0 part
> |-sdd98:57   0   140G  0 part
> |-sdd10   8:58   0   140G  0 part
> |-sdd11   8:59   0   140G  0 part
> +-sdd12   8:60   0   140G  0 part
>
>
>   Pre-requisites:
> ceph osd set noout;
> apt-get install uuid-runtime;
>
>
>   for ID in `seq 24 35`; do
> UUID=`uuidgen`;
> OSD_SECRET=`ceph-authtool --gen-print-key`;
> DEVICE='/dev/sdd'$[$ID-23]; # 24-23 = /dev/sdd1, 35-23 = /dev/sdd12
> echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | ceph osd new $UUID $ID -i
> - -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring;
> mkdir /var/lib/ceph/osd/ceph-$ID;
> mkfs.xfs $DEVICE;
> mount $DEVICE /var/lib/ceph/osd/ceph-$ID;
> ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring
> --name osd.$ID --add-key $OSD_SECRET;
> ceph-osd -i $ID --mkfs --osd-uuid $UUID;
> chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID;
> systemctl enable ceph-osd@$ID;
> systemctl start ceph-osd@$ID;

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-01-10 Thread David Herselman
Hi Sean,

No, Intel’s feedback has been… Pathetic… I have yet to receive anything more 
than a request to ‘sign’ a non-disclosure agreement, to obtain beta firmware. 
No official answer as to whether or not one can logically unlock the drives, no 
answer to my question whether or not Intel publish serial numbers anywhere 
pertaining to recalled batches and no information pertaining to whether or not 
firmware updates would address any known issues.

This with us being an accredited Intel Gold partner…


We’ve returned the lot and ended up with 9/12 of the drives failing in the same 
manner. The replaced drives, which had different serial number ranges, also 
failed. Very frustrating is that the drives fail in a way that result in 
unbootable servers, unless one adds ‘rootdelay=240’ to the kernel.


I would be interested to know what platform your drives were in and whether or 
not they were connected to a RAID module/card.

PS: After much searching we’ve decided to order the NVMe conversion kit and 
have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD rating.


Regards
David Herselman

From: Sean Redmond [mailto:sean.redmo...@gmail.com]
Sent: Thursday, 11 January 2018 12:45 AM
To: David Herselman <d...@syrex.co>
Cc: Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Many concurrent drive failures - How do I activate 
pgs?

Hi,

I have a case where 3 out to 12 of these Intel S4600 2TB model failed within a 
matter of days after being burn-in tested then placed into production.

I am interested to know, did you every get any further feedback from the vendor 
on your issue?

Thanks

On Thu, Dec 21, 2017 at 1:38 PM, David Herselman 
<d...@syrex.co<mailto:d...@syrex.co>> wrote:
Hi,

I assume this can only be a physical manufacturing flaw or a firmware bug? Do 
Intel publish advisories on recalled equipment? Should others be concerned 
about using Intel DC S4600 SSD drives? Could this be an electrical issue on the 
Hot Swap Backplane or BMC firmware issue? Either way, all pure Intel...

The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped through 
images, file systems are subsequently severely damaged.

Is it possible to get Ceph to read in partial data shards? It would provide 
between 25-75% more yield...


Is there anything wrong with how we've proceeded thus far? Would be nice to 
reference examples of using ceph-objectstore-tool but documentation is 
virtually non-existent.

We used another SSD drive to simulate bringing all the SSDs back online. We 
carved up the drive to provide equal partitions to essentially simulate the 
original SSDs:
  # Partition a drive to provide 12 x 150GB partitions, eg:
sdd   8:48   0   1.8T  0 disk
|-sdd18:49   0   140G  0 part
|-sdd28:50   0   140G  0 part
|-sdd38:51   0   140G  0 part
|-sdd48:52   0   140G  0 part
|-sdd58:53   0   140G  0 part
|-sdd68:54   0   140G  0 part
|-sdd78:55   0   140G  0 part
|-sdd88:56   0   140G  0 part
|-sdd98:57   0   140G  0 part
|-sdd10   8:58   0   140G  0 part
|-sdd11   8:59   0   140G  0 part
+-sdd12   8:60   0   140G  0 part


  Pre-requisites:
ceph osd set noout;
apt-get install uuid-runtime;


  for ID in `seq 24 35`; do
UUID=`uuidgen`;
OSD_SECRET=`ceph-authtool --gen-print-key`;
DEVICE='/dev/sdd'$[$ID-23]; # 24-23 = /dev/sdd1, 35-23 = /dev/sdd12
echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | ceph osd new $UUID $ID -i - -n 
client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring;
mkdir /var/lib/ceph/osd/ceph-$ID;
mkfs.xfs $DEVICE;
mount $DEVICE /var/lib/ceph/osd/ceph-$ID;
ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring --name 
osd.$ID --add-key $OSD_SECRET;
ceph-osd -i $ID --mkfs --osd-uuid $UUID;
chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID;
systemctl enable ceph-osd@$ID;
systemctl start ceph-osd@$ID;
  done


Once up we imported previous exports of empty head files in to 'real' OSDs:
  kvm5b:
systemctl stop ceph-osd@8;
ceph-objectstore-tool --op import --pgid 7.4s0 --data-path 
/var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal --file 
/var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
chown ceph:ceph -R /var/lib/ceph/osd/ceph-8;
systemctl start ceph-osd@8;
  kvm5f:
systemctl stop ceph-osd@23;
ceph-objectstore-tool --op import --pgid 7.fs0 --data-path 
/var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal 
--file /var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
chown ceph:ceph -R /var/lib/ceph/osd/ceph-23;
systemctl start ceph-osd@23;


Bulk import previously exported objects:
cd /var/lib/vz/template/ssd_recovery;
for FILE in `ls -1A osd*_*.export | grep -Pv '^osd(8|23)_'`; do
  OSD=`echo $FILE | perl -pe 's/^osd(\d+).*/\1/'`;
  PGID=`echo $FILE | 

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-01-10 Thread Sean Redmond
/osd/ceph-30/journal --file /var/lib/vz/template/ssd_
> recovery/osd30_7.fs4.export
> systemctl stop ceph-osd@31   ceph-objectstore-tool --op import --pgid
> 7.4s2 --data-path /var/lib/ceph/osd/ceph-31 --journal-path
> /var/lib/ceph/osd/ceph-31/journal --file /var/lib/vz/template/ssd_
> recovery/osd31_7.4s2.export
> systemctl stop ceph-osd@32   ceph-objectstore-tool --op import --pgid
> 7.4s4 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
> /var/lib/ceph/osd/ceph-32/journal --file /var/lib/vz/template/ssd_
> recovery/osd32_7.4s4.export
> systemctl stop ceph-osd@32   ceph-objectstore-tool --op import --pgid
> 7.fs2 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
> /var/lib/ceph/osd/ceph-32/journal --file /var/lib/vz/template/ssd_
> recovery/osd32_7.fs2.export
> systemctl stop ceph-osd@34   ceph-objectstore-tool --op import --pgid
> 7.4s5 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
> /var/lib/ceph/osd/ceph-34/journal --file /var/lib/vz/template/ssd_
> recovery/osd34_7.4s5.export
> systemctl stop ceph-osd@34   ceph-objectstore-tool --op import --pgid
> 7.fs1 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
> /var/lib/ceph/osd/ceph-34/journal --file /var/lib/vz/template/ssd_
> recovery/osd34_7.fs1.export
>
>
> Reset permissions and then started the OSDs:
> for OSD in 27 30 31 32 34; do
>   chown -R ceph:ceph /var/lib/ceph/osd/ceph-$OSD;
>   systemctl start ceph-osd@$OSD;
> done
>
>
> Then finally started all the OSDs... Now to hope that Intel have a way of
> accessing drives that are in a 'disable logical state'.
>
>
>
> The imports succeed, herewith a link to the output after running an import
> for placement group 7.4s2 on OSD 31:
>   https://drive.google.com/open?id=1-Jo1jmrWrGLO2OgflacGPlEf2p32Y4hn
>
> Sample snippet:
> Write 1#7:fffcd2ec:::rbd_data.4.be8e9974b0dc51.2869:head#
> snapset 0=[]:{}
> Write 1#7:fffd4823:::rbd_data.4.ba24ef2ae8944a.a2b0:head#
> snapset 0=[]:{}
> Write 1#7:fffd6fb6:::benchmark_data_kvm5b_20945_object14722:head#
> snapset 0=[]:{}
> Write 1#7:a069:::rbd_data.4.ba24ef2ae8944a.aea9:head#
> snapset 0=[]:{}
> Import successful
>
>
> Data does get written, I can tell by the size of the FileStore mount
> points:
>   [root@kvm5b ssd_recovery]# df -h | grep -P 'ceph-(27|30|31|32|34)$'
>   /dev/sdd4   140G  5.2G  135G   4% /var/lib/ceph/osd/ceph-27
>   /dev/sdd7   140G   14G  127G  10% /var/lib/ceph/osd/ceph-30
>   /dev/sdd8   140G   14G  127G  10% /var/lib/ceph/osd/ceph-31
>   /dev/sdd9   140G   22G  119G  16% /var/lib/ceph/osd/ceph-32
>   /dev/sdd11  140G   22G  119G  16% /var/lib/ceph/osd/ceph-34
>
>
> How do I tell Ceph to read these object shards?
>
>
>
> PS: It's probably a good idea to reweight the OSDs to 0 before starting
> again. This should prevent data flowing on to them, if they are not in a
> different device class or other crush selection ruleset. Ie:
>   for OSD in `seq 24 35`; do
> ceph osd crush reweight osd.$OSD 0;
>   done
>
>
> Regards
> David Herselman
>
> -Original Message-
> From: David Herselman
> Sent: Thursday, 21 December 2017 3:49 AM
> To: 'Christian Balzer' <ch...@gol.com>; ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] Many concurrent drive failures - How do I
> activate pgs?
>
> Hi Christian,
>
> Thanks for taking the time, I haven't been contacted by anyone yet but
> managed to get the down placement groups cleared by exporting 7.4s0 and
> 7.fs0 and then marking them as complete on the surviving OSDs:
> kvm5c:
>   ceph-objectstore-tool --op export --pgid 7.4s0 --data-path
> /var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal
> --file /var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
>   ceph-objectstore-tool --op mark-complete --data-path
> /var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal
> --pgid 7.4s0;
> kvm5f:
>   ceph-objectstore-tool --op export --pgid 7.fs0 --data-path
> /var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal
> --file /var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
>   ceph-objectstore-tool --op mark-complete --data-path
> /var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal
> --pgid 7.fs0;
>
> This would presumably simply punch holes in the RBD images but at least we
> can copy them out of that pool and hope that Intel can somehow unlock the
> drives for us to then export/import objects.
>
>
> To answer your questions though, we have 6 near identical Intel Wildcat
> Pass 1U servers and have Proxmox loaded on them. Proxmox uses a Debian

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-21 Thread Dénes Dolhay
op import
>--pgid 7.fs4 --data-path /var/lib/ceph/osd/ceph-30 --journal-path
>/var/lib/ceph/osd/ceph-30/journal --file
>/var/lib/vz/template/ssd_recovery/osd30_7.fs4.export
>systemctl stop ceph-osd@31   ceph-objectstore-tool --op import
>--pgid 7.4s2 --data-path /var/lib/ceph/osd/ceph-31 --journal-path
>/var/lib/ceph/osd/ceph-31/journal --file
>/var/lib/vz/template/ssd_recovery/osd31_7.4s2.export
>systemctl stop ceph-osd@32   ceph-objectstore-tool --op import
>--pgid 7.4s4 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
>/var/lib/ceph/osd/ceph-32/journal --file
>/var/lib/vz/template/ssd_recovery/osd32_7.4s4.export
>systemctl stop ceph-osd@32   ceph-objectstore-tool --op import
>--pgid 7.fs2 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
>/var/lib/ceph/osd/ceph-32/journal --file
>/var/lib/vz/template/ssd_recovery/osd32_7.fs2.export
>systemctl stop ceph-osd@34   ceph-objectstore-tool --op import
>--pgid 7.4s5 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
>/var/lib/ceph/osd/ceph-34/journal --file
>/var/lib/vz/template/ssd_recovery/osd34_7.4s5.export
>systemctl stop ceph-osd@34   ceph-objectstore-tool --op import
>--pgid 7.fs1 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
>/var/lib/ceph/osd/ceph-34/journal --file
>/var/lib/vz/template/ssd_recovery/osd34_7.fs1.export
>
>
>Reset permissions and then started the OSDs:
>for OSD in 27 30 31 32 34; do
>  chown -R ceph:ceph /var/lib/ceph/osd/ceph-$OSD;
>  systemctl start ceph-osd@$OSD;
>done
>
>
>Then finally started all the OSDs... Now to hope that Intel have a way
>of accessing drives that are in a 'disable logical state'.
>
>
>
>The imports succeed, herewith a link to the output after running an
>import for placement group 7.4s2 on OSD 31: 
>  https://drive.google.com/open?id=1-Jo1jmrWrGLO2OgflacGPlEf2p32Y4hn
>
>Sample snippet:
>  Write 1#7:fffcd2ec:::rbd_data.4.be8e9974b0dc51.2869:head#
>snapset 0=[]:{}
>  Write 1#7:fffd4823:::rbd_data.4.ba24ef2ae8944a.a2b0:head#
>snapset 0=[]:{}
>Write 1#7:fffd6fb6:::benchmark_data_kvm5b_20945_object14722:head#
>snapset 0=[]:{}
>  Write 1#7:a069:::rbd_data.4.ba24ef2ae8944a.aea9:head#
>snapset 0=[]:{}
>Import successful
>
>
>Data does get written, I can tell by the size of the FileStore mount
>points:
>  [root@kvm5b ssd_recovery]# df -h | grep -P 'ceph-(27|30|31|32|34)$'
>  /dev/sdd4   140G  5.2G  135G   4% /var/lib/ceph/osd/ceph-27
>  /dev/sdd7   140G   14G  127G  10% /var/lib/ceph/osd/ceph-30
>  /dev/sdd8   140G   14G  127G  10% /var/lib/ceph/osd/ceph-31
>  /dev/sdd9   140G   22G  119G  16% /var/lib/ceph/osd/ceph-32
>  /dev/sdd11  140G   22G  119G  16% /var/lib/ceph/osd/ceph-34
>
>
>How do I tell Ceph to read these object shards?
>
>
>
>PS: It's probably a good idea to reweight the OSDs to 0 before starting
>again. This should prevent data flowing on to them, if they are not in
>a different device class or other crush selection ruleset. Ie:
>  for OSD in `seq 24 35`; do
>ceph osd crush reweight osd.$OSD 0;
>  done
>
>
>Regards
>David Herselman
>
>-Original Message-
>From: David Herselman 
>Sent: Thursday, 21 December 2017 3:49 AM
>To: 'Christian Balzer' <ch...@gol.com>; ceph-users@lists.ceph.com
>Subject: RE: [ceph-users] Many concurrent drive failures - How do I
>activate pgs?
>
>Hi Christian,
>
>Thanks for taking the time, I haven't been contacted by anyone yet but
>managed to get the down placement groups cleared by exporting 7.4s0 and
>7.fs0 and then marking them as complete on the surviving OSDs:
>kvm5c:
>ceph-objectstore-tool --op export --pgid 7.4s0 --data-path
>/var/lib/ceph/osd/ceph-8 --journal-path
>/var/lib/ceph/osd/ceph-8/journal --file
>/var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
>ceph-objectstore-tool --op mark-complete --data-path
>/var/lib/ceph/osd/ceph-8 --journal-path
>/var/lib/ceph/osd/ceph-8/journal --pgid 7.4s0;
>kvm5f:
>ceph-objectstore-tool --op export --pgid 7.fs0 --data-path
>/var/lib/ceph/osd/ceph-23 --journal-path
>/var/lib/ceph/osd/ceph-23/journal --file
>/var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
>ceph-objectstore-tool --op mark-complete --data-path
>/var/lib/ceph/osd/ceph-23 --journal-path
>/var/lib/ceph/osd/ceph-23/journal --pgid 7.fs0;
>
>This would presumably simply punch holes in the RBD images but at least
>we can copy them out of that pool and hope that Intel can somehow
>unlock the drives for us to then export/import objects.
>
>
>To answer your questions though, we have 6 near identical Intel Wildcat
>Pass 1U servers and have Proxmox loaded on them. Proxmox uses a Debian
>9 base 

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-21 Thread David Herselman
ceph-objectstore-tool --op import --pgid 7.4s5 
--data-path /var/lib/ceph/osd/ceph-34 --journal-path 
/var/lib/ceph/osd/ceph-34/journal --file 
/var/lib/vz/template/ssd_recovery/osd34_7.4s5.export
systemctl stop ceph-osd@34   ceph-objectstore-tool --op import --pgid 7.fs1 
--data-path /var/lib/ceph/osd/ceph-34 --journal-path 
/var/lib/ceph/osd/ceph-34/journal --file 
/var/lib/vz/template/ssd_recovery/osd34_7.fs1.export


Reset permissions and then started the OSDs:
for OSD in 27 30 31 32 34; do
  chown -R ceph:ceph /var/lib/ceph/osd/ceph-$OSD;
  systemctl start ceph-osd@$OSD;
done


Then finally started all the OSDs... Now to hope that Intel have a way of 
accessing drives that are in a 'disable logical state'.



The imports succeed, herewith a link to the output after running an import for 
placement group 7.4s2 on OSD 31: 
  https://drive.google.com/open?id=1-Jo1jmrWrGLO2OgflacGPlEf2p32Y4hn

Sample snippet:
Write 1#7:fffcd2ec:::rbd_data.4.be8e9974b0dc51.2869:head#
snapset 0=[]:{}
Write 1#7:fffd4823:::rbd_data.4.ba24ef2ae8944a.a2b0:head#
snapset 0=[]:{}
Write 1#7:fffd6fb6:::benchmark_data_kvm5b_20945_object14722:head#
snapset 0=[]:{}
Write 1#7:a069:::rbd_data.4.ba24ef2ae8944a.aea9:head#
snapset 0=[]:{}
Import successful


Data does get written, I can tell by the size of the FileStore mount points:
  [root@kvm5b ssd_recovery]# df -h | grep -P 'ceph-(27|30|31|32|34)$'
  /dev/sdd4   140G  5.2G  135G   4% /var/lib/ceph/osd/ceph-27
  /dev/sdd7   140G   14G  127G  10% /var/lib/ceph/osd/ceph-30
  /dev/sdd8   140G   14G  127G  10% /var/lib/ceph/osd/ceph-31
  /dev/sdd9   140G   22G  119G  16% /var/lib/ceph/osd/ceph-32
  /dev/sdd11  140G   22G  119G  16% /var/lib/ceph/osd/ceph-34


How do I tell Ceph to read these object shards?



PS: It's probably a good idea to reweight the OSDs to 0 before starting again. 
This should prevent data flowing on to them, if they are not in a different 
device class or other crush selection ruleset. Ie:
  for OSD in `seq 24 35`; do
ceph osd crush reweight osd.$OSD 0;
  done


Regards
David Herselman

-Original Message-
From: David Herselman 
Sent: Thursday, 21 December 2017 3:49 AM
To: 'Christian Balzer' <ch...@gol.com>; ceph-users@lists.ceph.com
Subject: RE: [ceph-users] Many concurrent drive failures - How do I activate 
pgs?

Hi Christian,

Thanks for taking the time, I haven't been contacted by anyone yet but managed 
to get the down placement groups cleared by exporting 7.4s0 and 7.fs0 and then 
marking them as complete on the surviving OSDs:
kvm5c:
  ceph-objectstore-tool --op export --pgid 7.4s0 --data-path 
/var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal --file 
/var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
  ceph-objectstore-tool --op mark-complete --data-path 
/var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal --pgid 
7.4s0;
kvm5f:
  ceph-objectstore-tool --op export --pgid 7.fs0 --data-path 
/var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal 
--file /var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
  ceph-objectstore-tool --op mark-complete --data-path 
/var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal 
--pgid 7.fs0;

This would presumably simply punch holes in the RBD images but at least we can 
copy them out of that pool and hope that Intel can somehow unlock the drives 
for us to then export/import objects.


To answer your questions though, we have 6 near identical Intel Wildcat Pass 1U 
servers and have Proxmox loaded on them. Proxmox uses a Debian 9 base with the 
Ubuntu kernel, for which they apply cherry picked kernel patches (eg Intel NIC 
driver updates, vhost perf regression and mem-leak fixes, etc):

kvm5a:
   Intel R1208WTTGSR System (serial: BQWS55091014)
   Intel S2600WTTR Motherboard (serial: BQWL54950385, BIOS ID: 
SE5C610.86B.01.01.0021.032120170601)
   2 x Intel Xeon E5-2640v4 2.4GHz (HT disabled)
   24 x Micron 8GB DDR4 2133MHz (24 x 18ASF1G72PZ-2G1B1)
   Intel AXX10GBNIA I/O Module
kvm5b:
   Intel R1208WTTGS System (serial: BQWS53890178)
   Intel S2600WTT Motherboard (serial: BQWL52550359, BIOS ID: 
SE5C610.86B.01.01.0021.032120170601)
   2 x Intel Xeon E5-2640v4 2.4GHz (HT enabled)
   4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
   Intel AXX10GBNIA I/O Module
kvm5c:
   Intel R1208WT2GS System (serial: BQWS50490279)
   Intel S2600WT2 Motherboard (serial: BQWL44650203, BIOS ID: 
SE5C610.86B.01.01.0021.032120170601)
   2 x Intel Xeon E5-2640v3 2.6GHz (HT enabled)
   4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
   Intel AXX10GBNIA I/O Module
kvm5d:
   Intel R1208WTTGSR System (serial: BQWS62291318)
   Intel S2600WTTR Motherboard (serial: BQWL61855187, BIOS ID: 
SE5C610.86B.01.01.0021.032120170601)
   2 x In

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-21 Thread Denes Dolhay
  Always   
-   41906


Media wear out : 0% used
LBAs written: 14195
Power on hours: <100
Power cycle count: once at the factory, once at our offices to check if there 
was newer firmware (there wasn't) and once when we restarted the node to see if 
it could then access a failed drive.


Regards
David Herselman


-Original Message-
From: Christian Balzer [mailto:ch...@gol.com]
Sent: Thursday, 21 December 2017 3:24 AM
To: ceph-users@lists.ceph.com
Cc: David Herselman <d...@syrex.co>
Subject: Re: [ceph-users] Many concurrent drive failures - How do I activate 
pgs?


Hello,

first off, I don't have anything to add to your conclusions of the current 
status, alas there are at least 2 folks here on the ML making a living from 
Ceph disaster recovery, so I hope you have been contacted already.

Now once your data is safe or you have a moment, I and others here would 
probably be quite interested in some more details, see inline below.

On Wed, 20 Dec 2017 22:25:23 + David Herselman wrote:

[snip]

We've happily been running a 6 node cluster with 4 x FileStore HDDs per node 
(journals on SSD partitions) for over a year and recently upgraded all nodes to 
Debian 9, Ceph Luminous 12.2.2 and kernel 4.13.8. We ordered 12 x Intel DC 
S4600 SSDs which arrived last week so we added two per node on Thursday evening 
and brought them up as BlueStore OSDs. We had proactively updated our existing 
pools to reference only devices classed as 'hdd', so that we could move select 
images over to ssd replicated and erasure coded pools.


Could you tell us more about that cluster, as in HW, how are the SSDs connected 
and FW version of the controller if applicable.

Kernel 4.13.8 suggests that this is a handrolled, upstream kernel.
While not necessarily related I'll note that as far as Debian kernels (which 
are very lightly if at all patched) are concerned, nothing beyond
4.9 has been working to my satisfaction.
4.11 still worked, but 4.12 crash-reboot-looped on all my Supermicro X10 
machines (quite a varied selection).
The current 4.13.13 backport boots on some of those machines, but still throws 
errors with the EDAC devices, which works fine with 4.9.

4.14 is known to happily destroy data if used with bcache and even if one 
doesn't use that it should give you pause.


We were pretty diligent and downloaded Intel's Firmware Update Tool and 
validated that each new drive had the latest available firmware before 
installing them in the nodes. We did numerous benchmarks on Friday and 
eventually moved some images over to the new storage pools. Everything was 
working perfectly and extensive tests on Sunday showed excellent performance. 
Sunday night one of the new SSDs died and Ceph replicated and redistributed 
data accordingly, then another failed in the early hours of Monday morning and 
Ceph did what it needed to.

We had the two failed drives replaced by 11am and Ceph was up to 2/4918587 
objects degraded (0.000%) when a third drive failed. At this point we updated 
the crush maps for the rbd_ssd and ec_ssd pools and set the device class to 
'hdd', to essentially evacuate everything off the SSDs. Other SSDs then failed 
at 3:22pm, 4:19pm, 5:49pm and 5:50pm. We've ultimately lost half the Intel 
S4600 drives, which are all completely inaccessible. Our status at 11:42pm 
Monday night was: 1/1398478 objects unfound (0.000%) and 339/4633062 objects 
degraded (0.007%).


The relevant logs when and how those SSDs failed would be interesting.
Was the distribution of the failed SSDs random among the cluster?
Are you running smartd and did it have something to say?

Completely inaccessible sounds a lot like the infamous "self-bricking" of Intel 
SSDs when they discover something isn't right, or they don't like the color scheme of the 
server inside (^.^).

I'm using quite a lot of Intel SSDs and had only one "fatal" incident.
A DC S3700 detected that its powercap had failed, but of course kept working 
fine. Until a reboot was need, when it promptly bricked itself, data 
inaccessible, SMART reporting barely that something was there.

So one wonders what caused your SSDs to get their knickers in such a twist.
Are the survivors showing any unusual signs in their SMART output?

Of course what your vendor/Intel will have to say will also be of interest. ^o^

Regards,

Christian


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-20 Thread David Herselman
anitizeCryptoScrambleSupported : True
SanitizeSupported : True
SataGen1 : True
SataGen2 : True
SataGen3 : True
SataNegotiatedSpeed : Unknown
SectorSize : 512
SecurityEnabled : False
SecurityFrozen : False
SecurityLocked : False
SecuritySupported : False
SerialNumber : PHYM7276031E1P9DGN
TCGSupported : False
TargetID : 0
TempThreshold : Selected drive is in a disable logical state.
TemperatureLoggingInterval : Selected drive is in a disable logical state.
TimeLimitedErrorRecovery : Selected drive is in a disable logical state.
TrimSize : 4
TrimSupported : True
VolatileWriteCacheEnabled : Selected drive is in a disable logical state.
WWID : 3959312879584368077
WriteAtomicityDisableNormal : Selected drive is in a disable logical state.
WriteCacheEnabled : True
WriteCacheReorderingStateEnabled : Selected drive is in a disable logical state.
WriteCacheState : Selected drive is in a disable logical state.
WriteCacheSupported : True
WriteErrorRecoveryTimer : Selected drive is in a disable logical state.



SMART information is inaccessible, overall status is failed. Herewith the stats 
from a partner disc which was still working when the others failed:
Device Model: INTEL SSDSC2KG019T7
Serial Number:PHYM727602TM1P9DGN
LU WWN Device Id: 5 5cd2e4 14e1636bb
Firmware Version: SCV10100
User Capacity:1,920,383,410,176 bytes [1.92 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:Solid State Device
Form Factor:  2.5 inches
Device is:Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Mon Dec 18 19:33:51 2017 SAST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000Old_age   Always   
-   0
  9 Power_On_Hours  0x0032   100   100   000Old_age   Always   
-   98
12 Power_Cycle_Count   0x0032   100   100   000Old_age   Always   - 
  3
170 Unknown_Attribute   0x0033   100   100   010Pre-fail  Always   
-   0
171 Unknown_Attribute   0x0032   100   100   000Old_age   Always   
-   1
172 Unknown_Attribute   0x0032   100   100   000Old_age   Always   
-   0
174 Unknown_Attribute   0x0032   100   100   000Old_age   Always   
-   0
175 Program_Fail_Count_Chip 0x0033   100   100   010Pre-fail  Always   
-   17567121432
183 Runtime_Bad_Block   0x0032   100   100   000Old_age   Always   
-   0
184 End-to-End_Error0x0033   100   100   090Pre-fail  Always   
-   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always   
-   0
190 Airflow_Temperature_Cel 0x0022   077   076   000Old_age   Always   
-   23 (Min/Max 17/29)
192 Power-Off_Retract_Count 0x0032   100   100   000Old_age   Always   
-   0
194 Temperature_Celsius 0x0022   100   100   000Old_age   Always   
-   23
197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always   
-   0
199 UDMA_CRC_Error_Count0x003e   100   100   000Old_age   Always   
-   0
225 Unknown_SSD_Attribute   0x0032   100   100   000Old_age   Always   
-   14195
226 Unknown_SSD_Attribute   0x0032   100   100   000Old_age   Always   
-   0
227 Unknown_SSD_Attribute   0x0032   100   100   000Old_age   Always   
-   42
228 Power-off_Retract_Count 0x0032   100   100   000Old_age   Always   
-   5905
232 Available_Reservd_Space 0x0033   100   100   010Pre-fail  Always   
-   0
233 Media_Wearout_Indicator 0x0032   100   100   000Old_age   Always   
-   0
234 Unknown_Attribute   0x0032   100   100   000Old_age   Always   
-   0
241 Total_LBAs_Written  0x0032   100   100   000Old_age   Always   
-   14195
242 Total_LBAs_Read 0x0032   100   100   000Old_age   Always   
-   10422
243 Unknown_Attribute   0x0032   100   100   000Old_age   Always   
-   41906


Media wear out : 0% used
LBAs written: 14195
Power on hours: <100
Power cycle count: once at the factory, once at our offices to check if there 
was newer firmware (there wasn't) and once when we restarted the node to see if 
it could then access a failed drive.


Regards
David Herselman


-Original Message-
From: Christian Balzer [mailto:ch...@gol.com] 
Sent: Thursday, 21 December 2017 3:24 AM
To: ceph-users@lists.ceph.com
Cc: David Herselman <d...@syrex.co>
Subject: Re: [ceph-users] Many concurrent drive failures - How do I activate 
pgs?


Hello,

first off, I don't have anything to add to your conclusions of the current 
status, alas there are at least 2 folks here on the ML making a living from

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-20 Thread Christian Balzer

Hello,

first off, I don't have anything to add to your conclusions of the current
status, alas there are at least 2 folks here on the ML making a living
from Ceph disaster recovery, so I hope you have been contacted already.

Now once your data is safe or you have a moment, I and others here would
probably be quite interested in some more details, see inline below.

On Wed, 20 Dec 2017 22:25:23 + David Herselman wrote:

[snip]
> 
> We've happily been running a 6 node cluster with 4 x FileStore HDDs per node 
> (journals on SSD partitions) for over a year and recently upgraded all nodes 
> to Debian 9, Ceph Luminous 12.2.2 and kernel 4.13.8. We ordered 12 x Intel DC 
> S4600 SSDs which arrived last week so we added two per node on Thursday 
> evening and brought them up as BlueStore OSDs. We had proactively updated our 
> existing pools to reference only devices classed as 'hdd', so that we could 
> move select images over to ssd replicated and erasure coded pools.
> 
Could you tell us more about that cluster, as in HW, how are the SSDs
connected and FW version of the controller if applicable. 

Kernel 4.13.8 suggests that this is a handrolled, upstream kernel.
While not necessarily related I'll note that as far as Debian kernels
(which are very lightly if at all patched) are concerned, nothing beyond
4.9 has been working to my satisfaction. 
4.11 still worked, but 4.12 crash-reboot-looped on all my Supermicro X10
machines (quite a varied selection). 
The current 4.13.13 backport boots on some of those machines, but still
throws errors with the EDAC devices, which works fine with 4.9.

4.14 is known to happily destroy data if used with bcache and even if one
doesn't use that it should give you pause.

> We were pretty diligent and downloaded Intel's Firmware Update Tool and 
> validated that each new drive had the latest available firmware before 
> installing them in the nodes. We did numerous benchmarks on Friday and 
> eventually moved some images over to the new storage pools. Everything was 
> working perfectly and extensive tests on Sunday showed excellent performance. 
> Sunday night one of the new SSDs died and Ceph replicated and redistributed 
> data accordingly, then another failed in the early hours of Monday morning 
> and Ceph did what it needed to.
> 
> We had the two failed drives replaced by 11am and Ceph was up to 2/4918587 
> objects degraded (0.000%) when a third drive failed. At this point we updated 
> the crush maps for the rbd_ssd and ec_ssd pools and set the device class to 
> 'hdd', to essentially evacuate everything off the SSDs. Other SSDs then 
> failed at 3:22pm, 4:19pm, 5:49pm and 5:50pm. We've ultimately lost half the 
> Intel S4600 drives, which are all completely inaccessible. Our status at 
> 11:42pm Monday night was: 1/1398478 objects unfound (0.000%) and 339/4633062 
> objects degraded (0.007%).
> 
The relevant logs when and how those SSDs failed would be interesting. 
Was the distribution of the failed SSDs random among the cluster?
Are you running smartd and did it have something to say?

Completely inaccessible sounds a lot like the infamous "self-bricking" of
Intel SSDs when they discover something isn't right, or they don't like
the color scheme of the server inside (^.^). 

I'm using quite a lot of Intel SSDs and had only one "fatal" incident.
A DC S3700 detected that its powercap had failed, but of course kept
working fine. Until a reboot was need, when it promptly bricked itself,
data inaccessible, SMART reporting barely that something was there.

So one wonders what caused your SSDs to get their knickers in such a twist.
Are the survivors showing any unusual signs in their SMART output?

Of course what your vendor/Intel will have to say will also be of
interest. ^o^

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com