SSD vendors should use our tests for QA'ing their new SSDs!

Mike McCandless

http://blog.mikemccandless.com


On Sat, Aug 31, 2019 at 7:50 AM Uwe Schindler <[email protected]> wrote:

> Hi,
>
>
>
> the service to replace those SSD is included in rental fee 😊
>
>
>
> I am not sure why it writes so much, but I think Solr is more hammering
> our SSDs. Lucene’s test do not do too much IO. Nevertheless, the SSD
> survived more than 2 years. The server was installed on 2017-05-19. After
> some runtime I calculated the approximate lifetime and I was not bad in
> estimating: I said 2 years 😊
>
>
>
> FYI, at the moment they replace disk #2 (I rebuilt the raid array before).
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: [email protected]
>
>
>
> *From:* Michael McCandless <[email protected]>
> *Sent:* Saturday, August 31, 2019 1:32 PM
> *To:* Lucene/Solr dev <[email protected]>
> *Subject:* Re: NVMe - SSD shredding due to Lucene :-)
>
>
>
> Nice to know :)  Thanks for upgrading Uwe.
>
>
>
> I thought we randomly disable fsync in tests just to protect our precious
> SSDs?
>
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
>
>
>
> On Sat, Aug 31, 2019 at 6:20 AM Uwe Schindler <[email protected]> wrote:
>
> Hi all,
>
> I just wanted to inform you that I asked the provider of the Policeman
> Jenkins Server to replace the first of two NVMe SSDs, because it failed
> with fatal warnings due to too many writes and no more spare sectors:
>
> > root@serv1 ~ # nvme smart-log /dev/nvme0
> > Smart Log for NVME device:nvme0 namespace-id:ffffffff
> > critical_warning                    : 0x1
> > temperature                         : 76 C
> > available_spare                     : 2%
> > available_spare_threshold           : 10%
> > percentage_used                     : 67%
> > data_units_read                     : 62,129,054
> > data_units_written                  : 648,788,135
> > host_read_commands                  : 6,426,997,226
> > host_write_commands                 : 5,582,107,803
> > controller_busy_time                : 86,754
> > power_cycles                        : 21
> > power_on_hours                      : 20,252
> > unsafe_shutdowns                    : 16
> > media_errors                        : 0
> > num_err_log_entries                 : 0
> > Warning Temperature Time            : 7855
> > Critical Composite Temperature Time : 0
> > Temperature Sensor 1                : 76 C
> > Thermal Management T1 Trans Count   : 0
> > Thermal Management T2 Trans Count   : 0
> > Thermal Management T1 Total Time    : 0
> > Thermal Management T2 Total Time    : 0
>
> The second one looks a bit better, but will be changed later, too. I have
> no idea what a data unit is (512 bytes, 2048 bytes,... - I think one LBA).
>
> So we are really shredding SSDs with Lucene tests 😊
>
> Uwe
>
> P.S.: The replacement is currently going on...
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: [email protected]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to