Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

Andrei Mikhailovsky Sat, 26 Dec 2015 06:00:39 -0800

Yes, indeed. it seems not to matter much if you do nnot have a write intensive 
cluster.


We have Intel 520s which were in production for over 2 years and only used 5% 
of their life according to smart. I've also used Samsung 840Pro, which had the 
same/similar figures over a year usage. So, I guess for my purpose, the 
endurance is not such a big deal. However, the ssds that I have absolutely suck 
performance wise for the ceph journal. Especially the Samsung drives. That's 
the main reason for wanting the 3700/3500 or their equivalent.

Andrei

----- Original Message -----
> From: "Tyler Bishop" <[email protected]>
> To: "Lionel Bouton" <[email protected]>
> Cc: "Andrei Mikhailovsky" <[email protected]>, "ceph-users" 
> <[email protected]>
> Sent: Tuesday, 22 December, 2015 16:36:21
> Subject: Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio 
> results

> Write endurance is kinda bullshit.
> 
> We have crucial 960gb drives storing data and we've only managed to take 2% 
> off
> the drives life in the period of a year and hundreds of tb written weekly.
> 
> 
> Stuff is way more durable than anyone gives it credit.
> 
> 
> ----- Original Message -----
> From: "Lionel Bouton" <[email protected]>
> To: "Andrei Mikhailovsky" <[email protected]>, "ceph-users"
> <[email protected]>
> Sent: Tuesday, December 22, 2015 11:04:26 AM
> Subject: Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio 
> results
> 
> Le 22/12/2015 13:43, Andrei Mikhailovsky a écrit :
>> Hello guys,
>>
>> Was wondering if anyone has done testing on Samsung PM863 120 GB version to 
>> see
>> how it performs? IMHO the 480GB version seems like a waste for the journal as
>> you only need to have a small disk size to fit 3-4 osd journals. Unless you 
>> get
>> a far greater durability.
> 
> The problem is endurance. If we use the 480GB for 3 OSDs each on the
> cluster we might build we expect 3 years (with some margin for error but
> not including any write amplification at the SSD level) before the SSDs
> will fail.
> In our context a 120GB model might not even last a year (endurance is
> 1/4th of the 480GB model). This is why SM863 models will probably be
> more suitable if you have access to them: you can use smaller ones which
> cost less and get more endurance (you'll have to check the performance
> though, usually smaller models have lower IOPS and bandwidth).
> 
>> I am planning to replace my current journal ssds over the next month or so 
>> and
>> would like to find out if there is an a good alternative to the Intel's
>> 3700/3500 series.
> 
> 3700 are a safe bet (the 100GB model is rated for ~1.8PBW). 3500 models
> probably don't have enough endurance for many Ceph clusters to be cost
> effective. The 120GB model is only rated for 70TBW and you have to
> consider both client writes and rebalance events.
> I'm uneasy with SSDs expected to fail within the life of the system they
> are in: you can have a cascade effect where an SSD failure brings down
> several OSDs triggering a rebalance which might make SSDs installed at
> the same time fail too. In this case in the best scenario you will reach
> your min_size (>=2) and block any writes which would prevent more SSD
> failures until you move journals to fresh SSDs. If min_size = 1 you
> might actually lose data.
> 
> If you expect to replace your current journal SSDs if I were you I would
> make a staggered deployment over several months/a year to avoid them
> failing at the same time in case of an unforeseen problem. In addition
> this would allow to evaluate the performance and behavior of a new SSD
> model with your hardware (there have been reports of performance
> problems with some combinations of RAID controllers and SSD
> models/firmware versions) without impacting your cluster's overall
> performance too much.
> 
> When using SSDs for journals you have to monitor both :
> * the SSD wear leveling or something equivalent (SMART data may not be
> available if you use a RAID controller but usually you can get the total
> amount data written) of each SSD,
> * the client writes on the whole cluster.
> And check periodically what the expected lifespan left there is for each
> of your SSD based on their current state, average write speed, estimated
> write amplification (both due to pool's size parameter and the SSD
> model's inherent write amplification) and the amount of data moved by
> rebalance events you expect to happen.
> Ideally you should make this computation before choosing the SSD models,
> but several variables are not always easy to predict and probably will
> change during the life of your cluster.
> 
> Lionel
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

Reply via email to