> On Aug 5, 2017, at 21:03, Ivan Kudryavtsev <kudryavtsev...@bw-sw.com> wrote:
> Hi, I think Eric's comments are too tough. E.g. I have 11xSSD 1TB with
> linux soft raid 5 and Ext4 and it works like a charm without special
> tunning.
> Qcow2 also not so bad. LVM2 does it better of course (if not being
> snapshotted). Our users have different workloads and nobody claims disk
> performance is a problem. Read/write 100 MB/sec over 10G connection is not
> a problem at all for the setup specified above.

100 MB/sec is the speed of a single vintage 2010 5200 RPM SATA-2 drive. For 
many people, that is not a problem. For some, it is. For example, I have a 
12x-SSD RAID10 for a database. This RAID10 is on a SAS2 bus with 4 channels 
thus capable of 2.4 gigaBYTES per second raw throughput. Yes, I have validated 
that the SAS2 bus is the limit on throughput for my SSD array. If I provided a 
qcow2 volume to the database instance that only managed 100MB/sec, my database 
people would howl.

I have many virtual machines that run quite happily with thin qcow2 volumes on 
12-disk RAID6 XFS datastores (spinning storage) with no problem, because they 
don't care about disk throughput, they are there to process data, or provide 
services like DNS or a Wiki knowledge base, or otherwise do things that aren't 
particularly time-critical in our environment. So it's all about your customer 
and his needs. For maximum throughput, qcow2 on a ext4 soft RAID capable of 
doing 100Mb/sec is very... 2010 spinning storage... and people who need more 
than that, like database people, will be extremely dissatisfied. 

Thus my suggestions of ways to improve performance via providing a custom disk 
offering for those cases where disk performance and specifically write 
performance is a problem -- switching to 'sparse' rather than 'thin' as the 
provisioning mechanism (which greatly speeds writes since now only the 
filesystem block allocation mechanisms get invoked, rather than qcow2's block 
allocation mechanisms, and qcow2 now only has a single allocation zone which 
greatly speeds its own lookups), using a different underlying filesystem that 
has proven to have more consistent performance (xfs isn't much faster than ext4 
under most scenarios but doesn't have the lengthy dropouts in performance that 
come with lots of writes on ext4), and possibly flipping on async caching in 
the disk offering if data integrity isn't a problem (for example, for an 
Elasticsearch instance, the data is all replicated across multiple nodes on 
multiple datastores anyhow, so if I lose an Elasticsearch node's data so what? 
I just destroy that instance and create a new one to join to the cluster!). And 
of course there's always the option of simply avoiding qcow2 altogether and 
providing the data via iSCSI or NFS directly to the instance, which may be what 
you need to do for something like a database that has some very specific 
performance and throughput requirements.

Reply via email to