On 19/7/25 04:08, Zdenek Kabelac wrote:
> Especially in virtual world you often get already some 'provisioned' space
> which nowhere near to anything physical. Try to stripe such drives makes
> no practical sense.

Been following this discussion with interest. I think the above sentence 
sums up the entire discussion.

A few more things I want to add.

1. Adding controllers, even in a physical machine, with individual disks 
and stripe across them is by far the worst thing you can do. As 
controllers, especially the high end ones, do have a huge amount of 
logic and knowledge of the underlying topology, it can, and will 
optimise read and write operations to align with head positions and 
other characteristics in case of HDD's. On SSD's it can/will align to 
disk firmware settings and instructions. These are in 99% of all cases 
not available in the OS as they will be masked by the controller. If you 
add more controllers these will all operate independently and have no 
insight in the IO pattern and can therefore only operate on a single io. 
As there are then dependencies on different controllers and different 
disks, all controls will need to come from the OS side.

2. Using guest IO controls. (Such as direct io parameters). These kind 
of parameters will almost always be ignored by the underlying hypervisor 
as that will by itself needs to determine the IO parameters in 
conjunction with other io activity of other guests on that hypervisor.

3. Unreal cache optimisations. Using dd is by far the worst option to 
use for performance tests as it will never (Ok, almost never) align with 
real workloads. If you use dd for performance test you will find that 
this will backfire in most cases when a normal workload is applied. The 
main reason is that dd will always have a sequential workload unless you 
start a large amount of dd instances to the same disk at once with 
different offsets. Even then you will see an obscure number coming back.

4. Use a tool that can record and replay io workloads such as SWAT and 
VDbench.

There are a massive amount of dependencies when it comes to storage 
performance profile and optimisations. You may get really disappointing 
outcomes when systems are configured with settings that come out of an 
unrealistic profiling exercise and a real world workload will get 
deployed on them.


cheers

Erwin



Reply via email to