I've been thinking about the point below, about using dd as a performance test. Although I do wholeheartedly agree that real-life workload tests must to be conducted (ideally where a production workload is recorded, and then played back in a test environment), I still think that dd and fio provide valuable insights into disk performance.
On real physical hardware, where I know all of the variables, the product spec sheets, and the theoretical performance ceiling of all the components, I actually see results that are close to the physical capabilities of the hardware involved when I use dd or fio. To be clear, fio is showing the same results as dd when using direct writes, which indicates that dd isn't that bad of a quick test and is a viable replacement for fio to be able to get a quick idea of where performance lies. In these cases, testing a system with dd or fio, what I'm after is a validation of the physical capabilities of the hardware, as a starting point in performance tuning. If these results aren't in line with expectations, then it's pointless tuning any other layer until this is resolved. Unless someone can provide evidence that dd or fio do not and simply can not show true hardware speeds when testing in a virtualized environment, then I can only continue to use these tools in benchmarking. Because when it comes down to it, we buy hardware for their performance characteristics, and want to verify that we are seeing that same performance in our environment after purchase. /AH -----Original Message----- From: Erwin van Londen <er...@erwinvanlonden.net> Sent: 21 July 2025 05:26 To: Zdenek Kabelac <zdenek.kabe...@gmail.com>; Henry, Andrew <andrew.he...@eon.se>; linux-lvm@lists.linux.dev Subject: Re: striped LV and expected performance 3. Unreal cache optimisations. Using dd is by far the worst option to use for performance tests as it will never (Ok, almost never) align with real workloads. If you use dd for performance test you will find that this will backfire in most cases when a normal workload is applied. The main reason is that dd will always have a sequential workload unless you start a large amount of dd instances to the same disk at once with different offsets. Even then you will see an obscure number coming back.