Greg wrote: > Hi there, > > I have a "server" running 24/7 with a lot of RAM. I would like to speed up > disk system by giving much higher priority to reads and delaying writes. > > YES, I KNOW THE RISK! > > As I understand, there are two things to tune: > > 1. I/O Scheduler. The default is mq-deadline. Let say I have md_raid array > /dev/md0 consisting of /dev/sd[ab]. Should I keep it for both, md0 and > physical disks? Which I/O scheduler parameters should I change in case of > md0 and which in case of sd[ab]? > > 2. Kernel runtime parameters. As I understand I should focus on vm.dirty_* > parameters. My Idea is to set vm.dirty_ratio=70 and > vm.dirty_expire_centisecs to something like 10 to 60min. Should I change > anything else? > > PS. Among others, I'm trying to learn something about Linux caching. So > please stick to above questions.
The first thing you should do is establish whether you have a problem. cat /proc/vmstat | egrep "dirty|writeback" nr_dirty is the number of pages waiting to be written, which will be written out when vm.dirty_expire_centisecs goes off. If this number is low while your server is actually doing things, then there is no point in trying to delay writes further - it is not pushing enough data to disk to be worthwhile. What's low? Well, a page is typically 4KB, so anything that is less than a tenth of a second of writing is going to be fast. If you have a RAID that can write at 150MB/s (a typical speed for a single rotating disk) then less than 15MB is negligible. That would be an nr_dirty around 1000. If you are using a SATA SSD, a write speed of 500MB/s is a good assumption, so an nr_dirty exceeding 6000 would hit a 0.1 second threshold. If you are using PCIe 4 NVMe SSDs, 1-2 GB/s is plausible, and the nr_dirty would have to be 12000 to 25000. All this assumes a fairly linear write pattern. If you are using rotating disks and all your writes are small and to random locations - the worst case - then an nr_dirty of 100 might be interesting. If you want to gather stats for a longer period of time while you run a typical workload, the command you want is called sar. First establish that you have a problem. -dsr-

