I have not read the whole thread -- sorry. I have a suggestion. If I were to set out to make a dent on I/O performance, I would invest time in some groundwork first:
- Stop abusing "struct buf" to describe I/O requests. Introduce an iorequest_t or something. - Currently we do a virtual->physical->virtual->physical dance for many I/Os. The process has quite a bit of overhead, and for many of those I/Os we won't need them to appear in kernel space hence the dance is not needed. Make I/O requests pass around lists of vm_pages or physical addresses or whatever,