On 08/11/2015 10:38, Matthew Dillon wrote: > The basic problem is that heavy use of dsched (the disk scheduler) has > caused crashes and lockups ever since dsched went into the tree, and is > still causing crashes and lockups today. Several people have tried to > bandaid it but honestly I think the only correct solution is to remove it > and then start over from scratch at some future date with a better design. > > So I would like to remove it, and I am soliciting opinions on that. I'm all for it.
The current design is heavy-handed and just plain overkill. Linking the I/O to threads on a fine-grain basis is just asking for trouble. I've offered blueprints for a redesign on IRC to several people over the years - without going into many details, the key aspect would be to get rid of the heavy-handed linking to threads, processes, etc. Just hashing based on the thread ID would simplify the design significantly and make it a lot easier to work with. If anybody is serious about replacing it with a more sensible design, I can dig out some more information from my logged IRC rumblings. However, with the prevalence of SSDs, NVMe, etc nowadays, I have a feeling that adding any more latency to the disk access is rather counter-productive, even if the scheduling was optimal. In other words - I'm unconvinced we even would benefit from such a framework going forward. Instead, it might be significantly more interesting trimming down the latencies through CAM, or even coming up with a possible replacement for it, with a focus on low latency/overhead. Cheers, Alex