Thank you for the detailed update, Xavi! This looks very interesting. On Thu, Jan 24, 2019 at 7:50 AM Xavi Hernandez <[email protected]> wrote:
> Hi all, > > I've just updated a patch [1] that implements a new thread pool based on a > wait-free queue provided by userspace-rcu library. The patch also includes > an auto scaling mechanism that only keeps running the needed amount of > threads for the current workload. > > This new approach has some advantages: > > - It's provided globally inside libglusterfs instead of inside an > xlator > > This makes it possible that fuse thread and epoll threads transfer the > received request to another thread sooner, wating less CPU and reacting > sooner to other incoming requests. > > > - Adding jobs to the queue used by the thread pool only requires an > atomic operation > > This makes the producer side of the queue really fast, almost with no > delay. > > > - Contention is reduced > > The producer side has negligible contention thanks to the wait-free > enqueue operation based on an atomic access. The consumer side requires a > mutex, but the duration is very small and the scaling mechanism makes sure > that there are no more threads than needed contending for the mutex. > > > This change disables io-threads, since it replaces part of its > functionality. However there are two things that could be needed from > io-threads: > > - Prioritization of fops > > Currently, io-threads assigns priorities to each fop, so that some fops > are handled before than others. > > > - Fair distribution of execution slots between clients > > Currently, io-threads processes requests from each client in round-robin. > > > These features are not implemented right now. If they are needed, probably > the best thing to do would be to keep them inside io-threads, but change > its implementation so that it uses the global threads from the thread pool > instead of its own threads. > These features are indeed useful to have and hence modifying the implementation of io-threads to provide this behavior would be welcome. > > > These tests have shown that the limiting factor has been the disk in most > cases, so it's hard to tell if the change has really improved things. There > is only one clear exception: self-heal on a dispersed volume completes > 12.7% faster. The utilization of CPU has also dropped drastically: > > Old implementation: 12.30 user, 41.78 sys, 43.16 idle, 0.73 wait > > New implementation: 4.91 user, 5.52 sys, 81.60 idle, 5.91 wait > > > Now I'm running some more tests on NVMe to try to see the effects of the > change when disk is not limiting performance. I'll update once I've more > data. > > Will look forward to these numbers. Regards, Vijay
_______________________________________________ Gluster-devel mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-devel
