On Thu, Jan 31, 2019 at 11:12 PM Xavi Hernandez <xhernan...@redhat.com> wrote:
> On Fri, Feb 1, 2019 at 7:54 AM Vijay Bellur <vbel...@redhat.com> wrote: > >> >> >> On Thu, Jan 31, 2019 at 10:01 AM Xavi Hernandez <xhernan...@redhat.com> >> wrote: >> >>> Hi, >>> >>> I've been doing some tests with the global thread pool [1], and I've >>> observed one important thing: >>> >>> Since this new thread pool has very low contention (apparently), it >>> exposes other problems when the number of threads grows. What I've seen is >>> that some workloads use all available threads on bricks to do I/O, causing >>> avgload to grow rapidly and saturating the machine (or it seems so), which >>> really makes everything slower. Reducing the maximum number of threads >>> improves performance actually. Other workloads, though, do little I/O >>> (probably most is locking or smallfile operations). In this case limiting >>> the number of threads to a small value causes a performance reduction. To >>> increase performance we need more threads. >>> >>> So this is making me thing that maybe we should implement some sort of >>> I/O queue with a maximum I/O depth for each brick (or disk if bricks share >>> same disk). This way we can limit the amount of requests physically >>> accessing the underlying FS concurrently, without actually limiting the >>> number of threads that can be doing other things on each brick. I think >>> this could improve performance. >>> >> >> Perhaps we could throttle both aspects - number of I/O requests per disk >> and the number of threads too? That way we will have the ability to behave >> well when there is bursty I/O to the same disk and when there are multiple >> concurrent requests to different disks. Do you have a reason to not limit >> the number of threads? >> > > No, in fact the global thread pool does have a limit for the number of > threads. I'm not saying to replace the thread limit for I/O depth control, > I think we need both. I think we need to clearly identify which threads are > doing I/O and limit them, even if there are more threads available. The > reason is easy: suppose we have a fixed number of threads. If we have heavy > load sent in parallel, it's quite possible that all threads get blocked > doing some I/O. This has two consequences: > > 1. There are no more threads to execute other things, like sending > answers to the client, or start processing new incoming requests. So CPU is > underutilized. > 2. Massive parallel access to a FS actually decreases performance > > This means that we can do less work and this work takes more time, which > is bad. > > If we limit the number of threads that can actually be doing FS I/O, it's > easy to keep FS responsive and we'll still have more threads to do other > work. > Got it, thx. > > >> >>> Maybe this approach could also be useful in client side, but I think >>> it's not so critical there. >>> >> >> Agree, rate limiting on the server side would be more appropriate. >> > > Only thing to consider here is that if we limit rate on servers but > clients can generate more requests without limit, we may require lots of > memory to track all ongoing requests. Anyway, I think this is not the most > important thing now, so if we solve the server-side problem, then we can > check if this is really needed or not (it could happen that client > applications limit themselves automatically because they will be waiting > for answers from server before sending more requests, unless the number of > application running concurrently is really huge). > We could enable throttling in the rpc layer to handle a client performing aggressive I/O. RPC throttling should be able to handle the scenario described above. -Vijay
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel