Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-27 Thread Raghavendra G
I've filed a bug on the issue at: https://bugzilla.redhat.com/show_bug.cgi?id=1360689 On Fri, Jul 15, 2016 at 12:44 PM, Raghavendra G wrote: > Hi Patrick, > > Is it possible to test out whether the patch fixes your issue? There is > nothing like validation from user

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-15 Thread Raghavendra G
Hi Patrick, Is it possible to test out whether the patch fixes your issue? There is nothing like validation from user experiencing the problem first hand. regards, Raghavendra On Tue, Jul 12, 2016 at 10:40 PM, Jeff Darcy wrote: > > Thanks for responding so quickly. I'm not

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-12 Thread Glomski, Patrick
Hello, Jeff. Thanks for responding so quickly. I'm not familiar with the codebase, so if you don't mind me asking, how much would that list reordering slow things down for, say, a queue of 1500 client machines? i.e. round-about how long of a client list would significantly affect latency? I only

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-12 Thread Jeff Darcy
> > * We might be able to tweak io-threads (which already runs on the > > bricks and already has a global queue) to schedule requests in a > > fairer way across clients. Right now it executes them in the > > same order that they were read from the network. > > This sounds to be an easier fix. We

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-11 Thread Raghavendra G
On Fri, Jul 8, 2016 at 8:02 PM, Jeff Darcy wrote: > > In either of these situations, one glusterfsd process on whatever peer > the > > client is currently talking to will skyrocket to *nproc* cpu usage (800%, > > 1600%) and the storage cluster is essentially useless; all other

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-08 Thread Jeff Darcy
> In either of these situations, one glusterfsd process on whatever peer the > client is currently talking to will skyrocket to *nproc* cpu usage (800%, > 1600%) and the storage cluster is essentially useless; all other clients > will eventually try to read or write data to the overloaded peer

[Gluster-devel] One client can effectively hang entire gluster array

2016-07-08 Thread Glomski, Patrick
Hello, users and devs. TL;DR: One gluster client can essentially cause denial of service / availability loss to entire gluster array. There's no way to stop it and almost no way to find the bad client. Probably all (at least 3.6 and 3.7) versions are affected. We have two large replicate gluster