TThreadPoolServer does its IO in many threads, so it can be faster in some circumstances. However, it's not "fair" in terms of requests - when a client gets a thread, it monopolizes it until it closes, even if it's not even making a request. In practice, this means you have to allow basically unlimited threads.
I don't know why thread pool server would stop responding after 30 minutes. We'd need to see more information to have any chance of figuring that out. 1/10th the clients will be serviced much more quickly by any server, I'd imagine. How many worker threads does the HsHa server have configured? On Wed, Apr 27, 2011 at 10:46 AM, Michi Mutsuzaki <[email protected]>wrote: > I tried using TThreadPoolServer (max threads=100), and latency was much > lower. I thought I'd just use TThreadPoolServer instead, but it stopped > responding after 30 minutes... > > - Has anybody seen TThreadPoolServer stop responding under heavy load? Now > I > can't even telnet to the server. > - I tried reducing the number of clients to 10, and the latency for > THsHaServer went down significantly. Is this an expected behavior? > > Thanks! > --Michi > > On 4/26/11 4:07 PM, "Michi Mutsuzaki" <[email protected]> wrote: > > > Hi Bryan, > > > > Thank you for your response. I tried 100 worker threads, but it didn't > help... > > I'm still seeing a big time gap between first and second recv. Could it > be > > because I'm using an older version of thrift on the client? > > > > Thanks! > > --Michi > > > > On 4/26/11 3:53 PM, "Bryan Duxbury" <[email protected]> wrote: > > > >> The big imbalance between worker threads and client processes means that > >> your clients are queuing like mad - on average, each client is waiting > for > >> like 12 other requests to finish. Increase your number of threads to be > >> greater than the number of client processes and you should see a > difference. > >> > >> On Tue, Apr 26, 2011 at 12:40 PM, Michi Mutsuzaki > >> <[email protected]>wrote: > >> > >>> Hello, > >>> > >>> I'm doing performance test on THsHaServer, and I like to check if my > setup > >>> and result look reasonable. > >>> > >>> Thrift RCP API: > >>> I have 1 method called ping() that simply returns 0. > >>> > >>> client: > >>> - 100 client processes. > >>> - Each process sends ping() in a loop. > >>> - Thrift 0.4 with C++. > >>> boost::shared_ptr<TSocket> socket(new TSocket(host, port)); > >>> boost::shared_ptr<TTransport> transport(new > TFramedTransport(socket)); > >>> boost::shared_ptr<TProtocol> protocol(new > TBinaryProtocol(transport)); > >>> > >>> server: > >>> - RHEL4 on a 4-core box. > >>> - Thrift 0.6 with Java. 8 worker threads. > >>> TNonblockingServerTransport trans = new > TNonblockingServerSocket(port); > >>> THsHaServer.Args args = new THsHaServer.Args(trans); > >>> args.workerThreads(8); > >>> TServer server = new THsHaServer(args); > >>> > >>> result: > >>> - average latency: 30 ms > >>> - throughput: 3100 requests/sec > >>> - strace on the client process shows that there is a big time gap > (~30ms) > >>> between first and sencond recv for many requests. > >>> > >>> 12:24:05.322485 send(35, "...", 21, MSG_NOSIGNAL) = 21 <0.000018> > >>> 12:24:05.322559 recv(35, "...", 4, 0) = 4 <0.029323> > >>> 12:24:05.352003 recv(35, "...", 24, 0) = 24 <0.000009> > >>> > >>> - the server spends most of the time in futex() call. > >>> > >>> % time seconds usecs/call calls errors syscall > >>> ------ ----------- ----------- --------- --------- ---------------- > >>> 85.33 23.120969 261 88620 11395 futex > >>> 10.23 2.771762 145 19128 write > >>> 2.87 0.777199 20 37917 read > >>> ... > >>> > >>> I'm looking to see how I can reduce latency. Please let me know if I'm > >>> missing something obvious. > >>> > >>> Thanks! > >>> --Michi > >>> > >>> > >> > >
