With THsHaServer (100 threads), CPU is about 60%, and it looks like the server process is thrashing (lots of futex() calls). With TThreadPoolServer (100 threads), CPU is 100%.
I'll try to write a small program to reproduce this. Thanks! --Michi On 4/27/11 11:06 AM, "Bryan Duxbury" <[email protected]> wrote: > I doubt the client has anything to do with it. I'd be interested to hear > about the condition of the server in general - is the server's java process > using up 100% cpu? That could point to the IO thread being saturated. > > On Tue, Apr 26, 2011 at 4:07 PM, Michi Mutsuzaki <[email protected]>wrote: > >> Hi Bryan, >> >> Thank you for your response. I tried 100 worker threads, but it didn't >> help... I'm still seeing a big time gap between first and second recv. >> Could >> it be because I'm using an older version of thrift on the client? >> >> Thanks! >> --Michi >> >> On 4/26/11 3:53 PM, "Bryan Duxbury" <[email protected]> wrote: >> >>> The big imbalance between worker threads and client processes means that >>> your clients are queuing like mad - on average, each client is waiting >> for >>> like 12 other requests to finish. Increase your number of threads to be >>> greater than the number of client processes and you should see a >> difference. >>> >>> On Tue, Apr 26, 2011 at 12:40 PM, Michi Mutsuzaki <[email protected] >>> wrote: >>> >>>> Hello, >>>> >>>> I'm doing performance test on THsHaServer, and I like to check if my >> setup >>>> and result look reasonable. >>>> >>>> Thrift RCP API: >>>> I have 1 method called ping() that simply returns 0. >>>> >>>> client: >>>> - 100 client processes. >>>> - Each process sends ping() in a loop. >>>> - Thrift 0.4 with C++. >>>> boost::shared_ptr<TSocket> socket(new TSocket(host, port)); >>>> boost::shared_ptr<TTransport> transport(new >> TFramedTransport(socket)); >>>> boost::shared_ptr<TProtocol> protocol(new >> TBinaryProtocol(transport)); >>>> >>>> server: >>>> - RHEL4 on a 4-core box. >>>> - Thrift 0.6 with Java. 8 worker threads. >>>> TNonblockingServerTransport trans = new >> TNonblockingServerSocket(port); >>>> THsHaServer.Args args = new THsHaServer.Args(trans); >>>> args.workerThreads(8); >>>> TServer server = new THsHaServer(args); >>>> >>>> result: >>>> - average latency: 30 ms >>>> - throughput: 3100 requests/sec >>>> - strace on the client process shows that there is a big time gap >> (~30ms) >>>> between first and sencond recv for many requests. >>>> >>>> 12:24:05.322485 send(35, "...", 21, MSG_NOSIGNAL) = 21 <0.000018> >>>> 12:24:05.322559 recv(35, "...", 4, 0) = 4 <0.029323> >>>> 12:24:05.352003 recv(35, "...", 24, 0) = 24 <0.000009> >>>> >>>> - the server spends most of the time in futex() call. >>>> >>>> % time seconds usecs/call calls errors syscall >>>> ------ ----------- ----------- --------- --------- ---------------- >>>> 85.33 23.120969 261 88620 11395 futex >>>> 10.23 2.771762 145 19128 write >>>> 2.87 0.777199 20 37917 read >>>> ... >>>> >>>> I'm looking to see how I can reduce latency. Please let me know if I'm >>>> missing something obvious. >>>> >>>> Thanks! >>>> --Michi >>>> >>>> >>> >> >> >
