The big imbalance between worker threads and client processes means that your clients are queuing like mad - on average, each client is waiting for like 12 other requests to finish. Increase your number of threads to be greater than the number of client processes and you should see a difference.
On Tue, Apr 26, 2011 at 12:40 PM, Michi Mutsuzaki <[email protected]>wrote: > Hello, > > I'm doing performance test on THsHaServer, and I like to check if my setup > and result look reasonable. > > Thrift RCP API: > I have 1 method called ping() that simply returns 0. > > client: > - 100 client processes. > - Each process sends ping() in a loop. > - Thrift 0.4 with C++. > boost::shared_ptr<TSocket> socket(new TSocket(host, port)); > boost::shared_ptr<TTransport> transport(new TFramedTransport(socket)); > boost::shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport)); > > server: > - RHEL4 on a 4-core box. > - Thrift 0.6 with Java. 8 worker threads. > TNonblockingServerTransport trans = new TNonblockingServerSocket(port); > THsHaServer.Args args = new THsHaServer.Args(trans); > args.workerThreads(8); > TServer server = new THsHaServer(args); > > result: > - average latency: 30 ms > - throughput: 3100 requests/sec > - strace on the client process shows that there is a big time gap (~30ms) > between first and sencond recv for many requests. > > 12:24:05.322485 send(35, "...", 21, MSG_NOSIGNAL) = 21 <0.000018> > 12:24:05.322559 recv(35, "...", 4, 0) = 4 <0.029323> > 12:24:05.352003 recv(35, "...", 24, 0) = 24 <0.000009> > > - the server spends most of the time in futex() call. > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 85.33 23.120969 261 88620 11395 futex > 10.23 2.771762 145 19128 write > 2.87 0.777199 20 37917 read > ... > > I'm looking to see how I can reduce latency. Please let me know if I'm > missing something obvious. > > Thanks! > --Michi > >
