Hello,
I'm doing performance test on THsHaServer, and I like to check if my setup
and result look reasonable.
Thrift RCP API:
I have 1 method called ping() that simply returns 0.
client:
- 100 client processes.
- Each process sends ping() in a loop.
- Thrift 0.4 with C++.
boost::shared_ptr<TSocket> socket(new TSocket(host, port));
boost::shared_ptr<TTransport> transport(new TFramedTransport(socket));
boost::shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));
server:
- RHEL4 on a 4-core box.
- Thrift 0.6 with Java. 8 worker threads.
TNonblockingServerTransport trans = new TNonblockingServerSocket(port);
THsHaServer.Args args = new THsHaServer.Args(trans);
args.workerThreads(8);
TServer server = new THsHaServer(args);
result:
- average latency: 30 ms
- throughput: 3100 requests/sec
- strace on the client process shows that there is a big time gap (~30ms)
between first and sencond recv for many requests.
12:24:05.322485 send(35, "...", 21, MSG_NOSIGNAL) = 21 <0.000018>
12:24:05.322559 recv(35, "...", 4, 0) = 4 <0.029323>
12:24:05.352003 recv(35, "...", 24, 0) = 24 <0.000009>
- the server spends most of the time in futex() call.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
85.33 23.120969 261 88620 11395 futex
10.23 2.771762 145 19128 write
2.87 0.777199 20 37917 read
...
I'm looking to see how I can reduce latency. Please let me know if I'm
missing something obvious.
Thanks!
--Michi