Hello,

I'm doing performance test on THsHaServer, and I like to check if my setup
and result look reasonable.

Thrift RCP API:
I have 1 method called ping() that simply returns 0.

client:
- 100 client processes.
- Each process sends ping() in a loop.
- Thrift 0.4 with C++.
    boost::shared_ptr<TSocket> socket(new TSocket(host, port));
    boost::shared_ptr<TTransport> transport(new TFramedTransport(socket));
    boost::shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));

server:
- RHEL4 on a 4-core box.
- Thrift 0.6 with Java. 8 worker threads.
    TNonblockingServerTransport trans = new TNonblockingServerSocket(port);
    THsHaServer.Args args = new THsHaServer.Args(trans);
    args.workerThreads(8);
    TServer server = new THsHaServer(args);

result:
- average latency: 30 ms
- throughput: 3100 requests/sec
- strace on the client process shows that there is a big time gap (~30ms)
between first and sencond recv for many requests.

    12:24:05.322485 send(35, "...", 21, MSG_NOSIGNAL) = 21 <0.000018>
    12:24:05.322559 recv(35, "...", 4, 0) = 4 <0.029323>
    12:24:05.352003 recv(35, "...", 24, 0) = 24 <0.000009>

- the server spends most of the time in futex() call.

    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     85.33   23.120969         261     88620     11395 futex
     10.23    2.771762         145     19128           write
      2.87    0.777199          20     37917           read
    ...

I'm looking to see how I can reduce latency. Please let me know if I'm
missing something obvious.

Thanks!
--Michi

Reply via email to