Le 8/10/12 8:50 AM, Antonio Rodriges a écrit :
After more analysis of statistics we identified the bottleneck but not
found yet the solution

You are right that when client waits for the respose, the throughput
will be low. However, we measure all stages of the query processing.
And the transfer stage takes long. We synchronize the clock between
the server, gate (see below) and the clinet within +/-5 ms. 10 clients
can generate and receive responses for about 7 queries per minute.

This is very low.
Both client and server use Mina.

Client:
while ( time not finished )
{
   q = generateQueryString            /// it is several bytes
   send (q)
   wait for response ()
}  // that;s all

There is a gate (also Mina based) which simply retranferes queries to
servers and results back to clients:

Gate (has 32 nio acceptors)
Unpack query
Parse query
Choose server
Transfer request → server

Server
Unpack request
Extract data (implies disk IO)
Create response
Pack response
Transfer resp → Gate

Again Gate:
Unpack response from server
RePack response for client
Transfer  Gate → Client

so this is :

Client --4Mb--> Gate --4Mb--> Server --4Mb--> Disk

and back (assuming that the response is just an Ack).

Transfering 4Mb on a 1Gb/s network costs 1000/40 = 25 ms, so it will take 50 ms to transfer a message from a client to the server. You will saturate the network with 20 messages sent per second.

Writing 4Mb on disk will take roughly between 50 to 100 ms too (faster if you use a SSD).

The global roundtrip will take around 25ms + 25ms + 50ms = around 100ms. You should mesure this atomic roundtrip. That also means you will be able to process 10message per second on a single client, but no more than 20 messages per second before saturating the network.

The overall performance must be dominated by disk IO which is
currently up to 200 ms. However, as I mentioned before, we measure all
stages.
200 ms to write data on disk is very slow... That will limit even further the number of message a client can process every second : 25ms + 25ms +200 = around 300ms to process a message, so this is 3 messages per second per client.

The median for a stage is given right to each of them. The statistics
is for 10 clients and 4 MB query results. They are able to receive 7
queries per minute.
So you mean that with 10 clients sending queries, each client is capable of processing 0,7 message per *minute* = 0,01 per second, max ?

The query response comes to client on average in 7237 ms.While Gate->
client takes 6627 ms and response server -> gate takes only 220 ms. on
average It is interesting why gate does not keep pace with the overall
load? While it simply retransfers the message.

client --> gate ..>
client <-- gate <..

takes 7, 237 seconds,

and

..>gate --> server
<..gate <-- server

takes only 220 ms ? (which is on line with the back the envelop math from the beginning of my response)

Maybe too many IO for a
single machine or Mina maybe tuned?
I'm just wondering what's the Gate doing... You may want to add a Logger filter in the Gate, on both side (if it's using MINA too) to see where you are losing time.

Also if you have routers between the client and the gate, I think you should check them.


--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Reply via email to