Le 8/10/12 8:50 AM, Antonio Rodriges a écrit :
After more analysis of statistics we identified the bottleneck but not
found yet the solution
You are right that when client waits for the respose, the throughput
will be low. However, we measure all stages of the query processing.
And the transfer stage takes long. We synchronize the clock between
the server, gate (see below) and the clinet within +/-5 ms. 10 clients
can generate and receive responses for about 7 queries per minute.
This is very low.
Both client and server use Mina.
Client:
while ( time not finished )
{
q = generateQueryString /// it is several bytes
send (q)
wait for response ()
} // that;s all
There is a gate (also Mina based) which simply retranferes queries to
servers and results back to clients:
Gate (has 32 nio acceptors)
Unpack query
Parse query
Choose server
Transfer request → server
Server
Unpack request
Extract data (implies disk IO)
Create response
Pack response
Transfer resp → Gate
Again Gate:
Unpack response from server
RePack response for client
Transfer Gate → Client
so this is :
Client --4Mb--> Gate --4Mb--> Server --4Mb--> Disk
and back (assuming that the response is just an Ack).
Transfering 4Mb on a 1Gb/s network costs 1000/40 = 25 ms, so it will
take 50 ms to transfer a message from a client to the server. You will
saturate the network with 20 messages sent per second.
Writing 4Mb on disk will take roughly between 50 to 100 ms too (faster
if you use a SSD).
The global roundtrip will take around 25ms + 25ms + 50ms = around 100ms.
You should mesure this atomic roundtrip. That also means you will be
able to process 10message per second on a single client, but no more
than 20 messages per second before saturating the network.
The overall performance must be dominated by disk IO which is
currently up to 200 ms. However, as I mentioned before, we measure all
stages.
200 ms to write data on disk is very slow... That will limit even
further the number of message a client can process every second : 25ms +
25ms +200 = around 300ms to process a message, so this is 3 messages per
second per client.
The median for a stage is given right to each of them. The statistics
is for 10 clients and 4 MB query results. They are able to receive 7
queries per minute.
So you mean that with 10 clients sending queries, each client is capable
of processing 0,7 message per *minute* = 0,01 per second, max ?
The query response comes to client on average in 7237 ms.While Gate->
client takes 6627 ms and response server -> gate takes only 220 ms. on
average It is interesting why gate does not keep pace with the overall
load? While it simply retransfers the message.
client --> gate ..>
client <-- gate <..
takes 7, 237 seconds,
and
..>gate --> server
<..gate <-- server
takes only 220 ms ? (which is on line with the back the envelop math
from the beginning of my response)
Maybe too many IO for a
single machine or Mina maybe tuned?
I'm just wondering what's the Gate doing... You may want to add a Logger
filter in the Gate, on both side (if it's using MINA too) to see where
you are losing time.
Also if you have routers between the client and the gate, I think you
should check them.
--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com