Re: Too slow network IO

Emmanuel Lécharny Fri, 10 Aug 2012 06:50:21 -0700

Le 8/10/12 8:50 AM, Antonio Rodriges a écrit :

After more analysis of statistics we identified the bottleneck but not
found yet the solution


You are right that when client waits for the respose, the throughput
will be low. However, we measure all stages of the query processing.
And the transfer stage takes long. We synchronize the clock between
the server, gate (see below) and the clinet within +/-5 ms. 10 clients
can generate and receive responses for about 7 queries per minute.


This is very low.

Both client and server use Mina.

Client:
while ( time not finished )
{
   q = generateQueryString            /// it is several bytes
   send (q)
   wait for response ()
}  // that;s all

There is a gate (also Mina based) which simply retranferes queries to
servers and results back to clients:

Gate (has 32 nio acceptors)
Unpack query
Parse query
Choose server
Transfer request → server

Server
Unpack request
Extract data (implies disk IO)
Create response
Pack response
Transfer resp → Gate

Again Gate:
Unpack response from server
RePack response for client
Transfer  Gate → Client


so this is :

Client --4Mb--> Gate --4Mb--> Server --4Mb--> Disk

and back (assuming that the response is just an Ack).

Transfering 4Mb on a 1Gb/s network costs 1000/40 = 25 ms, so it willtake 50 ms to transfer a message from a client to the server. You willsaturate the network with 20 messages sent per second.

Writing 4Mb on disk will take roughly between 50 to 100 ms too (fasterif you use a SSD).

The global roundtrip will take around 25ms + 25ms + 50ms = around 100ms.You should mesure this atomic roundtrip. That also means you will beable to process 10message per second on a single client, but no morethan 20 messages per second before saturating the network.


The overall performance must be dominated by disk IO which is
currently up to 200 ms. However, as I mentioned before, we measure all
stages.

200 ms to write data on disk is very slow... That will limit evenfurther the number of message a client can process every second : 25ms +25ms +200 = around 300ms to process a message, so this is 3 messages persecond per client.


The median for a stage is given right to each of them. The statistics
is for 10 clients and 4 MB query results. They are able to receive 7
queries per minute.

So you mean that with 10 clients sending queries, each client is capableof processing 0,7 message per *minute* = 0,01 per second, max ?


The query response comes to client on average in 7237 ms.While Gate->
client takes 6627 ms and response server -> gate takes only 220 ms. on
average It is interesting why gate does not keep pace with the overall
load? While it simply retransfers the message.


client --> gate ..>
client <-- gate <..

takes 7, 237 seconds,

and

..>gate --> server
<..gate <-- server

takes only 220 ms ? (which is on line with the back the envelop mathfrom the beginning of my response)

Maybe too many IO for a
single machine or Mina maybe tuned?

I'm just wondering what's the Gate doing... You may want to add a Loggerfilter in the Gate, on both side (if it's using MINA too) to see whereyou are losing time.

Also if you have routers between the client and the gate, I think youshould check them.



--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Too slow network IO

Reply via email to