Taj,
Even 4 times faster (400 sec for 68MB) is not very fast. First try to
scp a similar sized file between the hosts involved. If this transfer is
slow, first fix this issue. Try to place the test file on the same
partition where HDFS data is stored.
With tcpdump, first make sure amount of data transfered matches around
68MB that you expect.. and check for any large gaps in data packets
comming to the client. Also when the client is reading, check netstat on
both client and the datanode.. note the send buffer on datanode and recv
buffer on the client. If datanodes send buffer is non-zero most of the
time, then you have some network issue, if recv buffer on client is
full, then client is reading slow for some reason... etc.
hope this helps.
Raghu.
j2eeiscool wrote:
Hi Raghu,
Good catch, thanx. totalBytesRead is not used for any decision etc.
I ran the client from another m/c and read was about 4 times faster.
I have the tcpdump from the original client m/c.
This is probably asking too much but anything in particular I should be
looking in the tcpdump.
Is (tcpdump) about 16 megs in size.
Thanx,
Taj
Raghu Angadi wrote:
Thats too long.. buffer size does not explain it. Only small problem I
see in your code:
> totalBytesRead += bytesReadThisRead;
> fileNotReadFully = (bytesReadThisRead != -1);
totalBytesRead is off by 1. Not sure where totalBytesRead is used.
If you can, try to check tcpdump on your client machine (for datanode
port 50010)
Raghu.
j2eeiscool wrote:
Hi Raghu,
Many thanx for your reply:
The write takes approximately: 11367 millisecs.
The read takes approximately: 1610565 millisecs.
File size is 68573254 bytes and hdfs block size is 64 megs.