Re: HDFS data transfer!

jason hadoop Fri, 12 Jun 2009 18:55:20 -0700

Also check the IO wait time on your datanodes, if the io wait time is high,
you can't win.


On Fri, Jun 12, 2009 at 11:24 AM, Brian Bockelman <[email protected]>wrote:

> What's your replication factor?  What aggregate I/O rates do you see in
> Ganglia?  Is the I/O spikey, or has it plateaued?
>
> We can hit close to network rate (1Gbps) per node locally, and have pretty
> similar hardware.
>
> Brian
>
>
> On Jun 12, 2009, at 9:03 AM, Scott wrote:
>
>  I ran the put command on 3 of the nodes simultaneously to copy files that
>> were local on those machines into the hdfs.
>>
>> Brian Bockelman wrote:
>>
>>> What'd you do for the tests?  Was it a single stream or a multiple stream
>>> test?
>>>
>>> Brian
>>>
>>> On Jun 12, 2009, at 6:48 AM, Scott wrote:
>>>
>>>  So is ~ 1GB/minute transfer rate a reasonable performance benchmark?
>>>>  Our test cluster consists of 4 quad core xeon machines with 2 non-raided
>>>> drives each.  My initial tests show a transfer rate of around 1GB/minute,
>>>> and that was slower that I expected it to be.
>>>>
>>>> Thanks,
>>>> Scott
>>>>
>>>>
>>>> Brian Bockelman wrote:
>>>>
>>>>> Hey Sugandha,
>>>>>
>>>>> Transfer rates depend on the quality/quantity of your hardware and the
>>>>> quality of your client disk that is generating the data.  I usually say 
>>>>> that
>>>>> you should expect near-hardware-bottleneck speeds for an otherwise idle
>>>>> cluster.
>>>>>
>>>>> There should be no "make it fast" required (though you should reviewi
>>>>> the logs for errors if it's going slow).  I would expect a 5GB file to 
>>>>> take
>>>>> around 3-5 minutes to write on our cluster, but it's a well-tuned and
>>>>> operational cluster.
>>>>>
>>>>> As Todd (I think) mentioned before, we can't help any when you say "I
>>>>> want to make it faster".  You need to provide diagnostic information - 
>>>>> logs,
>>>>> Ganglia plots, stack traces, something - that folks can look at.
>>>>>
>>>>> Brian
>>>>>
>>>>> On Jun 10, 2009, at 2:25 AM, Sugandha Naolekar wrote:
>>>>>
>>>>>  But if I want to make it fast, then??? I want to place the data in
>>>>>> HDFS and
>>>>>> reoplicate it in fraction of seconds. Can that be possible. and How?
>>>>>>
>>>>>> On Wed, Jun 10, 2009 at 2:47 PM, kartik saxena <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>  I would suppose about 2-3 hours. It took me some 2 days to load a 160
>>>>>>> Gb
>>>>>>> file.
>>>>>>> Secura
>>>>>>>
>>>>>>> On Wed, Jun 10, 2009 at 11:56 AM, Sugandha Naolekar
>>>>>>> <[email protected]>wrote:It
>>>>>>>
>>>>>>>  Hello!
>>>>>>>>
>>>>>>>> If I try to transfer a 5GB VDI file from a remote host(not a part of
>>>>>>>>
>>>>>>> hadoop
>>>>>>>
>>>>>>>> cluster) into HDFS, and get it back, how much time is it supposed to
>>>>>>>>
>>>>>>> take?
>>>>>>>
>>>>>>>>
>>>>>>>> No map-reduce involved. Simply Writing files in and out from HDFS
>>>>>>>> through
>>>>>>>>
>>>>>>> a
>>>>>>>
>>>>>>>> simple code of java (usage of API's).
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards!
>>>>>>>> Sugandha
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards!
>>>>>> Sugandha
>>>>>>
>>>>>
>>>>>
>>>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: HDFS data transfer!

Reply via email to