Thanks harsh !
That means basically both APIs as well as hadoop client commands allow only 
serial writes.
I was wondering what could be other ways to write data in parallel to HDFS 
other than using multiple parallel threads.

Thanks,
JJ

Sent from my iPhone

On May 17, 2011, at 10:59 PM, Harsh J <[email protected]> wrote:

> Hello,
> 
> Adding to Joey's response, copyFromLocal's current implementation is serial
> given a list of files.
> 
> On Wed, May 18, 2011 at 9:57 AM, Mapred Learn <[email protected]>
> wrote:
>> Thanks Joey !
>> I will try to find out abt copyFromLocal. Looks like Hadoop Apis write
> serially as you pointed out.
>> 
>> Thanks,
>> -JJ
>> 
>> On May 17, 2011, at 8:32 PM, Joey Echeverria <[email protected]> wrote:
>> 
>>> The sequence file writer definitely does it serially as you can only
>>> ever write to the end of a file in Hadoop.
>>> 
>>> Doing copyFromLocal could write multiple files in parallel (I'm not
>>> sure if it does or not), but a single file would be written serially.
>>> 
>>> -Joey
>>> 
>>> On Tue, May 17, 2011 at 5:44 PM, Mapred Learn <[email protected]>
> wrote:
>>>> Hi,
>>>> My question is when I run a command from hdfs client, for eg. hadoop fs
>>>> -copyFromLocal or create a sequence file writer in java code and append
>>>> key/values to it through Hadoop APIs, does it internally transfer/write
> data
>>>> to HDFS serially or in parallel ?
>>>> 
>>>> Thanks in advance,
>>>> -JJ
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>> 
> 
> -- 
> Harsh J

Reply via email to