Thanks Patrick ! This would work if directory is to be uploaded but for streaming, I guess, this would not work.
Sent from my iPhone On May 18, 2011, at 9:39 AM, Patrick Angeles <[email protected]> wrote: > kinda clunky but you could do this via shell: > > for $FILE in $LIST_OF_FILES ; do > hadoop fs -copyFromLocal $FILE $DEST_PATH & > done > > If doing this via the Java API, then, yes you will have to use multiple > threads. > > On Wed, May 18, 2011 at 1:04 AM, Mapred Learn <[email protected]>wrote: > >> Thanks harsh ! >> That means basically both APIs as well as hadoop client commands allow only >> serial writes. >> I was wondering what could be other ways to write data in parallel to HDFS >> other than using multiple parallel threads. >> >> Thanks, >> JJ >> >> Sent from my iPhone >> >> On May 17, 2011, at 10:59 PM, Harsh J <[email protected]> wrote: >> >>> Hello, >>> >>> Adding to Joey's response, copyFromLocal's current implementation is >> serial >>> given a list of files. >>> >>> On Wed, May 18, 2011 at 9:57 AM, Mapred Learn <[email protected]> >>> wrote: >>>> Thanks Joey ! >>>> I will try to find out abt copyFromLocal. Looks like Hadoop Apis write >>> serially as you pointed out. >>>> >>>> Thanks, >>>> -JJ >>>> >>>> On May 17, 2011, at 8:32 PM, Joey Echeverria <[email protected]> wrote: >>>> >>>>> The sequence file writer definitely does it serially as you can only >>>>> ever write to the end of a file in Hadoop. >>>>> >>>>> Doing copyFromLocal could write multiple files in parallel (I'm not >>>>> sure if it does or not), but a single file would be written serially. >>>>> >>>>> -Joey >>>>> >>>>> On Tue, May 17, 2011 at 5:44 PM, Mapred Learn <[email protected]> >>> wrote: >>>>>> Hi, >>>>>> My question is when I run a command from hdfs client, for eg. hadoop >> fs >>>>>> -copyFromLocal or create a sequence file writer in java code and >> append >>>>>> key/values to it through Hadoop APIs, does it internally >> transfer/write >>> data >>>>>> to HDFS serially or in parallel ? >>>>>> >>>>>> Thanks in advance, >>>>>> -JJ >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Joseph Echeverria >>>>> Cloudera, Inc. >>>>> 443.305.9434 >>>> >>> >>> -- >>> Harsh J >>
