Re: seqdirectory command in MapReduce

Dan Filimon Sat, 16 Feb 2013 10:35:47 -0800

But why would this be a problem? As long as it's using HDFS to access
the files, it should be able to fetch the chunks from wherever they
might be in the cluster.


I don't see why it wouldn't work. Let us know if it works!

On Sat, Feb 16, 2013 at 7:38 PM, Claudio Reggiani <[email protected]> wrote:
> Yes, thank you Steve. And sorry for my encoded messages
>
> Claudio
>
>
> 2013/2/16 Steve Chien <[email protected]>
>
>>  I think he meant that code is reading and converting the files from the
>> Input directory as a standalone program. Not a map-reduce program...
>>
>> On Feb 16, 2013, at 11:22, Dan Filimon <[email protected]>
>> wrote:
>>
>> > Hi Claudio,
>> >
>> > Could you be more specific? What does 'MapReduce style' mean?
>> > seqdirectory should create sequence files from the documents in a
>> > folder, where the keys are the document names and the values are the
>> > documents' content.
>> >
>> > What do you need it to do?
>> >
>> > On Sat, Feb 16, 2013 at 5:55 PM, Claudio Reggiani <[email protected]>
>> wrote:
>> >> Hello,
>> >>
>> >> I have a text dataset. Running "seqdirectory" command on it I see it's
>> not
>> >> written in MapReduce style (looking at the source code of
>> >> SequenceFilesFromDirectory confirms that).
>> >>
>> >> What if I have a big dataset stored in HDFS and I would like to convert
>> it
>> >> in SequenceFile format? Do I need to create my own custom job or
>> >> seqdirectory does that?
>> >>
>> >> Thanks
>> >> Claudio Reggiani
>>

Re: seqdirectory command in MapReduce

Reply via email to