Hi Claudio, Could you be more specific? What does 'MapReduce style' mean? seqdirectory should create sequence files from the documents in a folder, where the keys are the document names and the values are the documents' content.
What do you need it to do? On Sat, Feb 16, 2013 at 5:55 PM, Claudio Reggiani <[email protected]> wrote: > Hello, > > I have a text dataset. Running "seqdirectory" command on it I see it's not > written in MapReduce style (looking at the source code of > SequenceFilesFromDirectory confirms that). > > What if I have a big dataset stored in HDFS and I would like to convert it > in SequenceFile format? Do I need to create my own custom job or > seqdirectory does that? > > Thanks > Claudio Reggiani
