Thanks jose, thank you for your reply, i have one more silly doubt, mahout sequence file format and hadoop sequence file format are same or different ? please reply ./rahul
>On Wed, Dec 28, 2011 at 10:27 PM, Josh Patterson <[email protected]> wrote: > >Rahul, > >Currently the text file to sequence file functionality is contained in > >(as of Mahout 0.6 / trunk): > >org.apache.mahout.text.SequenceFilesFromDirectory > > >and it write a K/V pair to a standard sequence file in the form of: > > >{ filepath (Text), contents of file (Text) } > > >In the current single process form of the code it uses a custom > >PathFilter (SequenceFilesFromDirectoryFilter) to recursively walk down > >a directory and its child directories to write the contained files > >into a series of sequence files based on a variety of options like > >"chunk size". > > >An example of running this would be: > > >bin/mahout seqdirectory -c UTF-8 -i reuters/ -o reuters-seqfiles > > >Josh > > On Wed, Dec 28, 2011 at 7:00 AM, rahul raghavendhra > <[email protected]> wrote: > > I am new to Mahout.. i just want to know how text file is converted into > > seqfile and then to sparse vectors.. > > any kind of text file can be converted into seq file using ./mahout > > seqdirectory ? > > > > thanks in advance.. > > > > ./rahul > > > > -- > Twitter: @jpatanooga > Solution Architect @ Cloudera > hadoop: http://www.cloudera.com >
