They both map to the same class in hadoop: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html
JP On Thu, Dec 29, 2011 at 3:32 AM, rahul raghavendhra <[email protected]> wrote: > Thanks jose, > thank you for your reply, i have one more silly doubt, mahout sequence > file format and hadoop sequence file format are same or different ? > please reply > ./rahul > >>On Wed, Dec 28, 2011 at 10:27 PM, Josh Patterson <[email protected]> wrote: > >> >Rahul, >> >Currently the text file to sequence file functionality is contained in >> >(as of Mahout 0.6 / trunk): >> >org.apache.mahout.text.SequenceFilesFromDirectory >> > > >>and it write a K/V pair to a standard sequence file in the form of: >> >> >{ filepath (Text), contents of file (Text) } >> >> >In the current single process form of the code it uses a custom >> >PathFilter (SequenceFilesFromDirectoryFilter) to recursively walk down >> >a directory and its child directories to write the contained files >> >into a series of sequence files based on a variety of options like >> >"chunk size". >> >> >An example of running this would be: >> >> >bin/mahout seqdirectory -c UTF-8 -i reuters/ -o reuters-seqfiles >> >> >Josh >> >> On Wed, Dec 28, 2011 at 7:00 AM, rahul raghavendhra >> <[email protected]> wrote: >> > I am new to Mahout.. i just want to know how text file is converted into >> > seqfile and then to sparse vectors.. >> > any kind of text file can be converted into seq file using ./mahout >> > seqdirectory ? >> > >> > thanks in advance.. >> > >> > ./rahul >> >> >> >> -- >> Twitter: @jpatanooga >> Solution Architect @ Cloudera >> hadoop: http://www.cloudera.com >> -- Twitter: @jpatanooga Solution Architect @ Cloudera hadoop: http://www.cloudera.com
