Re: File system

Dennis Kubes Mon, 15 Dec 2008 17:22:04 -0800

The nutch databases are either SequenceFile or MapFile formats whichstore key and value pairs. Their keys and values are Writableimplementations which translate an object into it byte equivalent andvice versa.

Data and index files are MapFile format. Data is a SequenceFile, indexis an index used by MapFiles for seeking to a specific key.

Please see the hadoop wiki for more information about Sequence and Mapfiles and writable formats.


Dennis

oSilvio wrote:

Do somebody know how do the file structure works, briefly?It seems that the data are compressed or something, its not possible to
understand whats recorded in the data nor index files.
Thanks
Silvio

Re: File system

Reply via email to