Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "NutchFileFormats" page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/NutchFileFormats?action=diff&rev1=5&rev2=6 ./src/java/org/apache/nutch/scoring/webgraph/Node.java }}} - = CrawlDB = + With the above in mind, lets look at the composite features of some of these custom Writable's + = Writable Composition = - TODO - - = LinkDB = - - TODO - - = Segments = == org.apache.hadoop.io.Text == @@ -66, +60 @@ [[http://hadoop.apache.org/docs/current2/api/index.html?org/apache/hadoop/io/ArrayFile.html|ArrayFile]] is a specialization of MapFile, specifically a dense file-based mapping from integers to values where the keys are long integers. Finally you can also see [[http://hadoop.apache.org/docs/current2/api/index.html?org/apache/hadoop/io/SetFile.html|SetFile] which is a file representing a file-based set of keys. Additional files in [[http://hadoop.apache.org/docs/current2/api/index.html?org/apache/hadoop/io/package-summary.html|org.apache.hadoop.io.*]] package contains the actual Writer, Reader and Sorter implementations as well. + + = CrawlDB = + + Content here is under construction. + Content here is under construction. + + = LinkDB = + + Content here is under construction. + Content here is under construction. + + = Segments = When Nutch crawls the web, each resulting segment has four subdirectories, each containing an ArrayFile (a MapFile having keys that are long integers):

