Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=11&rev2=12 The index structure formed after indexing is shown below : - ||'''Field Name'''||'''Stored'''||'''Index'''|| '''Indexing Filter/Plugin''' ||'''Comment'''|| + ||'''Field Name'''||'''Stored'''||'''Index'''|| '''Plugin''' ||'''Comment'''|| - || boost || YES || Not Indexed || Indexer || || - || digest || YES || Not Indexed || Indexer || || + || boost || YES || Not Indexed || /!\ NEEDS COMMENT /!\ || /!\ NEEDS COMMENT /!\ || + || digest || YES || Not Indexed || /!\ NEEDS COMMENT /!\ || /!\ NEEDS COMMENT /!\ || || lang || YES || Un-Tokenized || language-identifier || Add a '''lang''', language field to a document.|| - || segment || YES || Not Indexed || Indexer || || - || tstamp || YES || Tokenized || Indexer || || + || segment || YES || Not Indexed || /!\ NEEDS COMMENT /!\ || /!\ NEEDS COMMENT /!\ || + || tstamp || YES || Tokenized || /!\ NEEDS COMMENT /!\ || /!\ NEEDS COMMENT /!\ || || cc:license || YES || Indexed, Tokenized || creativecommons || Adds the entire license as '''cc:license=xxx''' and '''attributes''' extracted of the license url|| || cc:meta || YES || Indexed, Tokenized || creativecommons || Adds the license location as '''cc:meta=xxx''' || || cc:type || YES || Indexed,Tokenized || creativecommons || Adds the work type as '''cc:type=xxx'''|| @@ -21, +21 @@ || content || NO || Tokenized || index-basic || Adds basic searchable '''content field''' to a document. || || lastModified || NO || Indexed, Un-Tokenized || index-more || Adds some time related meta info in the form of '''last-modified''' if present. || || date || NO || Indexed, Un-Tokenized || index-more || Index date as last-modified, or, if that's not present, uses fetch time. || - || contentLength || NO || Indexed, Un-Tokenized || index-more || /!\ NEEDS COMMENT/!\ || + || contentLength || NO || Indexed, Un-Tokenized || index-more || /!\ NEEDS COMMENT /!\ || || type || NO || Indexed, Un-Tokenized || index-more || Adds contentType, primaryType, subType (all mime-types) || || primaryType || NO || Indexed, Un-Tokenized || index-more || primaryType (mime-type) || || subType || NO || Indexed, Un-Tokenized || index-more || subType (mime-type) ||

