Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=7&rev2=8 The index structure formed after indexing is shown below : - ||'''FieldName'''||'''Stored'''||'''Index'''|| '''IndexingFilter''' ||'''Comment'''|| + ||'''Field Name'''||'''Stored'''||'''Index'''|| '''Indexing Filter/Plugin''' ||'''Comment'''|| - || boost || YES || NotIndexed || Indexer || || + || boost || YES || Not Indexed || Indexer || || - || digest || YES || NotIndexed || Indexer || || + || digest || YES || Not Indexed || Indexer || || - || lang || YES || UnTokenized || language-identifier || || + || lang || YES || Un-Tokenized || language-identifier || || - || segment || YES || NotIndexed || Indexer || || + || segment || YES || Not Indexed || Indexer || || || tstamp || YES || Tokenized || Indexer || || || anchor || NO || Tokenized || index-anchor || Indexing filter that indexes all inbound '''anchor text''' for a document.|| || title || YES || Tokenized || index-basic || Adds basic searchable '''title field''' to a document. Also indexed by index-more || - || site || NO || UnTokenized || index-basic || Adds basic searchable '''site field''' to a document. || + || site || NO || Un-Tokenized || index-basic || Adds basic searchable '''site field''' to a document. || || host || NO || Tokenized || index-basic || Adds basic searchable '''hostname field''' to a document. || || url || YES || Tokenized || index-basic || Adds basic searchable '''URL field''' to a document. || || content || NO || Tokenized || index-basic || Adds basic searchable '''content field''' to a document. || || lastModified || YES || NotIndexed || index-more || || - || date || NO || UnTokenized || index-more || || + || date || NO || Un-Tokenized || index-more || || - || contentLength || YES || NotIndexed || index-more || || + || contentLength || YES || Not Indexed || index-more || || - || type || NO || UnTokenized || index-more || contentType,primaryType,subType (all mime-types) || + || type || NO || Un-Tokenized || index-more || contentType,primaryType,subType (all mime-types) || - || primaryType || YES || UnTokenized || index-more || primaryType (mime-type) || + || primaryType || YES || Un-Tokenized || index-more || primaryType (mime-type) || - || subType || YES || UnTokenized || index-more || subType (mime-type) || + || subType || YES || Un-Tokenized || index-more || subType (mime-type) || - || tld || YES || UnTokenized / NotStored(based on conf) || tld || see http://issues.apache.org/jira/browse/NUTCH-439 || + || tld || YES || Un-Tokenized / NotStored(based on conf) || tld || see http://issues.apache.org/jira/browse/NUTCH-439 || - || category || NO || UnTokenized || index-url-category || see http://issues.apache.org/jira/browse/NUTCH-386 || + || category || NO || Un-Tokenized || index-url-category || see http://issues.apache.org/jira/browse/NUTCH-386 || || subcollection || YES || Tokenized || subcollection || see subcollection plugin || ----

