Am 13.07.2011 15:37, schrieb Gabriele Kahlout:
> Well, I'm !sure how usual this scenario would be:
> 1. In general those using solr with nutch don't store the content field to
> avoid storing the whole web/intranet in their index, twice (1 in the form of
> stored data, and one in the form of indexed data).

Not exactly. The indexed form is quite different from the stored form;
only the tokens are stored, each token only once, and some additional
data like the document count and, maybe, shingle information etc..

Hence, indexed data usually needs much less space on disk than the
original data.

There's no practical alternative to storing the content in a stored
field. What would you otherwise display as a search result? "The
following web pages have your search term somewhere in their contents,
don't know where, take a look on your own"?


