- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: luca pellegrini Subject: Re: questions and suggestions on dpsearch
Hi Maxime, as i go on using DataParkSearch i have more and more questions: 1) what is exactly IndexDocSizeLimit? i know by documentation that it corrensponds to the amount of data stored in index per document. what do you mean by "amount of data"? Is it the same as MaxDocSize? 2)i'm trying to index 20 .it domains and the indexer seems to index (using crc-multi) 10 GB of data. I think this is very much; this is dued to the fact that each word found in each web-page is being saved in the database. Do you think that we can save some space using cache-mode indexing (or any aother indexing tecnique)? Is there a way to have a sort of "lookup table" containing only the dictionary of indexed words? 3)is there a way to tell the indexer to avoid indexing stopwords? 4)is there a way to tell the indexer to avoid indexing a document if a certain string is being found in the document body? (for example: if it's a blog page, don't index that page). i think that NoIndexIf could be used for this purpuse, but how? - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1162208106
