: " By default, no more than 10,000 terms will be : indexed for a field." : : Given your note, then the docs do not mean that no : more than 10,000 terms will be indexed, but that some : smaller number of terms will be indexed and only the : first 10,000 occurrances will be tallied.
It means that by default only the first 10,000 terms of a document are indexed. If only 10,000 terms are indexed for any doc, then the sum of the term frequencies for any single doc should never be more then 10,000. Your confusion sems to be that you are thinking about indexing "terms" along with their TermFrequencies - so for you the sentence "Why, Why, Why oh Why did you go?" has 5 terms -- but in the context of analysis and maxFieldLength it has a sequence of 8 terms. If you setMaxFieldLength to 4, then you would wind up seeing only 4 terms in your index: "Why" with a termFreq of 3 and "oh" with a termFreq of 1. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]