> Maybe your fields are too long so that only part of it gets indexed (look at IndexWriter.maxFieldLength).
This is interesting, I've had a look at the JavaDoc and I think I understand. The maximum field length describes the maximum number of unique terms, not the maximum number of words/tokens. Therefore, even if I have a 4Gb field, I could quite safely have a maxFieldLength of, say, 100k words which should safely handle the maximum number of unique words, rather than 800 million which would be needed to handle every token. Is this correct? Is 100k a worrying maxFieldLength, in terms of how much memory this would consume? Does Lucene issue a warning if this limit is exceeded during indexing (it would be quite worrying if it was silently discarding terms)? Thanks in advance, Alex. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
