On Tue, Aug 24, 2010 at 12:03 PM, Eric Pugh <[email protected]
> wrote:

> Hi all,
>
> I have maxFieldLength set to 10000 in solrconfig.xml, but was playing
> around with really large document (The King James Bible) in analysis.jsp.
> I hacked analysis.jsp to show me the number of terms at each filter, and the
> headers, but without turning everything on by checkboxing verbose.
>
> My results shown at this screenshot:
> http://img.skitch.com/20100824-t36rq45i2wfimwyd53gwiqebdy.png seem to
> confirm that maxFieldLength is NOT honored by the analysis.jsp.
>
>
Separate from whether or not analysis.jsp should do this (I happen to think
the closer to "reality" it is, the better), I think the easiest
implementation would be to wrap the entire stream with
LimitTokenCountFilter:

/**
 * This TokenFilter limits the number of tokens while indexing. It is
 * a replacement for the maximum field length setting inside {...@link
org.apache.lucene.index.IndexWriter}.
 */

If i remember, its not exactly the same as the maxFieldLength, but its
pretty close.

-- 
Robert Muir
[email protected]

Reply via email to