I created a patch file at https://issues.apache.org/jira/browse/SOLR-2086. I went with the simplest approach since I didn't want to confuse things by having extra filters being added to what the user created. However, either approach would work!
On Aug 24, 2010, at 12:18 PM, Robert Muir wrote: > > On Tue, Aug 24, 2010 at 12:03 PM, Eric Pugh <[email protected]> > wrote: > Hi all, > > I have maxFieldLength set to 10000 in solrconfig.xml, but was playing around > with really large document (The King James Bible) in analysis.jsp. I hacked > analysis.jsp to show me the number of terms at each filter, and the headers, > but without turning everything on by checkboxing verbose. > > My results shown at this screenshot: > http://img.skitch.com/20100824-t36rq45i2wfimwyd53gwiqebdy.png seem to confirm > that maxFieldLength is NOT honored by the analysis.jsp. > > > Separate from whether or not analysis.jsp should do this (I happen to think > the closer to "reality" it is, the better), I think the easiest > implementation would be to wrap the entire stream with LimitTokenCountFilter: > > /** > * This TokenFilter limits the number of tokens while indexing. It is > * a replacement for the maximum field length setting inside {...@link > org.apache.lucene.index.IndexWriter}. > */ > > If i remember, its not exactly the same as the maxFieldLength, but its pretty > close. > > -- > Robert Muir > [email protected] ----------------------------------------------------- Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server Free/Busy: http://tinyurl.com/eric-cal
