Hi all,

I have maxFieldLength set to 10000 in solrconfig.xml, but was playing around 
with really large document (The King James Bible) in analysis.jsp.   I hacked 
analysis.jsp to show me the number of terms at each filter, and the headers, 
but without turning everything on by checkboxing verbose.  

My results shown at this screenshot: 
http://img.skitch.com/20100824-t36rq45i2wfimwyd53gwiqebdy.png seem to confirm 
that maxFieldLength is NOT honored by the analysis.jsp.   

But it seems to me that folks using analysis.jsp would expect the process to be 
exactly like what happens during a document being indexed??   In my specific 
case, it took me a while to realize that the reason my indexing results 
differed from analysis.jsp results was because indexing only looked at the 
first 10000 tokens, but analysis looked at all 101561. A horizontal table of 
10,000 cells kind of looks like a horizontal field of 101,561 cells!

Would it make sense to parse the text through the DocInverterPerField in 
analysis.jsp?  Or to maybe just modify the getTokens method in analysis.jsp to 
only parse maxFieldLength tokens?  I think I can do it via looking up the 
SolrCore, and doing core.getSolrConfig().mainIndexConfig.maxFieldLength


Eric





-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from 
http://www.packtpub.com/solr-1-4-enterprise-search-server
Free/Busy: http://tinyurl.com/eric-cal









---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to