Sorry Andrew, this is something that's bitten people before. search for maxFieldLength and you will see *2* of them in your config - one for indexDefaults and one for mainIndex. The one in mainIndex is set at 10000 and hence overrides the one in indexDefaults.
-Yonik http://www.lucidimagination.com On Mon, Oct 26, 2009 at 10:30 AM, Andrew Clegg <andrew.cl...@gmail.com> wrote: > > > Yep, I just re-indexed it again to make double sure -- same problem > unfortunately. > > My solrconfig.xml and schema.xml are attached. > > In case you want to see it in action on the same data I've got, I've tarred > up my data and conf directories here: > > http://biotext.org.uk/static/solr-issue-example.tar.gz > > That should be enough to reproduce it with. > > Thanks! > > Andrew. > > > Yonik Seeley-2 wrote: >> >> Yes, please show us your solrconfig.xml, and verify that you reindexed >> the document after changing maxFieldLength and restarting solr. >> >> I'll also see if I can reproduce a problem with maxFieldLength being >> ignored. >> >> -Yonik >> http://www.lucidimagination.com >> >> >> >> On Mon, Oct 26, 2009 at 7:11 AM, Andrew Clegg <andrew.cl...@gmail.com> >> wrote: >>> >>> Morning, >>> >>> Last week I was having a problem with terms visible in my search results >>> in >>> large documents not causing query hits: >>> >>> http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-td26029040.html#a26029351 >>> >>> Erick suggested it might be related to maxFieldLength, so I set this to >>> 2147483647 in my solrconfig.xml and reindexed over the weekend. >>> >>> Unfortunately I'm having the same problem now, even though Erick appears >>> to >>> be right! I've narrowed it down to a single document for testing >>> purposes, >>> and I can get it returned by querying for a term near the beginning, but >>> terms near the end cause no hit, and I can even find the point part way >>> through the document, after which, none of the remaining terms seem to >>> cause >>> a hit. >>> >>> The document is about 32000 terms long, most of which is in a single >>> field >>> called related_ids of about 31000 terms. My first thought was that the >>> text >>> was being chopped up into so many tokens that it was going over the >>> maxFieldLength anyway, but 2147483647/32000=67109, and it seems very >>> unlikely that 67109 tokens would be generated per term! >>> >>> I've tried undeploying and redeploying the whole web app from Tomcat in >>> case >>> the new maxFieldLength hadn't been read, but no difference. If I go to >>> >>> http://localhost:8080/solr/admin/file/?file=solrconfig.xml >>> >>> I can see >>> >>> <maxFieldLength>2147483647</maxFieldLength> >>> >>> as expected. >>> >>> Does anyone have any more ideas? This could potentially be a showstopper >>> for >>> us as we have quite a few long-ish documents to index. (32K words doesn't >>> seem that long to me, but still...) >>> >>> I've tried it with today's nightly build (2009-10-26) and it makes no >>> difference. If this sounds like a bug, I'll open a JIRA and attach tars >>> of >>> my config and data directories. Any thoughts? >>> >>> Thanks, >>> >>> Andrew. >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Solr-ignoring-maxFieldLength--tp26057808p26057808.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> > http://www.nabble.com/file/p26060882/solrconfig.xml solrconfig.xml > http://www.nabble.com/file/p26060882/schema.xml schema.xml > -- > View this message in context: > http://www.nabble.com/Solr-ignoring-maxFieldLength--tp26057808p26060882.html > Sent from the Solr - User mailing list archive at Nabble.com. > >