[ https://issues.apache.org/jira/browse/SOLR-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller updated SOLR-3587: ------------------------------ Attachment: SOLR-3587.patch mentioned patch attached for posterity - I'm going to give a shot at just updating the analyzer now. > Reloading a core is no longer updates analysis process > ------------------------------------------------------ > > Key: SOLR-3587 > URL: https://issues.apache.org/jira/browse/SOLR-3587 > Project: Solr > Issue Type: Bug > Affects Versions: 4.0 > Reporter: Alexey Serba > Assignee: Mark Miller > Priority: Blocker > Fix For: 4.0, 5.0 > > Attachments: SOLR-3587.patch, SOLR-3587.patch > > > It's a usual practice in Solr to overwrite synonyms/stopwords files and issue > a core reload command to force Solr reload these files. We've noticed that > this trick is no longer working on trunk. > I started debugging this problem in Eclipse and I can see that Solr actually > re-reads updated synonym file on core reload, but indexing process still uses > old reference to Synonym filter / SynonymMap instance. > When I start Solr initially I can see that my Solr server has 2 instances of > SynonymMap (the object that holds actual synonyms data). This is expected as > I have 2 SynonymFilters defined in my schema (one field with 2 synonym > filters - index + query time) > After a core reload I see that Solr has 4 instances of this class. So I > thought that there could be some leak and issued many (N) core reload > commands and expected to see 2*N instances. It didn't happen. I can only see > 20 instances of this class. > This is pretty interesting number as well because, according to thread > dump/list, Solr/Jetty has 10 working threads in thread pool (by default for > example configs). So I suspect that we cache something in thread local > storage and it hits us. > I looked into the code and found that Lucene caches token streams in > ThreadLocal, but I don't know the code enough to state that this is the > problem. > So I took a different approach and found the commit that introduced this bug. > I wrote a simple test (shell script) and used _git bisect_ tool to chase this > bug. > {quote} > ffd9c717448eca895d19be8ee9718bc399ac34a7 is the first bad commit > commit ffd9c717448eca895d19be8ee9718bc399ac34a7 > Author: Mark Robert Miller <markrmil...@apache.org> > Date: Thu Jun 30 13:59:59 2011 +0000 > SOLR-2193, SOLR-2565: The default Solr update handler has been improved > so > that it uses fewer locks, keeps the IndexWriter open rather than > closing it > on each commit (ie commits no longer wait for background merges to > complete), > works with SolrCore to provide faster 'soft' commits, and has an > improved API > that requires less instanceof special casing. > > You may now specify a 'soft' commit when committing. This will > use Lucene's NRT feature to avoid guaranteeing documents are on stable > storage in exchange > for faster reopen times. There is also a new 'soft' autocommit tracker > that can be > configured. > > SolrCores now properly share IndexWriters across SolrCore reloads. > > git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1141542 > 13f79535-47bb-0310-9956-ffa450edef68 > {quote} > It looks like sharing IndexWriters across SolrCore reloads could be the root > cause, right? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org