Alexey Serba created SOLR-3587:
----------------------------------

             Summary: Reloading a core is no longer updates analysis process
                 Key: SOLR-3587
                 URL: https://issues.apache.org/jira/browse/SOLR-3587
             Project: Solr
          Issue Type: Bug
    Affects Versions: 4.0
            Reporter: Alexey Serba


It's a usual practice in Solr to overwrite synonyms/stopwords files and issue a 
core reload command to force Solr reload these files. We've noticed that this 
trick is no longer working on trunk. 

I started debugging this problem in Eclipse and I can see that Solr actually 
re-reads updated synonym file on core reload, but indexing process still uses 
old reference to Synonym filter / SynonymMap instance. 

When I start Solr initially I can see that my Solr server has 2 instances of 
SynonymMap (the object that holds actual synonyms data). This is expected as I 
have 2 SynonymFilters defined in my schema (one field with 2 synonym filters - 
index + query time)

After a core reload I see that Solr has 4 instances of this class. So I thought 
that there could be some leak and issued many (N) core reload commands and 
expected to see 2*N instances. It didn't happen. I can only see 20 instances of 
this class. 

This is pretty interesting number as well because, according to thread 
dump/list, Solr/Jetty has 10 working threads in thread pool (by default for 
example configs). So I suspect that we cache something in thread local storage 
and it hits us. 

I looked into the code and found that Lucene caches token streams in 
ThreadLocal, but I don't know the code enough to state that this is the problem.

So I took a different approach and found the commit that introduced this bug. I 
wrote a simple test (shell script) and used _git bisect_ tool to chase this bug.

{quote}
ffd9c717448eca895d19be8ee9718bc399ac34a7 is the first bad commit
commit ffd9c717448eca895d19be8ee9718bc399ac34a7
Author: Mark Robert Miller <[email protected]>
Date:   Thu Jun 30 13:59:59 2011 +0000

      SOLR-2193, SOLR-2565: The default Solr update handler has been improved so
      that it uses fewer locks, keeps the IndexWriter open rather than closing 
it
      on each commit (ie commits no longer wait for background merges to 
complete),
      works with SolrCore to provide faster 'soft' commits, and has an improved 
API
      that requires less instanceof special casing.
    
      You may now specify a 'soft' commit when committing. This will
      use Lucene's NRT feature to avoid guaranteeing documents are on stable 
storage in exchange
      for faster reopen times. There is also a new 'soft' autocommit tracker 
that can be
      configured.
    
     SolrCores now properly share IndexWriters across SolrCore reloads.
    
    git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1141542 
13f79535-47bb-0310-9956-ffa450edef68
{quote}

It looks like sharing IndexWriters across SolrCore reloads could be the root 
cause, right?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to