On Thu, Oct 30, 2008 at 2:06 AM, Bill Graham <[EMAIL PROTECTED]> wrote:
> I've been running solr 1.3 on an ec2 instance for a couple of weeks and I've 
> had some stability issues. It seems like I need to bounce the app once a day. 
> That I could live with and ultimately maybe troubleshoot, but what's more 
> disturbing is that three times in the last 2 weeks my index has been 
> corrupted when FileNotFoundExceptions started to appear.
>
> I'm running in jetty and had my index on the local file system until I lost 
> the index the first time. Then I moved it to my mounted ebs volume so I could 
> restore from a snapshot if needed. I'm wondering if perhaps there are issues 
> with the locking mechanize on either the local directory (which is really a 
> virual instance), or the mounted xfs volume. Has anyone seem this, or have 
> suggestions re the cause? I'm using the single lockType.
>
> I'm running a single solr instance that gets frequent updates from multiple 
> threads, and commits about every hour.
>
> A few things I see in the logs:
>
> - From time to time I see write lock timeouts:
> SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed 
> out: SingleInstanceLock: write.lock

This is really strange.  It suggests that there is another in-process
writer that is holding the lock.  That should be impossible, unless
it's caused by a previous exception trying to open an IndexWriter and
the lock is simply stale.  What seems to be the first exception that
occurs?

Also, you might try changing the lock type from single to simple to
make it visible cross-process.  That would rule out trying to start
another solr instance on the same index directory.... opening two
writers on the same directory is one way to get missing files like you
appear to have.

> - I've seen OOM exceptions during warming. I've changed 
> maxWarmingSearchers=1, which I suspect will do he trick

OOM errors are really tricky - if they happen in the wrong place, it's
hard to recover gracefully from. Correctly cleaning up after an OOM
error in the IndexWriter recently had some little fixes in lucene
trunk - you might want to try the latest dev version of Lucene and see
if it helps.


-Yonik


> - The finally, this is what I fond in the logs today when the index got 
> corrupt:
>
> Oct 29, 2008 12:18:39 AM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
> Oct 29, 2008 12:18:41 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /var/local/solr/data/production/index/_2rv.fdt (No such file or directory)
>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>        at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:368)
>        at 
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
>        at 
> org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:226)
>        at 
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
>        at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
>        at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>        at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>        at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>        at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>        at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>        at org.mortbay.jetty.Server.handle(Server.java:285)
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>        at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>        at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: java.io.FileNotFoundException: 
> /var/local/solr/data/production/index/_2rv.fdt (No such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
>        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>        at org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:77)
>        at 
> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:355)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:304)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:226)
>        at 
> org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:56)
>        at 
> org.apache.lucene.index.ReadOnlyMultiSegmentReader.<init>(ReadOnlyMultiSegmentReader.java:27)
>        at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:101)
>        at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>        at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>        ... 25 more
>
>
> Oct 29, 2008 12:18:41 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={} status=500 QTime=1149
> Oct 29, 2008 12:18:41 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /var/local/solr/data/production/index/_2rv.fdt (No such file or directory)
>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
>        at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:368)
>        at 
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
>        at 
> org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:226)
>        at 
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
>        at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
>        at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>        at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>        at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>        at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>        at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>        at org.mortbay.jetty.Server.handle(Server.java:285)
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>        at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>        at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: java.io.FileNotFoundException: 
> /var/local/solr/data/production/index/_2rv.fdt (No such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
>        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>        at org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:77)
>        at 
> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:355)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:304)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:226)
>        at 
> org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:56)
>        at 
> org.apache.lucene.index.ReadOnlyMultiSegmentReader.<init>(ReadOnlyMultiSegmentReader.java:27)
>        at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:101)
>        at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>        at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
>        ... 25 more
>
> [ then things look ok for a while, but these exceptions start to appear ]
>
> Oct 29, 2008 12:41:45 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={} status=0 QTime=1291
> Exception in thread "Lucene Merge Thread #0" 
> org.apache.lucene.index.MergePolicy$MergeException: 
> java.io.FileNotFoundException: /var/local/solr/data/production/index/_2rv.fdt 
> (No such file or directory)
>        at 
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
>        at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
> Caused by: java.io.FileNotFoundException: 
> /var/local/solr/data/production/index/_2rv.fdt (No such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
>        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>        at org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:77)
>        at 
> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:355)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:304)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:258)
>        at 
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4280)
>        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3932)
>        at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
>        at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
>
> Any ideas?
>
> thanks,
> Bill

Reply via email to