Just found out the root cause:

* The segments.gen file does not get replicated to slave all the time.

For some reason, this small (20bytes) file lives in memory and does not get 
updated to the master's hard disk. Therefore it is not obviously transferred to 
slaves.

Solution was to shut down the master web app (must be a clean shut down!, not 
kill of Tomcat). Then do the replication.

Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
does not seem to copy over this file too. So enforcing in the replication 
scripts solved the problem.

Thanks Otis and everyone for all your support!

Madu


-----Original Message-----
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 12:37 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me 
org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
figure out correct segment file name. (May be index replication issue -- 
leading to "not fully replicated".. but its so hard to believe as both master 
and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you 
know.


Thanks
Madu


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Maduranga Kannangara <mkannang...@infomedia.com.au>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Tue, November 10, 2009 5:42:44 PM
> Subject: RE: Segment file not found error - after replicating
>
> Thanks Otis,
>
> I did the du -s for all three index directories as you said right after
> replicating and when I find errors.
>
> All three gave me the exact same value. This time I found the error in a 
> rather
> small index too (31Mb).
>
> BTW, if I copy the segment_x file to what Solr is looking for, and restart the
> Solr web-app from Tomcat manager, this resolves. But it's just a work around,
> never good enough for the production deployments.
>
> My next plan is to do a remote debug to see what exactly happening in the 
> code.
>
> Any other things I should looking at?
> Any help is really appreciated on this matter.
>
> Thanks
> Madu
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> Sent: Tuesday, 10 November 2009 1:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Segment file not found error - after replicating
>
> Madu,
>
> So are you saying that all slaves have the exact same index, and that index is
> exactly the same as the one on the master, yet only some of those slaves 
> exhibit
> this error, while others do not?  Mind listing index directories of 1) master 
> 2)
> slave without errors, 3) slave with errors and doing:
> du -s /path/to/index/on/master
> du -s /path/to/index/on/slave/without/errors
> du -s /path/to/index/on/slave/with/errors
>
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Maduranga Kannangara
> > To: "solr-user@lucene.apache.org"
> > Sent: Mon, November 9, 2009 7:47:04 PM
> > Subject: RE: Segment file not found error - after replicating
> >
> > Thanks Otis!
> >
> > Yes, I checked the index directories and they are 100% same, both timestamp
> and
> > size wise.
> >
> > Not all the slaves face this issue. I would say roughly 50% has this 
> > trouble.
> >
> > Logs do not have any errors too :-(
> >
> > Any other things I should do/look at?
> >
> > Cheers
> > Madu
> >
> >
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> > Sent: Tuesday, 10 November 2009 9:26 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Segment file not found error - after replicating
> >
> > It's hard to troubleshoot blindly like this, but have you tried manually
> > comparing the contents of the index dir on the master and on the slave(s)?
> > If they are out of sync, have you tried forcing of replication to see if one
> of
> > the subsequent replication attempts gets the dirs in sync?
> > Do you have more than 1 slave and do they all start having this problem at 
> > the
> > same time?
> > Any errors in the logs for any of the scripts involved in replication in 
> > 1.3?
> >
> > Otis
> > --
> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >
> >
> >
> > ----- Original Message ----
> > > From: Maduranga Kannangara
> > > To: "solr-user@lucene.apache.org"
> > > Sent: Sun, November 8, 2009 10:30:44 PM
> > > Subject: Segment file not found error - after replicating
> > >
> > > Hi guys,
> > >
> > > We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
> > > environment and use the replication scripts to make replicas those live in
> > load
> > > balancing slaves.
> > >
> > > The issue we face quite often (only in Linux servers) is that they tend to
> not
> >
> > > been able to find the segment file (segment_x etc) after the replicating
> > > completed. As this has become quite common, we started hitting a serious
> > issue.
> > >
> > > Below is a stack trace, if that helps and any help on this matter is 
> > > greatly
> > > appreciated.
> > >
> > > --------------------------------
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created gap: org.apache.solr.highlight.GapFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created regex: org.apache.solr.highlight.RegexFragmenter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
> load
> > > INFO: created html: org.apache.solr.highlight.HtmlFormatter
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > SEVERE: Could not start SOLR. Check solr/home property
> > > java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or 
> > > directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or 
> > > directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at 
> > > org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.common.SolrException log
> > > SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or 
> > > directory)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
> > >         at org.apache.solr.core.SolrCore.(SolrCore.java:470)
> > >         at
> > >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
> > >         at
> > > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
> > >         at
> > > org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
> > >         at
> > > org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
> > >         at
> > > org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
> > >         at
> > >
> >
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > >         at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> > >         at
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> > >         at
> > >
> >
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> > >         at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> > >         at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> > >         at
> > >
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> > >         at
> > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> > >         at
> > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> > >         at
> > >
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> > >         at
> > > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> > >         at java.lang.Thread.run(Thread.java:619)
> > > Caused by: java.io.FileNotFoundException:
> > > /solrinstances/solrhome01/data/index/segments_v (No such file or 
> > > directory)
> > >         at java.io.RandomAccessFile.open(Native Method)
> > >         at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> > >         at
> > >
> >
> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
> > >         at
> > > org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
> > >         at
> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
> > >         at 
> > > org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:94)
> > >         at
> > >
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> > >         at
> > >
> >
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> > >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> > >         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:951)
> > >         ... 30 more
> > >
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
> > > INFO: SolrDispatchFilter.init() done
> > > Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrServlet init
> > > INFO: SolrServlet.init()
> > >
> > > --------------------------------
> > >
> > > Steps to re-produce the error (However, for me did not work in my local 
> > > box.
> > > Also remote server is too far away to remote-debug!).
> > >
> > > -  Post some new data to the master server (Usually about 1Gb worth text
> > files)
> > > -  Run the replicate script in slave Solr instance
> > > -  Try to login to admin in slave Solr instance
> > >
> > > And you should see above stack trace even in the Tomcat output.
> > >
> > >
> > > Thanks in advance.
> > > Madu

Reply via email to