EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, I am using EmbeddedSolrServer for full indexing (Multi core) and StreamingUpdateSolrServer for incremental indexing. The steps involved are mentioned below. Full indexing (Daily) 1) Start EmbeddedSolrServer 2) Delete all docs 3) Add all docs 4) Commit and optimize collection 5) Stop EmbeddedSolrServer 6) Reload core http://localhost:7070/solr/admin/cores?action=RELOAD&core=docs Incremental Indexing (Hourly) 1) Start StreamingUpdateSolrServer 2) Add/Delete docs 3) Commit collection Now, the issue is the index is getting corrupted if we do Full indexing and incremental indexing one after the other without restarting the Tomcat web server (localhost:7070). There is no issue if we restart the Tomcat after each of the indexing processes (Full and incremental). Please let me know how can we avoid corrupting the index without restarting the Tomcat. I am fairly new to Solr, so I may be missing something here. Below are some details about our Solr Installation. 1) JVM OpenJDK 64-Bit Server VM (19.0-b09) 2) solr-spec-version 4.0.0.2011.12.08.06.33.52 3) solr-impl-version 4.0-SNAPSHOT 1211898 - root - 2011-12-08 06:33:52 4) lucene-spec-version 4.0-SNAPSHOT 5) lucene-impl-version 4.0-SNAPSHOT 1211898 - root - 2011-12-08 06:24:12 6) OSRed Hat Enterprise Linux Server release 6.1 (Santiago) Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3889073.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, Any update on this? Please let me know if you need additional information on this. Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3902171.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Mikhail Khludnev, Thank you for the reply. I think the index is getting corrupted because StreamingUpdateSolrServer is keeping reference to some index files that are being deleted by EmbeddedSolrServer during commit/optimize process. As a result when I Index(Full) using EmbeddedSolrServer and then do Incremental index using StreamingUpdateSolrServer it fails with a FileNotFound exception. A special note: we don't optimize the index after Incremental indexing(StreamingUpdateSolrServer) but we do optimize it after the Full index(EmbeddedSolrServer). Please see the below log and let me know if you need further information. --- Mar 29, 2012 12:05:03 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[035405]} 0 28 Mar 29, 2012 12:05:03 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update/extract params={stream.type=text/html&literal.stream_source_info=/snps/docs/customer/q_and_a/html/035405.html&literal.stream_name=035405.html&wt=javabin&collectionName=docs&version=2} status=0 QTime=28 Mar 29, 2012 12:05:03 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=false,waitSearcher=true,expungeDeletes=false,softCommit=false) Mar 29, 2012 12:05:03 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {commit=} 0 10 Mar 29, 2012 12:05:03 AM org.apache.solr.common.SolrException log SEVERE: java.io.FileNotFoundException: /opt/solr/home/data/docs_index/index/_3d.cfs (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(RandomAccessFile.java:233) at org.apache.lucene.store.MMapDirectory.createSlicer(MMapDirectory.java:229) at org.apache.lucene.store.CompoundFileDirectory.(CompoundFileDirectory.java:65) at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:82) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:112) at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:700) at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:263) at org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2852) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2843) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2616) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2731) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2719) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2703) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:325) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:84) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:52) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1477) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) - Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3905071.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Shawn, Thanks for sharing your opinion. Mikhail Khludnev, what do you think of Shawn's opinion? Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3907223.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Mikhail Khludnev, You are partially right. i.e. We have two separate processes accessing the same Lucene Directory but they do not run simultaneously. They run one after the other and only after the first one is completed. The commit from the EmbeddedServer is successful and I am posting the log below. --- INFO: [] webapp=null path=/update/extract params={stream.type=text%2Fhtml&collectionName=docs} status=0 QTime=5 Apr 5, 2012 7:28:34 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=false,waitSearcher=true,expungeDeletes=false,softCommit=false) Apr 5, 2012 7:28:34 AM org.apache.solr.core.SolrDeletionPolicy onCommit INFO: SolrDeletionPolicy.onCommit: commits:num=2 commit{dir=/opt/solr/home/data/docs_index/index,segFN=segments_4v,version=1333471748253,generation=175,filenames=[_5a.fdt, _5a_0.tip, _5a.fdx, _5a.tvf, _5a.tvx, segments_4v, _5a.tvd, _5a_0.prx, _5a.per, _5a_0.frq, _5a_0.tim, _5a.fnm] commit{dir=/opt/solr/home/data/docs_index/index,segFN=segments_4w,version=1333471748256,generation=176,filenames=[_5b.fnm, _5b.tvd, _5b.tvf, _5b_0.tip, _5b.nrm, _5b_0.tim, _5b.fdx, _5b_0.prx, _5b_0.frq, segments_4w, _5b.tvx, _5b.per, _5b.fdt] Apr 5, 2012 7:28:34 AM org.apache.solr.core.SolrDeletionPolicy updateCommits INFO: newest commit = 1333471748256 Apr 5, 2012 7:28:34 AM org.apache.solr.search.SolrIndexSearcher INFO: Opening Searcher@17c232ee main Apr 5, 2012 7:28:34 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: end_commit_flush Apr 5, 2012 7:28:34 AM org.apache.solr.search.SolrIndexSearcher warm INFO: autowarming Searcher@17c232ee main{DirectoryReader(segments_4w:1333471748256 _5b(4.0):Cv1000)} from Searcher@658f7386 main{DirectoryReader(segments_4v:1333471748253 _5a(4.0):Cv16787)} fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} Apr 5, 2012 7:28:34 AM org.apache.solr.search.SolrIndexSearcher warm INFO: autowarming result for Searcher@17c232ee main{DirectoryReader(segments_4w:1333471748256 _5b(4.0):Cv1000)} fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} Apr 5, 2012 7:28:34 AM org.apache.solr.core.SolrCore registerSearcher INFO: [] Registered new searcher Searcher@17c232ee main{DirectoryReader(segments_4w:1333471748256 _5b(4.0):Cv1000)} Apr 5, 2012 7:28:34 AM org.apache.solr.search.SolrIndexSearcher close INFO: Closing Searcher@658f7386 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} Apr 5, 2012 7:28:34 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {commit=} 0 344 Apr 5, 2012 7:28:34 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=/update params={commit=true&waitSearcher=true} status=0 QTime=344 Apr 5, 2012 7:28:34 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[009658]} 0 9 Apr 5, 2012 7:28:34 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=/update/extract params={stream.type=text%2Fhtml&collectionName=docs} status=0 QTime=9 - Please let me know your thoughts. Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3916521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, Any update? Thanks, PC Rao -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3925014.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Mikhail Khludnev, THank you for your help. Let me explain you the scenario about JVM. The JVM in which tomcat is running will not be restarted every time the StreamingUpdateSolrServer is running where as the EmbeddedSolrServer is a fresh JVM instance(new process) every time. In this scenario the index is being corrupted. If I restart Tomcat(i.e. restart JVM in which StreamingupdateServer is running) after each of the index completion the index doesn't get corrupted. However, this is not a viable option for us because Solr will not be available to users during the restart. Let me know if you have any more thoughts on this. In case you dont, can you also let me know how can I seek help from others? Thanks again, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3931636.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, Any more thoughts?? Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi Ryan, I see. Yes, for incremental indexing(Hourly) we use StreamingUpdateSolrServer and it is faster than EmbeddedSolrServer. We are also using, Embedded server for full indexing on a daily basis and it is efficient for full indexing as it can handle large number of documents in a better way. THanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940818.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and StreamingUpdateSolrServer
Hi, Can someone officially confirm that it is not supported by current Solr version to use both EmbeddedSolrServer(For Full indexing) and StreamingUpdateSolrServer(For Incremental indexing ) to update the same index? How can I request for enhancement in the next version? I think that this requirement is valid and very useful; Any disagreements? Thanks, PC Rao. -- View this message in context: http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3960266.html Sent from the Solr - User mailing list archive at Nabble.com.